Policy Research Working Paper 9263 Measuring Employment Experimental Evidence from Urban Ghana Rachel Heath Ghazala Mansuri Bob Rijkers William Seitz Dhiraj Sharma Development Economics Development Research Group & Poverty and Equity Global Practice June 2020 Policy Research Working Paper 9263 Abstract Using a randomized survey experiment in urban Ghana, and mobile self-employment. In contrast, there is no impact this paper demonstrates that the length of the reference of the reference period on the incidence of wage employ- period and the interview modality (in person or over the ment. The wage employed report working fewer days and phone) affect how people respond in labor surveys, with hours when confronted with a shorter reference period. impacts varying markedly by job type. Survey participants Finally, interviews conducted on the phone yield lower report significantly more self-employment spells when the estimates of employment, hours worked, and days worked reference period is shorter than the traditional one week, among the self-employed who are working from home or with the impacts concentrated among those in home-based a mobile location as compared with in-person interviews. This paper is a product of the Development Research Group, Development Economics and the Poverty and Equity Global Practice. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http:// www.worldbank.org/prwp. The authors may be contacted at brijkers@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Measuring Employment: Experimental Evidence from Urban Ghana Rachel Heath, Ghazala Mansuri, Bob Rijkers, William Seitz, and Dhiraj Sharma Keywords: Labor statistics, labor force surveys, reference period, interview mode, self- employment; survey design, phone-based surveys JEL Codes: J20, J21, J22, J46 Rachel Heath, email: rmheath@uw.edu (Associate Professor, University of Washington); Ghazala Mansuri:gmansuri@worldbank.org (Lead Economist, World Bank); Bob Rijkers, email: brijkers@worldbank.org (Senior Economist, World Bank); William Hutchins Seitz, email: wseitz@worldbank.org (Economist, World Bank) and Dhiraj Sharma, email: dsharma5@worldbank.org (Economist, World Bank). The authors thank Erhan Artuc, Kathleen Beegle, Andrew Dabalen, Eric Edmonds, Adriana Kugler, Vasco Molini, three anonymous referees, and conference participants at the 2016 Annual Bank Conference on Development Economics, the World Bank, and Oxford University for helpful comments. Paolo Falco and Mary Hallward-Driemeier provided support in the early stages of the project. Arthur Lagrange, Jonathan Lain, Poorna Mazumdar, Lokendra Phadera, Mohammed Y. H. Saleh, Vaclav Tehle, and Asmus Zoch provided excellent research assistance. The research was supported by funding from the World Bank’s research support budget, the Knowledge for Change Program (KCP), and the Strategic Research Program (SRP). Views expressed in this paper do not necessarily represent the views of the World Bank Group, its affiliated organizations, the executive directors of the World Bank Group or the governments they represent. 1. Introduction Accurate measurement of labor market activity is crucial for policy design, especially in developing countries, where households are disproportionately reliant on labor income. Current practice favors the use of a one-week reference period in labor market surveys, even though a shorter reference period, such as a day, might provide a sharper snapshot of labor market activity when such activity is short-lived and transitory, as is typical in large swaths of the developing world where self- and informal employment are the norm (Hussmanns, Mehran, and Verma 1992). At the same time, with the penetration of mobile phones in developing countries, new modalities of data collection have become feasible. These technological developments have also rejuvenated interest in the impact of survey mode on data quality. 1 This paper reports the findings of a randomized survey experiment conducted in urban Ghana to assess the impact of the reference period and survey modality (being interviewed over the phone) on labor statistics. The experiment tracked the labor market behavior of 954 respondents of the Ghana Urban Household Panel Survey for six months, using four monitoring instruments: (a) a baseline in-person interview, (b) a high-frequency sequence of interviews spanning 10 consecutive weeks, (c) an endline in-person interview conducted approximately three months after the baseline, and (d) a follow-up phone survey conducted three months after completion of the endline interview (i.e., six months after the start of the survey). Survey respondents were randomized into 3 ‘treatment’ arms for the high-frequency interviews which varied both the reference period and survey modality. 2 One-third of respondents was interviewed three times a week (using two separate 24-hour recall questions) by phone, while two-thirds were allocated to a weekly interview, but faced distinct survey modes: in one arm, respondents were 1 High-frequency labor supply data are also of policy interest, as recent work on labor supply responses to shocks has shown (see e.g. Heath et al. forthcoming). 2 There were also two control arms each comprising approximately 300 respondents. Phones were also provided to all respondents in one of the control arms. Since individuals in the control arms did not partake in the high frequency interviews, we do not analyze their reporting in this paper. Results which are available upon request from the authors suggest that receiving a phone or participating in the high-frequency surveys did not durably alter the labor market behavior of respondents. 2 interviewed in person, while in the other, they were interviewed over the phone. Phones were provided to all respondents interviewed on the phone. The impact of receiving this phone on labor market behavior is likely limited since 95.7 percent of survey participants owned a cellphone before the experiment began. The results show significant and economically meaningful phone and reference period effects on reporting, which vary strongly by job type. Starting with the former, in any given week, workers who are interviewed over the phone once each week instead of in person are 8 percentage points less likely to report working (relative to a mean of 86%), report 0.6 fewer days of work and 7 fewer hours of work. These effects mask marked heterogeneity across different types of workers; phone effects for wage employees are insignificant. The results are driven almost entirely by the self- employed in mobile or home-based self-employment (individuals who either work inside their homes or outside their homes but not in a permanent structure), who exhibit strong phone effects. The reporting of self-employed workers in comparatively stable jobs (proxied by working outside the home in a permanent structure) is not significantly impacted by survey mode. We also document sizeable reference period effects, which impact both wage workers and the self-employed, but are much stronger for the latter. On average, those interviewed over the phone three times a week and reporting over the past two days (by means of two separate 24-hour recall questions) are 7 percentage points more likely to report having done any work than those interviewed over the phone each week. Conditional on reporting any work they report 0.6 fewer days (relative to an average of 5.2) and 4.7 fewer hours (relative to an average of 44.4) than those interviewed each week. The increase in the odds of reporting work is exclusively concentrated among the self-employed. Self-employed workers in the tri-weekly arm are 13 percentage points more likely to report having done any work over the course of a given week as compared to self- employed individuals who were asked to recall their labor market behavior over the previous week. By contrast, we do not find evidence of an impact of shortening the reference period on the reported incidence of employment by wage workers. Moreover, reference period effects are strongest for the self-employed in home based or mobile jobs. The reference period not only impacts the reporting of the incidence of employment, but also its duration. Wage workers in the tri-weekly phone group report working 0.5 fewer days (0.6 fewer days conditional on reporting any) than wage workers in the weekly phone group and 5.4 3 fewer hours (6.3 conditional on reporting working). Among the self-employed we detect a similar but marginally smaller reduction in the number of days worked conditional on working (of 0.6 days), but no reduction in the unconditional number of hours or days worked. Assessing which labor market reports are most accurate is challenging because we do not observe true values. 3 While phone interviews may reduce social desirability bias by increasing anonymity, they could also elicit impatience and recall error. We are not in a position to ascertain which modality is to be preferred, but the survey-induced differences in labor market reporting are of interest in and of themselves in an era where researchers are progressively reliant on phone- based surveys (Dabalen et al, 2016). The survey modality sensitivity of labor market reports by certain types of self-employed workers also calls for caution when making inferences using indicators derived from surveys with different modalities, especially when making comparative statements about the labor market outcomes of different types of workers. Our results regarding the impact of the reference period on reporting also point to the possibility that the reference period of one week (the norm in most labor force surveys) may systematically underestimate the employment rate of self-employed workers. It may also overestimate the number of days and hours worked for especially the wage employed. Put differently, shorter reference periods are associated with reporting more spells of shorter average duration, which suggests the possibility of more accurate recall, but more research is needed to confirm this hypothesis. The rest of the paper is organized as follows. Section 2 reviews the literature on recall bias, telescoping and interview mode in labor surveys and elaborates on the paper’s hypotheses. Section 3 describes the experiment. Section 4 presents the data. Section 5 documents survey modality effects by comparing the reports of the weekly in person and weekly phone groups. Section 6 documents reference period effects by comparing the reports of the weekly phone group with the reports of those in the triweekly phone group. 3 In retrospect, our design for assessing the impact of shortening the reference period was not ideal; it would have been better to interview workers each day of the week. Time-use diaries are of course more of a gold standard, but they often have compliance issues which may require daily interviews, in any case, to ensure completion. Such approaches are likely the closest possible approximations to respondents’ true labor supply, given that third-party verification of their time use is very challenging in this context of widespread informality and self-employment. Employer records, for instance, would only be available for a relatively small subsample of the population in this context. 4 The last section summarizes the paper’s main conclusions and the implications of our results for future research. Appendix A defines the variables used in the paper. Appendix B provides statistics on the survey design, data quality checks, balance, and survey compliance. Appendix C elaborates on the aggregation of triweekly interviews to weekly labor reports. Appendix D presents additional tests. 2. Recall Bias, Telescoping, and Interview Mode in Labor Surveys A substantial body of literature (summarized in Beckett et al. 2001 and Bound et al. 2001) shows that retrospective reports are prone to recall bias , the magnitude of which depends on factors such as the salience of the events to be recalled, social desirability, respondent characteristics (Bardasi et al. 2011), and the reference period (Arthi et al., 2018, de Nicola and Gine, 2014). 4 Longer reference periods have been associated with increased recall bias that leads to differences in reported labor outcomes (see e.g. Horvath, 1982, Mathiowetz and Duncan 1988, and Pierret, 2001). Studies that use time diaries, in which the reference period is typically very short, yield lower estimates of labor supply (Barret and Hammermesh, 2016; Duncan and Stafford 1980; Hamermesh 1990; Robinson and Bostrom 1994; Bonke 2005; Robinson et al. 2011; Juster and Stafford, 1991). The length of the reference period can also impact telescoping error, the tendency of respondents to report events as occurring earlier (backward telescoping) or more recently (forward telescoping) than they actually did (Neter and Waksberg, 1964; Mathiowetz, 1986; Mathiowetz and Duncan, 1988; Sudman and Bradburn, 1974, 1982). While telescoping bias can go in either direction, it usually results in overreporting when the reference period is short (Bound et al., 2001), 4 The choice of reference period in labor market surveys has been the subject of extensive debate (Hussmans, Mehran, and Verma 1990; Stewart 2014). The International Labour Organization (ILO) identifies both a day and a week as appropriate reference periods, as they correspond closely to an instantaneous (stock) measure of employment and are less vulnerable to the memory-dependent errors that arise over longer periods of recall. Most labor market surveys use a reference period of one week, because of both the practicality of measurement and consistency with other sources. When full-time formal-sector employment is the norm, using weekly as opposed to daily recall has the additional advantage of resulting in a lower variance while giving similar average results. The ILO notes that when intermittent, casual, and short-term employment is widespread, as is the case in developing countries, shorter reference periods may enhance accuracy. If self-employment is more volatile than wage employment, one would expect the choice of the reference period to have a greater impact on reporting of self-employment than wage-employment. 5 because respondents tend to include events occurring before the reporting period (Akerlof and Yellen, 1985) leading to an overstatement of the number of events reported for recent periods and an understatement for more distant periods. Another strand of the literature examines the impact of survey mode in social surveys. Phone interviews may suffer higher rates of nonresponse (De Leeuw 1992;=), and can (but certainly do not always) also yield slightly different responses than in-person interviews (Groves 1990). These differences are often ascribed either to social desirability bias or satisficing. Starting with the former, greater “social distance” due to the physical absence of interviewers may reduce the proclivity of respondents to offer socially desirable responses in phone interviews. But the larger “social distance” could also have an opposite effect by making respondents less willing to confess to socially undesirable and stigmatized behavior (Gregson et al. 2002; Langhaug, Sherr and Cowan 2010). To the extent working is socially desirable but not stigmatized, one might anticipate phone interviews to be associated with lower estimates of days and hours worked. A second potential reason for different responses is survey satisficing; telephone interviewing may increase the likelihood of respondents limiting the amount of cognitive effort they devote to reporting accurately, either because phone interviews tend to move more quickly, because of the threat of interruptions, or the possibility of multitasking during the phone interview. Such survey satisficing could result in both more recall bias and a lower propensity to report work disruptions. 3. Experiment Design The experiment was designed to examine the impact of the length of the reference period and survey modality on labor market reporting. It included three treatments (table 1). Participants in all three groups were interviewed in-person at baseline and endline and by phone in a three-month follow-up. The treatment groups comprised (a) one group that was interviewed by phone three times a week for 10 weeks (30 interviews), (b) another that was interviewed by phone once a week for 10 weeks (10 interviews), and (c) a third that was interviewed in person once a week for 10 weeks (10 interviews). The questionnaire was identical for both phone and in-person interviews, and the two types of interviews were conducted at the same times. All participants interviewed by phone received cellphones to avoid selection bias associated with phone ownership. 6 To incentivize participation, all respondents were paid (see appendix table B1). All participants received 3 cedis ($1.36) for completing the baseline survey, 3 cedis ($1.36) for completing the endline survey, and 4 cedis ($1.82) worth of airtime credit for completing the three-month follow- up. Participants who were interviewed once a week (by phone or in-person) received 3 cedis ($1.36) for each completed interview, and participants who were interviewed three times a week received 2 cedis ($0.91) for each completed interview. All payments, except the follow-up survey payment, were made at endline. Individuals assigned to the weekly phone, and triweekly phone arms received phones (whether or not they already owned one) with a SIM card and 1 cedi ($0.46) of phone credit. The sample was drawn from respondents in three cities (Accra, Cape Coast, and Kumasi) of the Ghana Urban Household Panel Survey (GUHPS). 5 A baseline in-person interview was conducted with all respondents before the high-frequency survey was initiated. It served multiple functions, including collecting baseline information on key variables of interest, collecting contact information, distributing cellphones to participants, familiarizing respondents with the survey questions, and cultivating trust between enumerators and respondents. All respondents provided their phone numbers (often more than one) and were asked to provide their preferred phone number (typically their existing number) for completing phone interviews. Subjects were allocated to one of the three arms. The arms were balanced on a range of observable characteristics, including gender, education, age, occupation, marital status, dependency ratio, and asset ownership. Randomization was applied at the household level, so that everyone in a household was assigned to the same arm, in order to avoid intrahousehold spillovers arising from assigning members from the same household to different arms. A number of strategies helped reduce attrition and enhance the quality of the data. The in- person baseline interview was key to this process. At baseline, enumerators were paired with 5 The Ghana Urban Household Panel Survey (GUHPS) is a panel labor market survey administered by the Centre for the Study of African Economies at the University of Oxford. The sampling frame for the experiment consisted of respondents who had been interviewed in the GUHPS, excluding individuals under the age of 20 or above the age of 60 in 2013, individuals not contacted in either 2010 or 2012, and individuals located in Takoradi-Secondi (to cut costs, the number of locations in which the survey experiment was conducted was limited). The resulting sample frame consisted of 2,251 individuals from 720 households. 7 specific respondents, of the same gender, for the entirety of the survey. To convey interest in individuals’ welfare, interviewers asked questions about well-being before inquiring about labor market outcomes. The high-frequency interviews were also short; they were designed to take no more than five minutes. This was assisted by the use of Computer Assisted Personal Interviewing (CAPI) software that prepopulated time-invariant information, allowing interviewers to focus on questions that were expected to vary over the course of the survey. A log was kept of all calls made from the phones used for the survey. This allowed for verification of the date, time, and duration of calls and also maintained a record of call attempts, network problems, and other usage statistics. Information entered by enumerators in the hand-held device was also verifiable against data from the call logs automatically stored in the phone assigned to each enumerator, enabling better monitoring. During the baseline interview, respondents were also asked to indicate when they preferred to be interviewed. Finally, data quality checks were performed throughout the survey. This approach was successful: The average phone interview took less than four minutes. 6 More than 95 percent of the interviews were matched with a record in the call logs; unmatched interviews largely reflected the fact that respondents asked to use a number that was not provided at the time of the baseline or endline interview. After completion of the endline survey, 5 percent of respondents were randomly selected for a verification survey. They were asked to verify whether they had received a phone, whether the phone was sealed in a box, how often they were interviewed, what their employment status at baseline was, and what economic activity they were engaged in at the time of the baseline interview. Their responses are highly consistent with their baseline responses and attest to the credibility of the collected data (see appendix table B2). While phone interviews were associated with somewhat lower compliance than in-person interviews (consistent with earlier studies such as De Leeuw, 1992), survey compliance was very high overall: Only 6.4 percent of weekly in-person interviews, 11.2 percent of weekly phone interviews, and 10.8 percent of triweekly interviews were not completed (table 1 and appendix table B3). Interestingly, interviewing respondents more frequently thus did not induce higher noncompliance. Appendix table B4 analyzes the determinants of compliance with the high- 6 The duration of interviews did not decline over the course of the survey, suggesting that respondents did not (start) responding strategically to reduce the duration of the interview. (Namely, by responding that they had not worked, they could have reduced survey duration by about a minute.) 8 frequency surveys and demonstrates that attrition is difficult to predict. 7 Nonetheless, we will attrition-weight our regressions when analyzing reporting differences. Survey retention rates between baseline and endline interviews were even higher, with only 2 of 954 individuals not interviewed at the in person endline and just 9 percent of respondents not available for an interview at the three-month follow-up conducted over the phone. Moreover, the overwhelming majority of participants (97.5 percent) indicated that they would be willing to participate again. These rates of compliance are much higher than in other studies (see, e.g., Dillon 2012; Croke et al. 2014; Garlick, Orkin, and Quinn 2015) and probably reflect a variety of factors, including a short questionnaire, flexible interview schedules, adequate compensation, and the cultivation of trust between respondents and enumerators. 4. Data 4.1 Descriptive Statistics, Balance, and External Validity Table 2 assesses balance across treatment arms at baseline for the sample that participated in the survey (sampling was done before the survey was fielded, so examining balance at baseline is a strong test of whether randomization was successful), distinguishing between variables that were explicitly targeted in the randomization, presented in Panel A, and variables that were not, presented in Panel B. Sample sizes per treatment arm differ slightly from the target of 320, with 318 respondents participating in weekly in-person interviews, 315 in weekly phone interviews, and 321 in triweekly phone interviews. The null hypothesis that the treatment arms were balanced on all variables that were used to stratify the sample is not rejected, as is shown in Panel A. Just under three-fifths of respondents in the experiment were women, and the average respondent age was 34. Approximately two-thirds of respondents were employed, and approximately three-fifths of those working reported being self-employed. Our survey population thus constitutes a group of fairly poor urban dwellers who are likely to be mainly reliant on labor income and are thus of particular interest for a study of this type which aims to think about how best to collect labor market data. Our sample is representative of urban labor markets in Ghana, as is shown in appendix table B5, which compares our sample with that of the respondents in the 7 The predictive power of models of interview completion presented in appendix table B4 is very low; the adjusted R2 of the best performing model is 0.013. Moreover, few of the covariates significantly predict attrition. 9 same cities from the Ghana Living Standard Survey (GLSS) 2012/13, a nationally representative household survey, and shows that average socioeconomic characteristics of experiment participants and their households are not statistically different from GLSS participants. However, at baseline, the different groups exhibited significant differences in some of the variables not targeted in the randomization, notably in the propensity to be engaged in manufacturing and in the public sector. To account for these differences, in appendix tables D1 and D2 we present regressions in which we include controls for baseline sector of employment, days and hours worked per week at baseline, as well as baseline pay per week, in addition to the variables used in the randomization. Our main findings, which are presented below, are robust to including these controls. Balance tables for the self-employed and wage employed are presented in appendix tables B6 and B7, respectively. The results are similar to those for the main sample; we cannot reject the null that there are no differences across treatment groups in terms of the variables targeted in the randomization, but do detect some differences across groups for variables that were not directly targeted – notably sector of employment, and days worked per week for the self-employed, and days worked per week for the wage employed. These differences attest to the need to include controls for these baseline differences when analyzing the reporting behavior of the different groups. 4.2 Aggregating Triweekly interviews Most of the triweekly interviews were conducted on the same days of the week (Tuesday, Thursday, and Saturday) as is shown in appendix table C1. Respondents were asked to recall their labor market outcomes over the previous two days. As a result, Saturday was consistently missed in the high-frequency reporting. The follow-up survey aimed to fill this gap by inquiring about respondents’ current and past labor market behavior on Saturdays, as well as all other days of the week. To obtain estimates of how many hours the triweekly group worked each week they reported on, we simply add up the number of days and hours they reported working over 3 consecutive two- day interviews, and we impute the number of hours they worked each Saturday using information they provided at endline using the following rules; (i) if respondents reported never working on a Saturday (approximately 52% of respondents) no adjustment was made (i.e. no hours or days were added); (ii) if a respondent reported working all Saturdays, we added one day of work and the 10 usual number of hours they reported working on Saturdays whenever the respondent reported working any of the six other days of the week for which direct observational data were available. For weeks for which no other work was reported, no adjustment was made; and (iii) for respondents reporting working only some Saturdays we followed the same procedure but instead of adding a day and hours to each week in which they reported doing any work, we randomly selected a number of weeks in which they reported doing any work. Selection probabilities corresponded to the number of times respondents typically worked on a Saturday in any month (e.g., if a respondent reported working two Saturdays each month, then additional days and hours would be added to 50% of the weeks they reported doing any work). Appendix C furnishes descriptive evidence suggesting that imputing Saturday hours based on usual hours reported at endline is not a bad (though certainly not ideal) approximation for actual hours worked. 5. Survey Mode Effects This section analyzes differences in reporting between the weekly in person and weekly phone groups during high-frequency interviews to assess survey modality effects. We use the weekly in person reports as the benchmark, since in-person interviews remain the dominant survey mode for both labor and household surveys. We restrict attention to those who reported being employed at baseline. The following equation was estimated to assess the impact of survey modality on the reporting of labor market outcomes by respondents: = β ℎ + τw + (1) where is a labor market outcome of interest of individual i in week w (whether individual i reported doing any work in week w, days worked per week, and hours worked per week, both unconditionally and conditional on having reported any work that week); ℎ is a dummy variable that takes the value 1 if respondent i was interviewed over the phone once a week and 0 otherwise; the omitted category is thus the weekly in-person interviews; τ is a vector of calendar week dummies; and is a random error term. Standard errors are clustered at the 11 treatment level, i.e. by household. Note that we have at most 10 observations per individual. To minimize potential selection bias associated with differences in interview completion rates across individuals, all observations are attrition weighted, with weights corresponding to the inverse of the number of complete weeks for which an individual was observed. The coefficient βP provides an estimate of the impact of the survey modality on reported labor market behavior (recall that weekly in-person interviews are the omitted category and thus serve as the “control” group in this specification). To assess whether modality effects vary between the wage workers and self-employed, we also estimate: = β ℎ +β + β ∗ ℎ + τw + (2) where is a dummy variable that takes the value 1 respondents were self-employed at baseline and 0 otherwise. Labor market status was very stable over the course of the survey as is shown in Table 3 which presents a matrix of transitions between employment status at baseline and endline. Only 9 workers who were self-employed at baseline (2% of all self-employed workers at baseline) had switched to wage employment at the time of the endline and only 14 workers who were wage employed at baseline (5% of all wage employed at baseline) had switched to self-employment. Baseline employment status is thus a good measure of a person’s employment type during the course of the survey and avoids potential concerns about the endogeneity of employment status to treatment. The coefficient β estimates the differential sensitivity of the self-employed to survey modality. Using a similar strategy, we examine heterogeneity among the self-employed. We divide the self-employed into two groups: those who work in a fixed structure that is outside their home, and those who do not (i.e. those who either work at home and those who work outside their homes but have a mobile workplace). The former group is likely to comprise a more established set of entrepreneurs, whose labor market attachment is less volatile, and whose labor market reporting may, as a result, be less sensitive to the particular way labor statistics are collected. Specifically, focusing on the sample of those who are self-employed at baseline we estimate: 12 = β ℎ +β + β ∗ ℎ + τw + (3) Where FLOH is a dummy for working in a fixed location outside of the house in a permanent structure. The coefficient β measures how sensitivity to modality varies among the self- employed. 5.1 Results Respondents assigned to weekly phone interviews reported significantly less employment, fewer hours, and fewer days worked than respondents interviewed in-person each week, as is shown in table 4 which presents the main survey mode results. The effects are both statistically and economically significant. Relative to weekly in-person interviews, weekly phone-based interviews are associated with a statistically significant 8 percentage point reduction in the likelihood of reporting work. This is a sizeable effect given that 86% of the respondents who took part in weekly in-person interviews and reported being employed at baseline (the omitted category) report working in any given week. The reported number of days worked per week is a statistically significant 0.57 fewer overall, some 12% lower than the average of 4.60 days reported by respondents in the in-person weekly group. Conditional on having reported working, respondents in the weekly phone group reported 0.15 fewer days, only some 3% less than the average of 5.37 reported by respondents that partook in weekly in-person interviews. This difference is not statistically significant. The number of hours worked reported is 6.9 fewer overall, a sizeable and significant effect; 17% less than the average number of hours worked reported by participants in in-person interviews (41.3 hours). Conditional on having reported any work, differences are smaller and no longer significant, but still present; phone interviews are associated with reporting 3.6 fewer hours than in -person interviews, which is only 7% less than the average of 48.3 hours worked reported by those who participated in weekly in-person interviews. Put differently, phone interviews are associated with reporting fewer work spells, days and hours. However, conditional on having reported any work, the number of days and hours are not significantly different. 5.2 Heterogeneity between the wage workers and the self-employed 13 The results presented in panel A mask remarkable heterogeneity among the wage and self- employed, as is shown in panel B of table 4. Survey modality does not significantly affect labor market reporting by the wage-employed: None of the labor market outcomes reported by respondents who were wage-employed at baseline and assigned to the weekly phone treatment are different on average from the outcomes of wage workers who were interviewed in-person each week. By contrast, the self-employed are sensitive to survey mode; the self-employed interviewed over the phone were 12 percentage points less likely to report working in any given week than self-employed workers interviewed in person. They also reported working 0.8 fewer days and 8 fewer hours. They did not, however, report working fewer days or hours conditional on working. 5.3 Heterogeneity among the self-employed Given the marked sensitivity of the self-employed to survey modality, we assess which type of self-employed workers are most impacted by survey modality in table 5. The omitted category are self-employed workers who work out of their homes or have a mobile place of employment. Their reporting is very sensitive to modality; when these types of self-employed workers were interviewed over the phone they were 18 percentage points less likely to report having done any work then when they were interviewed in person. They also reported working one day a week less when they were interviewed over the phone as opposed to in person. By contrast, the self-employed who work in a fixed location outside their homes are much less sensitive to survey modality; the hypothesis that their reporting was not impacted by survey modality cannot be rejected. Put differently, the survey modality effect is entirely driven by those who are self-employed in home- based or mobile jobs. In the appendix table D3 we assess heterogeneity in reporting among the wage employed and show that neither the reporting of formal nor the reporting of informal wage employees is impacted by survey modality. 6 Impact of the Reference Period This section analyzes the impact of the reference period on labor market reporting by comparing the responses of individuals in the weekly and triweekly phone groups. Since these groups were 14 both interviewed over the phone, differences in reporting are not driven by modality. We use the reports of those in the phone weekly group as the benchmark, since the 7 day reference period remains the norm. We again confine attention to those who reported working at baseline. To assess the impact of the reference period on labor market reporting the following equation is estimated: = β ℎ + τw + (4) where is a labor market outcome of interest of individual i in week w as defined above and ℎ is a dummy variable that takes the value 1 if respondent i was interviewed over the phone three times a week and 0 otherwise; the omitted category is thus the weekly phone interviews; τ is a vector of calendar week dummies; and is a random error term. Triweekly labor market reports are aggregated to the weekly level in order to compare them with weekly labor market reports obtained in the weekly arms (see the discussion in section 4.2 and Appendix C). Standard errors are again clustered at the treatment level, i.e. by household, and all observations are again attrition weighted, with weights corresponding to the inverse of the number of complete weeks for which an individual was observed. Analogous to reference period effects we analyze heterogeneity by interacting the dummy for the triweekly phone treatment with an indicator for having been self-employed at baseline. We also examine heterogeneity among the self-employed. 6.1 Results Respondents in the triweekly phone arm (who were asked to respond to two 24-hour recall questions) were 7 percentage points more likely to report having done any work in a given week than respondents in the weekly phone group with a one-week reference period, as is shown in Table 6 which presents the main reference period results. There was no change in the average reported number of days of work or hours worked overall. However, conditional on reporting having done any work that week, the number of days fell significantly, by 0.6. The number of hours worked conditional on working dropped by 4.7. Shortening the reference period from one week to two days thus yielded reporting of more employment spells but not a significantly different 15 number of days or hours worked overall. One explanation for these findings is that more frequent reporting improves accuracy by enabling respondents to better recall both short-lived employment spells and disruptions. 6.2 Heterogeneity between the wage workers and the self-employed The length of the reference period also affects self-employed and wage-employed workers differently, as is shown in panel B. The reference period does not impact the reporting of the incidence of employment among the wage employed (the omitted category in this regression). Yet wage employed participants who were interviewed three times a week reported 0.5 fewer days worked than participants who completed weekly interviews, and 0.6 fewer days conditional on reporting any work. They also reported working 5.4 fewer hours and 6.3 fewer hours conditional on reporting on any work. By contrast, self-employed participants who were interviewed three times a week are 13 percentage points more likely to report having worked in any given week than those interviewed once a week. They also reported working 0.7 more days and 8.4 more hours than those interviewed once a week. However, they did not report working more days or hours worked conditional on reporting any. These findings suggest the differences were driven by reporting more spells, i.e. an “extensive” margin effect, not by reporting more or fewer hours for a given spell, i.e. not due to an “intensive” margin effect. 6.3 Heterogeneity among the self-employed To assess which type of self-employed workers drive the reference period effect, table 7 examines heterogeneity in the impact of being interviewed three times a week among the self-employed. It turns out that the self-employed who are working from their homes or have a mobile place of work exhibit the greatest sensitivity to the reference period. When they were interviewed three times a week using two 24-hour reference periods, they were 23 percentage points more likely to report having done any work that week than when they were interviewed once a week using a seven day reference period. They also reported working 0.9 days and 10 hours more when they were reporting using a shorter reference period than the one week. They did not, however, report fewer days and fewer hours conditional on reporting work. 16 The reporting of the self-employed who are working in a fixed location outside their homes is much less sensitive to the reference period. They did not report working significantly more often and did not report working more days and hours when they were interviewed three times a week instead of once a week. They did report working fewer days and hours conditional on reporting any work, but the difference with self-employed workers who were working from their homes is not statistically significant. Put simply, it seems that the sensitivity of the self-employed to survey mode is driven by those working at home or in mobile places of employment. In the appendix we explore heterogeneity among the wage employed, dividing them into formal and informal workers. While appendix table D4 indicates that differences in the impact of survey mode and the reference period across these groups are generally not statistically significant, the point estimates suggest a similar pattern as in the results comparing self-employed with and without a fixed place of employment outside their home: there is some evidence that informal wage workers reported more employment than formal ones when the reference period was shortened. 7. Conclusion A randomized survey experiment was conducted in urban Ghana to assess the impact of reference period and survey modality on reported labor market statistics. Interviewing people over the phone instead of in person reduced the reported incidence of employment by 8 percent, the number of days worked by 0.6 and the number of hours by 7, on average. These results mask marked heterogeneity in reporting across different types of jobs. Phone effects are concentrated among the self-employed who work from home or have a mobile place of employment. They are insignificant for the wage employed and for the self-employed who work outside of their homes in a permanent structure. Shortening the reference period from one week to two days (by means of two separate 24- hour recall questions) also has different impacts on the wage workers versus the self-employed; survey participants who reported their labor market activity three times a week reported significantly more self-employment spells than those reporting only once a week, but they did not report more wage employment. These effects are particularly pronounced among self-employed 17 workers who are working at home or in a mobile place of employment, i.e. those with less stable work. Wage employees do report working fewer days and hours when interviewed three times a week instead of once a week. These results point to the possibility that the reference period of a week may systematically underestimate the incidence of certain types of self-employment, but more research is needed to substantiate this conclusion. Future studies seeking to assess the role of the reference period, in contexts with significant self-employment, would be well advised to interview respondents on a daily basis and/or use time diaries. The use of administrative and/or observational data is of course ideal, but highly unlikely to be available in such settings. Although identifying the mechanisms that cause differences in reporting are beyond the scope of this paper, our findings demonstrate that labor market statistics are very sensitive to the method by which they are collected, with those in the least stable types of self-employment being most impacted by changes in survey mode and the reference period. It is important to pay attention to these differences when examining the determinants of labor supply, especially in contexts where self-employment is widespread. 18 References Akerlof, G. and J. Yellen. 1985. “Unemployment Through the Filter of Memory” Quarterly Journal of Economics 100 (3): 747:773. Arthi, V.S., K. Beegle, J. De Weerdt, and A.Palacios-Lopez. 2018. "Not Your Average Job: Measuring Farm Labor in Tanzania.” Journal of Development Economics 130 (C): 160-172. Bardasi, E., K. Beegle, A. Dillon, and P. Serneels. 2011. “Do Labor Statistics Depend on How and to Whom the Questions Are Asked? Results from a Survey Experiment in Tanzania.” World Bank Economic Review 25 (3): 418–47. Barrett, G., and D. S. Hamermesh. 2016. “Labor Supply Elasticities: Overcoming Nonclassical Measurement Error Using More Accurate Hours Data.” NBER Working Paper 22920. National Bureau of Economic Research, Cambridge, MA. Beckett, M., J. Da Vanzo, N. Sastry, C. Panis, and C. Peterson. 2001. “The Quality of Retrospective Data: An Examination of Long-Term Recall in a Developing Country.” Journal of Human Resources 36: 593–625. Bonke, J. 2005. “Paid Work and Unpaid Work: Diary Information versus Questionnaire Information.” Social Indicators Research 70 (3): 349–68. Bound, J., C. Brown, and N. Mathiowetz. 2001. “Measurement Error in Survey Data.” In Handbook of Econometrics, ed. J. Heckman and E. Leamer, 3705–843. Amsterdam: Elsevier. Croke, K., A. Dabalen, G. Demombynes, M. Giugale, and J. Hoogeveen. 2014. “Collecting High-Frequency Panel Data in Africa Using Mobile Phone Interviews.” Canadian Journal of Development Studies 35 (1): 186–207. Dabalen, A., A. Etang, J. Hoogeveen, E. Mushi, Y. Schipper, J. von Engelhardt. 2016. Mobile Phone Panel Surveys in Developing Countries: A Practical Guide for Microdata Collection. Directions in Development--Poverty. Washington, DC: World Bank. De Leeuw, E. 1992. “Data Quality in Mail, Telephone and Face to Face Surveys.” Netherlands Organization for Scientific Research, The Hague. de Nicola, F., and X. Gine. 2014. “How Accurate Are Recall Data: Evidence from Coastal India?” Journal of Development Economics 106: 52–65. Dillon, B. 2012. “Using Mobile Phones to Collect Panel Data in Developing Countries.” Journal of International Development 24: 518–27. Duncan, G. J., and F. P. Stafford. 1980. “Do Union Members Receive Compensating Wage Differentials?” American Economic Review 70 (3): 355–71. Garlick, R., K. Orkin, and S. Quinn. 2015. “Call Me Maybe: Experimental Evidence on Using Mobile Phones to Survey African Microenterprises.” CSAE Working Paper WPS/2016-14. Centre for the Study of African Economies, Oxford University. Gregson, S., T. Zhuwau, J. Ndlovu, and C.A. Nyamukapa, 2002. “Methods to Reduce Social Desirability Bias in Sex Surveys in Low-Development Settings: Experience in Zimbabwe.” Sexually Transmitted Diseases 29 (10): 568–75. Groves, R. M. 1990. “Theories and Methods of Telephone Surveys.” Annual Review of Sociology 16: 221–40. Hamermesh, D. S. 1990. “Shirking or Productive Schmoozing: Wages and the Allocation of Time at Work.” Industrial & Labor Relations Review 43 (3): 121S–33S. Heath, R., G.Mansuri and B.Rijkers “Labor Supply Responses tto Health Shocks: Evidence from High-Frequency Labor Market Data from Urban Ghana” Journal of Human Resources (forthcoming). 19 Horvath, F. W. 1982. “Forgotten Unemployment: Recall Bias in Retrospective Data.” Monthly Labor Review 105 (3): 40–43. Hussmanns, R., F. Mehran, and V. Verma. 1990. Surveys of Economically Active Population, Employment, Unemployment and Underemployment: An ILO Manual on Concepts and Methods. Geneva: International Labour Office. ILO (International Labour Organization). 2015. World Employment and Social Outlook: Trends 2015. Geneva. Juster, F. T., and F. P. Stafford. 1991. “The Allocation of Time: Empirical Findings, Behavioral Models, and Problems of Measurement.” Journal of Economic literature 29(2): 471–522. Langhaug, L. F., L. Sherr, and F. M. Cowan, 2010. “How to Improve the Validity of Sexual Behaviour Reporting: Systematic Review of Questionnaire Delivery Modes in Developing Countries.” Tropical Medicine and International Health 15 (3): 362–81. Mathiowetz, N. A., 1986. “The Problem of Omissions and Telescoping Error: New Evidence from a Study of Unemployment” Proceedings of the Section on Surveys Methods Research (American Statistical Association, Alexandria, VA.) Mathiowetz, N. A., and G. J. Duncan. 1988. “Out of Work, Out of Mind: Response Errors in Retrospective Reports of Unemployment.” Journal of Business and Economic Statistics 6 (2): 221–29. Neter, J., and J. Waksberg, 1964. “A Study of Response Errors in Expenditures Data from Household Interviews” Journal of the American Statistical Association 59 (305): 18-55. Pierret, C. 2001. “Event History Data and Survey Recall.” Journal of Human Resources 36: 439– 66. Robinson, J. P., and A. Bostrom. 1994. “The Overestimated Workweek? What Time Diary Measures Suggest.” Monthly Labor Review 11–23. Robinson, J. P., S. Martin, I. Glorieux, and J. Minnen. 2011. “The Overestimated Workweek Revisited.” Monthly Labor Review 134 (6): 43–53. Stewart, J. 2014. “The Importance and Challenges of Measuring Work Hours.” IZA World of Labor. Available at https://wol.iza.org/articles/importance-and-challenges-of-measuring- work-hours/long. Sudman, A. and N. M. Bradburn. 1974. Response Effects in Survey. Aldline Publishing Company, Chicago. Sudman, A. and N. M. Bradburn. 1982. Asking Questions. A Practical Guide to Questionnaire Design. San Francisco. 20 Table 1: Survey design and interview completion rates Weekly in person Weekly phone Triweekly phone Total interview interview interview Baseline (Aug-Sept 2013) Targeted # participants 320 320 320 960 Interviews Completed 318 315 321 954 High Frequency Surveys (Aug–Oct 2013) Interviews Attempted 3180 3150 9630 15,960 Interviews Completed 2977 2797 8593 14,367 Non-completion rate 6.4% 11.2%*** 10.8%*** 10.0% Endline (Oct-Nov 2013) Interviews Attempted 318 315 321 954 Interviews Completed 318 314 320 952 Non-completion rate 0.0% 0.3% 0.3% 0.2% 3-Month Follow up (March 2014) Interviews Attempted 318 315 321 954 Interviews Completed 281 289 295 865 Non-completion rate 11.6% 8.3% 8.1% 9.3% Note: ***, ** , * indicate the differences in completion rate are different from the weekly in person group at the 1%, 5%, and 10% significance levels respectively 21 Table 2 Baseline Differences by Treatment Status means p-values In- vs Phone Phone vs in-person person phone Weekly Triweekly weekly Weekly weekly (1) (2) (3) (2=1) (3=1) (3=2) A. Variables used in randomization Male 0.44 0.41 0.40 0.51 0.34 0.78 Age 34.26 33.21 34.04 0.17 0.79 0.29 Education 8.61 9.09 9.11 0.20 0.20 0.97 Married 0.44 0.41 0.41 0.42 0.45 0.97 Self-employed 0.47 0.35 0.42 0.00 0.24 0.09 Wage-employed 0.24 0.31 0.26 0.05 0.60 0.19 Accra 0.50 0.53 0.57 0.62 0.35 0.65 Cape Coast 0.09 0.06 0.07 0.62 0.65 0.96 Dependency ratio 0.38 0.39 0.41 0.73 0.31 0.47 Log hh assets 7.34 7.44 7.53 0.57 0.23 0.60 p-value joint F test all variables used in A 0.29 0.71 0.67 B. Variables not targeted in the randomization Days worked per week 4.03 3.64 3.75 0.10 0.25 0.60 Hours worked per week 37.29 31.77 32.74 0.03 0.07 0.66 Manufacturing 0.07 0.05 0.06 0.58 0.73 0.80 Trade 0.36 0.27 0.27 0.02 0.03 0.84 Services 0.26 0.28 0.32 0.51 0.12 0.30 Public sector 0.03 0.07 0.03 0.07 0.95 0.08 Pay per week 88.14 81.21 83.59 0.61 0.75 0.85 p-value joint F test all variables used in A and B 0.43 0.49 0.81 Total number of individuals 318 315 321 Note: *** p<0.01, ** p<0.05, * p<0.10. Standard errors are clustered at treatment level (Household). 22 Table 3: Transition Matrix Transition Matrix Status at Endline Status at Baseline Self-employed Wage-employed Not working Attrited Total Self-employed 348 9 38 0 395 88.1% 2.3% 9.6% 0.0% 100.0% Wage-employed 14 215 25 0 254 5.5% 84.7% 9.8% 0.0% 100.0% Not Working 32 31 240 2 305 10.5% 10.16% 78.7% 0.7% 100.0% Total 394 255 303 2 954 41.3% 26.7% 31.8% 0.2% 100.0% Note: Attrited is a category reserved for individuals that did not complete the endline survey. Not working is a composite category that comprises both the unemployed and those who are out of the labor force. 23 Table 4: The Impact of Survey Mode on Weekly Labor Market Reporting The impact of survey mode on labor market reporting Phone weekly vs in person weekly (omitted category) Days- Hours- Working Days conditional Hours conditional on work on work Panel A: Main effects (1) (2) (3) (4) (5) Phone weekly -0.08*** -0.57*** -0.15 -6.86*** -3.61 (0.03) (0.18) (0.10) (2.49) (2.38) [0.01] [0.01] [0.13] [0.01] [0.13] Mean in person weekly 0.86 4.60 5.37 41.34 48.31 N 3,951 3,951 3,190 3,951 3,190 Adjusted R2 0.029 0.033 0.004 0.025 0.004 Panel B: Heterogeneity by self-employment status (6) (7) (8) (9) (10) Phone weekly -0.01 -0.09 -0.02 -2.51 -2.22 (0.04) (0.27) (0.14) (3.47) (3.24) [0.87] [0.87] [0.90] [0.74] [0.74] Self-employed 0.04 0.29 0.11 0.55 -1.49 (0.04) (0.26) (0.15) (3.31) (3.08) [0.74] [0.74] [0.74] [0.90] [0.86] Phone weekly*Self-employed -0.12** -0.83** -0.23 -7.99* -3.21 (0.05) (0.35) (0.20) (4.35) (4.03) [0.16] [0.16] [0.74] [0.34] [0.74] Mean wage employed in-person weekly 0.83 4.41 5.30 41.00 49.23 Phone effect self-employed -0.14 -0.92 -0.24 -10.51 -5.43 p 0.02 0.02 0.25 0.07 0.43 Number of observations 3951 3951 3190 3951 3190 Adjusted R2 0.035 0.040 0.006 0.033 0.009 Note: *** p<0.01, ** p<0.05, * p<0.10, standard errors clustered by household are presented in parentheses below the coefficient. Multiple testing adjusted FDR-adjusted q-values are presented in square brackets. The omitted category are individuals who were working at baseline and completed weekly in person interviews. Regressions are attrition weighted (with weights corresponding to 1/number of weeks for which we have data for each individual). All regressions include week of interview dummies. 24 Table 5: The Impact of Survey Mode on Weekly Labor Market Reporting Among the Self- Employed The impact of survey mode on labor market reporting Heterogeneity among the self-employed Phone weekly vs in person weekly (omitted category) Days- Hours- Working Days conditional Hours conditional on work on work (1) (2) (3) (4) (5) Phone weekly -0.18*** -1.05*** -0.19 -8.29 -1.08 (0.06) (0.40) (0.28) (5.29) (5.63) [0.03] [0.03] [0.57] [0.18] [0.85] Fixed location outside home 0.09*** 0.89*** 0.46* 13.54** 10.34* (FLOH) (0.03) (0.31) (0.23) (6.18) (6.08) [0.03] [0.03] [0.13] [0.09] [0.15] Phone weekly*FLOH 0.15* 1.05* 0.37 7.06 1.75 (0.08) (0.61) (0.39) (8.75) (8.16) [0.15] [0.15] [0.46] [0.53] [0.85] Mean In-person weekly (not in 0.88 4.57 5.20 36.97 42.09 FLOH) Phone effect FLOH -0.03 -0.01 0.18 -1.23 0.67 p 0.63 0.98 0.48 0.86 0.91 N 1269 1269 1053 1269 1053 Adjusted R2 0.12 0.14 0.04 0.10 0.03 Note: *** p<0.01, ** p<0.05, * p<0.10, standard errors clustered by household are presented in parentheses. Multiple testing adjusted FDR-adjusted q-values are presented in square brackets. The omitted category are individuals who were working at baseline and completed weekly in person interviews. Regressions are attrition weighted (with weights corresponding to 1/number of weeks for which we have data for each individual). All regressions include week of interview dummies. 25 Table 6: The Impact of Reference Period on Weekly Labor Market Reporting The impact of the reference period on labor market reporting Phone triweekly vs phone weekly (omitted category) Days- Hours- Working Days conditional Hours conditional on work on work Panel A: Main effects (1) (2) (3) (4) (5) Phone triweekly 0.07** -0.14 -0.59*** -0.84 -4.70** (0.03) (0.20) (0.12) (2.19) (1.94) [0.05] [0.61] [0.00] [0.70] [0.04] Mean phone weekly 0.75 3.90 5.19 33.41 44.45 N 3220 3220 2530 3220 2530 Adjusted R2 0.008 0.005 0.043 0.003 0.015 Panel B: Heterogeneity by self-employment status (6) (7) (8) (9) (10) Phone triweekly -0.01 -0.52* -0.60*** -5.37* -6.28** (0.05) (0.27) (0.17) (3.01) (2.68) [0.86] [0.09] [0.01] [0.10] [0.07] Self-employed -0.09** -0.54** -0.12 -7.53*** -4.78* (0.04) (0.24) (0.13) (2.84) (2.60) [0.07] [0.07] [0.42] [0.07] [0.10] Phone triweekly*Self-employed 0.13** 0.70* 0.04 8.39** 3.50 (0.06) (0.36) (0.23) (4.13) (3.70) [0.07] [0.09] [0.86] [0.09] [0.42] Mean wage employed phone weekly 0.81 4.24 5.26 37.85 46.97 Triweekly effect self-employed 0.13 0.17 -0.56 3.02 -2.78 p 0.00 0.50 0.00 0.31 0.30 Number of observations 3220 3220 2530 3220 2530 Adjusted R2 0.015 0.011 0.043 0.014 0.021 Note: *** p<0.01, ** p<0.05, * p<0.10, standard errors clustered by household are presented in parentheses. Multiple testing adjusted FDR-adjusted q-values are presented in square brackets. The omitted category are individuals who were working at baseline and completed weekly in person interviews. Regressions are attrition weighted (with weights corresponding to 1/number of weeks for which we have data for each individual). All regressions include week of interview dummies. 26 Table 7: The Impact of Refence Period on Weekly Labor Market Reporting Among the Self- Employed The impact of the reference period on labor market reporting Heterogeneity among the self-employed Phone triweekly vs phone weekly (omitted category) Days- Hours- Working Days conditional Hours conditional on work on work (1) (2) (3) (4) (5) Phone triweekly 0.23*** 0.93** -0.25 10.14* 0.71 (0.07) (0.44) (0.33) (5.95) (6.07) [0.01] [0.07] [0.47] [0.13] [0.91] Fixed Location Outide of Home 0.24*** 1.93*** 0.82*** 20.54*** 11.98** (FLOH) (0.08) (0.52) (0.31) (6.23) (5.48) [0.01] [0.01] [0.03] [0.01] [0.07] Phone triweekly*FLOH -0.18** -1.44** -0.59 -13.69* -7.26 (0.09) (0.63) (0.42) (7.75) (6.91) [0.08] [0.06] [0.21] [0.12] [0.34] Mean phone weekly not in FLOH 0.66 3.28 4.95 26.37 39.80 Triweekly effect FLOH 0.05 -0.50 -0.84 -3.55 -6.56 p 0.40 0.27 0.00 0.49 0.08 N 966 966 785 966 785 Adjusted R2 0.106 0.079 0.031 0.082 0.022 Note: *** p<0.01, ** p<0.05, * p<0.10, standard errors clustered by household are presented in parentheses. Multiple testing adjusted FDR-adjusted q-values are presented in square brackets. The omitted category are individuals who were working at baseline and completed weekly in person interviews. Regressions are attrition weighted (with weights corresponding to 1/number of weeks for which we have data for each individual). All regressions include week of interview dummies. 27 Appendix A Variable Definitions This appendix consists of four tables (tables A.1–A.5) that define the variables used in the paper. Table A.1 Definition of variables used from 2010, 2012 and 2013 rounds of the Ghana Urban Panel Survey Variable Definition Male Dummy variable = 1 if male, 0 if female Age Age in years Education Years of formal schooling Married Married = 1, 0 otherwise Self-employed Dummy variable = 1 if primary work activity is self-employment, 0 if primary work activity is wage employment, out of labor force, or unemployed Wage-employed Dummy variable = 1 if primary work activity is wage-employment, 0 if primary work activity is self-employment, out of labor force, or unemployed Dependency ratio Number of household members younger than 15 or older than 64 divided by household size divided by household size Log hh assets Log of household assets Accra Dummy variable taking value 1 if the individual lives in Accra, 0 otherwise Dummy variable taking value 1 if the individual lives in Cape Cape Coast Coast/Takoradi, 0 otherwise Table A.2 Definition of baseline variables Variable Definition Self-employed Dummy variable = 1 if primary work activity is self-employment, 0 if primary work activity is wage employment, out of labor force, or unemployed Wage-employed Dummy variable = 1 if primary work activity is wage employment, 0 if primary work activity is self-employment, out of labor force, or unemployed Manufacturing Dummy variable=1 if respondent works in the manufacturing sector, 0 otherwise Trade Dummy variable=1 if respondent works in the trade sector, 0 otherwise Services Dummy variable=1 if respondent works in the services sector, 0 otherwise Public sector Dummy variable=1 if respondent works in the public sector, 0 otherwise Pay per week Usual weekly earnings in Ghanaian Cedis Days worked per week Total number of days worked per week Hours worked per week Total number of hours worked in a typical week (calculated by multiplying the number of days worked per week with the number of hours worked per day) 28 Table A.3 Definition of variables used from weekly/triweekly interviews Variable Definition Phone weekly Dummy variable = 1 if respondent was assigned to weekly phone interview treatment In person weekly Dummy variable = 1 if respondent was assigned to weekly in person interview treatment Phone triweekly Dummy variable = 1 if respondent was assigned to triweekly phone interview treatment Working Dummy variable = 1 if respondent completed work over the reference period Days Days worked per week Days conditional on work Days worked per week conditional on having reported any work that week Hours Hours worked over the reference period (e.g. in case the reference period is a week it is number of days worked per week X number of hours worked per day; in case the reference period is a day it is the number of hours worked that day) Hours conditional on work Hours worked per week over the reference period (if the reference period is a week the number of hours conditional on work is calculated by multiplying number of days worked per week times number of hours worked per day; if it is per day it is simply the number of hours) conditional on having reported any work during the reference period Table A.4 Definition of variables used from endline survey Variable Definition Self-employed Dummy variable = 1 if primary work activity is self-employment, 0 if primary work activity is wage employment, out of labor force, or unemployed Wage-employed Dummy variable = 1 if primary work activity is wage-employment, 0 if primary work activity is self-employment, out of labor force, or unemployed Table A.5 Definition of variables used from 3-month follow up survey Variable Definition Fixed Location Outside of the House (FLOH) (Self-employed only) Dummy variable=1 if respondent’s place of work is outside of the home in a permanent structure, 0 otherwise (e.g. when the respondents works either at home or in a mobile location outside of the home). Informal (Wage employees only) Wage employees who do not get paid for doing overtime, do not have a pension, do not get paid sick leave, nor paid holidays. This definition of informality is ad hoc but allows us to identify roughly even-sized groups of informal and formal wage workers. 29 Appendix B Survey Design and Implementation Table B1: Incentives Participation Fees (per completed interview) Weekly Received High frequency interviews 3 Month phone Baseline Endline a phone Follow up credit weekly 3x weekly In person weekly 3 3 3 4 Phone weekly  1 3 3 3 4 Phone triweekly  1 3 2 3 4 Note: amounts in Ghanaian cedis (2013 exchange rate of 0.42 cedis to 1 dollar). Baseline interviews took place from August until September 2013, high frequency interviews were conducted between August and October 2013. Endline interviews were conducted between October and November 2013. 30 Table B.2 Results of verification check Congruence (percent) Did you receive a phone at the time of the baseline interview? 98 Was it a new phone, packed in a box? 97 How often were you interviewed per week? 90 Employment status at baseline 93 Occupation at baseline 80 Note: Check administered to 5 percent of participants. Congruence means that the respondent gave an answer that was consistent with his or her treatment assignment or the answer he or she provided at baseline. 31 Table B.3 Completed high-frequency interviews as percent of scheduled interviews Weekly Weekly Triweekly in-person phone Phone Week (N = 318) (N = 315) (N = 321) 1 314 98.7% 315 100.0% 1 320 99.7% 2 257 80.1% 3 267 83.2% 2 244 76.7% 231 73.3% 4 272 84.7% 5 290 90.3% 6 271 84.4% 3 301 94.7% 248 78.7% 7 278 86.6% 8 283 88.2% 9 295 91.9% 4 306 96.2% 282 89.5% 10 296 92.2% 11 289 90.0% 12 288 89.7% 5 306 96.2% 279 88.6% 13 296 92.2% 14 293 91.3% 15 277 86.3% 6 294 92.5% 279 88.6% 16 294 91.6% 17 298 92.8% 18 295 91.9% 7 306 96.2% 292 92.7% 19 292 91.0% 20 284 88.5% 21 281 87.5% 8 297 93.4% 288 91.4% 22 281 87.5% 23 282 87.9% 24 296 92.2% 9 307 96.5% 295 93.7% 25 284 88.5% 26 297 92.5% 27 293 91.3% 10 302 95.0% 288 91.4% 28 276 86.0% 29 288 89.7% 30 280 87.2% Total 2977 93.6% 2797 88.8% 8593 89.2% 32 Table B.4 Compliance with the High Frequency Survey Dependent variable: Interview completed? (1) (2) (3) coef se coef se coef se Phone weekly (PW) -0.06*** 0.01 -0.06*** 0.01 -0.05 0.04 Phone triweekly (PT) -0.05*** 0.01 -0.05*** 0.01 -0.02 0.03 Controls Male -0.01 0.01 -0.00 0.01 Age -0.00* 0.00 -0.00 0.00 Education -0.00 0.00 -0.00 0.00 Cape Coast -0.03 0.02 -0.00 0.02 Kumasi -0.01 0.01 0.01 0.01 Wage employed 0.00 0.02 0.07*** 0.01 Self-employed 0.03*** 0.01 0.03** 0.01 Dependency ratio -0.02 0.02 -0.01 0.03 Assets -0.00 0.00 -0.00 0.00 Hours (baseline) -0.00 0.00 -0.00 0.00 Days (baseline) -0.00 0.01 -0.01 0.00 Pay (baseline) -0.00** 0.00 -0.00* 0.00 Manufacturing 0.05 0.04 0.04 0.03 Trade 0.03 0.04 0.03 0.03 Services 0.04 0.03 0.02 0.03 Public Sector 0.08** 0.04 0.06 0.04 Controls *Phone weekly (PW) WP*Male 0.03 0.02 WP*Age 0.00 0.00 WP*Education -0.00 0.00 WP*Cape Coast -0.01 0.03 WP*Kumasi -0.00 0.02 WP*Wage employed 0.06*** 0.02 WP*Self-employed 0.00 0.02 WP*Dependency ratio -0.07 0.05 WP*Assets 0.00 0.00 WP*Hours (baseline) -0.01* 0.00 WP*Days (baseline) -0.01 0.01 WP*Pay (baseline) -0.00 0.00 WP*Manufacturing 0.09 0.06 WP*Trade 0.09* 0.05 WP*Services 0.10* 0.05 WP*Public Sector 0.11* 0.06 Controls * Phone triweekly (PT) TP*Male -0.01 0.02 TP*Age -0.00 0.00 TP*Education 0.00 0.00 TP*Cape Coast -0.04 0.03 TP*Kumasi -0.03 0.02 TP*Wage employed -0.08*** 0.02 TP*Self-employed 0.00 0.02 TP*Dependency ratio 0.00 0.04 TP*Assets -0.00 0.00 TP*Hours (baseline) 0.00 0.00 TP*Days (baseline) 0.00 0.01 TP*Pay (baseline) -0.00 0.00 TP*Manufacturing -0.01 0.06 TP*Trade -0.01 0.06 TP*Services 0.00 0.06 TP*Public Sector 0.01 0.07 Constant 0.95*** 0.00 0.99*** 0.02 0.97*** 0.02 P-value joint F Controls 0.06 0.00 P-value joint F PW*Controls 0.00 P value joint F PT*Controls 0.02 Number of observations 15,890 15,020 15,020 Adjusted R2 0.005 0.011 0.013 Note: *** p<0.01, ** p<0.05, * p<0.1, standard errors are clustered by household 33 Table B5: Comparison with representative household survey G-HFLS GLSS 6 Difference p-value (1) (2) (3) (4) Individual Characteristics Male 0.412 0.454 -0.042 0.904 Age 33.857 35.465 -1.731 0.831 Education Female 9.415 10.022 -0.607 0.931 Male 11.233 12.886 -1.653 0.839 Employed Female 0.646 0.768 -0.122 0.749 Male 0.731 0.834 -0.102 0.800 of which self-employed Female 0.692 0.732 -0.040 0.935 Male 0.503 0.414 0.090 0.887 of which in Female 0.075 0.132 -0.057 0.882 manufacturing Male 0.105 0.113 -0.008 0.984 of which in trade Female 0.589 0.437 0.152 0.785 Male 0.255 0.177 0.078 0.873 of which in services Female 0.294 0.299 -0.005 0.993 Male 0.584 0.399 0.185 0.769 of which in public sector Female 0.042 0.074 -0.032 0.913 Male 0.045 0.129 -0.083 0.846 Unemployed Female 0.068 0.030 0.038 0.804 Male 0.046 0.028 0.018 0.918 Not in the labor force Female 0.285 0.196 0.090 0.802 Male 0.223 0.135 0.087 0.814 Sample size (N = 948) (N=13204) Household Characteristics Household size 4.410 3.649 0.760 0.782 Dependency ratio 0.438 0.310 0.128 0.693 Maximum education level in the 11.537 14.630 -3.093 0.881 household Female headed 0.376 0.341 0.034 households 0.950 Labor force participation rate (20 -60 0.747 0.585 0.162 0.660 year olds) Asset index 0.000 0.102 -0.102 0.939 Sample size N = 354 N = 7440 Note: For household characteristics, the GLSS sample is restricted to urban households. For individual characteristics both samples are confined to urban respondents aged 20 - 60. 34 Table B6: Baseline Differences by Treatment Status: Self-Employed means p-values In- vs Phone Phone vs in-person person phone Weekly Triweekly weekly Weekly weekly (1) (2) (3) (2=1) (3=1) (3=2) A. Variables used in randomization Male 0.37 0.35 0.39 0.74 0.73 0.54 Age 36.57 38.22 37.36 0.18 0.49 0.48 Education 7.83 8.09 8.56 0.62 0.16 0.40 Married 0.57 0.61 0.60 0.60 0.68 0.89 Self-employed 1.00 1.00 1.00 Wage-employed 0.00 0.00 0.00 Accra 0.45 0.53 0.61 0.40 0.06 0.32 Cape Coast 0.09 0.04 0.05 0.29 0.46 0.66 Dependency ratio 0.37 0.41 0.43 0.29 0.08 0.49 Log hh assets 7.40 7.31 7.62 0.68 0.27 0.16 p-value joint F test all variables used in A 0.72 0.36 0.88 B. Variables not targeted in the randomization Days worked per week 5.69 5.57 5.58 0.43 0.44 0.97 Hours worked per week 53.82 46.92 49.30 0.03 0.13 0.44 Manufacturing 0.10 0.06 0.07 0.30 0.49 0.76 Trade 0.69 0.67 0.56 0.81 0.04 0.09 Services 0.21 0.26 0.36 0.37 0.01 0.11 Public sector 0.00 0.00 0.00 Pay per week 116.87 103.60 135.61 0.57 0.49 0.24 p-value joint F test all variables used in A and B 0.27 0.10 9,88 Total number of 150 110 135 individuals Note: *** p<0.01, ** p<0.05, * p<0.10. Standard errors are clustered at treatment level (Household). 35 Table B7: Baseline Differences by Treatment Status: Wage Employed means p-values In- vs Phone Phone vs in-person person phone Weekly Triweekly weekly Weekly weekly (1) (2) (3) (2=1) (3=1) (3=2) A. Variables used in randomization Male 0.56 0.59 0.55 0.72 0.89 0.63 Age 33.53 31.60 32.83 0.21 0.69 0.45 Education 9.40 10.03 9.51 0.31 0.87 0.41 Married 0.35 0.34 0.29 0.93 0.46 0.48 Self-employed 0.00 0.00 0.00 Wage-employed 1.00 1.00 1.00 Accra 0.63 0.61 0.57 0.85 0.60 0.73 Cape Coast 0.04 0.07 0.09 0.40 0.37 0.81 Dependency ratio 0.44 0.40 0.47 0.27 0.67 0.14 Log hh assets 7.13 7.53 7.59 0.09 0.05 0.83 p-value joint F test all variables used in A 0.32 0.47 0.18 B. Variables not targeted in the randomization Days worked per week 5.69 5.51 5.51 0.22 0.23 0.96 Hours worked per week 50.49 49.96 47.02 0.87 0.26 0.25 Manufacturing 0.08 0.10 0.11 0.66 0.52 0.9 Trade 0.16 0.10 0.15 0.31 0.82 0.36 Services 0.67 0.62 0.66 0.56 0.92 0.61 Public sector 0.09 0.14 0.09 0.28 0.86 0.18 Pay per week 139.95 146.23 103.97 0.85 0.25 0.07 p-value joint F test all variables used in A and B 0.19 0.45 0.19 Total number of individuals 75 97 82 Note: *** p<0.01, ** p<0.05, * p<0.10. Standard errors are clustered at treatment level (Household). Appendix C Aggregating Triweekly Interviews Table C1 Interview Schedule Number of Day interview was Percent of interviews conducted on total conducted Sunday 151 1.76 Monday 9 0.11 Tuesday 2,659 31.05 Wednesday 177 2.07 Thursday 2,722 31.79 Friday 131 1.53 Saturday 2,714 31.69 Table C2: Assessing Imputation Bias Usual Hours (a) Actual Hours (b) Conditional on Conditional on working mean bias (a-b) N working that day of (Average hours worked day the week reported during HF interview (reported at 3 month frequency survey) refers to follow up) Sunday 9.07 8.43 0.63 15 Monday 9.45 8.11 1.34 126 Tuesday 9.55 7.32 2.23 58 Wednesday 9.54 7.48 2.06 41 Thursday 8.92 8.02 0.90 26 Friday 8.96 8.10 0.86 25 Saturday 7.67 5.67 2.00 3 Average 9.36 7.85 1.51 294 Appendix D: Additional Results Table D1: The Impact of Survey Mode on Weekly Labor Market Reporting – with controls The impact of survey mode on labor market reporting Phone weekly vs in person weekly (omitted category) - with controls Days- Hours- Working Days conditional on Hours conditional on work work Panel A: Main effects (1) (2) (3) (4) (5) Phone weekly -0.09*** -0.52*** -0.10 -5.81*** -3.28 (0.03) (0.16) (0.07) (2.11) (2.06) [0.00] [0.00] [0.19] [0.01] [0.14] Mean in person weekly 0.85 4.50 5.39 41.6 48.69 N 3,562 3,562 2,852 3,562 2,852 Adjusted R2 0.073 0.165 0.272 0.247 0.329 Panel B: Heterogeneity by self-employment status (6) (7) (8) (9) (10) Phone weekly -0.02 -0.15 -0.05 -4.71 -5.38* (0.05) (0.27) (0.11) (3.76) (3.23) [0.68] [0.68] [0.68] [0.61] [0.37] Self-employed 0.05 0.18 -0.14 -2.60 -7.60*** (0.05) (0.30) (0.12) (3.81) (2.94) [0.63] [0.68] [0.61] [0.68] [0.16] Phone weekly*Self-employed -0.11* -0.62* -0.07 -1.84 3.60 (0.06) (0.35) (0.16) (4.51) (3.74) [0.37] [0.37] [0.68] [0.68] [0.63] Mean wage employed in-person weekly 0.82 4.40 5.35 41.75 50.78 Phone effect self-employed -0.23 -1.24 -0.15 -3.68 7.20 p 0.05 0.08 0.64 0.68 0.34 Number of observations 3,562 3,562 2,852 3,562 2,852 Adjusted R2 0.077 0.168 0.272 0.247 0.330 Note: *** p<0.01, ** p<0.05, * p<0.10, standard errors clustered by household are presented in parentheses below the coefficient. Multiple testing adjusted FDR-adjusted q-values are presented in square brackets. The omitted category are individuals who were working at baseline and completed weekly in person interviews. Regressions are attrition weighted with weights corresponding to 1/number of weeks for which we have data for each individual. All regressions control for gender, age, marital status, education, days worked per week at baseline, hours worked per week at baseline, pay per week at baseline, self-employed at baseline, wage- employed at baseline, dummies for being employed in manufacturing, trade, services, the public sector, household dependency ratio, and log household assets, and week of interview dummies. 38 Table D2: The Impact of Reference Period on Weekly Labor Market Reporting – with controls The impact of the reference period on labor market reporting Phone triweekly vs phone weekly (omitted category) Days- Hours- Working Days conditional Hours conditional on work on work Panel A: Main effects (1) (2) (3) (4) (5) Phone triweekly 0.08*** -0.03 -0.52*** -0.14 -3.84*** (0.03) (0.18) (0.11) (1.73) (1.48) [0.02] [0.93] [0.00] [0.93] [0.02] Mean phone weekly 0.74 3.84 5.18 32.96 44.50 N 2,925 2,925 2,284 2,925 2,284 Adjusted R2 0.066 0.119 0.164 0.252 0.334 Panel B: Heterogeneity by self-employment status (6) (7) (8) (9) (10) Phone triweekly -0.01 -0.47* -0.56*** -3.93 -4.89** (0.05) (0.28) (0.17) (2.91) (2.36) [0.88] [0.14] [0.01] [0.24] [0.09] Self-employed -0.10** -0.57** -0.13 -7.28** -5.46** (0.05) (0.28) (0.14) (3.20) (2.63) [0.09] [0.09] [0.44] [0.09] [0.09] Phone triweekly*Self-employed 0.15*** 0.76** 0.07 6.52* 1.82 (0.06) (0.35) (0.21) (3.76) (2.99) [0.08] [0.09] [0.80] [0.14] [0.63] Mean wage employed phone weekly 0.79 4.15 5.23 36.95 46.57 Triweekly effect self-employed 0.15 0.29 -0.49 2.60 -3.07 p 0.00 0.21 0.00 0.25 0.10 Number of observations 2,925 2,925 2,284 2,925 2,284 Adjusted R2 0.073 0.124 0.164 0.255 0.335 Note: *** p<0.01, ** p<0.05, * p<0.10, standard errors clustered by household are presented in parentheses. Multiple testing adjusted FDR-adjusted q-values are presented in square brackets. The omitted category are individuals who were working at baseline and completed weekly in person interviews. Regression are attrition weighted with weights corresponding to 1/number of weeks for which we have data for each individual. All regressions control for gender, age, marital status, education, days worked per week at baseline, hours worked per week at baseline, pay per week at baseline, self-employed at baseline, wage-employed at baseline, dummies for being employed in manufacturing, trade, services, the public sector, household dependency ratio, and log household assets, and week of interview dummies. 39 Table D3: The Impact of Survey Mode on Weekly Labor Market Reporting – Heterogeneity among Wage Employees The impact of survey modality on labor market reporting Heterogeneity among the wage employed Phone weekly vs in person interviews (omitted category) Days- Hours- Working Days conditional Hours conditional on work on work (1) (2) (3) (4) (5) Phone weekly -0.01 -0.01 0.03 -2.85 -2.94 (0.05) (0.32) (0.15) (4.80) (4.79) [0.99] [0.99] [0.99] [0.99] [0.99] Informal -0.07 -0.38 -0.00 -4.26 -0.84 (0.06) (0.39) (0.24) (5.92) (6.32) [0.99] [0.99] [0.99] [0.99] [0.99] Phone weekly*Informal 0.02 0.07 -0.03 1.99 1.15 (0.08) (0.52) (0.29) (7.15) (7.15) [0.99] [0.99] [0.99] [0.99] [0.99] Mean in-person weekly (formal) 0.85 4.48 5.25 41.97 49.20 Phone effects informal 0.01 0.06 0.00 -0.86 -1.79 p 0.86 0.89 1.00 0.87 0.73 N 1,451 1,451 1,178 1,451 1,178 Adjusted R2 0.024 0.024 -0.004 0.012 -0.006 Note: *** p<0.01, ** p<0.05, * p<0.10, standard errors clustered by household are presented in parentheses. Multiple testing adjusted FDR-adjusted q-values are presented in square brackets. The omitted category are individuals who were working at baseline and completed weekly in person interviews. Regression are attrition weighted with weights corresponding to 1/number of weeks for which we have data for each individual. All regressions include week of interview dummies. 40 Table D4: The Impact of Reference Period on Weekly Labor Market Reporting – Heterogeneity among Wage Employees The impact of the reference period on labor market reporting Heterogeneity among the wage employed Phone triweekly vs phone weekly (omitted category) Days- Hours- Working Days conditional Hours conditional on work on work (1) (2) (3) (4) (5) Phone triweekly -0.09 -0.99*** -0.70*** -11.25*** -9.65*** (0.06) (0.36) (0.27) (3.95) (3.60) [0.30] [0.04] [0.04] [0.04] [0.04] Informal -0.06 -0.33 -0.05 -2.50 0.11 (0.06) (0.35) (0.16) (3.99) (3.27) [0.49] [0.49] [0.82] [0.67] [0.97] Phone triweekly*Informal 0.15 0.84 0.15 10.09 4.83 (0.09) (0.57) (0.35) (6.20) (5.23) [0.27] [0.30] [0.76] [0.27] [0.49] Mean phone weekly (formal) 0.83 4.40 5.29 38.95 46.82 Phone effects informal 0.06 -0.15 -0.55 -1.15 -4.82 p 0.38 0.73 0.02 0.81 0.22 N 1,273 1,273 1,017 1,273 1,017 Adjusted R2 0.019 0.034 0.049 0.031 0.036 Note: *** p<0.01, ** p<0.05, * p<0.10, standard errors clustered by household are presented in parentheses. Multiple testing adjusted FDR-adjusted q-values are presented in square brackets. The omitted category are individuals who were working at baseline and completed weekly in person interviews. Regression are attrition weighted with weights corresponding to 1/number of weeks for which we have data for each individual. All regressions include week of interview dummies. 41