Welfare Analysis of Changing Notches: Evidence from Bolsa Famı́lia*

We analyze the welfare impacts of a reform that expanded an eligibility notch in one of the world’s largest cash transfer programs, Bolsa Famı́lia. We develop a novel framework to bound the welfare impacts of reforms to transfer programs featuring notches using two sufficient statistics: (1) the number of households bunching at the old notch who move toward the new notch, and (2) the number of households who “jump” down to the new notch. We estimate these two statistics using longitudinal administrative data and a difference-in-difference strategy. Despite strong evidence of behavioral responses to this reform, we find that the corresponding efficiency costs are small relative to the equity benefits: the reform’s MVPF is between 0.90 and 1.12. Because the Bolsa Famı́lia eligibility threshold is based on self-reported income, our findings suggest that the efficiency costs of targeting based on self-reported income can be relatively small even in high-informality settings.


Introduction
Cash transfers are one of the most popular social safety net programs in developing countries with around 130 low-and middle-income countries having at least one unconditional cash transfer program (Bastagli et al., 2016). Because cash transfer programs are typically targeted to poor households in high-informality settings, such programs have been criticized due to their potential to create disincentive effects (e.g., misreporting, shifting from formal to informal employment, reductions in labor supply). While a number of papers have explored disincentive effects of cash transfer programs (e.g., Garganta and Gasparini (2015), De Brauw et al. (2015)), few have attempted to quantify and trade-off the efficiency costs arising from these behavioral responses with the equity benefits cash transfers provide. 1 In other words, little is known about the welfare effects of these very popular programs.
This paper helps fill this gap by analyzing the welfare impacts of a reform to one of the world's largest cash transfer programs, Bolsa Família (BF). BF, a means-tested anti-poverty program in Brazil, is a particularly interesting program to study for two reasons: (1) BF is one of the few cash transfer programs for which eligibility is based on self-reported household income (Fruttero, Leichsenring and Paiva, 2020), 2 and (2) the benefit schedule features a pronounced eligibility threshold -a notch -wherein households are eligible for benefits only if they report an income below this threshold. Given the large portion of Brazil's workforce in the informal sector (Henley, Arabsheibani and Carneiro, 2009), these two features generate substantial scope for income misreporting. We study a 2014 reform to BF in which the government increased both the eligibility threshold and the amount of money given to beneficiary households. We develop a novel sufficient statistics framework to bound the welfare impacts of reforms to transfer programs featuring notches and apply this framework to the 2014 BF reform using longitudinal administrative data. We find strong evidence of behavioral responses to this reform; however, we find that the corresponding efficiency costs turn out to be small relative to the equity benefits. Plugging our empirical estimates into our sufficient statistics framework, we find that the marginal value of public funds (MVPF) is between 0.90 and 1.12, implying that the reform was welfare improving if the government values giving R$0.90 to BF recipients more than spending R$1 on their next best alternative and welfare decreasing if the government values giving R$1.12 to BF recipients less than spending R$1 on their next best alternative. Because BF has high coverage of very poor households (Bastagli (2008), Lindert et al. (2007)), we argue that it is highly likely that the government is willing to spend R$1 to get R$0.90 to BF recipients. This result highlights the importance of having theoretical frameworks that translate behavioral responses to social programs into welfare impacts, a point recently emphasized in Gerard and Gonzaga (2021). This paper begins by developing a novel theoretical framework to analyze the welfare impacts of changing notches in transfer schedules. In particular, we show how to bound the welfare impact of a notch reform using two sufficient statistics that capture the relevant behavioral responses: (1) the number of agents bunching at the original threshold who move towards the new threshold as a result of the reform, and (2) the number of agents who "jump" down to bunch at the new threshold as a result of the reform. Importantly, our bounds require only minimal structure on the agent problem: agents can have arbitrary preference heterogeneity, agents can have any number of choice variables (and, therefore, can respond to the reform in various ways), and agents may face optimization frictions such as limited choice sets and/or adjustment costs. As such, our 1. Bergolo and Cruces (2021), who explores a cash transfer program in Uruguay, is a recent exception. 2. While eligibility is based on self-reported income, the government cross-checks information provided by households with other administrative databases. framework can easily be adapted to study notch reforms in other transfer programs. Given the prevalence of notches in tax and transfer programs across the world (Slemrod (2010); Kleven (2016)), developing a novel sufficient statistics method to study notch reforms is a key contribution of our paper.
To build intuition for our framework, we first derive bounds on welfare in a simple, static model in which a government provides a constant transfer only to households who report an income below a certain threshold; hence, the transfer schedule features a notch. Households are endowed with an income and choose how much income to report to the government subject to misreporting costs. We derive bounds on the MVPF of changing the notch, comprising both an increase in the benefit level and eligibility threshold. The MVPF is the ratio of households' willingness-to-pay (WTP) for the reform relative to the government's budgetary cost of the reform (Hendren and Sprung-Keyser, 2020). We show that aggregate WTP can be bounded using two empirical objects: (1) the number of "bunching households" who bunch at the old notch and move towards the new notch, and (2) the number of "jumping households" who jump down to bunch at the new notch. To show this, we first note that for households who do not jump or bunch, their WTP is simply equal to the amount of additional money they receive. However, for the bunching and jumping households, their WTP differs from their increase in benefits; this is because we cannot simply appeal to the envelope condition in our setting as we consider arbitrary discrete reforms. Instead, we use revealed preference arguments to bound the amount that each bunching and jumping household is willing to pay. We can therefore bound aggregate WTP and, in turn, the MVPF provided we can observe the number of bunching and jumping households.
Because the welfare impact of a reform is equal to the MVPF multiplied by a normative welfare weight, our bounds on the MVPF enable us to make welfare statements about the impacts of changing a notch using estimable, reduced-form objects.
While our results are initially stated in the context of a simple, static misreporting model to build intuition, we show that our welfare bounds are highly robust to the agent problem. In particular, our bounds hold in models with any sort of behavioral response margin (e.g., agents can respond to the reform via labor supply responses instead of misreporting responses), arbitrary preference heterogeneity, adjustment costs, limited choice sets (e.g., agents face labor supply frictions), and (non-extreme) misperceptions of the benefit schedule. Moreover, we discuss how our bounds can be augmented to allow for dynamic decision making with uncertainty and for more complex policy environments. We therefore refer to the number of jumping and bunching households as sufficient statistics to bound the welfare impact of a notch change. We view this generality as a key contribution of our framework as not only does it allow for our framework to be adapted to analyze notch reforms in other transfer programs, but it also highlights a new way to use reduced-form bunching evidence to inform welfare analysis without making strong assumptions on the agent problem.
We then turn our attention to bounding the welfare impact of the June 2014 reform to the Bolsa Família benefit schedule. Prior to June 2014, households reporting a per-capita income below the extreme-poverty threshold of R$70 per-month were eligible for an unconditional monthly benefit of R$70 per-month, whereas households reporting an income above this threshold were not. 3 In June 2014, both the eligibility threshold and benefit were raised by 10%: the eligibility threshold increased from R$70 per-capita, per-month to R$77 per-capita, per-month, while the unconditional benefit increased from R$70 per-month to R$77 per-month.
We have access to administrative data spanning December 2011 to September 2016 from CadastroÚnico, which is the Brazilian government's national registry used to determine eligibility for all federal social welfare programs. Using this data, we seek to estimate the number of original bunching households who move towards the new notch and the number of households jumping down to the new notch as a result of the reform. The number of bunching households is simply equal to the reduction in the bunching mass at the old notch, while the number of jumping households is equal to the increase in the bunching mass at the new notch less the reduction in the bunching mass at the old notch. However, when looking at the raw data, there are clear time trends in the number of households locating at the old and new notch in the pre-reform period. Thus, our identification challenge is to understand how the number of households reporting incomes at the old and new notch would have evolved in the post-reform period had the reform not occurred.
At a high level, our identification strategy is based on the following insight: the incentives to report an income at and above the original notch are affected by the reform, but the incentives to report an income below the original notch are unaffected by the reform. Hence, we can use portions of the reported income distribution below the original notch as control groups for underlying time trends in the reported income distribution at and above the original notch. To do so, we use a generalized difference-in-difference strategy as employed in, for example, Wolfers (2006) or Mora and Reggio (2013). A standard difference-in-difference estimator requires that the control groups and treatment groups have "parallel trends" in the pre-reform period that would have persisted in the post-reform period had the reform not occurred. Similarly, the generalized difference-in-difference estimator requires that there exist stable relationships, which are well approximated by relatively low order polynomials, between treatment and control groups which existed prereform that would have persisted in the post-reform period had the reform not occurred. In our setting, we discretize the reported income distribution into bins, using bins below the original notch as control bins while using bins that include the original and new notches as treatment bins. We then use the estimated relationships between treatment and control bins in the pre-reform period to create counterfactuals for how the treatment bins would have evolved absent the reform. Fundamentally, there are complex structural relationships between the number of people reporting in each income bin; these relationships are determined by how the true income distribution is evolving over time, how the true income distribution is mapped to the reported income distribution, as well as growth in applicants to the BF program over time. Our generalized difference-in-difference strategy essentially assumes that the differences over time between the number of people reporting in each income bin are well approximated by (low-order) polynomials which would have persisted in absence of the reform; this assumption appears to hold in the pre-reform period, lending credence to our identification assumptions. Moreover, we implement a number of placebo tests which further support the validity of our identification assumptions.
Using this generalized difference-in-difference strategy, we find that the number of households reporting incomes at the new notch increased by approximately 49,000 and the number of households reporting incomes at the old notch decreased by approximately 27,000 as a result of the reform. Hence, the reform induced 27,000 bunching households to move toward the new threshold and an additional 22,000 households to jump down to the new threshold. This translates into a lower bound for the MVPF of the reform of approximately 0.9 and an upper bound for the MVPF of approximately 1.12. In terms of welfare implications, as long as the government values giving R$0.90 (in a non-distortionary manner) to the BF households more than spending R$1 on their next best alternative, the reform was welfare improving. We argue that this is likely the case given that BF has high coverage of households in extreme poverty and given that the households misreporting to get the benefit typically fall in the poorest half of the population (Bastagli (2008) and Lindert et al. (2007)). A back-of-the-envelope calculation suggests that the welfare effect of spending R$1 on the BF reform is at least as high as the welfare effect of spending R$1.50 on a non-distortionary universal transfer.
Hence, our empirical findings contribute to the debate around targeted vs. universal transfers in developing settings (Hanna and Olken, 2018): even in a setting with a highly pronounced notch and substantial scope for misreporting, the efficiency cost generated by behavioral responses is simply not large enough to outweigh the equity gain associated with the increased generosity of benefits targeted to poor households (relative to a universal transfer).
Relationship to the literature: We contribute to the small but growing literature on estimating the welfare impacts of cash transfer programs in developing countries (e.g., Bergolo and Cruces (2021); Bergstrom and Dodds (2021b), Hanna and Olken (2018)). Existing papers on cash transfers in developing settings typically estimate impacts (e.g., behavioral responses), but often lack theoretical frameworks to infer associated welfare implications (Bergolo and Cruces (2021), Gerard and Gonzaga (2021)). We develop a theoretical framework to translate behavioral responses to notch reforms into welfare impacts, and, in doing so, show that despite strong evidence of behavioral responses to the BF reform, such responses are unlikely to be large enough to generate efficiency losses that outweigh the equity benefits. Our findings are similar to Bergolo and Cruces (2021) who estimate sizable equity benefits relative to efficiency costs for a cash transfer program in Uruguay that bases eligibility, in part, on reported incomes. These results provide some evidence against the commonly held belief that targeting transfer programs based on self-reported incomes will have substantial efficiency costs in settings with high informality. Thus, our results have important implications for the future design of cash transfer programs, especially if developing countries are to increasingly tie eligibility of social programs, in part, to reported incomes.
We also contribute to the large literature that explores bunching at notches and kinks. Typically papers in this literature use reduced-form bunching evidence to pin down structural parameters of interest (e.g., the elasticity of taxable income) by making assumptions on the agent optimization problem (Kleven, 2016). These parameters can then be used to inform the welfare impacts of previous or proposed policy changes, although, in this literature, they are more commonly used to identify alternative schedules that would improve welfare (see, for example, Best et al. (2015) or Bachas and Soto (2021)). However, the chief limitation of this approach is that translating a bunching mass into a structural parameter typically relies heavily on the modeling assumptions of the agent problem (Kleven, 2016). 4 Conversely, our framework uses reduced-form bunching evidence in a sufficient statistics approach, i.e., without making substantive parametric assumptions on the agent optimization problem. Hence, our approach is complementary to standard bunching approaches in the sense that they have opposing strengths and weaknesses: our framework avoids making strong assumptions on the agents' optimization problem, but our framework can only be used to answer a narrower set of questions than the standard bunching approach. Furthermore, our empirical strategy differs from standard bunching analyses. We focus on estimating changes in bunching that result from a reform, whereas standard bunching analysis typically focuses on estimating bunching at a given notch or kink. 5 Using a difference-in-difference strategy, we use portions of the distribution unimpacted by the reform to control for underlying time trends in portions of the distribu-tion which are impacted by the reform. Conversely, the standard bunching approach uses portions of the distribution unimpacted by the notch combined with smoothness assumptions on the underlying counterfactual distribution to identify bunching induced by the notch. One may wonder if we could have adapted the standard bunching approach to estimate changes in bunching while controlling for underlying time trends as done in Carril (2022); however, as discussed further in Section 4.3, our distribution of reported incomes is extremely non-smooth (we have extreme bunching at numbers equal to 0 mod 50, substantial bunching at numbers which were notches many years earlier, and less extreme bunching at numbers equal to 0 mod 10). While bunching estimation methods have been augmented to deal with "round-number" bunching and bunching at "reference-points" (e.g., Kleven and Waseem (2013) and Best and Kleven (2017)), we believe that the pervasiveness and the variability of round-number and reference-point bunching in our setting will make it too difficult to precisely identify the counterfactual distributions around our notches (in particular, around the original notch of R$70 which is equal to 0 mod 10). Thus, our empirical strategy highlights an alternative way to estimate changes in bunching when smoothness assumptions cannot be used to estimate counterfactual distributions. 6 Finally, our paper contributes to the sufficient statistics approach for welfare analysis (Chetty, 2009). One broad contribution to this literature is showing how to apply a sufficient statistics framework to settings with large, discrete policy reforms when the envelope condition cannot be applied. Kleven (2021) discusses how to do approximate welfare analysis using an expanded set of sufficient statistics when reforms are large; however, he argues that these additional statistics are difficult to estimate. We overcome the need to estimate a large set of complex parameters by focusing on welfare bounds. This paper is also related to Lockwood (2020) who observes that in the presence of notches, the sufficient statistic approach for the welfare analysis of tax systems needs to be augmented to include a correction term which captures the change in bunching at a notch in response to a tax reform. Our key theoretical contribution relative to Lockwood (2020) is the generality of our results. Whereas Lockwood (2020) characterizes the welfare impacts of particular infinitesimal reforms to notches in a model with quasi-linear utility, one dimension of agent heterogeneity, and one choice variable, our approach allows us to bound the welfare impacts of reforms to programs with notches of any size while placing very little structure on the agent problem. 7 The remainder of the paper is organized as follows. Section 2 discusses the structure of the Bolsa Família program, Section 3 introduces the theoretical framework for welfare analysis of changing notches, Section 4 discusses our strategy to empirically identify the number of jumping and bunching households for the June 2014 reform, Section 5 presents our results and robustness analysis, and Section 6 concludes.

Bolsa Família
In this section we discuss details of the Bolsa Família program, the June 2014 reform, and the data we have access to.
6. While methodologically different, our approach is conceptually similar to Best et al. (2020) who also use the behavior of non-bunching individuals to make inferences about the counterfactual behavior of bunching individuals. 7. We also allow for changes in the location of the notch, which turns out to be more technically complicated.

The Bolsa Família Program
The Brazilian anti-poverty program Bolsa Família (BF), which has been recently supplanted by Auxílio Households were then classified as eligible if their reported per-capita income fell below one of two thresholds. First, households reporting a per-capita income below the "extreme-poverty threshold" were eligible for an unconditional benefit (referred to as the "basic benefit"). Second, households with children reporting a per-capita income below the higher "poverty threshold" were eligiblie for a conditional benefit (referred to as the "variable benefit") provided that they made health and education investments in their children.
Notably, not all households reporting incomes below these thresholds received the benefit. This is because there was a quota (cap) on the number of beneficiaries per municipality. Prior to 2009, these quotas were based on the predicted number of households below the poverty threshold in each municipality. Post 2009, these quotas were based on the predicted number of households below the poverty threshold scaled by 1.18 (Gerard, Naritomi and Silva, 2021). Hence, if quotas were binding in a given municipality, some eligible households may not have received benefits. 8 Finally, the MDS had several enforcement mechanisms to prevent income misreporting. First, during the interview, the income questions came at the end of the questionnaire so that questions on expenditures and assets could help the interviewer asses the veracity of the reported income (Bastagli, 2008). Second, during the interview, the applicant was reminded of her responsibility to provide true statements under penalty of losing the right to be eligible for government programs (Gazola Hellmann, 2015). Third, the ministry conducted audits, which could be triggered by citizens' complaints and cross-checks of registry data with other datasets such as administrative data on formal employment, deaths, or automobile purchases (Gazola Hellmann, 2015).
However, despite these attempts, the large informal sector in the Brazilian economy generated substantial 8. In Section 3.3 we discuss how our framework can be augmented when only a fraction of eligible households receive benefits, and in Section 5.2, we discuss why this feature of the program is unlikely to impact our MVPF bounds. scope for income misreporting.

The Transfer Schedule and the June 2014 Reform
Between July 2009 and June 2014, the extreme-poverty threshold was R$70 per-capita, per-month and the poverty threshold was R$140 per-capita, per-month. 9 The basic (unconditional) benefit was equal to R$70 per-month, while the variable (conditional) benefits were based on the number and ages of the children in the household (see Appendix B.4 for more information on the variable benefits). In June 2014, the government increased both the benefits and thresholds by 10%. Thus, the extreme-poverty threshold was raised from R$70 to R$77 per-capita, per-month, the poverty threshold was raised from R$140 to R$154 per-capita, per-month, the basic benefit was raised from R$70 to R$77 per-month, and the variable benefits were also increased by 10%. This reform was announced on national television by the president in April 2014.
Our main empirical analysis will focus on single individual households (households with one adult and no children). These households are not eligible for the variable benefits as they do not have children. Thus, prior to June 2014, these households were eligible for R$70 per-month if their reported income was less than or equal to R$70 per-month and 0 otherwise. After June 2014, these households were eligible for R$77 permonth if their reported income was less than or equal to R$77 per-month and 0 otherwise. Thus, the benefit schedule for these households has a single notch which increased both in level and location as a result of the June 2014 reform; see Figure 1. We focus on single individual households because in February 2013, the government instituted a guaranteed minimum income of R$70 per-capita for all households which was subsequently raised to R$77 in the June 2014 reform. 10 However, because the basic benefit is equal to the guaranteed minimum income, the benefit schedule for single individual households is not impacted by this minimum. In contrast, for all 9. In 2003, the extreme-poverty threshold and the poverty threshold were set to equal one-fourth and one-half of the monthly minimum wage of R$200, respectively. These thresholds have since been periodically adjusted for inflation; the time betwen adjustments is ad hoc and not linked to the minimum wage. Prior to June 2014, the last readjustment was in July 2009 (Gazola Hellmann, 2015).
10. This guaranteed minimum income was instituted earlier for households with children. In particular, this guarantee was instituted in June 2012 for households with children below the age of 6, in November 2012 for households with children below the age of 15, and in February 2013 for all remaining households. other households, this minimum creates a kink in the benefit schedule below the extreme poverty threshold (the location of this kink will vary based on household composition); moreover, the location of this kink changed with the 2014 reform. For example, prior to the reform, households with two adults and no children had a kink at the reported per-capita income level of R$35 which increased to R$38.5 post June 2014. 11 This is problematic because, as will be discussed in Section 4, one of our identification assumptions is that the reported income distribution below the extreme-poverty threshold is unaffected by the June 2014 reform; clearly this is not necessarily true for households with more than one member because the kink created by the guaranteed minimum income changes concurrently with the notch. Nonetheless, we will discuss the impacts of the June 2014 reform on households with more than one individual in Section 5.5.
Finally, in June 2016, there was another reform to the BF program where both the benefit and the threshold were further increased. This reform, like the June 2014 reform, affected households of all compositions.
As discussed next, our data ends in September 2016; thus, we do not have sufficient data beyond June 2016 to analyze the June 2016 reform.

Data Sources and Sample Description
We have access to the CadastroÚnico household registry, which is used to determine the eligibility of households for BF as well as all other targeted federal social programs (Veras Soares, 2011). Many of these other programs have eligibility criteria above the BF thresholds which explains the large number of ineligible applicants in the registry (see Table 1). 12 These other programs do not change concurrently with the BF reform that we analyze and are discussed in more detail in Appendix B.2.  Table 1. Note, Table 1 also presents separate statistics for single individual households as our main empirical analysis will focus on these households. 11. In particular, prior to June 2014, a two member household with a reported per-capita income less than R$35,ŷ < 35, will receive an additional monthly benefit equal to 2(70 −ŷ) − 70. E.g., a two member household reporting a per-capita income of R$20 will receive R$70 in the basic benefit and an additional benefit of R$30.
12. Moreover, the government aims to register all families with per-capita incomes below half the minimum wage (or total incomes below three times the minimum wage) (Veras Soares, 2011). The minimum wage was R$724 per-month in 2014; thus half the minimum wage was R$362, which is substantially higher than the highest BF threshold.
13. Every time the MDS analyzes the CadastroÚnico data, it creates one of these extractions. Therefore, the frequency of the extractions are a result of previous data analyses by the ministry. Appendix B.3 contains a figure depicting the timeline of the data extractions.

Theoretical Framework for Welfare Analysis of Changing Notches
In order to analyze the welfare impacts of the 2014 BF reform, we need a theoretical framework. In this section we devise a sufficient statistics framework to bound the welfare impacts of changing a notch in a transfer program. While we will apply this framework to the BF reform, we believe this framework can be used more generally to analyze reforms to notches in other transfer programs. To begin, we derive bounds in a simple, static misreporting model. This simple setup is useful not only for building intuition but also because misreporting responses are likely common in response to the BF reform. We show that we can bound the welfare impact of a notch change using two empirical objects: (1) the number of households bunching at the old notch who move towards the new notch as a result of the reform, and (2) the number of households who jump down to the new notch as a result of the reform. We then show that these bounds hold in a much more general model which puts minimal structure on the household problem, thereby arguing that our two empirical objects are "sufficient statistics" to bound the change in welfare from a notch change.

Baseline Model Set-Up
To begin, we consider a static world in which households choose how much income to report,ŷ, subject to a policy p = {b, τ } where b denotes the level of the benefit and τ denotes the eligibility threshold s.t. those reportingŷ ≤ τ receive b and those reportingŷ > τ receive nothing. 14 Households have two dimensions of heterogeneity: (1) endowed income y distributed according to CDF F (y), and (2) aversion to misreporting governed by µ ∈ {1, 2} with probability mass function π(µ). Type µ = 1 households are "truth-telling" households who never misreport, whereas type µ = 2 households are willing to misreport their income. For simplicity, we assume there is no fixed cost of reporting an income and that all households know the policy p. We also assume households have quasi-linear utility in consumption. Type µ = 1 households therefore 14. For sake of parsimony, our simple framework abstracts from a number of complexities. For instance, we abstract from the presence of other tax and transfer programs. We also assume that all households reporting below the threshold τ receive the benefit with certainty. We discuss in Section 3.3 how to incorporate these more complex policy environments into our framework, and in Section 5.2 we discuss how these additional complexities may impact our empirical results.
have utility under policy p given by: Utility for type µ = 2 households under policy p is equal to: where v (y −ŷ) 1(y >ŷ) captures the disutility of reportingŷ when true income is y. We assume v ′ > 0 and v ′′ ≥ 0 so that the cost of misreporting is increasing and convex in the discrepancy between true and reported income. 15 If a household is indifferent between reporting truthfully and over-reporting, we break their indifference by assuming they report truthfully.
Optimal reported incomes,ŷ * , for type µ = 2 households are characterized as follows (see Appendix A.1 for a formal derivation):ŷ * (y, µ = 2; p) = where y c (p) is the income level for which households are indifferent between misreporting at τ and reporting truthfully, implicitly defined by y c (p) + b − v (y c (p) − τ ) = y c (p). In words, type µ = 2 individuals with y < τ report truthfully and get the benefit, those with y ∈ (τ, y c (p)] get the benefit by misreporting and bunching at the notch, and those with y > y c (p) report their income truthfully and do not get the benefit.
Next, we define G(x; p) to capture the number of households reporting an income less than or equal to x under policy p: Hence, G(τ ; p) captures the number of households locating under the eligibility threshold τ . Finally, we assume that total social welfare under policy p, W(p), is given by a weighted sum of household utilities evaluated at optimal choices under policy p less the budgetary cost of the policy multiplied by the shadow value of public funds λ, which captures the welfare gain of spending a dollar on the government's next best alternative. In other words, we assume that spending $1 on the program comes at the cost of spending $1 less on some other program which decreases welfare by a constant value λ. Welfare under policy p is therefore given by: where ϕ(y, µ) denotes the government's welfare weight on a household with income y and type µ.

Welfare Effect of the Reform in the Baseline Model
Our goal is to evaluate the welfare impact of a reform from policy p = {b, τ } to p ′ = {b ′ , τ ′ }, defining {∆b, ∆τ } = p ′ − p. For ease of exposition, let us assume ∆b > 0 and ∆τ > 0 (i.e., we increase the level and location of the notch). Note that we are not restricting to infinitesimal reforms; the bounds we derive allow for arbitrary, discrete reforms to b and τ . Our goal is to derive bounds for W(p ′ ) − W(p) in terms of empirically observable objects (we discuss the advantages of focusing on bounding the welfare impact as opposed to characterizing the exact impact in Section 3.3). Hendren and Sprung-Keyser (2020) show that the welfare impacts of any policy reform can be expressed in terms of a normative welfare weight along with a positive sufficient statistic, denoted the marginal value of public funds (MVPF), which captures households' willingness-to-pay (WTP) for the reform relative to the total budgetary cost of the reform. Our goal then is to derive bounds for the MVPF.
Because the utility function is quasi-linear, a household with income y and misreporting type µ has a WTP for the reform implicitly defined by: Equivalently, WTP is simply equal to the compensating variation. We now heuristically derive bounds on the WTP for all households impacted by the reform (we formally derive these bounds in the proof to Lemma 1). To do so, we will split the set of households impacted by the reform into four groups and discuss the WTP for each group separately. We call these four groups the mechanical households, bunching households, threshold households, and jumping households. Figure 2 depicts which households fall into each of the four groups for a hypothetical change to the reported income distribution.
Note: This figure shows a hypothetical density of reported incomes under the initial policy p = {b, τ } (in grey) and how this density changes as a result of a reform that increases the policy to p ′ = {b ′ , τ ′ } (in black). Note, the vertical grey line at τ and the vertical black line at τ ′ represent the bunching households under the initial policy and new policy, respectively. This figure also depicts which households are classified as mechanical households, bunching households, threshold households, and jumping households. The mechanical households are the households who report at or below τ under policy p and who do not change their behavior in response to the reform. The number of mechanical households is given by M = G(τ ; p ′ ). Notably, M is not equal to the mass reporting at or below τ under policy p as this mass includes bunching households who do update their behavior as a result of the reform. Conversely, those reporting at or below τ under p ′ will also report at or below τ under p. Because the reform increases benefits for mechanical households by ∆b, their WTP for the reform is simply ∆b.
The bunching households are the households who misreport and bunch at the original threshold τ under policy p and move with the threshold as it is increased (i.e., they report between (τ, τ ′ ] under policy p ′ ). 16 Thus, the number of bunching households is equal to reduction in households locating at or below τ as a result of the reform: B = G(τ ; p) − G(τ ; p ′ ). The bunching households receive an increase in benefits equal to ∆b. Moreover, they experience a reduction in their misreporting costs as they move from τ to τ ′ . Hence, the bunching households have a WTP of at least ∆b. 17 Moreover, by revealed preference arguments, the most the bunching households can value this reduction in misreporting costs is b dollars. If they value this reduction by more than b dollars, it would not have been optimal for these households to bunch at τ under policy p. 18 Thus, the bunching households' WTP for the reform is in [∆b, ∆b The threshold households are the households who report in (τ, τ ′ ] under policy p; we denote the number of threshold households by T = G(τ ′ ; p) − G(τ ; p). These households will not update their behavior in response to the reform because all households reporting above τ given policy p are reporting truthfully regardless of their type µ. Under p ′ the optimal choice for these households is to continue reporting truthfully as doing so allows them to receive b ′ and not incur any misreporting costs. These households go from receiving no benefits to receiving b ′ in benefits. Hence their WTP is equal to b ′ .
Finally, the jumping households are the households who report above τ ′ given policy p but who report at τ ′ given policy p ′ . In particular, households who were previously close-to-indifferent between misreporting at the threshold and truthfully reporting above the threshold but opted for the latter, i.e., type µ = 2 households with y ∈ (y c (p), y c (p ′ )], will now jump and misreport to the new threshold τ ′ . 19 Notably, we describe the behavioral response of these households as a "jump" because these households experience a discontinuous change in their optimal reported income as we move from p to p ′ . The number of jumping households, J, is given by J = G(τ ′ ; p ′ ) − G(τ ′ ; p), which is equivalent to the increase in households reporting at or below the new threshold as a result of the reform (note, the number of jumping households is not equal to the mass of households locating at τ ′ given policy p ′ as some of the households locating at τ ′ may be original bunching households who moved with the notch). By revealed preference, jumping households' utility is improved by changing their behavior; hence, their WTP is weakly positive. 20 Moreover, after the reform, jumping 16. Not all bunching households will move all the way to τ ′ . In particular, bunchers with y ∈ (τ, τ ′ ) will reportŷ = y under p ′ . 17. Even for an infinitesimal reform, the utility gain bunching households experience from moving with the notch has a firstorder impact on welfare. This is because the envelope theorem cannot be applied for these individuals: in order to argue that the derivative of indirect utility with respect to the policy is equal to the derivative of utility with respect to the policy evaluated at optimal decisions, utility must be differentiable with respect to the policy given any fixed choices (see Theorem 2 of Milgrom and Segal (2002)). However, utility is actually discontinuous as a function of the parameter τ holding decisions fixed: for example, individuals reporting an income of τ see a discrete drop in consumption (and hence utility) if τ is reduced by any amount.
18. Suppose bunching households have a WTP for the relaxation in their misreporting costs greater than b: v(y−τ )−v(y−τ ′ ) > b. This implies v(y − τ ) > b. However if v(y − τ ) > b, bunching households would have preferred to report truthfully above τ over misreporting at τ under policy p.
19. For large changes in τ s.t. y c (p) < τ ′ , only µ = 2 households with y ∈ (τ ′ , y c (p ′ )] jump and misreport to the new notch. 20. For an infinitesimal reform, jumping households have a WTP of 0 as the only households who jump are those that are indifferent between locating at the notch and reporting truthfully. Thus, our lower bound for the WTP for jumping households is exact for an infinitesimal reform. Note, however, that the jumping households still have a first order impact on social welfare through their effect on the government's budget as each jumping household costs the government b ′ dollars (despite the fact that the mass of jumpers is measure 0 for an infinitesimal reform; see Bergstrom and Dodds (2021a) for further discussion).
households get b ′ more dollars but incur misreporting costs. Therefore jumping households' WTP cannot exceed b ′ as misreporting costs are weakly positive. Hence the WTP of jumping households is in [0, b ′ ]. Table 2 summarizes the bounds on the WTP for each of our four groups along with the number of households falling into each group and the cost that each group imposes on the government.
Note: This table shows the willingness-to-pay (WTP) and cost to the government for all the households impacted by the reform of moving from policy p to p ′ . We split the households into four groups: mechanical, bunching, threshold, and jumping households.
This brings us to Lemma 1 which bounds the total WTP for the reform: Lemma 1. If individuals solve Problem (2), the total WTP of the reform from p to p ′ with p ′ − p = {∆b, ∆τ } > 0 can be bounded as follows: where M, B, T and J denote the mass of mechanical, bunching, threshold and jumping households mathematically defined in Table 2.
Moreover, using the cost per-household to the government given in Table 2, we can express the total budgetary cost of the reform as follows: Hence, we can construct bounds for the MVPF of the reform: Proposition 1. If individuals solve Problem (2), the marginal value of public funds of the reform from p to can be bounded as follows: Proof. This follows directly from Lemma 1 and Equation (4).
The lower bound for the MVPF captures the fact that the lower bound for the WTP of the jumping households is 0 while the cost they impose on the government is b ′ each (whereas the lower bound for the WTP of bunching households is exactly equal to the cost they impose on the government). Hence, if all jumping households value the reform at their lower bound, b ′ J/(Total Cost) of each dollar of spending is "wasted". Meanwhile, the upper bound for the MVPF captures the fact that the upper bound for the WTP of the bunching households is b ′ while the cost they impose on the government is only b ′ − b (whereas the upper bound for the WTP of jumping households is exactly equal to the cost they impose on the government). Hence, if bunching households all value the reform at their upper bound, the government "gains" bB/(Total Cost) for each dollar of spending. We can then use these bounds on the MVPF to construct bounds on the money metric welfare gain relative to the budgetary cost of the reform using Proposition 2: 21 Proposition 2. If individuals solve Problem (2) and social welfare is given by Equation (3), then the money metric welfare gain relative to the budgetary cost of the reform from p to p ′ with p ′ − p = {∆b, ∆τ } > 0 can be bounded as follows: where ω L (ω U ) captures the weighted average money-metric welfare gain from giving a dollar to mechanical, bunching, threshold, and jumping households, where the weights are determined by the relative size of each group's lower bound (upper bound) for WTP.
Proof. We do not provide a separate proof for Proposition 2; we only provide a proof for Proposition 3, which nests Proposition 2 (see Appendix A.3).
Proposition 2 bounds the increase in total welfare from spending $1 on the reform. In particular, M V P F L captures our lower bound on the total WTP of the mechanical, bunching, threshold, and jumping households when we spend $1 on the reform, while ω L denotes the welfare gain, measured in dollars, of splitting $1 among the mechanical, bunching, threshold, and jumping households (where the split is determined by the lower bounds on each group's WTP for the reform). Subtracting the budgetary cost of $1 from ω L M V P F L gives a lower bound for the total welfare gain of spending $1 on the reform. Symmetric logic explains why ω U M V P F U − 1 is an upper bound for the increase in total welfare, measured in dollars, of spending $1 on the reform.
Finally, in light of Proposition 1, we can express M V P F L and M V P F U in terms of two positive objects: (1) the number of bunching households, and (2) the number of jumping households. Thus, the number of bunchers and jumpers are the empirical objects needed to construct bounds for the welfare impact of the reform. 22

Robustness to Model Specification
To highlight the robustness of our welfare bounds, we now show that we actually require very little structure on preferences or behavioral responses of agents. Suppose households have several decisions variables denoted by the vector x within a choice set X. Household decisions are made conditional on primitives denoted by the vector θ ∈ Θ and the policy p. Households get the benefit b if their reported incomeŷ, which can be 21. Note, that the welfare change is expressed in dollar units (as opposed to welfare units) as we divide through by the shadow value of public funds, λ.
22. Technically, we also need to know the total cost of the reform; however, Equation (4) shows that the total cost is a function solely of G(τ ′ ; p ′ ) and G(τ ; p), which are needed to construct J and B. a decision variable or a function of decision variables and primitives, is below τ . Household income, denoted y, is also potentially a function of decisions x. 23 Households solve: where c denotes consumption. We assume total welfare is given by a weighted sum of utilities, with welfare weights given by ϕ(θ): where λ represents the shadow value of public funds and G(τ ; p) = θ:ŷ(θ,p)≤τ dF (θ) represents the number of households receiving the benefit under policy p. More generally, we define G(z; p) = θ:ŷ(θ,p)≤z dF (θ). This setup allows us to substantially generalize Proposition 1 and Proposition 2: Proposition 3. Suppose households solve Problem (5), welfare is given by Equation (6), and τ ′ > τ . Defining: where ω L (ω U ) captures the weighted average money-metric welfare gain from giving a dollar to mechanical, bunching, threshold, and jumping households, where the weights are determined by the relative size of each group's lower bound (upper bound) for WTP.
Proposition 3 highlights that we can bound the MVPF and, consequently, bound the welfare impacts of the reform using empirically observable objects while only putting limited structure on the household problem. 24 25 In particular, to bound the MVPF we simply need to estimate how the number of people locating below the new notch changes as a result of the reform, J = G(τ ′ ; p ′ ) − G(τ ′ ; p), as well as how the number of people locating below the old notch changes as a result of the reform, B = G(τ ; p)−G(τ ; p ′ ). In the context of our simple baseline model, B equals the number of households bunching at the original notch and 23. For example, in a labor supply model, x could equal household labor supply l, θ could include productivity n, andŷ = y = nl. 24. We are slightly abusing notation here. Our upper bound for the MVPF, M V P FU , is actually the upper bound on the MVPF from moving from policy p ′ to p. This is because our upper bound on the WTP is actually an upper bound on the negative WTP of moving from policy p ′ to p, or equivalently, it is an upper bound on the willingness-to-accept (WTA) to move from policy p to p ′ . We obviate this distinction in Proposition 1 by making the assumption that utility is quasi-linear in consumption, so that WTP=WTA.
25. Note that in Proposition 3 we have assumed τ ′ > τ and that b ′ G(τ ′ ; p ′ ) − bG(τ ; p) > 0. Assuming τ ′ > τ is WLOG because one can also use Equation (9) to get bounds on the welfare gain from moving from J equals the number of households jumping down to bunch at the new notch. However, if households solve the more general household problem (5), the interpretation of these terms may be changed. For instance, consider a labor supply model in which households can only work full-time, half-time, or not at all. In this case B = G(τ ; p) − G(τ ; p ′ ) simply captures the change in the number of households reporting at or below the original notch as a result of the reform, which does not correspond to a reduction in the bunching mass at the original notch as almost all households cannot precisely bunch. Similarly, J = G(τ ′ ; p ′ ) − G(τ ′ ; p) simply captures the change in the number of households reporting below the new notch as a result of the reform, which does not correspond to the number of households jumping down to bunch at the new notch as almost all households cannot precisely bunch. Nonetheless, for ease of exposition, we will continue to refer to B as the number of bunching households and J as the number of jumping households.
The core intuition for Proposition 3 is that we can again use revealed preference arguments to bound WTP for a reform from policy p to policy p ′ . And these revealed preference arguments require very little structure on household utility, household heterogeneity, the choice variables, or choice sets available to households.
Problem (5) can encompass a variety of important realisms: (1) households may respond by changing labor supply instead of misreporting their income (or respond on a variety of dimensions), 26 (2) households may face limited choice sets (e.g., restrictions on labor supply), (3) households may face reporting costs (e.g., hassle or time costs) and thereby face a decision of whether to report/update their income on the registry 27 , (4) households may have a wide range of heterogeneity in their utility functions (e.g., households may have varying preferences over the labor/leisure trade-off or varying preferences to locate at round numbers). We view this robustness as perhaps the most important aspect of our theory: we can construct bounds for the MVPF of changing a notch in a manner which is, in large part, model-free.
But of course there are some implicit restrictions encoded in the assumed household problem (Problem (5)) used to prove Proposition 3. Perhaps most importantly, Proposition 3 requires that households correctly perceive the benefit schedule and the reform. The proof to Proposition 3 uses the fact that household reoptimization improves utility; if misperceptions are extreme for many households this may not be the case.
However, it is straight-forward to extend Proposition 3 when households misperceive the schedule if we are willing to assume that, on average, behavioral responses to the reform from p to p ′ improve welfare (i.e., perceptions are not so extreme that households, on average, harm themselves by responding to the reform; see Appendix A.6). For example, if some proportion of households are entirely unaware of the reform while the rest of the population is perfectly aware of the reform, our bounds hold as unaware households will not respond to the reform whereas those who are aware of the reform improve their utility via their behavioral response. 28 (5) is also a static problem, so that Proposition 3 does not allow for dynamic decision making or uncertainty. We augment Proposition 3 in Appendix A.7 to show that we can bound the discounted welfare impact of the policy over time relative to the discounted total budgetary cost in a general dynamic model allowing for income dynamics, savings, and stochastic shocks. In this case, the relevant bounds for the MVPF are constructed using the discounted sum of the expected number of jumping households over 26. This is consistent with Feldstein (1999), who argues that the efficiency costs of taxation do not depend on whether behavioral responses occur on the labor supply margin or the misreporting margin.

Moreover, Problem
27. To capture adjustment/updating costs, one could suppose x =ŷt and θ consists of current income yt, aversion to misreporting µ, and prior reported incomeŷt−1; households incur an adjustment cost k if their reported income today differs from prior reported income.
28. In the extreme case where all households are unaware, both B and J are 0 so that our upper and lower bounds for the MVPF coincide at 1, which is definitionally the MVPF of non-distortionary cash transfers. time and the discounted sum of the expected number of bunching households over time. Proposition 3 also requires that there are no externalities from household decisions so that decisions of one household do not directly impact the utility of any other household. Our bounds can be generalized to allow for externalities by augmenting the upper and lower bounds for the MVPF with an additional term measuring the WTP for these externalities relative to the total cost; however, measuring WTP for externalities is likely difficult in practice.
Additionally, Proposition 3 can be augmented to allow for more complex policy environments. For instance, Proposition 3 holds even if agents only receive the benefit with some probability if they report below the notch (see Appendix A.4). The intuition being that this probability appears in both the numerator and denominator of the MVPF bounds (as it impacts both households' WTP for the reform as well as the total cost of the reform) and thus cancels out. Moreover, while Proposition 3 assumes that there is no underlying tax and transfer system beyond the benefit b given to those with a reported incomeŷ ≤ τ , it can easily be extended to account for more complex underlying tax and transfer schedules; in this case, the total budgetary cost of the reform must include the impacts that behavioral responses have on other programs that impact the government's budget (i.e., we need to calculate the fiscal externalities associated with the reform). 29 Lastly, we discuss the advantages of bounding the welfare impacts (as opposed to exactly characterizing the welfare impacts) of a notch reform. First, because we cannot apply envelope conditions in our setting, exactly characterizing welfare impacts requires one to take a stance on which margins households are responding, the functional form of their utility function, the sorts of frictions they face, etc. In contrast, bounding welfare impacts requires very little structure on the household optimization problem. In fact, we prove in Appendix A.5 that the bounds in Proposition 3 are as tight as possible without making additional assumptions on primitives. Second, our bounds on the welfare impact of a notch change are expressed in terms of estimable reduced-form objects, whereas exactly characterizing welfare impacts would require one to estimate a potentially large number of structural parameters.

Empirical Strategy
We now discuss our empirical strategy to estimate the number of bunching and jumping households for the June 2014 BF reform. To begin, we re-express our bounds for the MVPF in terms of parameters of the BF reform. Focusing on single individual households, the June 2014 reform changed the policy from 29. In particular, the total cost of the reform would equal where R(p) denotes net government spending under policy p (exclusive of spending on BF), and ∆R = R(p ′ ) − R(p) captures the fiscal externalities of the reform. In this case, the lower and upper bounds for the MVPF are given by: Thus, in addition to B and J, we need to observe ∆R to assess the welfare impact of the reform.
. Thus, our bounds on the MVPF in period t are given by: 30 We will estimate these bounds for June 2016, i.e., we will set t =t =June 2016. 31 Thus, we need to estimate the number of jumping and bunching households as of June 2016, which, in turn, requires us to estimate the number of households locating below the old and new notch under both p and p ′ att: Gt(70; p), Gt(70; p ′ ), Gt(77; p), and Gt(77; p ′ ). However, because this was a national reform, only Gt(77; p ′ ) and Gt(70; p ′ ) are observed directly because only policy p ′ was offered in periodt. Thus, our goal is to estimate the number of households reporting less than or equal to R$70 and R$77 in periodt had the reform not happened: Gt(77; p) and Gt(70; p).
Ideally, we would estimate Gt(77; p) and Gt(70; p) using control groups from random experimental variation (e.g., from staggering the implementation of the reform randomly across geographies); however, as mentioned above, the BF reform was a national reform implemented uniformly across Brazil starting June 2014. Consequently, our identification strategy will rely on using regions of the reported income distribution that were not impacted by the reform to control for underlying time trends in the portions of the reported income distribution that were impacted by the reform. This brings us to our two identification assumptions.

Identification Assumption 1
Our identification strategy relies on first finding a region of the reported income distribution which was not impacted by the reform. In the context of the baseline model in Section 3.1, the number of single individual households reporting an income strictly below R$70 should be unchanged by the reform. The intuition for this is that anyone who has a true income below R$70 always reports truthfully both pre-and post-reform and anyone who has a true income above R$70 prefers to misreport at the threshold rather than misreport to an income level below the threshold (because misreporting costs are increasing in the distance between true and reported incomes). While in this baseline model bunching should occur precisely at R$70, in reality bunching is typically more diffuse due to small optimization errors and/or frictions (Kleven, 2016). Thus, our first identification assumption is that the number of people reporting incomes at or below R$63 is unaffected by the reform (i.e., we assume that R$63 is sufficiently far below R$70 such that there are no "bunchers" at or below R$63): Identification Assumption 1. The distribution of reported incomes below R$63 is unaffected by the reform: To provide suggestive evidence that Assumption 1 is reasonable, Figure 3 plots the number of single individual households reporting in income bins of size 7 from R$0 to R$63 between June 2012 and June 2016 (with the numbers in each bin normalized to 1 in June 2012). Figure 3 shows that it is not obvious any of 30. By writing the bounds for a specific time period, we are implicitly assuming agents are not making dynamic decisions (e.g., agents are myopic) so that we can apply Proposition 3 and calculate the impact of the reform for a given time period.
31. As mentioned in Section 2, there is another reform to the schedule in June 2016 so we cannot estimate the welfare impact of the reform beyond June 2016.
In particular, Bt is equal to the reduction in households locating in R$(63,70] as a result of the reform (i.e., the reduction in households bunching at the old notch), while Jt is equal to the increase in households locating in R$(70,77] less the reduction in households locating in R$(63,70] as a result of the reform (i.e., the increase in households bunching at the new notch less the reduction in households bunching at the old notch).
32. We discuss reasons why Identification Assumption 1 may fail to hold and explore robustness to Identification Assumption 1 in Section 5.3.
33. The total cost of the reform can be estimated from the components necessary to estimate Bt and Jt as Total  34. Similarly, the number of households on the CadastroÚnico registry is growing over this time period -see Figure 12 in Appendix C.1. This too is likely due to a variety of factors including population growth, a struggling Brazilian economy, and increased awareness/understanding of the BF program over time.
35. Alternatively, households may wait to update simply because they are only required to update every two years and/or to avoid suspicion of misreporting that could result from updating immediately after the reform.

Identification Assumption 2
At a high level, our second identification assumption is going to relate how the numbers reporting in bins below R$63 evolve over time to how the numbers reporting in our two bins of interest, R$(63, 70] and R$(70, 77], would have evolved over time in absence of the reform. In this sense, bins below R$63 can be viewed as our "control bins" while R$(70,77] and R$(63,70] can be viewed as our "treatment bins". Building towards our second identification assumption and our main empirical specification, consider the following regression, where N (x−7,x],t denotes the number of individuals reporting in income bin R$(x − 7, x] in month t: where post t takes value 1 if month t is after the reform and 0 otherwise (i.e., post t = 1 if t ≥June 2014 and 0 otherwise). Regression (14)   indicates an average trend break of -0.038 across these bins, indicating that the number of people reporting incomes in bins {R$(0, 7], R$(7, 14], ..., R$(56, 63]} was, on average, approximately 3.8% lower in June, 2016 relative to polynomial trend. However, under Identification Assumption 1, any post-reform trend break for these bins must be due to underlying time variation unrelated to the reform. Loosely speaking, our second identification assumption is going to be that our two treatment bins, R$(63, 70] and R$(70, 77], would have seen the same deviation post-reform as the control bins had the reform not happened, i.e., these two bins also would have seen a 3.8% reduction in the number of people in June, 2016 (relative to their bin-specific polynomial time trends) had the reform not happened. However, as can be seen in Figure  Identification Assumption 2. In absence of the reform, the (log) number of people reporting in each bin evolves according to: Under Identification Assumption 2, the difference between the (log) number of households in any two 7-increment bins below R$77 is, in absence of the reform, governed by a stable polynomial plus a random error term. Combining Assumptions 1 and 2, we can use our control bins (i.e., bins ≤R$63) to identify h(t) in the post-reform period. 36 In other words, we can use our control bins to identify the expected deviation from bin-specific polynomials in the post-reform period if the reform did not occur. Any deviation observed above and beyond h(t) in our two treatment bins is then attributed to the reform. This brings us to our main empirical specification: a generalized difference-in-difference specification where we allow for flexible pre-treatment dynamics between treatment and control bins: where δ t represents a set of month fixed-effects; treat x takes value 1 if x ∈ {70, 77} and 0 otherwise; and K k=0 α k,x t k captures polynomial time trends for each bin (indexed by x) which predate the reform and are assumed to persist into the post-reform period had the reform not occurred. Under our two identification assumptions, the causal impact of the reform on the (log) number of households locating in R$( Before we present the results from Equation (15) in Section 5 (using various polynomial degrees K), let 36. We use the distribution below R$63 rather than the distribution above R$77 to control for underlying time trends because the distribution above R$77 is presumably impacted by the reform given that the jumping households would have located above R$77 in absence of the reform. However, reported incomes sufficiently far above R$77 should not be impacted by the reform. The question is, how far is sufficiently far? Kleven and Waseem (2013) use an iterative procedure to determine how far above a notch is sufficiently far so that the distribution beyond this point is unaffected by the notch. Their procedure relies on equating the excess mass at and below the notch with the missing mass above the notch. However, such an exercise is not feasible in our setting due to extensive margin responses. In particular, while some of the "excess" households locating below R$77 would have located above R$77 in absence of the reform, some of them may have simply decided to not report an income and, thus, not be on the registry. Thus, we cannot equate the excess mass below R$77 with the missing mass above R$77. Consistent with our selection of control bins, Kleven (2016) notes that when extensive margin responses are strong, one should consider only using data below the notch to estimate the bunching mass. us discuss the statistical and economic interpretations of Identification Assumption 2. From a statistical perspective, Identification Assumption 2 is simply a generalization of the standard difference-in-difference assumption (this generalization is employed in, for example, Wolfers (2006) and is discussed more generally in Mora and Reggio (2013)). 37 Setting K = 0 results in a standard difference-in-difference estimator in which the control groups and treatment groups are assumed to have "parallel trends" pre-reform that would have persisted in the post-reform period had the reform not occurred. The standard difference-in-difference strategy therefore assumes that the difference between the treatment and control groups follow a 0 th order polynomial in absence of the reform (i.e., that they differ by a constant). Larger values of K require progressively weaker parallel assumptions. For instance, setting K = 1 requires the "parallel growths" assumption which asserts that the growth rates in the treatment and control groups over time are the same (i.e., that second and higher differences between treatment and control groups are constant over time). 38 Setting K = 2 leads to the "parallel accelerations" assumption which asserts that the acceleration rates in the treatment and control groups over time are the same (i.e., third and higher differences between treatment and control groups are constant over time). Setting K > 2 results in what Mora and Reggio (2013) refer to as the "parallel-K" assumption which asserts that the K + 1 and higher differences between the treatment and control groups are constant over time.
But what is the economic meaning of Identification Assumption 2? Fundamentally, there are complex structural relationships between the number of people reporting in each income bin; these relationships are determined by how the true income distribution is evolving over time, how the true income distribution is mapped to the reported income distribution, as well as growth of the BF registry over time. By making Identification Assumption 2, we are essentially assuming that the differences over time between the number of people reporting in each income bin (which are determined by unknown structural relationships) are well approximated by polynomials which would have persisted in absence of the reform. 39 Identification Assumption 2 is not fully testable in the same way that the standard differences-in-differences assumption (that parallel pre-trends persist post treatment) is not fully testable. However, as in the case of the standard "parallel trends" assumption, we can gauge whether Identification Assumption 2 seems sensible by looking at pre-reform data. Identification Assumption 2 implies that the difference between the log number of households in any two 7-increment bins below R$77 is governed by a stable polynomial in absence of the reform. Appendix C.4 shows how the differences log N (70,77] − log N (x−7,x] and log N (63,70] − log N (x−7,x] evolved in the pre-reform period for x ∈ {7, 14, ..., 63}. These differences follow stable, low order polynomial relationships in the pre-period. While this of course does not imply that these relationships would have persisted into the post-reform period (just as parallel pre-trends do not imply that the trends would continue on a parallel path post treatment), it is at least suggestive that stable relationships exists between the various income bins.
However, because we have several control bins, we can partially test the validity of our two identification assumptions. In particular, under Identification Assumption 1, our control bins should be unaffected by the 37. Equation (2) in Wolfers (2006) uses an analogous generalized difference-in-difference accounting for differential quadratic pre-trends. Our Equation (15) is a combination of Equation (17) and Equation (20) from Mora and Reggio (2013). 38. A number of papers discuss the extended version of differences-in-differences under the "parallel growths" assumption (e.g., Mora and Reggio (2013), Rambachan and Roth (2020), and Bilinski and Hatfield (2019)).
39. One criticism of the standard bunching approach (see Blomquist et al. (2021)) is that smoothness assumptions used to identify counterfactual distributions around notches/kinks are implicitly assumptions on the shape of underlying primitive distributions. Similarly, Identification Assumption 2 is implicitly making assumptions about how primitives (e.g., endowed incomes, labor productivities, preferences over misreporting, etc.) are evolving over time. However, in contrast to the standard bunching approach, we can leverage the longitudinal nature of our data to explore the validity of these implicit assumptions by investigating whether the relationships between bins are stable in the pre-reform period and by doing placebo tests in the post-reform period.
reform. Thus, in the post-reform period, we can test whether the number of households locating in each of our control bins can be accurately predicted by our other control bins using Identification Assumption 2. Thus, at the end of Section 5, we run a number of placebo tests, suggesting that our two identification assumptions are reasonable.

Could We Use a Bunching Estimator to Estimate B and J?
Our empirical strategy amounts to estimating the reduction in the mass of households bunching at the old notch and the increase in the mass of households bunching at the new notch as a result of the reform using a difference-in-difference strategy. But one may wonder whether we could have used standard bunching techniques (developed in Kleven and Waseem (2013) and Saez (2010) (2022). However, this strategy is unlikely to be feasible in our setting. As seen in Figure 17 in Appendix C.3, the reported income distribution in Brazil is extremely non-smooth. For instance, in June 2014, in addition to substantial bunching at the notch of R$70, there is extreme bunching at numbers equal to 0 mod 50, less severe bunching at numbers equal to 0 mod 10, and substantial bunching at R$60. R$60 was a previous BF eligibility threshold (the notch of R$60 was implemented in 2006 while the notch of R$70 was implemented in 2009; see Gazola Hellmann (2015)). While bunching estimators have been augmented to deal with "round-number" bunching and other "reference-point" bunching (e.g., Kleven and Waseem (2013) and Best and Kleven (2017)), we believe that the pervasiveness and the variability of round-number and reference-point bunching in our setting will make it too difficult to precisely identify the counterfactual distributions around our notches (in particular, around the original notch of R$70 which is equal to 0 mod 10). Thus, our empirical strategy highlights an alternative way to estimate changes in bunching when smoothness assumptions cannot be used to estimate counterfactual distributions.

Results
In this section, we present results from our generalized difference-in-difference specification, Equation (15), and, in turn, calculate the number of bunching and jumping households along with our bounds on the MVPF.
Based on these bounds, we then discuss the welfare implications of the reform. We then show robustness of our results to a variety of factors: we show robustness to Identification Assumption 1 by allowing the number of households between R$(56, 63] to also be impacted by the reform; we show robustness to a more general version of Identification Assumption 2; and, we show robustness of our results to household composition.
Finally, we conduct a placebo exercise to explore the validity of our two identification assumptions.

Difference-in-Difference Results
First, we present results from estimating Equation (15) setting K = 3, which assumes the difference in the number of households reporting in any two income bins below R$77 approximately follows a cubic polynomial over time (in absence of the reform). Once we have estimated Equation (15), we can recover the causal impact of the reform on the log number of households in R$(x − 7, x] for x ∈ {70, 77} in any given month t, denoted ∆ log N (x−7,x],t =β 1,x +β 2,x t. Figure 6    40. All standard errors for our analysis are clustered at the bin level (using STATA's default small-number-of-clusters bias adjustment) as this is the level of "treatment assignment", see Abadie et al. (2017). Thus, we have 9 control clusters and 2 treatment clusters. However, Imbens and Kolesár (2016) show that these standard errors are modestly under-estimated when the number of clusters is small (they find that for samples with 10 clusters, 95% confidence intervals only have a 91% coverage rate). Hence, our standard errors may be modestly overstating the true statistical confidence we have in our estimates. An alternative approach is to use non-clustered wild bootstrapped p-values (note, wild cluster bootstrapped p-values lead to severe bias in difference-in-difference settings with a small number of treated clusters, see Roodman et al. (2019)). In our setting, these p-values are smaller than those generated from our cluster-robust standard errors. Finally, Ferman and Pinto (2019) suggest Translating these log changes into levels, the reform led to an increase of approximately 49,000 households locating in R$(70, 77] and a decrease of approximately 27,000 households locating in R$ (63,70]. Plugging these numbers into Equations (12) and (13), we get that Bt ≈ 27, 000 and Jt ≈ 22, 000. Thus, of the additional 49,000 households bunching at the new notch, 27,000 would have bunched at the old notch had the reform not happened, while 22,000 would have located above the new notch had the reform had not happened (i.e., 22,000 households jump into the program as a result of the reform). 41 We repeat this exercise under the alternative assumptions that the differences in the (log) numbers in any two bins follow quadratic, quartic, or quintic polynomials over time, i.e., we re-estimate Equation   (3) and (4) show the estimated number of bunching and jumping households for June 2016, Bt and Jt, calculated using Equations (12) and (13). Columns (5) and (6) show the estimated upper and lower bounds for the MVPF for June 2016, calculated using Equations (10) and (11). Standard errors are presented in parentheses and are computed via the delta method from the clustered standard errors estimated in Equation (15).

Discussion
Our lower bound on the MVPF of the June 2014 BF reform is approximately 0.90 whereas our upper bound on the MVPF is approximately 1.12. Notably, these bounds are somewhat tightly centered around another alternative for difference-in-difference settings with a small number of treated clusters; unfortunately, their results rely on asymptotic theory in the number of control clusters, which we believe is inappropriate for our setting given that we only have 9 control clusters. 41. One may then wonder whether we estimate a reduction of 22,000 households locating above R$77 as a result of the reform. To investigate this, we can include income bins above R$77 in Equation (15) as additional treatment bins. Cumulatively, we estimate that income bins between R$77 and R$147 saw a decline of around 14,300 households when setting K = 3 (with a standard error of around 6800). However, as mentioned in footnote 36, the set of jumping households consists of both households who would have located above R$77 in absence of the reform and households who would not have reported an income to the registry at all in absence of the reform (i.e., new entrants). Because of these new entrants, it is not surprising that we estimate the number of jumping households as exceeding the reduction in households reporting above R$77.
42. Our results are also robust to estimating Equation (15) with more granular income bins, see Appendix C.5.
1. This is because the number of mechanical households (households locating below R$70 both with and without the reform) is much larger than the number of bunching or jumping households. In particular, for  45 Moreover, we assume utility of consumption is CRRA: u(c) = c 1−γ /(1 − γ). Based on these assumptions, the government prefers giving R$0.90 (in a non-distortionary manner) to BF households more than spending R$1 on a non-distortionary UBI as long as γ > 0.14. While estimates for γ vary widely, typically estimates fall between 1 and 10 (Outreville, 2014). Alternatively, if γ = 1 (i.e., u(c) = log(c)), giving R$0.90 (in a non-distortionary manner) to BF households yields equivalent welfare gains to spending R$1.50 on a non-distortionary UBI. Thus, if the next best alternative policy is a non-distortionary UBI and the government is utilitarian, the BF reform was almost certainly welfare improving. This finding contributes to the debate around targeted vs. universal transfers in developing settings (Hanna and Olken, 2018): we find that even in a setting with a highly pronounced notch and substantial scope for misreporting, the efficiency cost generated by behavioral responses is simply not large enough to outweigh the equity gain associated with increased benefit generosity targeted to poor households (relative to a universal transfer).
Finally, when calculating our bounds for the MVPF, we have abstracted from a couple of complexities in the policy environment. First, we have ignored the impact that behavioral responses to the BF reform had on other components of the government's budget. There is evidence to suggest that some beneficiaries 43. Technically, the reform is welfare improving as long as the government values spending R$1 on their next best alternative less than splitting R$0.90 (in a non-distortionary manner) among the mechanical, threshold, bunching, and jumping households, where the split is determined by the lower bounds on each group's WTP for the reform. Similarly, the reform is welfare decreasing if the government values spending R$1 on their next best alternative more than splitting R$1.12 (in a non-distortionary manner) among the mechanical, threshold, bunching, and jumping households, where the split is determined by the upper bounds on each group's WTP for the reform.
44. In 2016, the mean per-capita income in Brazil was R$615.32 and the Gini coefficient was 0.533; these parameters fully pin down the income distribution if we assume log-normality. 45. Bastagli (2008) finds that of all transfers paid in 2004, 91% went to households in the bottom 50% of the income distribution. Alternatively, Lindert et al. (2007) suggests even better targeting performance, finding that 94% of all benefits go to the bottom 40%. Notably, transfers are not evenly distributed across the bottom 50% of households: the poorest households within this group receive more (e.g., Bastagli (2008) finds that the bottom 20% of the income distribution receive 50% of BF transfers, while Lindert et al. (2007) report that the bottom 20% receive 73% of BF transfers). Thus, we believe the assumption that the income distribution of BF households is the same as the income distribution for the bottom 50% of income earners in Brazil is highly conservative.
partially substitute formal sector employment for informal sector employment to become eligible for the BF benefit (as informal income is less easily verified making misreporting less costly; see De Brauw et al. (2015)).
While it is easier to evade taxes in the informal sector, it is unlikely that this behavioral response will affect income tax revenues as BF beneficiaries are almost certainly sufficiently poor so as to be exempt from income taxation. 46 However, such behavioral responses may lead to reductions in payroll tax revenues (as argued by Bergolo and Cruces (2021) in the context of a cash transfer program in Uruguay). A reduction in payroll tax revenue would increase the total cost of the reform leading to a reduction in both our lower and upper bounds for the MVPF (see footnote 29). However, there is an offsetting effect: Gerard, Naritomi and Silva (2021) recently showed that a reform to BF in 2009, which led to an increase in transfers, also led to a rise in formal sector employment of non-beneficiaries. They show that this offsetting effect dominates, generating a net positive impact on overall formal sector employment and tax revenue. Thus, we conjecture that the June 2014 BF reform may have had a small, positive impact on formal sector employment, generating a modest positive fiscal externality. A positive fiscal externailtiy would raise our estimates for both the lower and upper bound of the MVPF, and would, thus, only reinforce our finding that the 2014 reform was welfare improving relative to a non-distortionary UBI.
Second, as mentioned in Section 2, not all eligible households receive the BF benefit (due to municipalitylevel quotas on the number of beneficiaries). We discussed an extension in Section 3.3 (and Appendix A.4) that showed that our MVPF bounds are robust to eligible households only receiving the benefit with some probability. However, this extension assumed that the probability of receiving the benefit does not vary with reported income (conditional on reporting an income below the threshold) and does not change with the reform (these assumptions allow us to cancel out the probability in both the numerator and denominator of the MVPF). Fortunately, this appears to be the case for the BF reform: using data from the Cadastró Unico registries on which eligible households are BF beneficiaries, we see that in June 2014, 77.7% of single individual households reporting at or below R$70 are beneficiaries with this percentage essentially constant across all eligible bins, and in June 2016, 77.6% of those reporting at or below R$77 are beneficiaries with this percentage essentially constant across all eligible bins. These percentages are very similar to Gerard, Naritomi and Silva (2021) who find that in 2010, 79% of households reporting below the extreme poverty threshold are BF beneficiaries.

Robustness to Identification Assumption 1
We now explore robustness of our main results. First, Identification Assumption 1 may not hold if optimization frictions are large. For example, consider households who, in response to the reform, jump below the threshold by changing their labor supply but face large labor market frictions. These households may not be able to perfectly jump and bunch at the new notch, but may instead jump to an income level below R$63. Alternatively, under the original policy, some households may wish to bunch at R$70, but are only able to locate at an income below R$63 due to labor market frictions; if these households are able to update their income to an income between R$(70,77], they may move to the new notch as a result of the reform. Hence, large labor market frictions may imply that the distribution below R$63 is impacted by the reform. To test for this possibility (or equivalently, to allow for larger frictions), we relax Identification Assumption 1 by instead assuming that the number of people reporting in bins ≤R$56 are unaffected by the reform, allowing for the possibility that the number reporting in bin R$(56, 63] is affected by the reform. We 46. For example, in 2015, individuals earning less than R$1903.98 per-month were exempt from income taxation. augment Equation (15) by setting treat x = 1 if x ∈ {63, 70, 77} and 0 otherwise, i.e., we include R$(56, 63] as a treatment bin. Figure 7 plots the actual and counterfactual paths for log N (56,63],t , log N (63,70],t , and log N (70,77],t . It does not appear that the reform had a significant impact on the number reporting in R$(56, 63] and the impact of the reform on our two original treatment bins is robust to including R$(56,63] as a treatment bin. Table 7 in Appendix C.6 presents results for this alternative specification. . We assume the difference in the (log) number reporting in any two bins follows a cubic time trend in absence of the reform (i.e., we set K = 3 in Equation (15)). Confidence intervals are constructed from clustered standard errors at the bin level. The timing of the reform is indicated by the gray, shaded region.

Robustness to Identification Assumption 2
Identification Assumption 2 essentially boils down to assuming that the differences over time between the (log) number of people reporting in each income bin (which are determined by unknown, complex structural relationships) are well approximated by polynomials which would have persisted in absence of the reform.
Identification Assumption 2 therefore implies that if, for example, bin R$(35,42] experiences a 1% deviation from its bin-specific polynomial trend, bin R$(63,70] would also experience a 1% deviation from its bin-specific polynomial trend in absence of the reform. However, one may argue that, for example, a structural change in the economy which leads to a 1% deviation from the bin-specific polynomial time trend for bin R$ (35,42] would lead to a 2% deviation for bin R$(63,70]. As such, we relax Identification Assumption 2 to instead assume that the log number reporting in each bin evolves according to: Thus, we now assume that deviations from underlying bin-specific polynomials (in logs or, equivalently, percentage terms) are not necessarily the same across all bins. This relaxation of Identification Assumption 2 allows for a structural change in the economy which creates a 1% deviation from bin-specific polynomial trend for R$(35,42] to create a (γ 70 /γ 42 )% deviation from bin-specific polynomial trend for bin R$ (63,70].
Under this relaxed version of Identification Assumption 2, we estimate the following non-linear least squares regression: 47 Essentially, running Regression (16) will estimate a common set of month dummies, δ t , as well as bin-specific factors, γ x , which multiply these common month dummies for each bin x. We show in Appendix C.7 that our results are very robust to this relaxation of Identification Assumption 2.

Robustness to Household Composition
We also consider the possibility that some of the estimated behavioral impact of the reform for single individual households may be coming from households misreporting their family composition. For example, a two adult household can receive greater benefits if they report to be two separate one adult households as benefits are paid out per-household as opposed to per-capita. 48 Hence, the reform may have increased incentives to misreport family composition as well as income. 49 From a theoretical perspective, Proposition 3 holds even if misreporting responses occur on the family composition margin. However, such a behavioral response may affect the validity of our identification strategy (for example, it may no longer be reasonable to assume that the distribution below R$63 is unaffected by the reform if these new "single individual" households enter at income levels well below the threshold). Thus, we re-do our main analysis restricting to single individual households whose composition does not vary over the sample period. Our point estimates for the lower bound for the MVPF still lie between 0.88 and 0.91, while our point estimates for the upper bound are now smaller lying between 1.01 and 1.05; see Appendix C.8.
Finally, our main analysis is restricted to single individual households due to the fact that the schedule for households with more than one individual features a notch at R$70 and a kink below R$70 (e.g., this kink is at R$35 for two adult households with no children; see Section 2). Consequently, the 2014 BF reform led to both a change in the notch and a change in the kink for households with more than one individual. Thus, for these households, it is not necessarily reasonable to make Identification Assumption 1 as we might expect the reform to affect the number of people reporting in bins around the kink. We nonetheless proceed with estimating the bounds on the MVPF for two adult households without children, noting that our identification assumptions are imperfect in this setting. For these households we see the same pattern of behavioral responses to the reform and estimate lower bounds for the MVPF between 0.88 and 0.96 and upper bounds for the MVPF 47. Clearly, Identification Assumption 2 is nested within this more general functional form. 48. E.g., a household with two adults and no children that has a combined per-capita income of R$60 is eligible for R$70 in transfers pre-reform and R$77 in transfers post-reform. However, if this household were to report that they were actually two single individual households with incomes of R$60, they would each be eligible for R$70 in transfers pre-reform and R$77 in transfers post-reform.
49. However, the ability of households to misreport the number of members is limited as individuals must provide government issued IDs for all family members to be on the registry. Moreover, household composition is arguably more verifiable than income for many households given the large informal sector in Brazil. Thus, we suspect that households are more likely to misreport income than family composition. between 1.08 and 1.12; see Appendix C.9. 50

Placebo Tests
We can partially test the validity of our two identification assumptions via placebo tests. Together, Identification Assumptions 1 and 2 imply that all of the income bins below R$63 should evolve approximately according to a common time trend plus a bin-specific polynomial throughout the entire analysis period. To test this, we re-estimate Equation (15) but only include bins below R$63 (i.e., x ∈ {7, 14, ..., 63}) and randomly assign some of these bins to be "treatment" bins (i.e., treat x = 1). If both of our identification assumptions are correct, the "treatment effects" from these placebo regressions should be close to zero, i.e., the post-reform deviation of the "treated" bins relative to the post-reform deviation of the "control" bins should be close to 0. There are a total of 510 different ways to assign our nine bins below R$63 "treatment" status. 51  Figure 8 shows a fairly tight clustering of these treatment effects around zero: the mean "treatment effect" (in absolute value) across all bins is around 4% (0.04 log points). Hence, the "treatment effects" are relatively small for bins below R$63. 52 This suggests that Identification Assumptions 1 and 2 are fairly reasonable. 50. In Appendix C.10, we show strong suggestive evidence of a behavioral response to the change in the basic benefit notch for households with children, but we do not attempt to bound the MVPF for these households. This is because households with children may also receive the "variable benefit". Both the level and location of the notch associated with the variable benefit also changed with the June 2014 reform. Thus, the WTP of households with children needs to account for changes in the variable benefit schedule in addition to changes in the basic benefit schedule. This exercise is beyond the scope of the current paper.
51. There are 9 1 ways to pick one "treatment" bin, 9 2 ways to pick two "treatment" bins; thus, there are a total of 8 i=1 9 i = 510 possible regressions. 52. We find a negative "average treatment effect" of -0.105 log points for R$(14, 21]. However, R$(14, 21] seems implausibly far away from the notch to be impacted by the reform, especially given that other bins around R$(14,21] do not appear to be impacted by the reform. Hence, we interpret this negative "average treatment effect" simply as random variation.
Note: This figure shows the placebo "treatment effects" (in logs) for each income bin R$(x − 7, x] obtained from estimating Regression (15) (15) in a manner akin to randomization inference. 53 In Figure 8, the solid red line shows the treatment effect of about 1.6 log points for R$ (70,77] while the dashed red line shows the treatment effect of about -0.12 log points for R$(63,70], both from our preferred specification (K = 3). The treatment effect for R$(70,77] is an order of magnitude larger than any estimated "treatment effect" from placebo regressions, giving us high certainty that the number of individuals reporting incomes in R$(70,77] was impacted by the reform. Similarly, the treatment effect for R$(63,70] is larger (in absolute magnitude) than 97.8% of the placebo "treatment effects", implying a p-value of 0.022 against the null hypothesis that R$(63,70] was not impacted by the reform.

Conclusion
We analyze the welfare impacts of a reform that expanded both the benefit level and eligibility threshold in one of the world's largest cash transfer programs, Bolsa Família. To do so, we develop a novel sufficient statistics framework to bound the welfare impacts of reforms to transfer programs featuring notches. We then estimate these sufficient statistics for the 2014 BF reform using longitudinal administrative data. Despite finding strong evidence of behavioral responses to the BF reform, we find that the corresponding efficiency costs are small relative to the equity benefits of increased transfers to poor households: a back-of-the-envelope 53. See Imbens and Rubin (2021) or Young (2018) for discussions of randomization inference. calculation suggests that the welfare effect of spending R$1 on the reform is at least as high as the welfare effect of spending R$1.50 on a non-distortionary universal transfer. Because the Bolsa Família eligibility threshold is based on self-reported income, this result provides some evidence against the commonly held belief that targeting transfer programs based on self-reported incomes will generate substantial efficiency costs in high informality settings. Thus, our findings have important implications for the future design of cash transfer programs, especially if developing countries are to increasingly use self-reported income to determine eligibility for social programs.
Moreover, we believe that our sufficient statistics framework highlights a new manner in which reducedform evidence on jumping and bunching can be used to inform policy. Given the ubiquity of notches, we hope that the methods developed in this paper will be useful for analyzing reforms in a variety other contexts such as Medicaid, income-dependent tax credits, or firm tax schedules. Finally, the bounding techniques developed in this paper may be helpful for bounding welfare impacts of large reforms which do not necessarily feature notches.

A.1 Optimal Reported Incomes in Baseline Model
Type µ = 2 households with y < τ clearly prefer to report y ≤ŷ ≤ τ (and are indifferent between reporting income levels in this range as only under-reporting is costly). We assume they break this indifference by reporting at y. Anyone with y > τ prefers misreporting to τ over misreporting to an income less than τ because v ′ > 0. By definition, those with y = y c (p) are indifferent between misreporting at the threshold τ and truthfully reporting (we break their indifference by saying they will misreport at τ ), where y c (p) solves y c (p) + b − v(y c (p) − τ ) = y c (p). 54 Those with τ < y ≤ y c (p) prefer misreporting to τ over truthfully reporting and those with y > y c (p) prefer the reverse by the fact that v ′ > 0.

A.2 Proof of Lemma 1
Let us begin by calculating the WTP for the mechanical households, which are the households who report an income ≤ τ under both p and p ′ . Because households with y > τ will never report ≤ τ under policy p ′ as v ′ > 0, mechanical households are those with y ≤ τ ; they always report truthfully and receive the benefit.
Hence, their utility is equal to y + b ′ under policy p ′ and equal to y + b under policy p. Clearly, their WTP for the reform (or compensating variation) is equal to b ′ − b = ∆b. Because only mechanical households report under τ given policy p ′ , the number of such households is given by M = G(τ, p ′ ).
Next, let us discuss threshold households with income y ∈ (τ, τ ′ ] and µ = 1 (note, if y c (p) ≤ τ ′ , then type µ = 2 households with y ∈ (y c (p), τ ′ ] are also threshold households). These households all receive utility y under policy p and receive utility y + b ′ under policy p ′ . Hence, their WTP is given by b ′ . Definitionally, threshold households are those reporting incomes in (τ, τ ′ ] under policy p so that the total number of such households is given by T = G(τ ′ ; p) − G(τ ; p).
Finally, there are jumping households with type µ = 2 and income y ∈ (y c (p), y c (p ′ )] (or y ∈ (τ ′ , y c (p ′ )] if y c (p) ≤ τ ′ ). These households receive utility y under policy p and receive utility y + b ′ − v(y −ŷ * (y, 0; p ′ )) whereŷ * (y, 0; p ′ ) ≤ τ ′ . For these individuals we have: Because y + b ′ − v(y −ŷ * (y, 0; p ′ )) ≥ y by revealed preference, W T P ≥ 0. And because v(y −ŷ * (y, 0; p ′ )) ≥ 0, W T P ≤ b ′ for jumping individuals. Jumping households are those who report incomes above τ ′ under policy p and (weakly) below τ ′ under policy p ′ . Because all households who report incomes under τ ′ under policy 54. The existence of a unique y c (p) follows from v ′ > 0. 55. Note, if y c (p) ≤ τ ′ , i.e., the change in τ is large, all bunching households will report truthfully under p ′ . p also report incomes under τ ′ under policy p ′ , the number of jumping households is given by the increase in households reporting at or below τ ′ as a result of the reform: J = G(τ ′ ; p ′ ) − G(τ ′ ; p).
Putting this all together we get:

A.3 Proof of Proposition 3
Proof. We start with proving the lower bound for . First, welfare under policy p is given by: Next, note that by revealed preference, we have the following for any x ∈ X: Put simply, optimal decisions conditional on any given θ under p, x * (θ, p), yield weakly higher utility than any other set of decisions x that one could make. This yields the following bound on welfare under policy p ′ = {b ′ , τ ′ }, which ensues by evaluating utility under policy p ′ , but holding household decisions constant at their values under policy p (i.e., by revealed preference): So as to slightly reduce some cumbersome notation, let us define: Thus, for the reform from p = {b, τ } to p ′ = {b ′ , τ ′ } with p ′ − p = {∆b, ∆τ } and ∆τ > 0: Note, the change in utility for those withŷ * (θ, p) > τ ′ is zero as we move from p to p ′ , holding decisions fixed, as they do not receive a transfer. Next, define η {ŷ * (θ,p)≤τ } as the government's average welfare weight on the households who optimally report incomesŷ * ≤ τ under policy p. η {ŷ * (θ,p)≤τ } captures the average welfare gain from giving these households an extra $1: And define η {ŷ * (θ,p)∈(τ,τ ′ ]} as the government's average welfare weight of giving a dollar to the households who optimally report incomesŷ * ∈ (τ, τ ′ ] under policy p. η {ŷ * (θ,p)∈(τ,τ ′ ]} captures the average welfare gain from giving these households an extra $1: 56 Note, by the mean value theorem, η {ŷ * (θ,p)≤τ } and η {ŷ * (θ,p)∈(τ,τ ′ ]} are equal to some average social marginal utilities of consumption for their respective groups of households. Next, let us define the aggregate welfare weight, η L , which equals the weighted average welfare weight of giving a dollar to all households, where the weights are determined by the lower bound of WTP for the reform: Then, dividing Equation (19) through by the budgetary effect multiplied by λ, we have (recall we assume the budgetary effect is > 0): where ω L = η L /λ and M V P F L is given by: Next, we prove the upper bound for . We use identical revealed preference logical to bound welfare under policy p = {b, τ } by evaluating utility under policy p, but holding household decisions constant at their values under policy p ′ : Hence, for the reform from p = {b, τ } to p ′ = {b ′ , τ ′ } with p ′ − p = {∆b, ∆τ } and ∆τ > 0: Next, define η {ŷ * (θ,p ′ )≤τ } as the government's average welfare weight on the households who optimally report incomesŷ * ≤ τ under policy p ′ . η {ŷ * (θ,p ′ )≤τ } captures the average welfare gain from giving these households an extra $1: And define η {ŷ * (θ,p ′ )∈(τ,τ ′ ]} as the government's average welfare weight of giving a dollar to the households who optimally report incomesŷ * ∈ (τ, τ ′ ] under policy p ′ . η {ŷ * (θ,p ′ )∈(τ,τ ′ ]} captures the average welfare gain from giving these households an extra $1: 57 Again, by the mean value theorem, η {ŷ * (θ,p ′ )≤τ } and η {ŷ * (θ,p ′ )∈(τ,τ ′ ]} are equal to average social marginal utilities of consumption for their respective groups of households. Next, let us define the aggregate welfare weight, η U , which equals the weighted average welfare weight of giving a dollar to all households, where the weights are determined by the upper bound of WTP for the reform: Then, dividing Equation (18) through by the budgetary effect multiplied by λ, we have (recall we assume the budgetary effect is > 0): where ω U = η U /λ and M V P F U is given by:

A.4 Proof of Proposition 3 when Eligible Households Do Not Receive Benefit With Certainty
We now prove that our bounds still hold in a model where households reporting below the threshold receive the benefit with some probability q. Those reporting above the threshold do not receive the benefit.
Under this more complex policy environment, the household problem is as follows: where α denotes a Bernoulli random variable that takes value 1 with probability q and takes value 0 with probability 1−q. We again consider the impact on welfare from moving from policy We start with proving the lower bound on the welfare impact is unchanged. This proof simply requires some minor adjustments to the steps used to prove the lower bound in Appendix A.3.
First, welfare under policy p is given by: Next, note that by revealed preference, we have the following for any x ∈ X: This yields the following bound on welfare under policy p ′ = {b ′ , τ ′ }, which ensues by evaluating utility under policy p ′ , but holding household decisions constant at their values under policy p (i.e., by revealed preference): So as to slightly reduce some cumbersome notation, let us define: Thus, for the reform from p to p ′ we get: Next, define η {ŷ * (θ,p)≤τ } as the government's average welfare weight on the households who optimally report incomesŷ * ≤ τ under policy p. η {ŷ * (θ,p)≤τ } captures the average welfare gain from giving these households an extra $1 with probability q: And define η {ŷ * (θ,p)∈(τ,τ ′ ]} as the government's average welfare weight of giving a dollar to the households who optimally report incomesŷ * ∈ (τ, τ ′ ] under policy p. η {ŷ * (θ,p)∈(τ,τ ′ ]} captures the average welfare gain from giving these households an extra $1 with probability q: 58 Note, by the mean value theorem, η {ŷ * (θ,p)≤τ } and η {ŷ * (θ,p)∈(τ,τ ′ ]} are equal to some average expected social marginal utilities of consumption for their respective groups of households. Next, let us define the aggregate welfare weight, η L , which equals the weighted average welfare weight of giving a dollar to all households, where the weights are determined by the lower bound of WTP for the reform: Then, dividing Equation (19) through by the budgetary effect multiplied by λ, we have (recall we assume the budgetary effect is > 0): where ω L = η L /λ and M V P F L is given by: The upper bound can be proved in a analogous manner by adjusting the upper bound portion of the proof in Appendix A.3 for the fact that the benefit is only received with some probability q.

A.5 Proposition 3 Cannot Be Improved
Proposition 4. Without further assumptions on primitives, the bounds in Proposition 3 cannot be improved.
Proof. To show that one cannot construct tighter bounds than Proposition 3 without additional structure on the agent's problem, we provide examples for which these bounds are attained. In particular, we create examples for which (1) bunching households have a WTP of ∆b and jumping households have a WTP of 0 as well as (2) bunching and jumping households both have a WTP of b ′ . 59 Example 1: Consider our baseline, misreporting model in Section 3.1 with v(0) = 0, v ′ > 0 and v ′′ > 0.
Suppose that ∆b, ∆τ > 0. Consider a distribution for the misreporting types (µ = 2) with a mass point at y = τ , no density on (τ, y c ′ ), and another mass point at y = y c ′ , where y c ′ solves b ′ = v(y c ′ − τ ′ ). All bunching househols therefore have y = τ . Thus, the utility change for bunching households is equal to: Hence, the WTP for bunching households is ∆b. Because there are no individuals with incomes ∈ (τ, y c ′ ), all jumping households have y = y c ′ . Hence, all the jumping households report truthfully (and do not get the benefit) under the original policy p. We break their indifference by assuming that they jump to the threshold τ ′ under the new policy p ′ . The change in utility for jumpers is therefore given by: Thus, each jumping household's WTP is equal to 0. Hence, the total WTP for the reform equals (M + Example 2: Consider our baseline, misreporting model in Section 3.1 with v(0) = 0, v ′ > 0 and v ′′ > 0.
Suppose that ∆b, ∆τ > 0. Let y c solve b = v(y c − τ ) and y c ′ solve b ′ = v(y c ′ − τ ′ ). Finally, suppose that τ ′ − τ is large enough so that τ ′ > y c . Consider a distribution for the misreporting types (µ = 2) with no density on [τ, y c ), a mass point at y = y c , and no density on (y c , y c ′ ]. For the µ = 2 individuals with y = y c , they are indifferent between bunching at τ and reporting truthfully under policy p. In the former case, they are bunching individuals (and there are no jumping individuals) and in the latter case they are jumping individuals (and there are no bunching individuals). Regardless, the utility gain for these households is equal to: Thus the total WTP for the reform will equal M ∆b + (B + T + J)b ′ = M V P F U B .
59. Note that our examples require mass points of the income distribution. But one can approximate our example scenarios arbitrarily well with smooth income distributions; hence, we can get arbitrarily close to the cases when either (1) all bunching households have a WTP of ∆b and all jumping households have a WTP of 0 or (2) all bunching and jumping households both have a WTP of b ′ .

A.6 Incorporating Misperceptions of the Schedule
We now assume that households do not necessarily understand how the policy impacts their consumption.
Households solve the following problem: In words, households make decisions under the assumption that their consumption is some function of their true income y, their reported incomeŷ, the p, and state variables θ. For instance, this framework allows for households to misperceive the threshold τ or the benefit level b (e.g., f (y(x, θ),ŷ(x, θ), p, θ) = y(x, θ) + (b + θ 1 )1(ŷ(x, θ) ≤ (τ + θ 2 )), where θ 1 , θ 2 are state variables for the household). Total welfare is still given by: In order for Proposition 3 to hold under the more general model with misperceptions (Problem (20)), we need to make two additional assumptions. We need to assume that when the policy changes from p to p ′ , household behavioral re-optimization improves welfare, on average. In other words, misperceptions of the policy reform cannot be so severe that households make themselves worse off (on average) by responding to the reform. Mathematically, we require that: Note, that the previous inequality holds by revealed preference if households correctly perceive the schedule (i.e., behavioral re-optimization can only improve utility). If agents misperceive the schedule, we simply need to assume that behavioral responses improve welfare on average. Correspondingly, our second assumption is that if, hypothetically, the policy were to change from p ′ to p, household behavioral re-optimization would also improve welfare, on average. Mathematically, this amounts to assuming: If we are willing to make these two assumptions, the rest of the proof to Proposition 3 goes through, so that we can bound the welfare impact of changing notches if individuals misperceive the schedule. Hence, we can state: Proposition 5. Suppose households solve Problem (20), welfare is given by Equation (21) and τ ′ > τ . If we assume: Then as long as b ′ G(τ ′ ; p ′ ) − bG(τ ; p) > 0 have: where M V P F L is given by Equation (7), M V P F U is given by Equation (8)

A.7 Discounted Welfare Impact of Reform
Suppose households have several decisions variables at time t denoted by the vector x t (within a potentially limited choice set X t ). Household decisions are made conditional on state variables denoted by the vector θ t ∈ Θ t and the policy p. Households get the benefit b if their reported incomeŷ t , which can be a function of decision variables x t of included as a decision variable, is below τ . Household income, denoted y t , is also potentially a function of decisions x t . 60 Households in period t solve: where c t denotes consumption in period t, β is a discount factor, and E θ t+1 |θt,xt [V (θ t+1 )] represents the expected value of starting period t + 1 with state variables θ t+1 , noting that the expectation over θ t+1 may be impacted by current state variables θ t and current decisions x t . Equivalently, we can write out individual utility from the perspective of time period 0 as: where E θt|θ 0 ,{xt} represents the expectation over θ t from the perspective of time period 0 (taking into account the impact of all conditional decisions {x t } between time 0 and time t on the underlying expectations). The 60. For example, in a dynamic misreporting model, xt =ŷt and θt could include current income yt, aversion to misreporting µ, prior reported incomeŷt−1, and a parameter governing expected future income growth. Households may also make savings decisions if assets are a state variable in θt, current savings is included in xt, and ct represents post-transfer income. equality above follows from the law of iterated expectations. So as to slightly reduce some cumbersome notation, let us define: Using this notation, we again assume total discounted welfare is given by a weighted discounted sum of utilities, with welfare weights given by ϕ(θ 0 ), less the total discounted budgetary cost of the policy multiplied by a shadow value of public funds λ: where β represents the governments discount rate, λ represents the shadow value of public funds in time t = 0 (so that the shadow value of public funds in future periods equals β t λ), and: represents the expected number of households receiving the benefit under policy p in period t. More generally, we define: as the expected number of households with a reported income below z under policy p in time period t.
This setup allows us to bound the welfare impacts of a policy reform over many time periods: Proposition 6. Suppose households solve Problem (22), total welfare is given by Equation (23), and τ ′ > τ . Defining: Then as long as where ω L,T (ω U,T ) captures the discounted weighted average money-metric welfare gain from giving a dollar from giving these households an extra $1: Next, we prove the upper bound for giving a dollar to all households, where the weights are determined by the upper (discounted) bound of expected WTP for the reform: Then, dividing Equation (25) through by the budgetary effect multiplied by λ, we have (recall we assume the budgetary effect is > 0): where ω U,T = η U,T /λ and M V P F U,T is given by:

B.2 Other Social Security Programs Based on the CadastroÚnico
This appendix describes other programs that set their eligibility based on information from the Cadastró Unico database.

Benefício de Pretação Continuada (BPC):
This benefit targets the elderly (above 65 years of age) and disabled. It gives a minimum wage to all households with per-capita income up to a quarter of the minimum wage. Table 4 reports the minimum wage and BPC threshold across all years of the analysis. The Brazilian Social Security System administers its own exam to define eligibility for this program. Carteira do Idoso: This "Elderly Card" guarantees to all individuals 60 years of age or older and with income up to two times the minimum wage at least a 50% discount on any interstate trip by road, rail, or waterway.
Créditos Instalação do Programa Nacional de Reforma Agrária: Households with per-capita income up to three times the minimum wage and that are living in camping grounds get points in a system that selects beneficiaries to be settled through the Brazilian land reform.
Facultativo de Baixa Renda: This is an option to contribute to social security at a lower rate (5% of the minimum wage). The individual cannot have any income and household income must be below two times the minimum wage.
Identidade Jovem (ID Jovem): Discounts for cultural events and trips by road, rail, or waterway for individuals between 15 and 29 years of age living in a household with up to two times the minimum wage.
Isenção de taxas de inscrição em concursos públicos: Since 2008, households with per-capita income up to half of the minimum wage or total income of up to three times the minimum wage are exempt from public tender registration payment.
Política Nacional Assistência Técnica Rural -PNATER Brasil Sem Miséria: Technical assistance for households working on activities for their own consumption in rural areas.
ProgramaÁgua para Todos -Programa Nacional de Universalização do Acesso e Uso daÁgua: Since July 2011, the government has installed cisterns to ensure access to clean water for all Brazilians, with priority going to those who satisfy the criteria for BF program.
Bolsa Estiagem: This is a benefit of at least R$80 per month to households with total income up to two times the minimum wage that live in areas hit by natural disasters.
Programa Bolsa Verde -Programa de Apoioà Conservação Ambiental: Since October 2011, this program transfers R$300 every 3 months to households in extreme poverty (first threshold of BF) and that follow the requirements for using natural resources.
Programa Cisternas: This program aims to provide cisterns to low-income families registered in the CadastroÚnico.
Programa de Erradicação do Trabalho Infantil: This program transfers benefits similar to the BF (R$25 and R$40 per child per month in municipalities with less and more than 250,000 inhabitants, respectively) to households whose incomes are above the BF threshold with working children (up to 16 years of age) conditional on these children attending school 85% of the time instead of working.     Households with per-capita income above the second threshold are not eligible for any BF cash transfers.

B.3 Bolsa Família Program and Data Extraction Timeline
Moreover, the total variable benefit was capped at R$160 (5 children per household) and the total teenager benefit was capped at R$76 (two teenagers per household). 63 The reform this paper studies, which occurred in June 2014, increased the extreme poverty threshold from R$70 to R$77 and the poverty threshold from R$140 to R$154. The basic benefit was raised from R$70 to R$77, the benefit per child from R$32 to R$35, and the benefit per teenager from R$38 to R$42. Note 63. Households with children need to fulfill three additional conditions to receive the variable benefit and/or teenager benefit: (1) children must maintain a minimum of 85% school attendance between ages 6 and 15 and 75% school attendance between 16 and 17; (2) households must keep track of their children's vaccines; and (3) parents must maintain at least 85% attendance in a social-education program if the household has violated child labor laws in the past. All conditionalities were held constant during the analysis period.
that the thresholds are based on per-capita income but the benefits are denominated in raw amounts. This reform was announced on national television by the president in April 2014. Table 5 summarizes these aspects of the schedule before (first column) and after (second column) the reform for households with children. Note: The first two rows correspond to the the extreme-poverty and poverty thresholds, respectively. These are measured in monthly, per-capita income. I.e., before the reform a household is below the extreme-poverty threshold if their monthly, percapita income is below R$70. The third, fourth, and fifth rows display the benefits given to households; these are denoted in monthly amounts. I.e., before the reform a household below the extreme-poverty threshold receives R$70 per-month in the basic benefit. For purposes of illustration, Figure 11 plots how the benefit schedules for two particular household compositions varied over time. Note:ŷ denotes the reported, per-capita, monthly household income. B(ŷ) denotes the monthly, per-capita benefits a household receives if they reportŷ. These benefits will also depend on household composition. For example, a household with 2 adults and 1 teenager reportingŷ = 0 in December 2011 will receive R$70 in the basic benefit and R$38 in the teenager benefit. Thuŝ y + B(ŷ) = (70 + 38)/3 = 36.

C.5 Results with Smaller Bin Sizes
This Appendix contains results using a bin size of 3.5 rather than a bin size of 7 as in the main text.
We estimate Equation (15) Table   6. The estimated MVPF bounds are relatively similar to our main results in Table 3.     (4) and (5) show the estimated number of bunching and jumping households for June 2016, Bt and Jt, calculated using Equations (12) and (13). Columns (6) and (7) show the estimated upper and lower bounds for the MVPF for June 2016, calculated using Equations (10) and (11). Standard errors are presented in parentheses and are computed from the delta method from the clustered standard errors estimated in Equation (15).

C.7 Results Estimated from Nonlinear Least Squares Equation (16)
Note: This figure shows the log number of single individual households reporting incomes in R$(63, 70] and R$(70, 77] over time along with the counterfactual paths had the reform not happened. The counterfactual paths are equal to the actual number of people reporting in the given interval minus the causal impact of the reform,β1,x +β2,xt, estimated using Equation (16) where we set treatx = 1 if x ∈ {70, 77} and K = 3. Confidence intervals are constructed from clustered standard errors at the bin level. The timing of the reform is indicated by the gray, shaded region.   (1) and (2) (3) and (4) show the estimated number of bunching and jumping households for June 2016, Bt and Jt, calculated using Equations (12) and (13).
Columns (5) and (6) show the estimated upper and lower bounds for the MVPF for June 2016, calculated using Equations (10) and (11). Standard errors are presented in parentheses and are computed from the delta method from the clustered standard errors estimated in Equation (15).

C.8 Results for Households with Constant Composition
In this appendix, we present results for our sample of single individual households restricted to those who do not change their reported family composition throughout the analysis period (June 2012 to June 2016). To do so, we drop any household that reports a composition change over the analysis period. Moreover, we drop any household who enters the registry post June-2014 as we cannot tell whether these new entrants are truly new households or are pre-existing households (with multiple adults) pretending to be separate households so as to increase the amount of benefits they receive. In other words, we restrict our sample to single individual households who were (a) on the registry prior to June 2014, and (b) do not report a change in composition over the analysis period. This reduces our sample from 1,938,653 single individual households with incomes below R$77 in June 2016 to 1,039,573 single individual households with incomes below R$77 in June 2016. Table 9 presents results for this exercise. We find very similar estimates for the MVPF lower bound and slightly smaller estimates for the MVPF upper bound. Note that the estimated numbers of jumpers and bunchers are smaller than in Table 3 due to the smaller sample size.  (1) and (2) (15) with various polynomial degrees K ∈ {2, 3, 4, 5}, restricting the sample to single individual households who do not change their reported household composition throughout the analysis period. Columns (3) and (4) show the estimated number of bunching and jumping households for June 2016, Bt and Jt, calculated using Equations (12) and (13). Columns (5) and (6) show the estimated upper and lower bounds for the MVPF for June 2016, calculated using Equations (10) and (11). Standard errors are presented in parentheses and are computed from the delta method from the clustered standard errors estimated in Equation (15).

C.9 Results for Two Adult Households with No Children
In this appendix, we present results for households with two adults and no children. As discussed in Section 2.2 and Appendix B.4, the guaranteed minimum income, which was instituted in February 2013 for two adult households without kids, generates a kink in the benefit schedule that occurs at a per-capita The June 2014 reform not only changed the extreme poverty threshold and the basic benefit both from R$70 to R$77 but it also changed the guaranteed minimum income from R$70 per-capita to R$77 percapita. Hence, the June 2014 reform changed the location and level of both the notch and the kink in the benefit schedule (for example, the location of the notch moved from R$70 to R$77 per-capita while the location of the kink moved from R$35 to R$38.5 per-capita). Identification Assumption 1 is now harder to justify given that the June 2014 reform changed incentives for households to locate around R$35 by changing the kink in the benefit schedule from R$35 to R$38.5. However, many studies have shown that behavioral responses to kinks are typically very small; kinks generally induce substantially less bunching than do notches (Kleven, 2016). Hence, we simply ignore the presence of the kink and use all seven-increment income bins below R$63 as control groups just as in our main analysis. Figure    Results from estimating Equation (15) for various polynomial degrees K can be found in Table 10. The MVPF bounds are roughly similar in magnitude as for the single individual households discussed in Section 5 but are a bit more sensitive to the degree of polynomial used. 64 64. Note these bounds are for the MVPF associated with changing the location and level of the notch only, i.e., we ignore any welfare impacts of changing the level and location of the kink.  (1) and (2) (15) with various polynomial degrees K ∈ {2, 3, 4, 5}, restricting the sample to two adult households without children. Columns (3) and (4) show the estimated number of bunching and jumping households for June 2016, Bt and Jt, calculated using Equations (12) and (13). Columns (5) and (6) show the estimated upper and lower bounds for the MVPF for June 2016, calculated using Equations (10) and (11). Standard errors are presented in parentheses and are computed from the delta method from the clustered standard errors estimated in Equation (15).

C.10 Results for Households with Kids
The benefit schedule for families with children is substantially more complex than the benefit schedule for households without children, as discussed in detail Appendix B.4. Prior to June 2014, in addition to the guaranteed minimum income and the basic benefit, there was also a variable per-child benefit for households below the poverty threshold of R$140 per-capita. The June 2014 reform led to changes in the levels and locations of the guaranteed minimum income kink, the basic benefit notch, and the variable benefit notch.
For example, the poverty threshold was raised from R$140 per-capita to R$154 per-capita and the levels of the variable benefits were also increased by around 10% (see Appendix B.4 for more details).
Estimating the MVPF of the reform for households with children will require estimating the WTP for the all of the different changes to the BF schedule. Estimating the WTP for the change to the variable benefit schedule would be particularly difficult given that this benefit is made conditional on investments in children.
For example, suppose parents are under-investing in their children's education and the reform increases school attendance. Then we would need to estimate the childrens' WTP for the increased education they receive as a result of the reform to the variable benefit schedule. Thus, we leave calculating the MVPF of the reform for households with children to future work. We do, however, show strong evidence that households with kids respond to the reform. In particular, Figure 26 shows prima facie evidence that the reform increased the number of households reporting incomes in R$ (70,77].
Note: This figure shows the number of households with kids that report incomes in the various bins. The number in each bin is normalized to 1 in June, 2012. The timing of the reform is indicated by the gray, shaded region.