WPS8524 Policy Research Working Paper 8524 Not(ch) Your Average Tax System Corporate Taxation under Weak Enforcement Pierre Bachas Mauricio Soto Development Economics Development Research Group July 2018 Policy Research Working Paper 8524 Abstract How should developing countries tax corporate income? elasticities. Faced with higher tax rates, firms slightly reduce This paper studies this question in Costa Rica, where firms revenue but considerably increase costs, generating a large face discontinuously higher average tax rates on profits elasticity of profits. In this context, the revenue maximizing when their revenue marginally increases. The paper com- rate for profit taxation is below 25 percent and broadening bines a discontinuity and a bunching design to estimate the tax base while lowering the rate can increase revenue the profit elasticity and separate it into revenue and cost for these firms by 80 percent. This paper is a product of the Development Research Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/research. The authors may be contacted at pbachas@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Not(ch) Your Average Tax System: Corporate Taxation under Weak Enforcement Pierre Bachas & Mauricio Soto∗ JEL Classification: H25, H26, H32, O23 Keywords: Corporate Tax, Tax Evasion, Tax Elasticity, Notch ∗ Pierre Bachas: World Bank Research, pbachas@worldbank.org. Mauricio Soto: Banco Central de Costa Rica, SOTORM@bccr.fi.cr - We gratefully acknowledge data assistance by Oscar Fonseca and Jorge Richard Munoz from DGH at the Ministry of Finance and DGIE and DIE at BCCR. We thank Edward Miguel and Emmanuel Saez, for their support and encouragement, and have benefitted from comments and suggestions from Miguel Almunia, Juan Pablo Atal, Alan Auerbach, Youssef Benzarti, Michael Best, Anne Brockmeyer, Natalie Cox, Paul Gertler, Anders Jensen, Claus Kreiner, Etienne Lehman, Joana Naritomi, Alvaro Ramos, Andres Rodriguez Clare, Joel Slemrod, Danny Yagan, Andrew Zeitlin, Gabriel Zucman and seminar participants at Columbia, IDB, IFS, Michigan, Munich, Oxford Center for Business Taxation, Paris II, Princeton, TSE, UC Berkeley, U. de Costa Rica, UPF, World Bank DEC, ZEW, Zurich. Financial support from the Burch Center for Tax Policy, the Center for Equitable Growth and the Julis-Rabinowitz Center for Public Policy and Finance is gratefully acknowledged. The findings and conclusions are those of the authors and do not represent the views of the World Bank, the BCCR or any other institution. 1 Introduction Lower-income countries only collect 20% of their GDP in taxes, compared to 35% on average for OECD countries (Gordon and Li 2009, Besley and Persson 2013). For the corporate income tax, the slope of tax-take on GDP is similar: low-income countries collect 2% of their national income, while rich countries collect 3.5%. A plausible explanation for this pattern is that the elasticity of corporate profits with respect to the tax rate is larger in developing countries due to tax evasion. Since the corporate tax typically allows for all production costs to be deducted, firms could re- duce their tax base by over-reporting costs (Slemrod, Collins, Hoopes, Reck and Sebastiani 2015, Carrillo, Pomeranz and Singhal 2017). Some countries address this challenge by applying a lower rate to a broader base, with limited deductions, thereby reducing evasion incentives. However this introduces distortions to firms’ optimal scale and violates production efficiency (Diamond and Mir- rlees 1971). Best, Brockmeyer, Kleven, Spinnewijn and Waseem (2015) model this revenue versus production efficiency trade-off and empirically find that a broader corporate tax base is desirable in Pakistan. However, their variation mixes firms’ revenue and profits responses, which prevents them from separately estimating these key elasticities. The elasticity of profits is required to mea- sure the revenue maximizing rate under a profit tax, and both the revenue and profit elasticities are needed to jointly estimate the optimal rate and base. We take advantage of Costa Rica’s corporate tax design and rich administrative data to make three contributions. First, we estimate the elasticity of profits with respect to the tax rate for small firms: we find large elasticities (3-5), which severely constrain the revenue collection of profit taxes. Second, we separately estimate the elasticities of revenue and costs and show that two-thirds of firms’ responses to higher tax rates are explained by increases in reported costs. Using these elasticities we simulate the optimal tax system and find that broadening the base while lowering the rate generates tax revenue gains of up to 80%, keeping profits constant. Third, using additional data sources we show that the revenue elasticity is driven by tax evasion, with limited evidence of production responses, which reinforces the recommendation of a broader base with a lower rate. While corporate tax systems often tax profits at a flat rate, Costa Rica imposes increasing aver- age tax rates on profits as a function of firms’ revenue. The determinant of the tax rate (revenue) is different from the tax base (profits). This generates a notched tax schedule: firms’ average tax rate jumps from 10 to 20% at the first revenue threshold and from 20 to 30% at the second threshold. Theoretically, the changes in average tax rates above the thresholds should induce two types of be- havioral responses. First, firms which would have reported revenue slightly above the thresholds, absent the rate increase, have a strong incentive to reduce their revenue to just below the thresholds (bunching), which lowers the average tax rate they face on their entire profits. Moreover, firms should select into bunching as a function of their costs: firms with low costs gain the most by 1 reporting revenue at the thresholds, while firms with sufficiently large costs (e.g. zero profits) have no incentives to change their revenue. Second, for firms further up the revenue distribution and hence infra-marginal to the bunching behavior, changes in the tax rates create higher incentives to lower revenue and to increase costs. In administrative corporate tax returns we observe clear excess mass in the revenue distribution just below the thresholds and missing mass above them. We use the estimated counterfactual change in revenue of the marginal bunching firm to measure the elasticity of revenue with respect to the net of tax rate, following the notch estimation of Kleven and Waseem (2013). More importantly, we find that firms reporting revenue above the thresholds respond to higher tax rates by sharply reducing their reported profits. This is evidenced by a large “donut-hole” discontinuity in average profits by revenue on each side of the thresholds, including for firms infra-marginal to the bunching behavior. The profit response is a mix of firms reducing their revenue and increasing their costs when faced with higher tax rates. Using the revenue elasticity, estimated in the first step, we hold revenue responses constant, such that the remaining profit discontinuity only identifies changes in reported costs. By combining the revenue and cost responses we estimate the elasticity of profits with respect to the net of tax rate. The resulting elasticities are very large: 5 at the first threshold and 3 at the second threshold, an order of magnitude higher than estimates for small firms in OECD countries (Devereux, Liu and Loretz 2014, Patel, Seegert and Smith 2015). This estimation procedure provides a robust measure of the elasticity of profits. Intuitively, the elasticity of profits is identified from the “donut-hole” discontinuity in profits around the threshold, while bunching is used to separate the revenue and cost elasticities. In contrast, the revenue and cost elasticities separation suffers from two limitations: it abstracts from selection into bunching as a function of costs and it only estimates the revenue response of the marginal bunching firm.1 Under heterogeneity in revenue elasticities, this corresponds to the highest elasticity firm and pro- vides an upper (lower) bound to the average revenue (cost) elasticity. To address these limitations we model selection into bunching as a function of firms’ costs by assuming a counterfactual for the revenue and cost distribution. We then simultaneously estimate the revenue and cost elasticities to match the two key moments: (1) the excess mass at the threshold, (2) the donut-hole discontinuity in costs.2 We find that costs are much more elastic than revenue to changes in the tax rate, and account for over two-thirds of the drop in reported profits.3 We perform several robustness tests to support these results. First, the “donut-hole” disconti- 1 The bunching estimation measures the counterfactual revenue of the marginal buncher in order to satisfy the constraint that the bunching mass below the notch equals the missing mass above it. 2 We also estimate the share of firms with large adjustment frictions as the share of firms located above the threshold which face a marginal tax rate above one. 3 The large elasticity of costs rationalizes the frequent use of tax policies determined by revenue instead of profits in developing countries. Examples include registration thresholds, large taxpayers units, and corporate tax systems with different rates as a function of revenue such as Costa Rica’s, which exist in over 30 countries (KPMG tax guides). 2 nuity relies on the credible extrapolation to the thresholds of the relation between firms’ average profits and their revenue. We show that this relation is extremely stable and linear away from the thresholds. Second, observable fixed characteristics (industry, geography, years of existence) are similar for infra-marginal firms on either sides of the thresholds. Third, the profit discontinuity at the thresholds is not driven by a few outliers but occurs at all percentiles of the profit distribution, across sectors and years. Fourth, we show that firms’ dynamic behavior mirrors the static patterns: when a firm’s revenue grows past the threshold, its reported profit margin sharply drops and re- mains lower in subsequent years. On the contrary, firms growing at the same rate but remaining within their tax bracket hardly change their profit margins. The returns to scale estimated from the panel data imply that the cost discontinuity is an order of magnitude too large to be explained by a reduction in economic activity and corresponds to tax evasion. We use the estimated parameters to inform optimal corporate tax policy. First, regardless of the mechanisms driving responses, the profit elasticities of 5 and 3 severely constrain the range of desirable rates under a profit tax: rates above 17% and 25% are on the wrong side of the Laffer curve for SMEs in Costa Rica. Locally reducing the rates at the second and third tax bracket (20% and 30% respectively) is Pareto improving. Second, we use the estimated revenue and cost elastic- ities to characterize the optimal tax base and tax rate following the model of Best et al. (2015). We assume that the revenue elasticity corresponds to the real output elasticity and the cost elasticity to the evasion elasticity and simulate the revenue gains from broadening the base while lowering the rate, leaving total profits unchanged. We find that a reform which considerably broadens the base while lowering the rate leads to revenue collection gains of up to 80% for these firms.4 Concretely two policies could be implemented: (1) moving to a turnover tax (no deductions) with a tax rate of 2.9% realizes 93% of the revenue gains (2) removing the deductibility of non-wage administrative costs would double the tax base and under a rate of 5% realizes 97% of possible gains.5 Finally, we draw on rich administrative data to study if firms’ revenue responses are driven by evasion, avoidance or production decisions. In addition to corporate tax returns, we use data on audits, social security, monthly sales tax receipts and the registry of economic groups. We find evidence that for bunching firms under-reporting revenue explains part of the responses: their revenue is more likely to display inconsistencies with third-party reports and it adjusts upwards following audit threats. On the contrary, we find no evidence of specific production and avoidance responses: employment and wage bill are continuous at the thresholds in the social security data, revenue shifting across fiscal years appears limited in monthly sales receipts, and large firms do not divide into smaller ones in the group registry. Although the confidence intervals are not tight enough to reject production responses, the evidence on mechanisms supports revenue evasion and 4 We simulate the tax system locally and note that these parameters might not apply to large firms. 5 This is probably a lower bound since non-wage administrative costs appears very elastic in the data. 3 the desirability of a broader base. If profit responses are entirely driven by evasion, then firms facing a 30% tax rate evade taxes on as much as 70% of their profits. 1.1 Contribution and Related Literature Estimating the elasticity of corporate profits is challenging. First, corporate tax reforms are endoge- nous to the economic context and changes in tax rates often happen simultaneously with changes in the base and in enforcement (Kawano and Slemrod 2016). Second, when using variation gen- erated by tax reforms, the most common methodology is to instrument the tax rate change with the counterfactual tax rate change, assuming taxpayers earned their base-year income.6 This esti- mation is prone to mean reversion and sensitive to specification choices (Gruber and Saez 2002, Kopczuk 2005, Weber 2014). Third, corporate tax schedules are often flat and less amenable to the discontinuity methods applied to the personal income tax (Saez 2010, Chetty, Friedman, Olsen and Pistaferri 2011, Kleven and Waseem 2013). The discontinuities in the design of the corporate tax in Costa Rica allows us to estimate one of the first profit elasticity in a developing country, which appears substantially higher than in rich countries: Devereux et al. (2014) for the UK and Patel et al. (2015) for the US estimate corporate elasticities of 0.5 for small firms. The large elasticities we find can be rationalized by the weaker enforcement environment; an expand- ing literature (Pomeranz 2015 Khan, Khwaja and Olken 2015, Naritomi 2016, Brockmeyer and Hernandez 2017) shows that missing third-party information and low fiscal capacity lead to large evasion in developing countries.7 Our results imply that lowering the corporate rate could increase tax revenue from small and medium firms. This resonates with Gorodnichenko et al. (2009) and Kopczuk (2012), who find that tax reforms in Eastern Europe, which substantially decreased the rate and simplified the tax code, led to higher reported income. Similarly, Waseem (2018) finds a large increase in tax revenue from a reform in Pakistan which changed incentives for incorporation. Our paper also contributes to the literature on optimal taxation under weak enforcement (Emran and Stiglitz 2005, Gordon and Li 2009). We find that costs are very elastic, which complements Carrillo et al. (2017) and Slemrod et al. (2015) who find that following tighter enforcement of rev- enue by the tax administration, firms report higher revenue but substitute by over-reporting costs, leaving their tax liability unchanged. By separately estimating the profit and revenue elasticities, we can be more precise in our recommendation for a broader tax base, following the work of Best et al. (2015). We find that broadening the tax base leads to substantial revenue gains, which sup- ports the empirical relevance of the revenue versus production tradeoff. 6 Using this method Gruber and Rauh (2007) estimate a profit elasticity of 0.2 for large US corporations and Dwenger and Steiner (2012) an elasticity of 0.5 for German firms. 7 Tax evasion is not only a developing country issue: Kleven et al. (2011) and Slemrod et al. (2001) use randomized audits to estimate tax evasion in respectively Denmark and Minnesota - they find tax evasion rates as high as 40% on income not subject to third-party reporting, which is concentrated among the self-employed. 4 Finally, we contribute to the literature using discontinuities in tax design to estimate tax elas- ticities. Saez (2010) and Chetty et al. (2011) develop the framework for kink points, extended to notches by Kleven and Waseem (2013).8 In our setting, notches are determined by revenue, but the tax rate applies to profits. We develop a method to adjust for manipulation of the running variable (revenue) to estimate the profit elasticity and separate revenue and cost elasticities.9 We also apply a model-based estimation, to address selection into bunching as a function of costs, a feature of this tax design which also exists in several large middle-income countries, such as India, Indonesia, Thailand and Vietnam.10 The paper is organized as follows. Section 2 introduces the tax system and the theoretical framework. Section 3 presents the data, methods and main results. Section 4 adds structure to improve the estimation of the revenue and cost elasticities. Section 5 shows that firms’ dynamic behavior is consistent with the static distributional results. Section 6 discusses implications for optimal corporate tax policy. Section 7 shows that part of revenue responses are driven by evasion, with limited evidence of production and avoidance responses. Section 8 concludes. 2 Tax System and Theoretical Framework 2.1 Corporate Tax System in Costa Rica Figure 1 presents Costa Rica’s corporate tax schedule. A corporation pays an average tax rate of 10%, 20% or 30% on its profits as a function of its revenue: firms with revenue below the first threshold face a 10% average tax rate, firms with revenue in between the two thresholds face a 20% rate, and firms with revenue above the second threshold face a 30% tax rate.11 A noteworthy feature of the Costa Rican corporate tax design is that the determinant of the tax rate, revenue, is different from the tax base, profits. Profits are defined as revenue minus deductible costs and follow the accrual basis of accounting. A non exhaustive list of deductible costs includes material inputs, cost of labor, contracted services, insurance payments, interest payments, financial costs, capital depreciation, marketing costs and travel expenses (Article 8 of Law 7092). Although com- paring tax bases across countries is challenging due to their multi-dimensionality (Kawano and 8 For a thorough review of bunching methods see Kleven (2016). 9 Diamond and Persson (2017) simultaneously developed a method to estimate how manipulation itself impacts future outcomes for the manipulation “compliers”. They apply it to teacher’s grading discretion on students’ earnings. In contrast, our method uses the bunching moment to estimate manipulation of the running variable, which we use to adjust the discontinuity on the upper side of the threshold. 10 With knowledge of the implied marginal tax rate change at the threshold, our method can be extended to any revenue-dependent threshold. For example, Almunia and Lopez-Rodriguez (2018) study the impact of an enforcement threshold on Spanish firms’ reporting behavior, and Asatryan and Peichl (2016) of registration thresholds in Armenia. 11 In 2014, the revenue thresholds were 52,710,000 and 106,026,000 colones, corresponding to 150,000 and 300,000 USD in Purchasing Power Parity. The thresholds are indexed on inflation and grow on average by 4% yearly. 5 Slemrod 2016), the definition of the tax base in Costa Rica appears to follows international stan- dards with a few exceptions: payments to foreign consultants cannot exceed 10 percent of revenue, loss carry-backs are never allowed and loss carry-forwards are limited to the manufacturing sector. Importantly, the revenue thresholds only determine the tax liability and do not to determine any other policy. The current design was implemented in 1988 and has remained unchanged since. 2.2 Theoretical Framework: Baseline In the model, firms decide on their revenue production and can evade taxes by under-reporting revenue and over-reporting costs. We model tax evasion from the onset, since we find evidence that evasion is important to explain the large elasticities. When evading taxes, firms incur resource costs and risk detection. We use this framework to highlight the potential impact of the Costa Rican corporate tax system on firm behavior and derive empirical predictions. Consider a firm producing good y , subject to a convex cost function c(y ). The production costs incurred by the firm are tax-deductible and hence, in the model, a flat tax rate on profits is non- distortionary.12 The firm can under-report revenue, such that revenue evasion is (yi − y ˜i ˜i ), where y is reported revenue, and over-report costs, such that cost evasion is (˜ ci − ci ), where c ˜i is reported costs. In doing so it incurs resource costs13 and risks detection, which generates a convex cost of evasion R(yi − y ˜i − ci ). Finally, the firm faces the tax rate τ which applies to reported profit, ˜i , c π ˜i = y˜i − c ˜i . The firm chooses the triplet (y, y ˜) of revenue to produce, revenue to report and cost ˜, c to report to maximize its after-tax profit: yi − c Πi = yi − c(yi ) − τ.(˜ ˜i ) − R(yi − y ˜i − ci ) ˜i , c (1) To generate heterogeneity while presenting a tractable model we make two more assumptions. First, the cost function takes the form c(yi ; φi , αi ) = αi + k( yi ) φi , where αi are fixed costs (equivalent to a demand shifter) and φi is a productivity parameter, which scales variable costs k (yi ). Second, we assume that the cost of evasion function is separable in revenue and cost evasion such that R(yi − y ˜i ) = h(yi − y ˜i , ci − c ci − c(yi )). The first assumption serves to generate heterogeneity ˜i )+ g (˜ in costs and is not critical for the results. The second allows us to clearly illustrate the role of revenue and cost reporting responses. There is no uncertainty in the model since the firm knows its type determined by the pair (αi ,φi ) and knows the cost of evasion. The firm’s profits are: yi − c Πi = yi − c(yi ; φi , αi ) − τ.(˜ ˜i ) − g (˜ ˜i ) − h(yi − y ci − c(yi )) (2) 12 We do not pretend that corporate taxation is generally non-distortionary but make this assumption for the tractability of the model. The corporate tax is non-distortionary in a cost of capital model (Jorgenson and Hall 1967) with immediate expensing: if all costs, including returns to capital, are immediately deductible, then the corporate income tax is a tax on pure profits and does not impact production decisions. 13 In addition to risking detection by the tax authority, resource costs from evasion include keeping multiple sets of records, forgoing business relation with formal firms and limiting interactions with banks. See Chetty (2009). 6 ˜, c An interior optimum solution satisfies the first order conditions with respect to (y, y ˜): k (yi ) 1= (3) φi h (yi − y ˜i ) = τ (4) ci − ci ) = τ g (˜ (5) Equation (3) determines the revenue produced y , which in this model is independent of the tax rate and only depends on productivity. Equations (4) and (5) state that the marginal return to revenue and cost evasion, τ , equals the marginal cost of evasion, which depends on the amount evaded. Revenue increases with productivity φi but is independent of the fixed cost draw αi , such dy ∗ dy ∗ y∗ ) that dφii > 0 and dαii = 0. Costs are given by c∗ (y ∗ ; φi , αi ) = αi + k(φ i and increases both with dc∗ dc∗ productivity φi and with the fixed cost αi , such that dφii > 0 and dαii > 0. Under a continuous and differentiable joint distribution of productivity and fixed cost parame- ters f0 (φ, α) the distribution of revenue and costs is smooth. We assume that the cost of evasion functions h(yi − y ci − ci ) are continuous and differentiable and therefore the distributions ˜i ) and g (˜ of reported revenue and reported costs are also smooth and differentiable. 2.3 Theoretical Framework: Revenue Dependent Notches A noteworthy aspect of Costa Rica’s corporate schedule is that the average tax rate applied on profits increases from τ to τ + dτ when firms report revenue above the threshold y T . The tax ˜ and reported costs c liability is a function of reported revenue y ˜: y−c T (˜ ˜; y y−c ˜) = τ (˜ ˜) if ˜ ≤ yT y y−c T (˜ ˜; y y−c ˜) = (τ + dτ )(˜ ˜) if ˜ > yT y (6) y−c T (˜ ˜) = 0 ˜; y if ˜≤ 0 ˜−c y Let us suppose that the above tax system is imposed as a small tax reform over a previously flat corporate tax rate τ . Since only the productivity parameter φi determines firm i’s revenue, there exists a productivity threshold φ such that a firm with productivity φ = φ reports revenue exactly equal to the threshold y ˜ = y T , and all firms with φ ≤ φ declare revenue below the threshold ˜ ≤ y T . These firms are not impacted by the tax change. For firms with φ > φ, which locate above y the threshold pre-reform, there are two possible responses: (1) reduce reported revenue, either through production or evasion decisions, by an amount such that the new reported revenue equals the threshold or (2) remain above the threshold and face a higher tax rate. Firms which remain above the threshold respond to the higher tax rate by changing reported revenue and reported cost, 7 such that the marginal cost of evasion equals the new tax rate. Firms choose one of these two responses as a function of their productivity and fixed cost draw (i.e. they select into bunching as a function of their revenue distance to the notch and their costs). For every productivity draw φi in an interval [φ, φmax ] there exists a fixed cost αi such that all firms within the interval [φ, φα ] bunch at the threshold. φα is determined by the indifference condition between profits at the threshold and profits at the interior solution above the threshold: ˜T , c ΠT hreshold (y, y ˜|φα , α)] = ΠInterior (y , y ˜T , c ˜ |φα , α)], where (y, y ˜,c ˜) is the triplet of produced revenue, reported revenue and reported costs at the threshold, and (y , y ˜,c˜) is the triplet in the in- terior solution. Firms with φi > φα remain above the threshold and adjust their reporting behavior such that the marginal returns from evasion on revenue and costs equal the higher tax rate. To illustrate the effect of costs on the bunching decision, consider a firm with productivity φi > φ and fixed cost αi , mapping into produced revenue, reported revenue and costs (y0 , y ˜0 ), ˜0 , c T such that y ˜0 > y before the tax change. After the tax increase past the threshold, the firm can report revenue at the threshold by reducing revenue with a mix of production and evasion. We de- note the production response dy and the evasion response dy ˜. The total change in reported revenue is ∆y = dy + dy ˜, such that ∆y is the revenue distance to the threshold (∆y = y ˜0 − y T ). For small dτ 14 the gains from bunching compared to reporting pre-tax change revenue are approximated by: Gains ≈ dτ (y T − c ˜.h (y0 − y ˜0 ) + ∆y (τ + dτ ) − dy ˜) − dy.[1 − c (y0 − dy )] ˜0 + dy (7) The first term of equation (7) is a noteworthy feature of the Costa Rican corporate income tax: the gains from lowering revenue to reach the threshold are proportional to the change in the tax rate dτ and to the firm’s reported tax base at the threshold, y T − c ˜0 . Therefore heterogeneity in the fixed cost parameter generates different incentives to bunch for firms of equal productivity. The other terms of equation (7) state that the firm directly gains by not paying taxes on undeclared and unproduced revenue ∆y , but incurs larger resource costs from the additional revenue under- reporting and looses profits due to lower production. Note that if all responses are due to evasion, then equation (7) simplifies to: Gains ≈ dτ (y T − c ˜(τ + dτ ) − h (y0 − y ˜0 ) + dy ˜) ˜0 + dy Prediction 1: Bunching at the revenue thresholds The distribution of productivity and fixed cost f (φ, α) maps into the distribution of reported rev- y0 , c enue and reported costs ψ0 (˜ ˜0 ) such that a mass of firms bunch at the revenue threshold: y T +∆y (˜ c0 ) B= y0 , c ψ0 (˜ ˜0 )dy ˜ ˜ 0 .dc 0 (8) c ˜0 ˜0 =y T y 14 A small dτ allows us to ignore intensive margin responses to the higher tax rate above the threshold. In practice dτ is not small, but we assume so to illustrate the importance of firms’ cost in the bunching decision. 8 where ∆y (˜ c0 ) is the revenue response of a firm with counterfactual costs c ˜0 . With knowledge of the joint counterfactual distribution of revenue and costs we can estimate the elasticity of revenue y which generates a given amount of bunching. Absent the counterfactual cost distribution we can still estimate the revenue response of the marginal buncher, defined as the firm with the maximal ¯0 ] the marginal buncher’s revenue change in revenue. Given the support of the cost distribution [c0 ; c response is ∆y mb = ∆y (˜ c0 = c0 ). With the lower support of the cost distribution we can estimate the revenue elasticity of the marginal buncher, which in the model corresponds to the response of the firm with the lowest cost and largest elasticity. Under homogeneous elasticities the marginal revenue elasticity corresponds to the average revenue elasticity. Prediction 2: Missing mass above the thresholds and no strictly dominated region A corollary of the first prediction is that the firm density should display missing mass just above the threshold, corresponding to the excess mass at the threshold. The missing mass is a function of the revenue distance to the threshold and the counterfactual cost distribution at that revenue level. Firms are dominated if they face a marginal tax rate above one when reporting revenue above the threshold: in this setting only a subset of firms with sufficiently low costs are dominated above the threshold.15 As a consequence, even in a frictionless world, there is a mass of firms just above the threshold corresponding to firms with low profits. Figure 2 displays predictions 1 and 2. It plots the counterfactual distribution of firms by revenue under a 10% tax rate and the expected density following the introduction of the notch. Prediction 3: Increased Revenue and Cost Evasion Past the Thresholds Infra-marginal firms do not bunch at the revenue threshold but face an increase in the marginal return to evasion, which jumps from τ to τ + dτ . They respond to the higher tax rate by increasing both revenue and cost evasion until the marginal resource cost of evasion equals the new tax rate. As a result, firms above the threshold report lower revenue and higher costs than under the lower tax rate and observed profits by revenue jump down discontinuously past the threshold. 3 Behavioral Responses and Tax Elasticities This section estimates the elasticity of profits with respect to the net of tax rate, and presents a method to separate revenue and cost responses which provides an upper (lower) bound on the revenue (cost) elasticity. Section 4 imposes additional structure on firm’s counterfactuals profits to obtain point-estimates of the revenue and cost elasticities. Identification relies on two assumptions: first, absent the average tax rate increase past the thresholds, the revenue distribution would be 15 The standard notch setting (e.g. Kleven and Waseem (2013)) displays a deterministic dominated revenue interval past the threshold, as both the tax base and tax rate depend on income: any taxpayer reporting income in that interval is leaving money on the table since lowering its income to the threshold would increase its after tax income. 9 smooth and continuous, and can be approximated by a flexible polynomial.16 Second, average costs by revenue would not jump discontinuously at the thresholds. Under these assumptions, we develop a three step methodology. In a first step, we use bunching at the revenue thresholds to measure revenue responses to higher tax rates. In a second step, we use the discontinuities in average costs by revenue, on each side of the thresholds, to estimate the cost response for infra-marginal firms. Using the revenue elasticity estimated in the first step, we adjust the cost discontinuities to take into account intensive margin revenue responses. In a third step, we combine the revenue and cost responses to compute the profit response at the thresholds. 3.1 Data and Evidence of Responses to the Tax Schedule Costa Rica is a middle-income country, with per capita GDP of $15,000 and stable institutions. It collects 21% of its GDP in revenue of which 14% is tax revenue and 7% social security contri- butions. Our study uses the 2008 to 2014 universe of administrative corporate tax returns, from the Ministry of Finance. All registered corporations are required to file electronically yearly tax declaration D101 in which they report profits, revenue and costs. Costs are reported in five line items: administrative costs and wages, material inputs, capital depreciation, interest payments and other costs. The data contains 617,588 firm-year observations and 222,352 unique firms. In total, the corporate income tax raises 18% of tax revenue, roughly 2.5% of GDP. We study the behavior of small firms with yearly revenue below 150 M CRC ($450,000 in PPP). They represent 81% of the firm population, declare 20% of total profits and generate 13% of corporate tax revenue. Figure 3 shows the key features of the data by revenue bins of half million CRC, pooling all years together. Panel A shows the distribution of firms by revenue, which displays clear excess mass below the revenue thresholds and missing mass just above. Panel B shows the average profit margin by revenue, where profit margin is defined as profits over revenue.17 Profit margin by rev- enue follows a downward step function: it remains constant within a given tax bracket and jumps down discontinuously at the thresholds. Average profit margin within the first tax bracket is 17%, falls to 8% in the second bracket and decreases further to 5% in the third. We also observe that firms reporting revenue at the thresholds display profit margins in excess of 22% and 9%, respectively at the first and second thresholds.18 The estimation strategy uses the moments of the distributions of Figure 3 to estimate the profit elasticity and separate it between revenue and cost responses. 16 The ability to approximate the counterfactual density with a flexible polynomial is a usual assumption in bunching papers, however, it is not a trivial: it requires smooth production functions, credible shapes extrapolated distributions and no extensive margins responses. In our context there appears to be support for these assumptions and the paramet- ric choices (e.g. polynomial order, bunching interval limits) do not impact the estimated parameters. 17 Figure A2 shows average profits (Panel A) and average costs (Panel B) by revenue. Profit margin is unit free and very stable within tax bracket in our data, which highlights the large discontinuity in the tax base at the threshold. 18 As shown in equation 7, firms with low costs have incentives to select into bunching: sufficient heterogeneity in costs for a given level of revenue can lead to large average profit margins for bunchers. 10 3.2 Revenue Elasticity: Bunching Estimation 3.2.1 Bunching Methodology Since in the empirical section we only observe reported revenue and reported costs, we drop the ˜ and c y ˜ subscripts used in the model to refer to reported quantities, and simply denote them by y and c in what follows. Counterfactual revenue (costs) are reported revenue (costs) under a flat 10% corporate tax rate. To estimate the change in reported revenue of the marginal bunching firm, we use the integration constraint that the excess mass below the threshold must equal to missing mass above it (Kleven and Waseem 2013). We slice the data in half million revenue bins and obtain a counterfactual revenue density by fitting a polynomial of degree five:19 5 yu k Fj = βk .(yj ) + δi .1(yj = i) + νj (9) k=0 i=yl where Fj is the number of firms in revenue bin j , yj is the revenue midpoint of interval j , [yl , yu ] is the excluded region and δi ’s are dummy shifters for the excluded region. We use the estimated βk ’s to obtain the counterfactual firm distribution by revenue absent the tax change: 5 ˆj = F ˆk .(yj )k β (10) k=0 The estimation procedure requires that the excess mass below the threshold (E) equals the missing mass past the threshold (M), defined as: yT yu ˆ= E ˆj ) (Fj − F and ˆ = M ˆj − Fj ) (F (11) j =yl j =y T y T is the revenue threshold and the bounds of the excluded region [yl , yu ] are obtained as fol- lows: the lower limit yl is determined as the first bin with statistically different density, compared to a local regression on all bins to its left.20 The upper limit, yu = y T + ∆y , is estimated by imposing the restriction that the empirical excess mass (E ˆ ) equals the missing mass (M ˆ ). Starting from yu just above the threshold, we estimate equation (9) and compute E ˆ and M ˆ . For a low value of yu , the excess density is much larger than the missing density (E ˆ>M ˆ ). We iteratively increase yu until the excess mass converges to the missing mass (E ˆ = M ˆ ). The estimated upper bound, ˆu , is the counterfactual revenue of the marginal firm which responds to the tax change. Under y heterogeneity in revenue elasticities, this is the response of the highest elasticity firm and therefore 19 The order of the polynomial maximizes Akaike’s criteria. Table A1 shows robustness to the polynomial order. 20 We show in table A1 that changing yl does not impact the results. 11 provides an upper bound to the size of revenue responses. By imposing the restriction that the excess mass equals the missing mass, the point of con- vergence method generates two potential concerns.21 First, it assumes that there are no extensive margin responses. Extensive responses could occur if firms become informal when faced with higher tax rates. This would generate additional missing mass past the threshold and imply that E < M . In our setting extensive margin responses should play a limited role, as Costa Rica is one of Latin America’s countries with the lowest informality (ILO 2012), and it is unlikely that growing firms revert to informality.22 With extensive margin responses, the true revenue elasticity is smaller than the estimated one, which coincides with our interpretation of an upper bound on the revenue elasticity. Second, the standard bunching method ignores intensive margin revenue re- sponses past the threshold. Intensive responses imply that above the threshold, the counterfactual firm distribution is higher than the observed distribution. We take into account this second order effect by shifting the counterfactual distribution above the threshold with the factor implied from the estimated revenue elasticity.23 Finally note that our objective is to estimate the revenue elasticity, defined as: %change revenue ∆y (1 − τ0 ) y,1−t = = T. ∗ (12) %change (marginal tax rate) y (τ − τ0 ) Where τ0 and y T are known parameters and ∆y is estimated with the bunching method. τ ∗ is the marginal tax rate faced by the firm. In the case of a notch, and in particular of a notch with selection based on costs, the change in the marginal tax rate is not as straightforward as with a kink. Given the tax liability T (y − c|y ), we define the implicit marginal tax rate τ ∗ , for an increase in revenue ∆y , as the change in tax liability over the change in revenue: T (y T +∆y )−T (y T ) (τ0 +dτ )(y T +∆y −c)−τ0 (y T −c) dτ (y T −c) τ∗ = ∆y = ∆y = (τ0 + dτ ) + ∆y (13) Where τ0 , dτ , y T and ∆y are known or estimated parameters. However, the cost of the marginal buncher c is unknown. From the theory section we know that the marginal buncher is the firm with the lowest costs within its revenue bin. Therefore, the marginal buncher should have costs in the top percentile of the cost distribution for its revenue bin. In practice, to ensure that we estimate an upper bound on the revenue elasticity we assume that the cost of the marginal buncher corresponds to the 10th percentile of the cost distribution. 21 These limitations are also noted in Kleven and Waseem (2013). 22 Another type of extensive margin response occurs if firms display jumps in their production functions. In this case, the missing mass could be generated by firms which never incur the fixed costs of growing past the threshold. 23 The intensive margin adjustment occurs simultaneously with the iterative method to determine yu . In our setting with substantial elasticities, this adjustment slightly reduces the estimated revenue response. 12 3.2.2 Bunching Results Figure 4 shows the distribution of firms by revenue and the counterfactual density, estimated from the polynomial fit around each threshold. The firm revenue distribution in Costa Rica follows a power law (Zipf law) away from the threshold, which has been documented extensively across countries (Axtell 2001, Garicano, LeLarge and Van Reenen 2016). The polynomial approximates a standard firm size distributions and a fully parametric fit with a power law provides similar results: this alleviates concerns on the ad-hoc properties of counterfactuals fitted locally with polynomials (Blomquist, Kumar, Liang and Newey 2015, Seegert, Andrew and Marinho 2017) The estimated parameters are displayed in the top right corner of each panel. For the first threshold (Panel A), the excess mass is 2.3 times the counterfactual, meaning that there is 3.3 times the density that should be expected. In the absence of the notch, the marginal buncher would have an income of 58.3 million CRC, 16% higher than the threshold. For the second threshold (Panel B), the excess mass is 1.1 times the counterfactual and the marginal buncher has revenue of 107.7 M CRC, 7.6% higher than the threshold. Given the estimated revenue responses, we compute with equation (13) the implicit marginal tax rate faced by the marginal buncher: at the first threshold the resulting revenue elasticity with respect to the net of tax rate is 0.33. This implies that firms respond to a 10% reduction in the net of tax rate by reducing reported revenue by 3.3%. At the second threshold, the elasticity of revenue is 0.08. Table 2 reports the parameters and the resulting revenue elasticities at each threshold. Standard errors are estimated of 1,000 bootstrap iterations from resampling of the joint distribution of revenue and costs.24 Although graphically compelling, the behavioral responses to the revenue notches produce moderate revenue elasticities. Three points should be highlighted. First, on a small profit base a modest change in revenue can generate a large profit elasticity. Second, notches generate sizable changes in implicit marginal tax rates and large bunching is consistent with moderate elasticities. Third, lowering revenue is only one of two possible responses to a higher tax rate, since firms can also reduce their tax liability by increasing costs. We investigate cost responses below. 3.3 Cost Elasticity: Donut-Hole Discontinuity Panel B of Figure 3 presented the step pattern of average profit margin by revenue. Profit margin is visually attractive since unit free, and, in our data, very stable within tax brackets. However, to quantify the increase in costs, due to increase in the tax rate, we directly plot in Figure 5 the average of firms’ costs by revenue bins, around the first threshold. Importantly, some firms have selected into the revenue range around the threshold, as a function of their costs. From the bunching 24 After resampling from the joint distribution of revenue and costs we run the bunching point of convergence method on the new firm density by revenue. 13 analysis, we know that selection occurs in the revenue bins which correspond to the excess and missing mass intervals, [yl , yu ]. Therefore, we exclude these intervals from the cost discontinuity analysis by adding dummy variables for the excess and missing mass areas. We measure the discontinuity in costs at the threshold as follows: dist yu dist dist dist dist dist costsj = α + δ.1(yj > 0) + β1 .yj + β2 .yj 1(yj > 0) + γj 1(yj = j) + j (14) dist j =yl dist where costj represents the average of firms’ cost in bin j , yj = yj − y T is the revenue distance to the threshold and γj are dummy shifters for firms with revenue in the excluded excess and missing mass intervals. β1 provides the slope of costs on revenue below the threshold and β1 + β2 the slope above the threshold. The parameter of interest is δ , the discontinuity in reported costs at the δ threshold. This specification directly provides the percentage change in costs at the threshold as α . Our objective is to measure the discontinuity in costs, holding revenue responses constant. However, the cost discontinuity estimated from equation (14) could entirely be due to intensive margin responses of revenue. To see this, note that the running variable is revenue, which also responds to the increase in the tax rate: absent the tax change, firms in the upper tax bracket would have declared higher revenue, shifting up the costs by revenue relation. Using the estimated revenue elasticity from Section (3.2), we can adjust for the intensive margin revenue responses. For firms in revenue bin j , with revenue midpoint yj , the adjusted revenue (counterfactual revenue absent the change in the tax rate) would be: adj yj = yj if yj ≤ y T adj dτ (15) yj = yj + ˆy,1−t .yj . if yj > y T 1−τ We clarify four properties of the revenue adjustment. First, it only applies to firms with revenue above the threshold, since firms below do not face a tax rate change. Second, for firms with rev- enue sufficiently past the threshold, the increase in the average tax rate is equivalent to an increase in the marginal rate. Third, the revenue adjustment uses the estimated revenue elasticity: under heterogeneity in revenue responses, the revenue elasticity is an upper bound of the average elas- ticity. Hence the revenue adjustment is an upper bound to the true adjustment, which implies that the estimated cost elasticity is a lower bound. Fourth while bunching is local to the thresholds, the discontinuity in cost is estimated from infra-marginal firms. We are therefore implicitly assuming that these firms are comparable and that the parameters apply across the firm distribution. We apply the revenue adjustment and re-estimate equation 14. The coefficient δ now measures the increase in reported costs due to the tax change, holding revenue responses constant. The 14 discontinuity in costs is reported in Table 1, with and without the revenue adjustment.25 Figure 5 presents graphically the results for the first threshold: Panel A plots firms’ average costs by revenue bin and shows the revenue adjustment, which shifts costs horizontally for firms past the threshold. We then fit separate lines to the right and left of the threshold, excluding the interval impacted by bunching responses, [zl , zu ]. The linear extrapolation to the threshold on the left provides a coun- terfactual average costs for firms at the threshold under a 10% tax rate and absent the notch. The extrapolation to the right, provides the average costs for a 20% tax rate, absent revenue responses. We interpret the discontinuity at the threshold as the change in reported costs due to the increase in the tax rate. Panel B zooms in on the discontinuity in predicted average costs at the first threshold. We estimate a cost jump of 2.5 million at the threshold. Given the net of tax rate increase of 11% and counterfactual average cost sof 41.97 million at the threshold, the elasticity of cost is: ∆c|yT (1 − τ0 ) −2.55 0.9 c,1−τ = T . = ∗ = −0.55 c dτ 41.97 0.1 For a net of tax rate reduction of 10%, firms respond by increasing reported costs by 5.5%. At the second threshold, costs jump by 1.2 million on a 92 million base. Since the second threshold corresponds to a net of tax rate increase of 12.5% this implies a cost elasticity of -0.11. The estimation is equivalent to a donut-hole RD (Almond and Doyle 2011, Cohodes and Good- man 2014) under a local linear extrapolation. This design is credible if the counterfactual condi- tional expectation of costs as a function of revenue is well approximated by a linear relation. Formally this implies: E [Costs|Revenue, N o N otch] = β ∗ Revenue + . To support this as- sumption we plot the linear and quadratic fit of average costs by revenue, below and above each threshold (Figure A3). In each of the four quadrants (below and above each threshold, away from the thresholds) the quadratic fit is indistinguishable from the linear fit. We then compute the ad- justed R-squared from the linear, quadratic and cubic regressions and find that the linear model has the highest adjusted R-squared (Table A2). In Table A3 we investigate the impact on the es- timated cost discontinuity of different models and parameters. Columns (1)-(4) correspond to the first threshold and columns (5)-(8) to the second threshold. Columns (1) & (5) show that with a quadratic fit, the cost discontinuity is even larger than under the linear model; if the linearity as- sumption was introducing bias we would be underestimating the cost discontinuity and therefore estimate lower bounds on the cost and profit elasticities. Columns (2) & (6) show the cost discon- tinuity when the revenue adjustment is computed with a revenue elasticity which decreases as a function of revenue (instead of being constant), as suggested by the size of our two estimates, one 25 The revenue adjustment shifts costs horizontally – under a sufficiently large revenue elasticity, the entire cost discontinuity could be due to intensive margin revenue responses. To be the case, the elasticity of revenue would have to be 0.83 at the first threshold and 0.21 at the second, three times below our estimate. 15 at each threshold.26 Finally columns (3)-(4) and (7)-(8) vary the size of the extrapolation window, which has little impact on the estimated cost discontinuity. 3.4 Profit elasticity: Combining Revenue and Cost Responses By combining the revenue and cost responses, we can now estimate the elasticity of profits with respect to the net of tax rate. The elasticity of profits is a central parameter to set optimal tax rates and a sufficient statistic for revenue collection under a flat tax rate. It is defined as: % change prof its ∆π 1 − τ (∆y − ∆c) 1 − τ π,1−τ = = T. = . (16) % change (net of tax rate) π ∆τ π ∆τ Where ∆π , ∆y and ∆c are estimated at the threshold (we drop the ∆x|yT notation for simplic- ity). π T is the counterfactual average profit level at the threshold, assuming a flat 10% tax rate. We already estimated the change in costs at the threshold ∆c|yT and compute the change in revenue at the threshold ∆y |yT using the estimated revenue elasticity as: ∆y |yT = y T . y,1−τ . 1∆ −τ τ . Table 2 summarizes the elasticity estimates and changes in revenue, costs, and profits at each threshold. Standard errors are computed from 1,000 bootstrap iterations, where we resample with replacement from the joint distribution of revenue and cost. At the first threshold, we estimate a profit elasticity with respect to the net of tax rate of 4.9 and at the second threshold an elasticity of 2.9. These are very large elasticities and imply that the revenue maximizing rate is 17% for micro firms and 25% for small firms.27 Another result in Table 2 is the comparison between the cost and revenue elasticities. Slightly over 60% of the discontinuity in profits is due to an increase in costs and 40% from an increase in revenue. Reported costs appear more elastic than revenue to a change in the tax rate, even though we estimated a lower bound on the cost elasticity and an upper bound on the revenue elasticity. The difference is statistically significant at the first threshold and holds qualitatively at the second. In Section 4 we estimate more precise revenue and cost elasticities. Finally, note that the profit elasticity is robust to the bunching estimation. The profit elasticity hinges on the assumption that, absent the tax change, average reported costs by revenue follow the linear relation which applies away from the threshold. The large drop in profits past the thresholds is the key identifying variation for the profit response and the bunching estimation decomposes the response into the revenue and costs components. Due to the revenue adjustment term used to esti- mate the cost discontinuity, a higher revenue elasticity implies a larger adjustment, which reduces 26 The revenue adjustment uses the estimated elasticity at the threshold and applies it to all firms with revenue above the threshold. In Columns (2) and (6) of table A3 we instead assume a linearly decreasing revenue elasticity as a function of revenue, with a slope proportional to the drop in elasticities between the first and second threshold. 1 27 Under a flat corporate tax, the government revenue maximizing rate is τ Laf f er = 1+ π, 1−τ . Note that we only obtain local estimates of profit elasticities which might not apply to large firms. Hence we do not claim that lowering rates uniformly would increase tax revenue in Costa Rica. 16 the cost elasticity. This mechanical negative correlation between the revenue and cost elasticities leaves the profit elasticity unchanged. 3.5 Heterogeneity and Robustness of Profit Discontinuity Ideally we would repeat the previous analysis for each economic sector; however the estimation relies on large sample size, such that the revenue distribution and average costs by revenue are smooth and the counterfactual extrapolation credible. To study heterogeneity by sector, we simply estimate the “donut-hole” discontinuity in profit margin for each sector (following equation 14), without applying the revenue adjustment. Table 3, Column 1 shows the sector level profit margin discontinuities δs , separating the economy in fifteen sectors. All sectors, except “NGO and public administration”, display a sharp drop in profits past the threshold. The proportional change in profit margin is remarkably similar across sectors (Column 3): infra-marginal firms above the threshold always report margins 40 to 50% lower than same sector firms below the threshold. Figure A5 illustrates this result by plotting the firm distributions and average profit margin for each sector. Given the homogeneity of profit responses sorting of firms from different sectors is unlikely to provide a key explanation. However, we directly test if firms sort on either sides of the thresholds by estimating a donut-hole discontinuity (equation 14) for the few fixed characteristics in the tax data: province, economic sectors and number of years filing taxes. Table A4 shows the resulting coefficients: overall there appears to be very limited sorting based on these fixed characteristics. Among the 46 regressions, only two coefficients are significant, corresponding to the share of firms in real estate and in consultancies which fall at the first threshold.28 Finally, Figure A4 shows that the drop in profit margin past the threshold is not driven by a few outliers but by an entire downward shift in the profit distribution. It plots the quartiles of profit margin against revenue: the median profit margin starts at 6% below the threshold and drops to 3% above it. We observe a similar proportional decrease at the 25th and 75th percentiles.29 4 Estimating Elasticities with Selection into Bunching as a Function of Cost 4.1 Rationale for Added Structure While the profit elasticity estimated in section 3 appears robust to assumptions, the separation into revenue and cost responses is more fragile. The bunching method abstracts from selection into bunching as a function of costs and only estimates the revenue response of the marginal bunching firm. Under heterogeneity in revenue elasticities, this corresponds to the highest elasticity firm 28 One possibility to deal with the possibility of sorting in these two sectors is to reweigh the discontinuity in cost holding constant the sector composition. This explains less under 3% of the drop in profit margin. 29 We also confirm that bunching and the profit discontinuity occur in every year (Figures A6) 17 (Kleven and Waseem 2013) and provides an upper (lower) bound to the average revenue (cost) elasticity. To address these limitations we model selection into bunching as a function of firms’ costs by assuming a specific counterfactual for the entire profit margin distribution. We then simul- taneously estimate the revenue and cost elasticities to match the two key moments: (1) the excess mass at the threshold, (2) the donut-hole discontinuity in average costs. For any pair of revenue and cost elasticities, we compute the number of predicted bunching firms to match the empirical bunching.30 A higher revenue elasticity increases both bunching and the cost discontinuity, while a larger cost elasticity increases the discontinuity but reduces bunching, since the interior solution of facing the higher tax rate is less costly when firms can easily reduce reported costs. Finally, we estimate a third parameter: the share of firms which can not adjust their revenue due to large frictions,31 which we measure with the ratio of firms with revenue just past the threshold facing a marginal tax rate above one to the counterfactual. The larger the share of firms which can not adjust their revenue, the higher the revenue elasticity needed to match the empirical bunching.32 To obtain a joint counterfactual of revenue and costs we (1) maintain the counterfactual revenue distribution fitted in section 3 and (2) assume that absent the tax rate increase, the entire distribution of profit margin by revenue stays constant above the threshold. We use the profit margin distribu- tion of firms with revenue below the threshold as a counterfactual for margins of firms with revenue above the threshold. Specifically, we assume that the profit margin distribution of firms with 40 to 45 million CRC in revenue would apply to firms with revenue 10 to 30% larger, absent the tax change. If this is a credible assumption, then firms in revenue intervals not impacted by bunching should display stable profit margin distributions. Figure A7 shows that this is the case for revenue intervals 10 to 20% lower than the threshold, and we can never reject the Kolmgorov-Smirnov test of identical distributions.33 Thereafter, we use the profit margin distributions of figure A7 (Panel A) as the counterfactual distribution for firms above the threshold under a flat 10% tax rate. 4.2 Model with Selection into Bunching as a Function of Cost From section 2.3, the bunching mass is a function of counterfactual revenue and costs, y0 and c0 : y T +∆y (˜ c0 ) B= y0 , c ψ0 (˜ ˜0 .dc ˜0 )dy ˜0 c ˜0 ˜0 =y T y 30 This method is akin to the Bunching-Hole method in Kleven and Waseem (2013), in that we fill the bunching mass from the hole region - with the addition of selection into bunching based on costs. 31 Chetty and Saez 2013, Gelber et al. 2013 show the importance of frictions to explain taxpayers’ behavior 32 Our setting does not have a revenue interval where all firms’ behavior is dominated, since only firms with suf- ficiently low costs have incentives to lower their revenue to the threshold. Empirically we measure the ratio of firms reporting revenue just above the threshold which face a marginal tax rate above one, compared to the counterfactual. 33 Figure A7 (B) shows that the profit margin distributions are also stable in revenue bins 20-30% past the threshold. 18 We can replace counterfactual costs, c ˜ 0 and obtain:34 ˜0 , with counterfactual profit margin, m y T +∆y (m ˜ 0) y T +∆y (m ˜ 0) B= y0 , m φ0 (˜ ˜0 .dm ˜ 0 )dy ˜0 = φy ˜0 (˜ y0 )φm ˜ 0 )dy ˜ 0 (m ˜0 ˜0 .dm (17) ˜0 m ˜0 =y T y m ˜0 ˜0 =y T y where the second equality relies on the assumption that the counterfactual profit margin distri- bution is stable and independent of revenue (as shown in 4.1). The counterfactual distribution of revenue φy y0 ) is obtained from the polynomial fit of figure 4 and the counterfactual distribution ˜0 (˜ of profit margin φm ˜ 0 ) corresponds to the profit margin distribution of smaller firms (Figure A7, ˜ 0 (m Panel A). We rewrite equation 17 as to first integrate over profit margin and then over revenue is: ∞ 1 B= ˜0 (˜ φy y0 ) φm ˜ 0 (m ˜ 0 )dm ˜0 ˜ 0 .dy (18) ˜0 =y T y ¯ (˜ m y0 ) which highlights that for a given counterfactual revenue y ˜0 , all firms with counterfactual profit margin level m y0 ) should bunch. The next steps link the profit margin level m ¯ (˜ y0 ), at which a ¯ (˜ firm is indifferent between bunching and the interior solution, with the elasticities of revenue and costs. From section 3.2 the definition of the elasticity of revenue: ∆y 1 − τ0 ∆y 1 − τ0 y,1−τ = . = . (19) y dτ y τ ∗ − τ0 where τ ∗ is the implicit marginal tax rate that a firm faces by reporting additional revenue past the threshold, that is the change in the marginal tax rate between the threshold and interior solution: T (y T + ∆y ) − T (y T ) dτ (y T − c0 ) − (τ0 + dτ ).∆c τ∗ = = τ ∗ = (τ0 + dτ ) + (20) ∆y ∆y Replacing τ ∗ from equation 20 into equation 19: ∆y 1 − τ0 (∆y )2 (1 − τ0 ) y,1−τ = . ∗ = . (21) y τ − τ0 y dτ.∆y + dτ (y ∗ − c0 ) − (τ0 + dτ ).∆c For an elasticity of revenue y,1−τ , distance to the threshold, ∆y and cost response ∆c, there is a counterfactual cost level, c ¯0 bunch. ∆c is the ¯0 , such that all firms with cost lower than c increase in cost that the firm would report if it faced the higher tax rate, which is a function of the τ .c0 .dτ cost elasticity (∆c = c,1− 1−τ0 ). Note that we are assuming a homogeneous cost elasticity and as a result the cost response of bunchers had they not bunched equals the average cost response, estimated from the discontinuity.35 Finally we reorder equation 21 to have the cost threshold, c ¯0 , ˜−c 34 ˜ =y Reported profit margin is m ˜ ˜ – hence knowledge of any pair from (y y ˜, m ˜, c ˜ ) determines the remaining one. 35 Hence we are not allowing for selection into bunching as a function of cost responses, which is equivalent to assuming that bunchers would have had the same cost response as infra-marginal firms had they not bunched. 19 ¯ 0: on the left hand side and then transform it into the equivalent profit margin threshold, m (∆y )2 .(1 − τ0 ) (τ0 + dτ ). c,1−τ ¯0 = [y T + ∆y − c ]/[1 + ] ⇐⇒ dτ. y,1−τ .(y T + ∆y ) 1 − τ0 (∆y )2 .(1 − τ0 ) (τ0 + dτ ). c,1−τ ¯ 0 = 1 − [1 − m T 2 ]/[1 + ] (22) dτ. y,1−τ .(y + ∆y ) 1 − τ0 Equation 22 states that a firm with counterfactual distance to the threshold, ∆y , will bunch given a pair of revenue and cost elasticities ( y,1−τ , c,1−τ ), if its counterfactual profit margin is above m¯ 0 . We implement this formula for each revenue bin past the threshold. Given the counter- factual number of firms and profit distribution in bin j with distance ∆yj we compute the number of firms which bunch, for a pair of revenue and cost elasticities. 4.3 Results We estimate the pair of revenue and cost elasticities with an iterative process using as initial values the revenue and cost elasticities, Step 1 Step1 y,1−τ and c,1−τ obtained in section 3. This combination of elas- ticities applied to the model predicts substantially more bunching than observed. We re-estimate the revenue elasticity Step 2 y,1−τ , such that the number of predicted bunchers equates the excess mass at the threshold. With the resulting revenue elasticity, Step 2 y,1−τ , we measure the cost response to match the “donut-hole” discontinuity (adjusted with the revenue elasticity) and obtain a new cost elasticity, Step 2 c,1−τ . We iterate this process until we converge to the fixed point (ˆy,1−τ , ˆc,1−τ ), which matches (1) the excess mass and (2) the cost discontinuity.36 We then add a third parameter: the share of firms facing large adjustment cost. We measure it as the ratio of firms with revenue in the 2M interval past the threshold which face an implicit marginal tax rate above one, to the counter- factual, and find that 11% of firms can not adjust their revenue.37 We repeat the iteration process with this new parameter: at the first threshold, the estimated revenue elasticity is 0.25 and the cost elasticity is -0.62.38 This implies that 69% of the profit response is due to an increase in reported costs and 31% to a decrease in reported revenue. The share of cost to revenue response is larger than in section 3, which was expected since it recovered an upper bound on the revenue elasticity. Finally note that the above analysis treats revenue and cost responses as independent decisions. In practice, there could be substitutability across revenue and cost evasion, as audit triggers might 36 At each iteration we also re-estimate the counterfactual density distribution of firms due to the correction for intensive margin responses on the upper side of the threshold. This correction term only has second order effects. 37 ∞ 1 We rewrite equation 18 with the adjustment friction as: B = (1 − α) y y) m φ ˜0 (˜ ˜0 =y T y ˜ ), where α is φ ˜ 0 (m y) m ¯ (˜ the share of firms with large adjustment frictions. 38 We present a graphical illustration in Figure A8. The number of predicted bunchers is the triangular area between the counterfactual firm density and the curves. Panel A displays these curves for three values of the revenue elasticity, holding the cost elasticity constant. Panel B shows the result for the last iteration: in this scenario the revenue and cost elasticities predicts both bunching and the cost discontinuity. 20 depend on the overall tax evasion level. This changes the interpretation of the estimated revenue elasticity: since bunching firms face incentives to maximize evasion on the revenue margin, we estimate the maximum revenue elasticity when cost evasion is substituted for additional revenue evasion. Hence potential substitution across evasion technologies does not impact the qualitative result that it is easier for firms to over-report cost than under-report revenue. Similarly, the profit elasticity estimated from the discontinuity is not impacted by the revenue elasticity interpretation. 5 Dynamic Responses Thus far we combined the yearly distributions to study the static impact of the tax system on firms’ reporting behavior. In this section we instead analyze the change in reporting behavior across years as a function of firms’ tax brackets. The objective is twofold. First, to observe if the cross-sectional drop in profit margin at the threshold is mirrored by firms’ dynamic profit reporting behavior when changing tax brackets. Second, to obtain a measure of returns to scale in order to bound the share of cost responses that could be explained by real production responses. In the model of section 2, under real production responses firms limit their revenue when faced with a higher tax rate; if they display large increasing returns to scale, then this generates a discontinuity in costs around the threshold, as firms on the upper side have reduced their profitability. We use the panel dimension of the data and study changes in firms’ reported profit margin as a function of their tax brackets at time t and t+1. Figure A9 plots the average yearly profit margin change from t to t+1, as a function of firms’ tax bracket in year t and their tax bracket at t+1. On average firms which remain in the same bracket in consecutive years slightly increase their profit margins: by 0.7% for firms remaining in the first bracket, 0.5% in the second bracket and 0.2% in the third bracket. On the contrary, firms jumping to a higher tax bracket declare lower profit margins compared to the previous year. For firms jumping from the 10% rate to the 20% rate profit margin falls by 2.7% and for those jumping from the 20% rate to the 30% rate the drop is 0.5%. Symmetrically, firms which fall to lower bracket declare a higher profit margin. The simple comparison of average profit margin by tax brackets does not control for firms’ revenue growth: in the absence of a tax rate change, fast growing firms could still face different profit margin changes than slow growing firms. To address this issue, we study firms’ yearly profit margin change as a function of base year revenue, controlling for firms’ revenue growth. Figure 6 plots average profit margin changes between year t and t+1 for several base year revenue bins. Each panel is conditioned on the firms’ revenue growth, ranging from 1-3 million to 7-9 million. In between the dashed lines are the revenue bins containing firms which due to their revenue growth change tax bracket from year t to t+1 and face a 10% increase in their tax rate. On average these firms decrease their reported profit margin by 2 to 4%. We compare this change in profit margin to that of firms 21 growing at the same rate within a tax bracket. The firms in the revenue bins to the left of the first dashed line grow within the 10% tax rate bracket, and the firms to the right of the second dashed line grow within the 20% tax rate bracket and these firms slightly increase their profit margin. Figure A10 shows that this result applies in reverse for shrinking firms: firms which face a lower bracket, as a result of a drop in their revenue, increase their profit margin by 3% to 5%, while firms shrinking within their initial tax bracket do not change their reported profit margin. Taken together, this implies that the within firm drop in profit margin is correlated with the change in the tax rate. We also investigate the relation between profit margin and revenue with a panel regression: yu marginit = αi + γt + β.yit + δ.1(τit = τ + dτ ) + ψj .1(yj = j ) + it (23) j =yl where marginit is firm i’s profit margin in year t, αi and γt are respectively firm and year fixed effects, yit is revenue of firm i at time t, τit is the tax rate faced by firm i at time t and the ψj are dummy shifters for the revenue intervals impacted by bunching. The coefficient δ captures the change in profitability when crossing the threshold and β measures the return to profitability from higher revenue. We run this model separately at the first and second threshold, and include all firm-year observations with revenue in a 70 million CRC window around each threshold. Table A5 presents the results. The dummy on jumping to the higher tax bracket δ ˆ is negative and significant at each threshold. Firms crossing the first threshold decrease their profit margin by 3.06% (on a 14.56% base) and firms crossing the second threshold decrease their profit margin by 0.86% (on a 5.62% base). In addition, in the year when firms bunch they exhibit abnormally high profit margin, which is consistent with firms selecting into bunching in years with low costs. The coefficients on revenue β ˆ shows that conditional on staying within the same tax bracket, firms display very small increasing returns to scale. A firm which grows within the first tax bracket by 1 million in revenue increases its profit margin by .012% and a firm which grows within the second tax bracket by 1 million increases it profit margin by .007%. Using these estimates to proxy for average returns to scale we ask the following question: if all revenue responses above the threshold are production responses, what share of the discontinuity in costs can we mechanically explain with increasing returns to scale? In section 3 we estimated an upper bound elasticity of revenue of 0.33, which implies that a firm lowers its revenue by 3.6% when faced with a net of tax increase of 11%. Using the returns to scale estimate of .012% at the first threshold, we can only explain an increase in cost of 0.1 million (a 0.2% decrease in profit margin),39 while the actual estimated cost increase at the threshold, adjusted for revenue responses, is 2.55 million. Hence the cost discontinuities at the thresholds can not be explained by production responses in the standard model, given observed 39 These numbers are obtained from the following calculation: a firm with revenue of 50 million would decrease its revenue by 50 ∗ 0.036 = 1.8 million. This mechanically explains a 1.8 ∗ 0.012 = 0.22% decrease in profit margin. 22 returns to scale.40 However we have not discarded avoidance responses: a potential explanation is that the profit margin drop is due to firms’ shifting cost across fiscal years. Under this hypothesis, the profit margin of firms changing tax bracket should bounce back at t+2. Figure A11 does not validate this hypothesis. It shows the change in profit margin at t+1 and t+2, compared to the base year t, for firms which jumped to the 20% bracket at t+1 and remained in the 20% bracket at t+2. The average profit margin at t+2 remains lower than at t, and indistinguishable from t+1. Overall, we find that (1) cost discontinuities at the thresholds cannot be mechanically explained by real responses given observed returns to scale, and (2) in years when firms face higher tax rates they report lower profit margin and higher costs. These results reenforce the cross-sectional results, and show that the discontinuity in cost is not simply due to a static selection pattern. 6 Optimal Tax Policy 6.1 Laffer Rates and the Current Tax System The estimated profit elasticities imply that a 1 percent increase in the net of tax rate leads to a drop in reported profits of 5% for small firms and 3% for slightly larger firms, an order of magnitude larger than the profit elasticity estimated for similar sized firms in rich countries (Devereux et al. 1 2014, Patel et al. 2015). The tax rate maximizing revenue is τ max = 1+ π, 1−τ : regardless of the mechanisms driving responses, rates above 17% and 25% are on the wrong side of the Laffer curve. Lowering locally the rates applied to the second and third tax brackets would be Pareto improving by increasing tax revenue while lowering production distortions.41 These high elasticities might not characterize the responses of large firms, which remit 80% of profits.42 Indeed, based on two estimates, one at each threshold, the profit elasticity appears to decrease with revenue.43 A falling profit elasticity with size provides a plausible rationale for tax systems with average tax rates as a function of revenue: such systems tag firms based on revenue, which we find is hard to adjust, and apply increasing tax rates satisfying an inverse elasticity rule. In addition, the low initial rate on small firms encourages formalization: once firms are registered reverting to informality could be harder and increasing enforcement might increase tax collection 40 While we rule out real responses explaining the cost discontinuity with smooth production functions, other pro- duction functions could generate a cost discontinuity if the decision to cross the threshold is correlated with incurring fixed costs. However, reported capital stock appears continuous across the thresholds. Another threat is “lazy” report- ing: as the tax rate increase firms have more incentives to file real costs they did not use to file. However note that the tax rate is always positive which mitigates this concern. 41 Gorodnichenko et al. 2009 and Kopczuk 2012 both find that flat tax reforms in Eastern Europe, which decreased substantially the rate and simplified the tax code, led to large increases in reported income. 42 We denote by large firms all firms not considered in the analysis, that is with revenue above half a million USD 43 However large firms could have access to sophisticated evasion schemes (e.g. transfer pricing and debt shifting) and hence the profit elasticity might not necessarily fall with size. 23 in the medium run. Note that increasing marginal tax rates on profits cannot achieve the same outcome; while it reduces bunching incentives, it generates a loss in tax revenue as infra-marginal firms with large profits and low elasticities reduce their tax bill on the initial portion of their profits without substantially increasing their reported base. 6.2 Optimal Tax Rate and Tax Base In addition to choosing the tax rate, the government can broaden the tax base by limiting taxable deductions. While a deviation from a pure profit tax violates production efficiency (Diamond and Mirrlees 1971), it also reduces tax evasion thereby increasing tax revenue. Best, Brockmeyer, Kleven, Spinnewijn and Waseem (2015) present this trade-off clearly: a broadening of the tax base distorts production proportionally to the elasticity of real output with respect to the effective tax rate, but generates revenue by reducing returns to tax evasion, proportionally to the evasion elasticity. The key parameter is the ratio of the evasion elasticity to the real output elasticity. A limitation of their study is that the output elasticity is unknown and has to be assumed. Here we can use the estimated revenue elasticity as an upper bound to the real output elasticity, and reformulate the estimated cost elasticity as an evasion elasticity. With these parameters, we simulate the model of Best et al. (2015) and jointly estimate the optimal tax rate and tax base, under the constraint that total firm profits are unchanged under the new tax policy.44 We provide complete implementation details in Appendix B. We find that the optimal tax policy considerably broadens the base, to only allow 21% of costs to be deducted, while simultaneously lowers the rate to 3.4%. This generates revenue gains of 79%, holding total firms’ profits constant. Switching to a pure turnover tax (no deductions allowed) with a tax rate of 2.9% leads to revenue gain of 74%, 93% of realizable gains. More generally the optimal profile is fairly flat over a range of pairs of tax bases and rates (Figure B1, panel (a)), such that, for example, setting the base to a half with a rate of 5% would generate revenue gains of 77%, 97% of possible gains. Limiting taxable deductions by a half corresponds to a concrete policy: the removal of tax deductibility of administrative costs (excluding wages), which appear to respond strongly to tax incentives (figure A12). Hence targeting these particular deductions could lead to even larger revenue gains than estimated. Finally, note that the revenue elasticity is certainly an upper bound of the true output elasticity since we find that part of the revenue responses can be explained by tax evasion (Section 7). This re-enforces the desirability of a broader tax base with a lower rate and implies that our estimates of revenue gains represent lower bounds. 44 Note that we only estimate this model locally, that is for firms in a 60 million window around the first threshold. 24 7 Mechanisms: Evasion, Avoidance and Production Responses 7.1 Automated Desk Audits The tax administration audits comprehensively 300 firms per year, implying that only 0.4% of firms are audited in a given year. Firms are selected with a risk-based analysis, which consid- ers information from third-parties, deviation from industry averages and taxpayers’ history.45 The tax administration’s limited capacity to conduct comprehensive audits is partially compensated by computer audits, which automatically notify firms of discrepancies between self-reported revenue and revenue estimated from third-party data. Third-party data are collected from credit card sales, POS devices and via the D151 informative tax form which requires firms to declare purchases and sales to other firms (Brockmeyer et al. 2017).46 Notifications require firms to adjust their revenue to match the amount assessed with third-party data or justify the difference. As of 2015 bunching behavior did not generate a flag. Figure 7 shows the proportion of firms receiving a notification in 2012 by revenue bins of two million CRC and plots the linear fit excluding revenue intervals around the thresholds.47 About a third of small firms receive a notification, which highlights that tax declarations are often in- complete and that revenue is frequently under-reported. Compared to their expected probability (from a linear extrapolation) bunching firms are 8.3% more likely to receive a notification at the first threshold and 11.5% at the second threshold. Hence part of bunching appears driven by rev- enue under-reporting: while bunching firms sometimes get caught for inconsistencies, they appear willing to incur the expected costs.48 Two other statistics are worth noticing. First, firms reporting revenue just above the thresholds receive slightly fewer notifications, which might indicate that firms in the hole region report revenue more truthfully.49 Second, the proportion of notifications by revenue is fairly constant on either sides of the threshold, away from the notch, contrarily to the large discontinuity in profit margin, presented in Section 3. This is not inconsistent with our results: the automated system detects revenue under-reporting, while we find that the profits drop past the threshold occurs mainly from cost over-reporting. A possible explanation for the ease of manipulating costs relative to revenue is that third-party 45 Figure A13 plots the number of comprehensive audits (Panel a) and percentage of firms audited (Panel b) by broad revenue bins. The SMEs we study have revenue to the left of the figure and are rarely audited: they face a 0.2% probability of being audited. The percentage of audited firms increases with revenue to reach 3% for the largest firms. 46 A D151 is required anytime two parties transact for 2 million CRC or more in the year ($6,000 in PPP). For commissions, professional fees or rental agreements the reporting threshold is only 50,000 colones ($150) 47 Note that we did not have access to the underlying micro-data and only received tabulations for 2012. 48 From discussions with tax administration a non-trivial share of firms do not adjust their tax declaration following a notification. Firms can revise their declarations at minimal cost and do not get systematically prosecuted. However failure to comply increases the risk of a comprehensive audit. 49 Truthful reporting can explain part of the mass in the hole region only if it is difficult/costly to precisely control revenue by lowering production. 25 information always provides a lower bound on true values. Since firms’ incentives are to under- report revenue, observing revenue below the lower-bound provides a binary signal of tax evasion and generates an automatic notification. On the contrary, firms’ incentives are to over-report costs, hence a lower-bound on costs reveals cost under-reporting only if the tax administration is confi- dent that it observes all the firms’ costs. In a context with limited information, the signal to noise ratio for costs is low, and cost evasion is particularly difficult to detect for the tax administration. 7.2 Variation in Audit Probability by Sectors Next we use variation in audit probabilities by sectors, generated by the program of Special Audit Attention. In the first semester of 2012 the tax administration determined a list of sectors to which it assigned dedicated tax inspectors. This information was posted on the website of the ministry of finance and implied that firms in these sectors faced a discrete jump in their audit probability. Sectors were not randomly selected but chosen based on their underlying evasion risk and their growth rate compared to their tax payments’ growth. The twelve sectors selected in 2012 were real estate, private education, hotels and tour agencies, transport of merchandise, sale of vehicles, sports, production of pineapple, yucca, flowers and plants, casinos and betting, performances and recycling.50 The difference in difference analysis of firms within the selected sectors versus firms in other sectors shows a significant increase in reported profits following their assignment to the program. However, given the endogenous selection mechanism, it is difficult to establish causality. Instead, we run a triple difference in difference to study firms’ evasion behavior around the thresh- old: we compare the change in reported revenue of bunching firms in selected sectors to bunching firms in other sectors, and smaller firms in the same sectors. We assume that all firms within a sector adjust their audit probability equally and hypothesize that bunching firms are evading more revenue than slightly smaller firms (not bunching) before the program. Therefore, when faced with a higher audit risk, bunching firms should increase reported revenue by more than non-bunching firms. To test this hypothesis, we estimate the following equation: yist = αi + β ∗ Bunchijt0 ∗ Auditj ∗ P ostt + γ ∗ Bunchijt0 ∗ P ostt + δ ∗ Auditj ∗ P ostt + ζ ∗ P ostt + ijt (24) where yijt is either revenue, costs, profits or taxes of firm i in sector j at time t, Bunchijt0 equals one if firm i declares revenue in the two million revenue interval below the threshold in 2011 (t0 ) and zero otherwise, Auditj is a sector dummy equal to one if a firm belongs to an au- dited industry, P ostt is a time dummy equal to one in 2012 and 2013 and zero in 2011.51 50 The sectors selected in 2013 were almost identical and therefore we do not perform a sector specific event study for each year of the variation – instead we look at the first audit announcement in 2011 on the two subsequent tax declarations in 2012 and 2013. Sectors correspond to 3 digits ISIC classification. 51 We restrict the sample to a balanced panel and trim the top and bottom 1% outliers in terms of revenue growth. 26 Table 4 presents the results, using as a control group firms with revenue 10% below the thresh- old in 2011. The coefficient β on the triple interaction measures to the change in reporting behavior of bunching firms in selected sectors. Column (1) supports our hypothesis: bunching firms in high audit sectors increase their reported revenue by 5 million CRC more than smaller firms in the same sectors and bunching firms in other sectors (10% of their baseline revenue). However, these firms simultaneously increase reported costs by a larger amount (Column 2) leading to a 1.2 million drop in reported profits (Column 3). Nonetheless, their tax liability increase (Column 4), since the audit threat pushed these firms into the 20% tax bracket. We also note that the coefficient δ of the interaction between Audit and P ost is positive and significant for reported profit and tax liability: sectors with higher audit intensity remit more taxes. Finally, placebo treatments which assume the treatment occurred in 2009 or in 2010 show no significant effect on any of the four outcomes. The results show that following an increase in the audit probability, bunching firms report sub- stantially more revenue than smaller firms in their sectors and bunching firms in other sectors. This provides evidence that bunching firms were initially under-reporting their revenue to reach the threshold. We also find that while audits induce bunching firms to report more revenue, taxes paid only slightly increase since firms compensate by over-reporting cost when faced with a higher tax rate. This substitution between cost and revenue evasion following more stringent enforcement mirrors results from recent studies by Carrillo et al. (2017) and Slemrod et al. (2015). 7.3 Employment, Wage Bill and Assets Breaking firms’ costs into the line items reported on the corporate tax return brings limited in- sights. Figure A12 shows the discontinuity by revenue for different cost categories. The two main categories, “Administrative & Operational Costs” and “Material & Production Costs” account re- spectively for 60% and 40% of the total discontinuity. The other categories (interest deductions, depreciation and other costs) only represent 10% of total costs together. None of these categories display a discontinuity at the thresholds. In tax returns, wages are reported in “Administrative & Operational Costs” and cannot be separated. To study employment and wage bill, we instead use social security data which are available from the central bank for 2011 and 2012. Capital is reported on the tax returns. There are two reasons to believe that labor inputs reported in social security records are relatively accurate. First, employees have incentives to report their wages for social security as benefits are generous in Costa Rica. Second, estimated evasion on payroll and personal income tax of wage earners is much lower than evasion on other margins: for example, the ILO estimates that among formal firms in Costa Rica, only 9% of employees are informal. In theory, higher productivity firms need less inputs to produce a given revenue than lower productiv- ity firms: hence employment, wage bill and capital should fall discontinuously at the threshold if firms limit production due to the higher tax rate. 27 We test these assumptions by running the bunching-hole discontinuity regression (Equation 14) on employment, wage bill and capital. Figure A15 plots the average number of employees and wage bill on revenue at each threshold. For each plot we show the linear fit on each side of the thresholds, excluding revenue bins around the thresholds, and display the estimated discontinuity at the thresholds. The wage bill corresponds to 20% of firm’s revenue. Employment slightly drop at the thresholds, though neither coefficients on the discontinuity is significant and the wage bill as a share of revenue remains unchanged. These null results are consistent with no production response, however the standard errors around the coefficients on employment are too large to re- ject production responses. At the first threshold the discontinuity corresponds to a 1% drop in employment. Given the estimated revenue elasticity and assuming a homothetic production func- tion, production responses should generate a drop in employment three times larger. At the second threshold, we would predict a 1.5% drop in employment, while the discontinuity corresponds to a 2% drop in employment. Note that if firms can manipulate their reported wage bill, the results are biased against uncovering production responses (Almunia and Lopez-Rodriguez 2018).52 since the increase in the tax rate lowers the incentives to manipulate the wage bill, which could lead to an upward discontinuity. However the corporate tax rate remains below the tax rate on labor ev- erywhere which mitigates this mechanism. Finally, Figure A16 shows that the capital stock value reported on the tax returns is continuous at the thresholds. However the data on capital stock value are noisy and given the confidence intervals we cannot reject a drop in capital at the threshold. 7.4 Firm Division and Profit Shifting to Subsidiaries Large firms could take advantage of the tax design to create small subsidiaries on which to offload their profits: subsidiaries would report low revenue and high profits, taxed at 10%, while parent firms would report high revenue and low profits, taxed at 30%.53 We formulate two hypotheses: first, if parent firms incur a cost of opening and administrating subsidiaries, then subsidiaries should disproportionally report revenue just below the thresholds, and the share of subsidiaries should be higher in lower tax brackets. Second, if subsidiaries exist to avoid taxes, then they should display higher profit margins than other firms. We test these hypothesis with a unique dataset of economic affiliates compiled by the Central Bank. Firms are linked by matching shareholders’ names from the corporate ownership registry and direct calls and visits to the firms’ premises.54 For any pair of 52 Under collusion between employers and employees, firms should under-report wages instead of over-reporting them, since the sum of the payroll and personal income tax rates is always larger than the corporate tax rate in Costa Rica. The payroll tax rate is 35.67% and the marginal tax rates on personal income taxes are 10 to 15%. 53 Large firm division and profit shifting could explain both bunching and discontinuities in profits at the thresholds. 54 The Central Bank constructed the data of economic groups in 2012, with the goal to obtain accurate information on corporate ownership structure and firm linkages. These data were merged with the corporate tax returns for the years 2008-2012 such that the final dataset (named REVEC) contains firms’ reported revenue and profits. 28 affiliates, we define a subsidiary (parent) as the lower (higher) revenue firm.55 Figure A14 plots the share of firms which are subsidiaries by revenue and the linear fit on each side of the thresholds, excluding intervals around the thresholds. On average, 5% of small firms are subsidiaries of larger firms and the relation appears continuous on either sides of the threshold. The estimated discontinuities indicate that the share of subsidiaries slightly drops at the thresholds by 0.39% at the first and 0.14% at the second, however neither estimates are significant. The coef- ficient on excess bunching subsidiaries are small: there are only 0.31% more bunching subsidiaries than the linear prediction at the first threshold and 0.83% more at the second, and only the latter is significantly different from zero. Finally, we find no excess profitability of subsidiaries com- pared to non-subsidiaries at either threshold: if anything subsidiaries appear on average slightly less profitable. Hence tax motivated division of firms and profit-shifting to smaller subsidiaries can only explain a tiny fraction of bunching and none of the profits discontinuity at the thresholds. The absence of tax motivated firm division could be due to the registration fees and yearly stamp duties required to keep firms active or to the ease of over-reporting costs, making firm division subopti- mal. It is also possible that the data on economic groups does not fully capture firm affiliations, in particular evading firms might use more sophisticated strategies to appear unaffiliated. 7.5 Timing of Revenue Realization The timing of revenue realization is another margin for firms to lower their taxes (le Maire and Schjerning 2013). To remain below the threshold, firms could date some of their revenue from September (end of fiscal year) to October, or stop production once they reach the threshold. Both types of responses imply that bunching firms generate a lower share of revenue at the end of the fiscal year, compared to non-bunching firms. Retiming also predicts a higher share of revenue early in the fiscal year. Since testing these hypothesis requires monthly revenue data, we use the subsample of firms liable for monthly sales taxes and run the following regressions: yimt = β1 1 ∗ (m = Sept) ∗ Bunchit + δ2 ∗ Bunchi,t + αm + γt + imt (25) yimt = β2 1 ∗ (m = Oct) ∗ Bunchi,t−1 + δ2 ∗ Bunchi,t−1 + αm + γt + imt where yimt is revenue of firm i, in month m and fiscal year t, αm are month fixed effects, and γt are fiscal year fixed effects. Bunchit equals one if firm i’s revenue falls in the one million interval below the threshold in year t and 0 otherwise. The coefficient β1 (β2 ) measures the differential monthly revenue of bunchers in September (October), compared to other months and firms. If firms retime their income or limit production at the end of the fiscal year, then β1 should be negative 55 Results are not sensitive to the definition of subsidiaries, since empirically subsidiaries tend to be owned by much larger firms. For example results remain unchanged when we define subsidiaries as firms affiliated with a firm with revenue over 150 million CRC, far above the second threshold. 29 and β2 positive. Table A6 presents the results, and finds no evidence that bunching firms report a differential share of revenue in September or October than smaller or larger firms. These results are consistent with limited production and time-shifting responses, however we note that sales tax liable firms belong to sectors such as retail and restaurants which display lower bunching. 8 Conclusion Empirical estimates of tax elasticities for firms in developing countries have been limited by a lack of credible variation in tax rates. In this paper, we use the design of the tax system in Costa Rica, which creates large variation in average tax rates as a function of marginal changes in firms’ revenue, to estimate behavioral responses to taxes. We estimate profit elasticities of 3-5, implying that tax rates above 25% are locally on the wrong side of the Laffer curve. We also document a new mechanism: firms find it considerably easier to manipulate cost than to adjust revenue, and increasing reported costs explain over two-thirds of the reduction in the tax base when firms face higher rates. Using these new parameters, we simulate the optimal tax rate and tax base and find that broadening the base while lowering the rate can increase government revenue by up to 80%, holding profits constant. Three dimensions should be considered for the external validity of the results. First, while Costa Rica’s corporate tax system appears unusual, several large middle-income countries such as India, Indonesia, Malaysia, Thailand and Vietnam also apply increasing tax rates on profits as a function of revenue and our estimation method could be applied there. Second, the elasticity estimates concern small firms and might not apply to large firms, which have access to very dif- ferent evasion and avoidance technologies, as documented by the literature on transfer pricing and debt shifting. Third, tax elasticities are a function of the institutional environment: on the one hand, Costa Rica’s institutions are strong for its income level – for example, it ranks second low- est in Latin America for corruption. On the other hand, Costa Rica’s tax system is complex and fragmented,56 which could contribute to the large profit elasticity and costs over-reporting. 56 In addition to the corporate and personal income taxes, Costa Rica has a self-employed regime and a micro- sellers regime, which applies to firms with revenue further below the firms we study. Moreover it does not have a fully fledged VAT system, even though its sales tax shares many aspects of a VAT. 30 References Almond, Douglas and Joseph J. Doyle, “After midnight: A regression discontinuity design in length of postpartum hospital stays,” American Economic Journal: Economic Policy, 2011, 3 (3), 1–34. Almunia, Miguel and David Lopez-Rodriguez, “Under the Radar: The Effects of Monitoring Firms on Tax Compliance,” American Economic Journal - Economic Policy, 2018, vol. 10 (1). Asatryan, Zareh and Andreas Peichl, “Responses of Firms to Tax, Administrative and Account- ing Rules: Evidence from Armenia,” Working Paper, Manheim University, 2016. Axtell, R. L., “Zipf distribution of U.S. firm sizes,” Science, 2001, 293 (5536), 1818–1820. Besley, Timothy and Torsten Persson, “Chapter 2 - Taxation and Development,” in Martin Feld- stein Alan J. Auerbach, Raj Chetty and Emmanuel Saez, eds., handbook of public economics, vol. 5, Vol. 5 of Handbook of Public Economics, Elsevier, 2013, pp. 51 – 110. Best, Michael, Anne Brockmeyer, Henrik Kleven, Johannes Spinnewijn, and Mazhar Waseem, “Production vs Revenue Efficiency With Limited Tax Capacity: Theory and Evidence From Pakistan,” Journal of Political Economy, 2015, 123(6) (9717), 1311–1355. Blomquist, Soren, Anil Kumar, Che-Yuan Liang, and Whitney Newey, “Individual Hetero- geneity, Nonlinear Budget Sets, and Taxable Income,” Working Paper, 2015. Brockmeyer, Anne and Marco Hernandez, “Taxation, Information and Withholding: Evidence from Costa Rica,” Working paper, the World Bank, 2017. , , Stewart Kettle, and Spencer Smith, “Casting a Wider Tax Net: Experimental Evidence from Costa Rica,” Working paper, the World Bank, 2017. Carrillo, Paul, Dina Pomeranz, and Monica Singhal, “Dodging the Taxman: Firm Misreporting and Limits to Tax Enforcement,” AEJ: Applied Economics, 2017, 9(2). Chetty, Raj, “Is the Taxable Income Elasticity Sufficient to Calculate Deadweight Loss? The Implications of Evasion and Avoidance,” American Economic Journal: Economic Policy, 2009, 1 (2), 31–52. and Emmanuel Saez, “Teaching the Tax Code: Earnings Responses to an Experiment with EITC Recipients,” American Economic Journal: Applied Economics, 2013, (1), 1–31. , John N. Friedman, Tore Olsen, and Luigi Pistaferri, “Adjustment Costs, Firm Responses, and Micro vs. Macro Labor Supply Elasticities: Evidence from Danish Tax Records,” The Quar- terly Journal of Economics, 2011, 126 (2), 749–804. Cohodes, Sarah and Joshua Goodman, “Merit Aid , College Quality and College Completion : Massachusetts ’ Adams Scholarship as an In-Kind Subsidy ,” American Economic Journal: Applied Economics, 2014, 6 (4), 251–285. Devereux, Michael P., Li Liu, and Simon Loretz, “The Elasticity of Corporate Taxable Income: New Evidence from UK Tax Records,” American Economic Journal: Economic Policy, May 2014, 6 (2), 19–53. Diamond, Peter A and James A Mirrlees, “Optimal Taxation and Public Production: I– Production Efficiency,” American Economic Review, March 1971, 61 (1), 8–27. Diamond, Rebecca and Petra Persson, “The Long-term Consequences of Teacher Discretion in Grading of High-Stakes Tests,” Working Paper, Stanford University, 2017. Dwenger, Nadja and Viktor Steiner, “Profit Taxation And The Elasticity Of The Corporate In- come Tax Base: Evidence From German Corporate Tax Return Data,” National Tax Journal, March 2012, 65 (1), 118–50. 31 Emran, M. Shahe and Joseph E. Stiglitz, “On selective indirect tax reform in developing coun- tries,” Journal of Public Economics, April 2005, 89 (4), 599–623. Garicano, Luis, Claire LeLarge, and John Van Reenen, “Firm Size Distortions and the Pro- ductivity Distribution: Evidence from France,” American Economic Review, 2016, 106 (18841), 3439–3479. Gelber, Alexander M., Damon Jones, and Daniel W. Sacks, “Earnings Adjustment Frictions: Evidence From Social Security Earnings Test,” Working Papers 13-50, Center for Economic Studies, U.S. Census Bureau September 2013. Gordon, Roger and Wei Li, “Tax structures in developing countries: Many puzzles and a possible explanation,” Journal of Public Economics, 2009, 93 (7–8), 855 – 866. Gorodnichenko, Yuriy, Jorge Martinez-Vazquez, and Klara Sabirianova Peter, “Myth and Reality of Flat Tax Reform: Micro Estimates of Tax Evasion Response and Welfare Effects in Russia.,” Journal of Political Economy., 2009, 117 (3). Gruber, Jon and Emmanuel Saez, “The elasticity of taxable income: Evidence and implications,” Journal of Public Economics, 2002, 84 (1), 1–32. Gruber, Jonathan and Joshua Rauh, “How Elastic Is the Corporate Income Tax Base?,” In Auerbach, Alan J., James R. Hines Jr., and Joel B. Slemrod (eds.), Taxing Corporate Income in the 21st Century, 2007, Cambridge University Press, 140–163. ILO, http://laborsta.ilo.org/informal economy E.html 2012. Jorgenson, Dale and R.E. Hall, “Tax Policy and Investment Behavior,” American Economic Re- view, 1967, 57, 391–414. Reprinted in Bobbs-Merrill Reprint Series in Economics , Econ-130. Investment 2, ch. 1, pp 1-26. Kawano, Laura and Joel Slemrod, “How do corporate tax bases change when corporate tax rates change? With implications for the tax rate elasticity of corporate tax revenues,” International Tax and Public Finance, 2016, 23 (3), 401–433. Khan, Adnan Q., Asim I. Khwaja, and Benjamin A. Olken, “Tax Farming Redux: Experimental Evidence on Performance Pay for Tax Collectors,” The Quarterly Journal of Economics, 2015. Kleven, Henrik J. and Mazhar Waseem, “Using Notches to Uncover Optimization Frictions and Structural Elasticities: Theory and Evidence from Pakistan,” The Quarterly Journal of Eco- nomics, 2013, 128 (2), 669–723. Kleven, Henrik Jacobsen, “Bunching,” Annual Review of Economics, 2016, 8. Kleven, Henrik, Martin Knudsen, Klaus Kreiner, Soren Pedersen, and Emmanuel Saez, “Un- willing or Unable to Chear: Evidence from a Tax Audit Experiment in Denmark,” Econometrica, May 2011, Vol. 79 (No. 3), 651–692. Kopczuk, Wojciech, “Tax bases, tax rates and the elasticity of reported income,” Journal of Public Economics, 2005, 89 (11-12), 2093–2119. , “The Polish Business ”Flat” Tax and its Effect on Reported Incomes: a Pareto Improving Tax Reform?,” Working Paper, Columbia University, 2012. le Maire, Daniel and Bertel Schjerning, “Tax bunching, income shifting and self-employment,” Journal of Public Economics, 2013, 107, 1–18. Naritomi, Joanna, “Consumers as Tax Auditors,” Working Paper, 2016. Patel, Elena, Nathan Seegert, and Matt Smith, “At a Loss: the Elasticity of Corporate Income at the Zero Kink,” Working Paper, 2015. Pomeranz, Dina, “No Taxation without Information: Deterrence and Self-Enforcement in the Value Added Tax,” American Economic Review, 2015, 105 (8), 2539–69. 32 Saez, Emmanuel, “Do Taxpayers Bunch at Kink Points?,” American Economic Journal: Eco- nomic Policy, 2010, 2 (3), 180–212. Seegert, Nathan, McCallum Andrew, and Berthana Marinho, “Better Bunching, Nicer Notch- ing,” Working Paper, 2017. Slemrod, Joel, Brett Collins, Jeffrey Hoopes, Daniel Reck, and Michael Sebastiani, “Does Credit-card Information Reporting Improve Small-business Tax Compliance?,” NBER Working Papers 21412, National Bureau of Economic Research, Inc July 2015. , Marsha Blumenthal, and Charles Christian, “Taxpayer response to an increased probability of audit: evidence from a controlled experiment in Minnesota,” Journal of Public Economics, March 2001, 79 (3), 455–483. Waseem, Mazhar, “Taxes, Informality and Income Shifting: Evidence from a Recent Pakistani Tax Reform,” Journal of Public Economics, 2018, 157. Weber, Caroline E., “Toward obtaining a consistent estimate of the elasticity of taxable income using difference-in-differences,” Journal of Public Economics, 2014, 117, 90–103. 33 Figure 1: Costa Rica’s Corporate Tax Schedule Tax Base Profit = Revenue - Cost Average Tax Rate 30% 20% 10% T1 T2 0 20 40 60 80 100 120 140 Revenue (Million crc) Figure 1 shows the design of the corporate income tax in Costa Rica, as discussed in section 2.1. Firms face increasing average tax rates on their profits (revenue minus cost) as a function of their revenue. When revenue exceeds the first threshold, the average tax rate jumps from 10% to 20% and from 20% to 30% past the second threshold. Thresholds are adjusted yearly for inflation. Figure 2: Bunching with a Notched Schedule Based on Revenue Observed density Excess Mass Firm Density Counterfactual density Missing mass Costs too High to bunch y_T Revenue (Million crc) Figure 2 displays the theoretical density distributions, discussed in section 2.3. The counterfactual firm density is drawn under the assumption of a flat 10% tax rate. The notch induces some firms with counterfactual revenue above the threshold to reduce their revenue and bunch just below the threshold. The decision to bunch depends on firms’ revenue distance to the threshold and on their costs, such that for each revenue bin past the threshold, only firms with sufficiently low costs bunch. This implies that the observed density distribution should match the counterfactual density up to the threshold, exhibit excess mass at the threshold corresponding to missing mass above it. However there is no interval with zero density as firms with sufficiently large costs never have an incentive to bunch. Note that the observed density is permanently lower than the counterfactual past the threshold due to intensive margin responses which lower reported revenue. 34 Figure 3: Firm Density and Average Profit Margin 500 1000 1500 2000 2500 3000 Panel A: Firm Density Number of Firms 0 20 30 40 50 60 70 80 90 100 110 120 130 140 Revenue (Million crc) Panel B: Profit Margin .25 .2 Profit Margin .15 Average 95% CI .1 .05 0 20 30 40 50 60 70 80 90 100 110 120 130 140 Revenue (Million crc) Source: Administrative data from the Ministry of Finance 2008-2014. Figure 3 presents the key patterns of the corporate tax data, discussed in Section 3.1. The figure pulls together data from years 2008 to 2014. Panel A shows the density of firms by revenue. Panel B displays the average profit margin by revenue. Profit margin is defined as profits over revenue. The size of the revenue bins is 575,000 CRC. 35 Figure 4: Revenue Bunching Estimation 3000 2500 Panel A: First threshold Number of firms B = 2.3 y_u = 58.3 2000 1500 1000 y_l y_u 500 30 40 50 60 70 80 Revenue (Million crc) Panel B: Second threshold 900 800 Number of firms 700 B = 1.1 y_u = 107.7 600 500 400 y_l y_u 300 80 90 100 110 120 130 Revenue (Million crc) Source: Administrative data from the Ministry of Finance 2008-2014. Figure 4 displays the firm density by revenue and fits the counterfactual distribution to the first and second thresholds, discussed in Section 3.2. B is the excess mass as a share of the counterfactual and yu the revenue of the marginal buncher, obtained with the point of convergence method. The counterfactual is obtained from the regression of a polynomial of degree 5 (which maximizes Akaike criteria), on all data points outside the [yl , yu ] interval. The lower bound yl is determined as the first bin with statistically different density, compared to a local regression on all revenue bins to the left of the excluded region. The upper bound yu is estimated from an iterative process: starting from yu close to the threshold, we obtain the counterfactual and estimate the excess mass (B) below the threshold and missing mass (M) above the threshold. For low yu , the excess mass is larger than the missing mass (B >> M ). We iteratively increase yu until the two masses are equal (B = M ). 36 Figure 5: Donut-Hole Discontinuity in Costs by Revenue Panel A: First Threshold Cost by Revenue 70 Average Cost (Million crc) 60 50 40 Average Cost Cost adjusted for e_y 30 Linear fit y_l y_u Extrapolated linear fit 20 30 40 50 60 70 80 Revenue (Million crc) Panel B: Cost discontinuity (zoom) 55 Average Cost (Million crc) 50 45 40 Cost jump = 2.55 M 35 y_l y_u 40 45 50 55 Revenue (Million crc) Source: Administrative data from the Ministry of Finance 2008-2014. Figure 5 displays the average reported costs for each revenue bin around the first threshold, discussed in section 3.3. To estimate the cost discontinuity at the threshold, absent revenue responses, we adjust for intensive margin revenue responses: firms declaring revenue above the threshold reduced their reported revenue, due to the the tax rate increase. To take intensive responses into account, we horizontally shift firms’ costs proportionally to the revenue elasticity, estimated from bunching. For example, given an elasticity of revenue of 0.25 and a firm with revenue of 60M: revenuecounter = 60 + y,1−t .y. 1dt 0.1 −t = 60 + 0.25 ∗ 60 ∗ 0.9 ≈ 61.6. We linearly fit costs by revenue below and above the threshold. We exclude revenue bins impacted by bunching behavior. We then extrapolate the linear fits to the threshold. The resulting cost discontinuity represents the average increase in reported costs, for a firm at the threshold, from an increase in the tax rate from 10 to 20%. 37 Figure 6: Profit Margin Change Across Years for Growing Firms Change in profit margin between t & t+1 (%) Firms with 1−3 Million Growth in Revenue Firms with 3−5 Million Growth in Revenue 6 6 4 4 2 2 0 0 −2 −2 −4 −4 −6 −6 −20 −16 −12 −8 −4 0 4 8 12 16 20 −20 −16 −12 −8 −4 0 4 8 12 16 20 Firms with 5−7 Million Growth in Revenue Firms with 7−9 Million Growth in Revenue Change in profit margin between t & t+1 (%) 6 6 4 4 2 2 −6 −4 −2 0 0 −2 −4 −6 −20 −16 −12 −8 −4 0 4 8 12 16 20 −20 −16 −12 −8 −4 0 4 8 12 16 20 Distance to threshold at t=0 (Million crc) Distance to threshold at t=0 (Million crc) Source: Administrative data from the Ministry of Finance 2008-2014. Figure 6 plots the average change in profit margins between year t and t+1 for firms with different revenue growth, around the 1st threshold (section 5). Bars represent the 95% confidence intervals. In between the dashed lines are the firms whose revenue growth pushed them into the upper tax bracket. Firms to the left of the first dashed line grew within the 10% tax rate bracket, and firms to the right of the second dashed line grew within the 20% tax rate bracket. Revenue bins correspond to two million CRC Figure 7: Notifications of Discrepancies with Third-Party Data % of firms in revenue bin receiving notification .45 .4 .35 .3 .25 .2 30 40 50 60 70 80 90 100 Revenue (Million CRC) % of firm in bin Fitted line excl. dotted area 95% CI Source: Administrative data from the audit department of the ministry of Finance for 2012. Figure 7 displays the percentage of firms within a revenue bin receiving automated notifications from the tax ad- ministration (section 7.1). Notifications are generated from discrepancies between self-reported revenue and revenue estimated from third-party data. Revenue bins of 2 Million CRC. The fitted line excludes the revenue intervals im- pacted by the bunching selection. 38 Table 1: Donut-Hole Cost Discontinuity by Revenue 1st Threshold 2nd Threshold (1) (2) (3) (4) Cost Cost(rev. adjusted) Cost Cost(rev. adjusted) Jump in cost δ 4.203 2.548 2.223 1.277 (0.212) (0.226) (0.416) (0.432) Slope below Threshold. β1 0.834 0.834 0.933 0.933 (0.010) (0.010) (0.015) (0.015) Slope change above Threshold β2 0.103 0.069 0.017 0.008 (0.014) (0.014) (0.026) (0.026) Threshold intercept α 41.971 41.971 93.863 93.863 Observations (revenue bins) 80 80 80 80 δ % Jump in Cost α +10.01% +6.07% +2.37% +1.36% Source: Administrative data from the Ministry of Finance 2008-2014. Table 1 shows the results from the donut-hole regression discontinuity of average cost by revenue from equation 14 in section 3.3. At each threshold we report the discontinuity in cost δ without the revenue adjustment (columns 1 and 3) and with the revenue adjustment (cols 2 & 4) which is our main specification. The adjustment uses the revenue elasticity estimated with bunching to control for change in reported revenue above the threshold, such that the discontinuity only identifies responses in reported cost. An observation is a revenue bin of 0.575 Million Colones. Standard errors are shown in parentheses and stars indicate statistical significance level. Table 2: Elasticity Estimates from Point of Convergence Method Parameters Elasticity Threshold jump yT ∆y 1 − τ0 τ Revenue Cost Profit ∆y |yT ∆c|yT ∆π |yT 50 8.3 0.9 0.55 0.33 -0.55 4.93 -1.84 2.55 -4.39 (1.3) (0.07) (0.06) (0.28) 100.5 7.2 0.8 0.77 0.08 -0.11 2.92 -0.99 1.27 -2.26 (1.8) (0.04) (0.05) (0.71) Source: Administrative data from the Ministry of Finance 2008-2014. Table (2) shows the elasticity estimates when the point of convergence method is used to estimate the revenue elas- ticities, as discussed in section 3.4. Cost elasticities are estimated using the discontinuity in costs at the thresholds, controlling for revenue responses. The profit elasticity combines revenue and cost responses. Standard errors are estimated from a 1,000 bootstrap iterations, sampling with replacement from the joint distribution of revenue and cost. The left panel displays the parameters used to estimate the elasticities: y T is the revenue threshold in million CRC and ∆y is the revenue response of the marginal buncher estimated with bunching. 1 − τ0 is the tax rate below each thresh- old and τ ∗ is the implicit marginal tax rate faced by the marginal buncher, from equation (13). The right side panel shows the implied discontinuities at the thresholds in firms’ revenue, cost and profits. Standard errors in parentheses. 39 Table 3: Industry Level Results (First Threshold) Profit Margin (%) Bunching # Firm-Years Drop Base % Drop Excess Mass Total % Below T1 Sector (1) (2) (3) (4) (5) (6) Agriculture -4.1 8.4 -48.8 1.95 33,095 59.5 Manufacture -3.2 6.8 -47.1 2.34 34,799 45.4 Construction -5.2 9.5 -54.7 3.17 26,410 51 Wholesale & Motor Vehicle -3.5 7 -50 1.11 63,544 45.1 Retail -4.9 8.5 -57.6 1.17 100,552 47.9 Hotel & Restaurants -3.5 7 -50 1.41 21,483 49 Transport -4.1 9.9 -41.4 2 36,294 54.7 Financial Activities -10.3 21.8 -47.2 3.93 26,366 71.9 Real Estate -13 36.4 -35.7 4.05 91,525 85.1 Legal & Econ. Consultants -9.5 17 -55.9 6.27 64,617 73.3 Other Services -9.3 14.6 -63.7 4.37 37,091 69.3 Education & Culture -1.3 5.8 -22.4 2.64 14,228 56.8 Health -8.4 17.1 -49.1 3.23 19,611 65.2 NGO & Public Admin. .8 28.1 2.8 -.19 10,608 68.8 Undetermined -9.6 19.9 -48.2 3.66 36,044 80.8 Source: Administrative data from the Ministry of Finance 2008-2014. Table 3 shows the industry level profit margin discontinuity and excess mass at the first threshold, as discussed in section 3.5. Column 1 shows the drop in profit margin at the first threshold, column 2 the base profit margin and column 3 the percentage drop in profit margin. Column 4 computes the excess mass at the threshold. Column 5 shows the total number of firm-year observations by industry and the percentage of these firms below the first threshold. Table 4: Variation in Audit Risk at the Sector Level Control: firms with revenue 10% below threshold in 2011 (1) (2) (3) (4) Outcome (Million CRC): Revenue Cost Profit Taxes Bunch*Audit*Post 5.00 6.59 -1.23 0.25 (2.36) (2.20) (0.33) (0.11) Bunch*Post -0.25 -0.52 -0.42 0.04 (2.13) (2.00) (2.00) (0.05) Audit*Post -5.48 -5.82 1.24 0.15 (3.02) (3.11) (0.29) (0.04) Post 11.51 11.22 0.12 0.21 (2.26) (2.15) (0.24) (0.02) Firm Fixed Effects YES YES YES YES Observations 7,203 7,203 7,203 7,203 R-squared 0.01 0.01 0.02 0.02 Source: Administrative data from the Ministry of Finance 2008-2014. Table 4 shows the results of the program of special audit attention on firms’ reporting from equation 24, (section 7.2). The coefficient of interest is the triple interaction Bunch ∗ Audit ∗ P ost which shows the change in reported revenue, costs, profits and tax liabilities of bunching firms in 2011 in sectors which received an increase in audit probability. The control corresponds to bunching firms in other sectors and firms 10% smaller within the same sectors (firms with revenue 4 to 8M CRC below the threshold in 2011). Standard errors are clustered at the 3 digit ISIC sector level (48 clusters) and shown in parentheses. 40 Appendix A ADDITIONAL FIGURES AND TABLES Figure A1: Corporate Income Tax Across Countries (a) Tax Revenue on GDP (b) Tax Rates on GDP 50 6 Slope = 0.37 Slope = −1.36 (0.09) THA MYS CYP (0.50) ZAF PER Corporate Tax Revenue (% of GDP) MAR 40 AUS HKG 5 BGD CAN FRA PNG COD ZMB USA Corporate Tax Rate PAK BRA BEL TGO NAM DEU TUN SGP GMB PRT 4 30 NAM UKR BLR RUS JPN MWI ETH BDI NER UGA CAF HTI MLITZA KEN IND HND SLE NGA PNG PHL PER MAR TUN GAB CRI ESPAUS IRN BFA SWZ LKA GTM ZAF DOM NZL ITA PHL NZL KOR IND CZE GRC JPN HND BRA PRT DNK LBR GNB ZWE MRT SEN GHA NPL MNG EGY IRN URY PAN MYS AUT IRL NLD DNK GTM GEO BEL LAO ZWE ITA 3 KEN ARM KORSWE SVK GMB MRT HRV PAN CANCHE GBR 20 PRY SVK FRASWE MDGAFG ARM CHL THATURRUS FIN CHE HRV MNG GBR LBR SWZ JOR CRI URY IRL AUT UKR BLR HUN POLCZE USA SVN HKG SGP BDI NPL PAK MDA GAB POL ESPFIN 2 DOM EGY CHL TUR SVN NLD SDN GEOALBSRB LVA LTU TGO GHA BGR DEU JOR MWI ETH BFA SEN ZMB LAO KSV SRB LVA MDA COD LTU HUN 10 AFG KSV PRYBIH MKDBGR CYP NERCAFMLI SLE BGD NGA BIHLKA GRC ALB 1 HTI SDN MKD UGA GNB MDG TZA 0 0 6 8 10 12 6 8 10 12 Log GDP (PPP) Log GDP (PPP) Source: Corporate tax revenue data from ICTD. Top statutory corporate tax rates collected by the authors for 2013. GDP data in PPP from the World Bank. We exclude countries with less than 1 Million in population. N = 101. Figure A1, Panel (a) plots corporate tax revenue as a share of GDP on log GDP per capita. Panel (b) plots the top statutory tax rate on corporate tax revenue. The dotted lines show the linear fit and the 95% confidence interval, while the slope and standard error are reported in the box. Figure A2: Average Profits and Costs by Revenue Panel A: Profits 12 10 Average Profits 8 6 4 2 0 20 30 40 50 60 70 80 90 100 110 120 130 140 Revenue (Million crc) Panel B: Costs 120 100 Average Costs 80 60 40 20 20 30 40 50 60 70 80 90 100 110 120 130 140 Revenue (Million crc) Source: Administrative data from the Ministry of Finance 2008-2014. Figure A2 shows average profit (Panel A) and average costs (Panel B) by revenue, pooling together 2008 to 2014. 41 Figure A3: Linear Relation of Average Costs by Revenue 25 30 35 40 Below 1st Threshold Above 1st Threshold 55 60 65 70 Average Costs Average Costs 20 30 35 40 45 60 65 70 75 Revenue (Million crc) Revenue (Million crc) Below 2nd Threshold Above 2nd Threshold 75 80 85 90 105 110 115 120 Average Costs Average Costs 70 80 85 90 95 110 115 120 125 Revenue (Million crc) Revenue (Million crc) Average costs linear fit quadratic fit Source: Administrative data from the Ministry of Finance 2008-2014. Figure A3 shows the linear and quadratic relation of average costs by revenue, for each revenue interval around the threshold, as discussed in Section 3.3. Figure A4: Quartiles of Profit Margin by Revenue Bottom quartile Median Top quartile .03 .1 .4 .08 .3 Profit margin, Median Profit margin, P25 Profit margin, P75 .02 .06 .2 .04 .01 .1 .02 0 0 0 20 40 60 80 100 120 140 20 40 60 80 100 120 140 20 40 60 80 100 120 140 Revenue (Mil CRC) Revenue (Mil CRC) Revenue (Mil CRC) Note: The y−axis scale is not constant across figures Source: Administrative data from the Ministry of Finance 2008-2014. Figure A4 shows profit margins by revenue for each quartile within a revenue bin, as discussed in section 3.5. 42 Figure A5: Density and Profit Margin by Revenue for each Sector Agriculture Manufacture Construction 0 .1 .2 .3 .4 0 .1 .2 .3 .4 0 .1 .2 .3 .4 .04 .04 .04 .02 .02 .02 0 0 0 Wholesale & Motor vehicle Retail Hotel & restaurants 0 .1 .2 .3 .4 0 .1 .2 .3 .4 0 .1 .2 .3 .4 .04 .04 .04 .02 .02 .02 0 0 0 Transport Fin. activities & equipment rental Real estate 0 .1 .2 .3 .4 0 .1 .2 .3 .4 .04 .04 .04 0 .1 .2 .3 .4 .02 .02 .02 0 0 0 Legal & econ. consultancies Other services Education & Culture 0 .1 .2 .3 .4 0 .1 .2 .3 .4 0 .1 .2 .3 .4 .04 .04 .04 .02 .02 .02 0 0 0 Health Assoc. & Public admin. Undetermined 0 .1 .2 .3 .4 0 .1 .2 .3 .4 0 .1 .2 .3 .4 .04 .04 .04 .02 .02 .02 0 0 0 Source: Administrative data from the Ministry of Finance 2008-2014. The continuous line shows the firm density within a sector of economic activity (measured on the left vertical axis) and the dots the average profit margin by revenue for each sector (measured on the right vertical axis).The vertical line corresponds to the first revenue threshold, where the average tax rate jumps from 10 to 20%. These fifteen sectors contain the universe of registered firms. 43 Figure A6: Yearly Density & Profit Margins, 1st Threshold 2008 2009 2010 100 200 300 400 500 100 200 300 400 500 100 200 300 400 500 20 30 40 50 60 70 20 30 40 50 60 70 20 30 40 50 60 70 2011 2012 100 200 300 400 500 100 200 300 400 500 20 30 40 50 60 70 20 30 40 50 60 70 2013 2014 100 200 300 400 500 100 200 300 400 500 20 30 40 50 60 70 20 30 40 50 60 70 2008 2009 2010 .3 .3 .3 .2 .2 .2 .1 .1 .1 0 0 0 20 30 40 50 60 70 20 30 40 50 60 70 20 30 40 50 60 70 2011 2012 .3 .3 .2 .2 .1 .1 0 0 20 30 40 50 60 70 20 30 40 50 60 70 2013 2014 .3 .3 .2 .2 .1 .1 0 0 20 30 40 50 60 70 20 30 40 50 60 70 Source: Administrative data from the Ministry of Finance 2008-2014. 44 Figure A7: Profit Margin Distributions Away From the Threshold Panel A: Below threshold Panel B: Above threshold 7 7 6 6 5 5 4 4 Density Density 3 3 2 2 1 1 0 0 0 .2 .4 .6 .8 0 .2 .4 .6 .8 Profit margin Profit margin Revenue Distance to 1st Thresh. (Mil) Revenue Distance to 1st Thresh. (Mil) [−6,−8] [−8,−10] [10,12] [12,14] [−10,−12] [−12,−14] [14,16] [16,18] Source: Administrative data from the Ministry of Finance 2008-2014. This figure shows that the profit margin distribution is stable away from the threshold (section 4.1). Panel A shows the distribution for revenue intervals 10 to 20% below the threshold and Panel B for revenue intervals 20 to 30% above the threshold. Each curve corresponds to the profit margin distribution of firms within a 2 Million CRC revenue interval. We can never reject the Kolmgorov-Smirnov tests that profit margin are sampled from populations with identical distributions across all pairs of revenue intervals. The figures use an Epanechnikov kernel with bandwith of 0.04. Figure A8: Illustration of Model Based Estimation of Elasticities Panel A: Revenue Elasticity Scenarios Panel B: Estimated Revenue Easticity 1500 1500 Number of Firms 1000 Number of Firms 1000 500 500 0 0 2 4 6 8 10 0 Revenue Distance to 1st Threshold (Million crc) 0 2 4 6 8 10 Revenue Distance to 1st Threshold (Million crc) Counterfactual Density e_y=0.03 e_y=0.14 e_y=0.33 Bunching Firms Counter. Density e_y=0.25 Figure A8 illustrates the estimation of section 4.3. For a given revenue elasticity, y and cost elasticity, c , the area between the counterfactual density (green) and the curves represents the number of bunching firms. Panel A displays the profile of these curves for several values of the revenue elasticity, holding the cost elasticity fixed. Panel B displays the result for the last iteration (corresponding to the actual estimates of the revenue and cost elasticity): in this scenario predicted bunchers equal observed bunchers and the combination of revenue and cost elasticities predict the discontinuity on cost at the threshold. 45 Figure A9: Profit Margin Change Across Years by Tax Bracket 5 Change in profit margin between year t & t+1 (%) 4 3 2 1 0 −5 −4 −3 −2 −1 Tax bracket year t+1 10% 20% 30% 10% 20% 30% Tax bracket in year t Source: Administrative data from the Ministry of Finance 2008-2014. Figure A9 plots firms’ average change in profit margins between year t and t+1 as a function of their tax bracket in year t and t+1, as discussed in section 5. Bars represent 95% confidence intervals for standard errors of the mean. Figure A10: Profit Margin Change for Shrinking Firms Firms with 1−3 Million Fall in Revenue Firms with 3−5 Million Fall in Revenue Change in profit margin between t & t+1 (%) 6 6 4 4 2 2 0 0 −2 −2 −4 −4 −6 −6 −20 −16 −12 −8 −4 0 4 8 12 16 20 −20 −16 −12 −8 −4 0 4 8 12 16 20 Firms with 5−7 Million Fall in Revenue Firms with 7−9 Million Fall in Revenue Change in profit margin between t & t+1 (%) 6 6 4 4 2 2 0 0 −2 −2 −4 −4 −6 −6 −20 −16 −12 −8 −4 0 4 8 12 16 20 −20 −16 −12 −8 −4 0 4 8 12 16 20 Distance to threshold at t=0 (Million crc) Distance to threshold at t=0 (Million crc) Figure A10 plots the average change in profit margins between year t and t+1 for firms with different drop in revenue around the 1st threshold, from section 5. The narrow bars show the 95% confidence interval of the mean. In between the dashed lines are firms whose revenue drop pushed into the lower tax bracket. The firms to the left of the first dashed line shrank within the 10% tax rate bracket, and the firms to the right of the second dashed line shrank within the 20% tax rate bracket. The figure visually shows a difference in differences across group of firms that changed tax bracket from year t to t+1 versus firms which stayed within the same tax bracket, controlling for revenue growth. 46 Figure A11: Profit Margin Change over Three Periods 0 Profit Margin Relative to Year t=−1 −.01 −.02 −.03 −1 0 1 Years Relative to Crossing Threshold Sample restricted to firms who are below threshold in year −1, and above in years 0 and 1 Figure A11 plots the average change in firms’ profit margins change over a three year period, as discussed in section 5. The baseline year is t-1, and the sample consists of all firms which switched from the 10% to the 20% tax bracket at time t and remained in the 20% bracket at t+1. The bars display the 95% confidence interval. Figure A12: Cost Categories Breakdown Cost Category Admin & Operations Material & Production Other .7 Cost as a share of revenue .6 .5 .4 .3 .2 .1 0 30 40 50 60 70 80 Revenue (Million crc) Source: Administrative data from the Ministry of Finance 2008-2014. Figure A12 shows the cost discontinuity by revenue, broken down into the three main cost categories reported on the tax returns (“Formulario D101”), as discussed in section 7.3. Each cost category is displayed as a percentage of revenue. The five categories on the corporate tax returns are: administrative and operational costs, material and production costs, depreciation, interest deductions and other costs and we group the later three categories together. 47 Figure A13: Audits by Revenue in Costa Rica (a) Number of Audits by Revenue (b) Percentage of Firms Audits by Revenue 80 5 4 Percentage of firms audited 60 Number of audits 3 40 2 20 1 0 0 0 200 400 600 800 1000 1200 1400 >1500 0 200 400 600 800 1000 1200 1400 >1500 Revenue (Million CRC) Revenue (Million CRC) Source: Administrative data from the audit division of the Ministry of Finance 2009-2010 Figure A13, Panel (a) shows the total number of audits over 2009-2010, by revenue bins of 40 Million CRC. Panel (b) shows the percentage of firms audited, in each revenue bin. Figure A14: Share of Subsidiaries by Revenue 1st Threshold 2nd Threshold 8 Share of Subsidiary firms (% of total) 6 4 2 disctontinuity=−.39(.26) discontinuity=−.14 (.39) buncher = .31(.34) buncher=.83 (.33) 0 −20 −10 0 10 20 −20 −10 0 10 20 Distance to threshold (Million crc) Distance to threshold (Million crc) Source: Central Bank’s registry of economic groups 2008-2012 (REVEC database). Figure A14 plots the share of subsidiary firms by revenue at each threshold (section 7.4). It also plots its linear fit on each sides of the thresholds, excluding intervals around the thresholds. The coefficients correspond to the estimated discontinuity in the share of subsidiaries at the thresholds and to the excess share of subsidiaries bunching, compared to the linear prediction. Subsidiaries are defined as firms affiliated to a larger firm. 48 Figure A15: Employment and Wage Bill by Revenue Employment, 1st Threshold Employment, 2nd Threshold 8 7 discontinuity = −.035 (.053) discontinuity = −.122 (.089) 6 Employees 5 4 3 2 1 0 −20 −10 0 10 20 −20 −10 0 10 20 Wage bill, 1st Threshold Wage bill, 2nd Threshold .3 Wage bill share of revenue .25 .2 .15 .1 .05 discontinuity = .015 (.01) discontinuity = 0 (.006) 0 −20 −10 0 10 20 −20 −10 0 10 20 Distance to threshold (Million crc) Distance to threshold (Million crc) Source: Administrative data from social security records for the years 2011-2012 merged with corporate tax returns by the Central Bank (REVEC database). Figure A15 shows the average number of employees and wage bill by revenue around the first and second thresholds, as discussed in section 7.3. It displays the coefficient and standard errors from the discontinuity on the grouped data at the threshold and the dummy coefficient for firms in the bunching interval. 49 Figure A16: Capital Stock by Revenue First Threshold Second Threshold 200 300 150 Capital Stock Value (Million crc) 200 100 100 50 discontinuity=3.09 (13.19) discontinuity=17.59 (19.17) 0 0 −20 −10 0 10 20 −20 −10 0 10 20 Distance to threshold (Million crc) Distance to threshold (Million crc) Source: Administrative data from the Ministry of Finance 2008-2014. Figure A16 plots the average capital stock value on revenue around each threshold, as discussed in section 7.3. The capital stock value is reported on the corporate tax returns. Table A1: Robustness of Bunching Estimates Panel A: Varying the order of the Polynomial Order of Polynomial 4 5 6 First Threshold B 2.4 2.2 2.2 yu 59.4 58.3 58.8 y,1−τ 0.41 0.33 0.36 Second Threshold B 1.1 1.1 1.1 yu 108.3 107.7 107.7 y,1−τ 0.10 0.08 0.08 Panel B: Varying the excluded zone, yl Number of excluded bins 6 7 8 First Threshold B 2.0 2.2 2.3 yu 57.1 58.3 58.3 y,1−τ 0.25 0.33 0.33 Second Threshold B 1.1 1.1 1.0 yu 107.1 107.7 106.6 y,1−τ 0.07 0.08 0.06 Source: Administrative data from the Ministry of Finance 2008-2014. Table A1 shows under different scenarios the estimates of the excess mass B, the revenue of the marginal buncher yu and the resulting revenue elasticity y,1−τ (section 3.3). Panel A varies the order of the polynomial and Panel B the number of excluded bins on the lower side. 50 Table A2: Adjusted R-squared of Average Costs on Revenue Variable: Adj. R-squared Order of Polynomial Revenue Interval Linear Quadratic Cubic Below 1st Threshold .9977 .9981 .9980 Above 1st Threshold .9971 .9970 .9969 Below 2nd Threshold .9933 .9933 .9932 Above 2nd Threshold .9872 .9871 .9870 Source: Administrative data from the Ministry of Finance 2008-2014. Table A2 shows the model fit for different specifications of the regression of average costs on revenue, discussed in Section 3.3. Based on the adjusted R-squared, the simple linear model fits the data well and higher order terms are not necessary. Only below the first threshold could the quadratic fit be preferred. Table A3: Alternative Models for Cost Discontinuity by Revenue 1st Threshold 2nd Threshold Model Specification (1) (2) (3) (4) (5) (6) (7) (8) Narrow Wide Narrow Wide Quadratic Falling y Quadratic Falling y Window Window Window Window Jump in cost δ 3.804 2.326 2.688 2.465 3.113 1.277 1.389 .85 (.583) (.246) (.249) (.204) (.726) (.432) (.591) (.392) Slope below T. .603 .834 .823 .841 .821 .933 .924 .944 (.089) (.009) (.012) (.007) (.078) (.015) (.02) (.012) ∆ Slope above T. .297 .107 .079 .063 -.052 .008 .014 .018 (.092) (.014) (.017) (.011) (.098) (.026) (.042) (.021) Quadratic below T. -.009 -.004 (.003) (.003) ∆ Quadratic above T. .009 .01 (.003) (.003) Intercept, α 40.682 41.971 41.86 42.046 93.217 93.863 93.777 93.99 Observations 80 80 70 90 80 80 70 90 δ % Jump in Cost α +6.42% +5.86% +5.54% +9.34% +1.48% +0.91% +1.36% +3.36% Source: Administrative data from the Ministry of Finance 2008-2014. Table A3 shows the regressions of average costs by revenue on revenue for different model specifications (section 3.3). The parameter of interest is the jump in declared costs at the threshold, δ , from Equation (14). Compared to the main specification of Table (1), Rows (1)-(2) & (5)-(6) vary the revenue interval over which the line is fitted. Rows (3) & (7) assume that the revenue elasticity is falling with revenue, at the speed estimated between the first and second threshold. Rows (4) & (8) assume a quadratic fit instead of a linear fit. An observation is a revenue bin of 0.575 Million Colones. Standard errors are shown in parentheses. 51 Table A4: Donut-Hole Discontinuity of Fixed Characteristics 1st Threshold 2nd Threshold Share of firms within province San Jose .056 -.085 (.046) (.092) Alajuela -.024 .094 (.029) (.065) Cartago -.023 .066 (.022) (.047) Heredia -.006 -.070 (.035) (.061) Guanacaste -.010 -.038 (.014) (.042) Puntarenas -.002 .021 (.019) (.038) Limon .019 -.001 (.018) (.040) Share of firms within sector Agriculture .001 -.07 (.024) (.037) Manufacture .008 .142 (.027) (.060) Construction -.007 -.003 (.025) (.047) Wholesale & Motor Vehicle -.007 .007 (.032) (.087) Retail -.017 -.083 (.042) (.078) Hotel & Restaurants .002 -.012 (.021) (.046) Transport .010 -.018 (.021) (.049) Financial Activities -.007 .008 (.014) (.022) Real Estate -.088 .021 (.038) (.051) Legal & Econ. Consultants -.084 -.092 (.032) (.054) Other Services .012 .037 (.016) (.048) Education & Culture -.017 0.010 (.015) (.025) Health .018 .064 (.016) (.041) NGO & Public Admin. -.007 .001 (.010) (.021) Undetermined .013 -.011 (.008) (.017) Number of years filling taxes -.016 -.001 (.020) (.030) Table A4 tests for sorting of infra-marginal firms on either side of the thresholds by estimating the coefficient from the donut-hole discontinuity from equation 14, (section 3.5). The outcomes are the following fixed characteristics: share of firms within a province, share of firms within a sector and number of years the firms has filled taxes. Significant discontinuities only occur at the 1st threshold for two sectors of activity: ”Real Estate” and ”Legal and Economic consultancies”. This indicates very limited firm sorting based on these characteristics. Robust standard errors are in parenthesis. 52 Table A5: Dynamic Firm Behavior Dep Var: Profit Margin (1) 1st Threshold (2) 2nd Threshold Revenue (Million CRC) 0.0115 0.0071 (0.0039) (0.0029) Higher Tax Bracket -3.06 -0.86 (0.17) (0.14) Bunching (Narrow) 1.56 0.46 (0.27) (0.14) Bunching (Broad) 0.84 0.60 (0.20) (0.27) Above threshold (Narrow) -0.33 -0.10 (0.18) (0.17) Above threshold (Broad) -0.12 0.02 (0.11) (0.10) Constant 14.63 5.62 Firm + Year fixed effects YES YES Observations 289,744 88,493 Source: Administrative data from the Ministry of Finance 2008-2014. Table A5 shows the results from the panel regression of firm profit margin on revenue from equation 23, as discussed in section 5. All firms with revenue in a 70 Million CRC window centered around the thresholds are included in the sample. Profit margin is defined as profit over revenue. “Bunching” and “Above threshold” are dummies for declaring revenue in the intervals below and just above the threshold. Bunching narrow (wide) corresponds to reporting revenue in the half (half to four) Million interval below the threshold. Above threshold narrow (wide) is defined as having revenue between 0 to 3 (3 to 9) Million above the threshold. Standard errors are shown in parentheses. Table A6: Timing of Monthly Revenue at End of Fiscal Year Dependent Variable: Monthly Revenue (Million CRC) All firms CIT revenue = sales tax revenue (1) (2) (3) (4) (5) (6) (7) (8) Buncher*Sept 0.10 0.05 0.22 0.15 (0.12) (0.11) (0.12) (0.07) Buncher*Oct -0.02 -0.21 -0.01 -0.12 (0.12) (0.13) (0.18) (0.10) Firm Fixed Effects NO YES NO YES NO YES NO YES Observations 596,705 596,705 596,705 596,705 115,649 115,649 115,649 115,649 R-squared 0.01 0.01 0.01 0.01 0.64 0.64 0.64 0.64 Source: Administrative data from the Ministry of Finance on sales taxes 2008-2013. Table A6 tests for revenue retiming at the end of the fiscal year using the revenue reported on the monthly sales tax payment (section 7.5). Observations are at the firm-month level and are restricted to the 13,989 firm-year observations with corporate tax returns in a 30 Million CRC window around the first threshold. Specifications (1)-(4) are run on the entire sample while specifications (5)-(8) are run on the subsample for which the corporate tax revenue matches the sum of monthly sales tax revenue (max 5% discrepancy). Robust standard errors are shown in parentheses. 53 Appendix B Optimal Tax System This section presents the assumptions and simulations for the optimal tax base and tax rate results presented in section 6.2, following Best, Brockmeyer, Kleven, Spinnewijn and Waseem (2015) We return to the model of section 2.2, where a firm maximizes profits by choosing its revenue to produce and its costs to report.57 In addition to the tax rate τ , the government now also sets the tax base µ, which is the share of tax deductible costs. µ = 0 corresponds to a turnover tax and µ = 1 to a pure profit tax: ˜ − g (˜ ˜) = (1 − τ )y − c(y ) + τ µc Π(y, c c − c(y )) (1) ˜ are: The first order conditions with respect to revenue produced y and to reported costs c 1−µ c (y ) = 1 − τ = 1 − τE (2) 1 − τµ c − c(y )) = τ µ g (˜ (3) Where revenue decreases with the effective tax rate τE , and is undistorted under a pure profit tax. Evasion increases with the tax rate (higher τ ) and decreases with a wider tax base (lower µ). The government maximizes revenue collection T (y, c ˜) = τ (y − µc ˜) under the constraint that firms’ 58 total profits are unchanged from their current level. Solving this problem numerically requires fixing the initial aggregate profit level and hence assuming a specific form for the production and resource cost of evasion functions. We follow the parametrization of Best et al. (2015), where the production function is governed by firm-specific productivity parameter Ai , a fixed cost of production parameter Fi and a constant elasticity of production y : y yi = Ai (c − Fi ) 1+ y (4) The evasion cost function is governed by a firm-specific evasion scale parameter Bi and a ˜−c , with respect to τ µ: constant elasticity of evasion c 1+ c˜−c 1+ ˜−c c c − c) c − c(y )) = Bi (˜ gi (˜ ˜−c c / (5) ˜−c c ˜i − ci : These imply the following production choices (yi , ci ) and evasion choices c 1+ y y yi = A i (1 − τE ) y / (6) 1+ y 1+ ci = F i + A i (1 − τE )1+ y y (7) τ µ ˜−c ˜i − ci = ( ) c c (8) Bi 57 Note that we simplify the model of section 2.2 by not allowing revenue evasion. 58 In this model tax evasion represents a net social loss and not just a transfer from the government to firms. 54 To run the tax policy simulations we use the elasticity estimates from section 4.3: • We assume that the elasticity of production equals the revenue elasticity estimated in section 4.3 ( y,1−τE = 0.25). This elasticity represents an upper bound on the production elasticity since part of revenue responses appear due to evasion, as shown in section 7. In turn, this implies that we under-estimate the revenue gains from broadening the tax base presented below. • We assume that the cost elasticity corresponds to an evasion elasticity, as supported by sec- tion 5 which shows that cost discontinuities can not be mechanically explained by real re- sponses given observed returns to scale. Note that currently the cost elasticity we estimate in section 4.3 measures the percentage change in reported costs as a function of the net of tax rate, estimated at 0.62. Assuming that this elasticity is constant, then the evasion level when going from the first to the second bracket doubles. The evasion level in the first tax bracket (τ = 10%) which corresponds to a 10% increase in the net tax rate starting from 1, 0.062 is 0.27 = 0.17+0 .062 where 0.17 corresponds to firms’ average observed profit margin. In the 0.12 second bracket the evasion level is then 0.54 = 0.17+0 .062 . Given the lower elasticity of cost found at the second threshold, we might be under-estimating total evasion levels if the cost elasticity is decreasing with firm size. • We calibrate the firm specific productivity parameter Ai , fixed cost parameter Fi , and eva- sions scale parameter Bi in order to match the firm revenue distribution, the reported cost distribution and the average evasion level discussed above. In our main scenario we run this simulation for firms in a 60 million CRC window around the first threshold. For these firms the current tax policy is (τ = 0.1, µ = 1 if y <= yT ) and (τ = 0.2, µ = 1 if y > yT ). We then calculate for each pair of tax rate and tax base (τ , µ) firms’ total after tax profits, net of evasion cost, and the government’s tax revenue gain. Starting from the current tax system (τ ), we consider all rate and base pairs which leave firms’ total profits net of resource cost of evasion unchanged. Figure B1 panel (a), plots the revenue gains as a percentage of current revenue collected from these firms, on the breadth of the tax base µ. On the right vertical axis, it plots the optimal tax rate corresponding to the chosen base level. Given that firms below the threshold only face a 10% rate and that firms above are on the wrong side of the Laffer curve, applying a higher rate of 17% increases revenue collection by 35% for these firms. Broadening of the base while lowering the rate leads to revenue gains of up to 79%, which is reached for a base of 0.21 and a tax rate of 3.4%. However, we note that the revenue gains are large and very similar for a wide range of base parameters and their corresponding optimal tax rates. Panel (b) assumes that the new policy only applies to firms in a 30 million CRC interval above the threshold, and shows the revenue gains relative to the revenue collected previously on these same firms. For firms above the threshold revenue gains of up to 48% can be achieved a base of 0.17 and a tax rate of 2.5%. 55 Figure B1: Simulation results (a) Policy applies to all firms (b) Policy applies above threshold Revenue Gains Optimal Tax Rate Revenue Gains Optimal Tax Rate 80 80 .2 .2 Revenue Gains (% of baseline) Revenue Gains (% of baseline) 70 70 .15 .15 60 60 Optimal Tax Rate Optimal Tax Rate 50 50 40 40 .1 .1 30 30 .05 .05 20 20 10 10 0 0 0 0 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 Breadth of Tax Base (Mu) Breadth of Tax Base (Mu) Mu=0 corresponds to a turnover tax, and Mu=1 to a pure profit tax Mu=0 corresponds to a turnover tax, and Mu=1 to a pure profit tax This figure shows the revenue collection gains (% of current revenue) as a function of the tax base. The optimal tax rate corresponding to the tax base chosen is shown on the right vertical axis. Each panel corresponds to a different sample of firms considered and shows revenue collection gains relative to the revenue collected previously on these same firms. Panel (a) shows the optimal policy applied to a all firms in a 60 CRC million interval around the 1st threshold: given that firms below the threshold only face a 10% rate and that firms above are locally on the wrong side of the Laffer curve, simply applying a higher rate of 17% increases revenue collection by almost 40% for these firms. Further broadening the base combined with lowering the rate leads to revenue gains of almost 80%. Panel (b) assumes that the new policy only applies to firms in a 30 million CRC interval above the threshold, and shows the revenue gains relative to the revenue collected previously on these same firms, which at the optimum represents a 45% increase. 56