DELING THE LONG- M HEALTH AND T IMPACTS OF Modeling the Long- UCING SMOKING Term Health and Cost Impacts of VALENCE Reducing Smoking Prevalence Through OUGH TOBACCO Tobacco Taxation in Ukraine ATION IN AINE Acknowledgements Modeling the Long-Term Health and Cost This report was prepared under the overall Joy Townsend, Emeritus Professor of Economics Impacts of Reducing Smoking Prevalence coordination of Patricio V. Marquez, Lead Public and Primary Health Care, Department of Social and through Tobacco Taxation in Ukraine Health Specialist, Health, Nutrition and Population Environmental Health Research, London School of Global Practice, World Bank Group, by a team Hygiene and Tropical Medicine. comprised of: Feng Zhao, Human Development Program Leader, Introduction Laura Webber, Director, Public Health Modeling, Ukraine, Moldova, and Belarus, Ukraine Country UK Health Forum, London, UK, and Honorary Office, World Bank Group. Smoking is a leading cause of preventable premature deaths. Smoking’s Lecturer, School of Environmental Health, London effects will continue to devastate lives in many countries, including Ukraine, if School of Hygiene and Tropical Medicine. Olena Doroshenko, Health Economist, Health, measures are not implemented to reduce its prevalence. Smoking is a major Nutrition and Population Global Practice, World cause of many chronic diseases, such as cardiovascular disease, respiratory Tatiana I. Andreeva, Associate Researcher, Alcohol Bank Group. disease, and smoking-related cancers. and Drug Information Center (ADIC-Ukraine), Kiev, Ukraine, and Visiting Professor, Cluj School of Support was provided by Oleksandra Griaznova, Over recent years, successful tobacco control policies in Ukraine have resulted Public Health, College of Political, Administrative Ukraine Country Office, World Bank Group, and in one of the fastest declines in smoking prevalence in the world (1). This is and Communication Sciences, Babeș-Bolyai Akosua O. Dakwa, Health, Nutrition and Population largely due to multifaceted tobacco control legislation, adopted from 2005 University, Cluj-Napoca, Romania. Global Practice, World Bank Group. and subsequently upgraded. Renzo Sotomayor, Health Specialist, Health, Draft versions of the report were peer reviewed by: Ukraine ratified the WHO Framework Convention on Tobacco Control (FCTC) in Nutrition and Population Global Practice. World 2006. Currently, Ukrainian legislation basically corresponds to FCTC requirements. Bank Group. Professor Prabhat Jha, University of Toronto Chair in In 2005, Ukraine adopted a first tobacco-control law. Since then, several Global Health and Epidemiology, Dalla Lana School additional tobacco-control policies have been implemented in the country. Abbygail Jaccard, Deputy Director, Public Health of Public Health, and Executive Director, Centre Modeling, UK Health Forum, London, UK. for Global Health Research, St. Michael’s Hospital, Smoke-free policies supported by media campaigns have covered many Canada. workplaces and public places since the middle of 2006. Under these policies, Lise Retat, Mathematical Modeler, UK Health at least 50 percent of the area of restaurants and bars had to be isolated from Forum, London, UK. Sheila Dutta, Senior Health Specialist, Health, the smoking area, so that tobacco smoke did not penetrate into smoke-free Nutrition and Population Global Practice, World areas. This measure was supported by an intensive media campaign and Michael Xu, Software Engineer, UK Health Forum, Bank Group. public movement in favor of smoke-free policies. Many restaurants went London, UK. completely smoke-free both before and after implementing this measure. As The report was edited by Alexander Irwin. of December 2012, restaurants, workplaces, and other public places became Comments, inputs, and advice were provided by: 100 percent smoke-free. Designated smoking places, which figured in the The preparation of this report was supported under legislation between 2006 and 2012, were abolished in the amended laws. Michal Stoklosa, Senior Economist, American the World Bank’s Global Tobacco Control Program, As of late 2006, cigarette packs carried textual warning labels covering 30 Cancer Institute. co- financed by the Bill and Melinda Gates percent of their surface, in place of a previous warning which covered 10 Foundation and Bloomberg Foundation. percent of the front surface and stated: “Ministry of Health warns: Smoking Konstantin Krasovsky, Head, Tobacco Control Unit, is bad for your health.” Since October 4, 2012, large (50 percent of the pack Institute for Strategic Research, Ministry of Health Kiev, London and Washington, DC, August 2016- surface area), graphic health-warning labels on tobacco packaging have of Ukraine. February 2017. been introduced. Alberto Gónima, Consultant, World Bank Group. A ban on outdoor tobacco advertising since January 2009 was followed by a more comprehensive tobacco advertising ban, which entered into force on September 16, 2012. 2 3 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine In addition to the reduced tobacco affordability observed during the global Summary of results economic recession, the average tax incidence was increased between August 2008 and July 2010 from 0.5 UAH (Ukrainian hryvnia, the national currency Table 1 presents a summary of total disease cases (epidemiological) and costs of Ukraine) to 3 UAH per cigarette pack. Further changes in tobacco tax rates (economic) by parameter, year, and scenario as rates per Ukraine population. were less substantial and were above inflation only in some years. However, while the policies described were definitely beneficial, much The model estimated that by 2035 the specified tax increase would result progress remains to be made. As of late 2015, the prevalence of current in the avoidance of 126,730 new cases of smoking-related disease; 29,172 smoking among men in Ukraine was 45 percent (2, 3), although prevalence is premature deaths; and 267,098 potential years of life lost, relative to no much lower in women, at 11 percent (2, 3). change in tax. These reductions in disease and death will avoid 1.5 billion UAH in healthcare costs and 16.5 billion UAH in premature mortality costs, It is not a given that smoking trends will continue to decrease in Ukraine, respectively. unless effective tobacco control measures are sustained and strengthened. Especially when the economy grows, commodities/luxuries such as smoking will become more affordable. Table 1: Summary Table of the Outputs as Rates per Ukraine Population, by Year Tobacco industry tactics can also become an important factor in determining the level of cigarette consumption. One of the mechanisms of this influence Epidemiological outputs Year Sc0 (Baseline) Sc1 derives from the industry’s right to determine the maximum retail price of 2025 5480948[±4237] 5427558[±4237] cigarettes and thus to manipulate the net-of-tax portion of the price. In 2016, a Cumulative incident cases new tax policy stipulated that the minimal specific tobacco tax increase by 40 2035 11366868[±5753] 11255173[±5753] percent. Thus, the retail price was expected to increase and the consumption 2025 NA 56224[±6341] of cigarettes to decrease. However, the actual level of cigarette consumption Cumulative incident cases avoided increased. This happened because tobacco companies, aiming to keep their 2035 NA 126730[±9123] customers, initiated “price wars.” This example illustrates that more factors are 2025 589035[±1545] 582341[±1545] at play than are usually taken into account in weighing policy choices. Price Incident cases per year and tax factors are extremely important and need to be considered when 2035 646600[±1545] 640799[±1727] forecasting trends. 2025 218221[±1121] 208475[±1121] Attributable incident cases The present report provides evidence from a modeling exercise undertaken 2035 222603[±1041] 211984[±1041] to predict the health and related cost impacts that may stem from the 2025 NA 6372 implementation of a tobacco excise tax increase in Ukraine. Impacts are Cumulative premature deaths avoided calculated relative to the status quo before the tax hike, and are modeled, 2035 NA 29172 beginning in 2017, for 2025 and 2035. Cumulative potential years of life lost relative 2025 NA 48923 A microsimulation model was employed to simulate the long-term impact to baseline 2035 NA 267098 of tobacco taxation on the future burden of a range of non-communicable Economic outputs diseases (NCDs). Specifically, the disease outcomes quantified were coronary heart disease (CHD), stroke, chronic obstructive pulmonary disease (COPD), 2025 NA 542.23 and lung cancer. The microsimulation model has been deemed by the Direct costs avoided (millions UAH) 2035 NA 1545.81 OECD the most relevant method for NCD modeling based on risk-factor data (4). This report complements modeling work done to estimate the fiscal- 2025 NA 3568.4 Cumulative premature mortality costs avoided revenue impact and expected reduction in consumption that might stem (millions UAH) 2035 NA 16536.4 from proposed additional tobacco excise tax increases in Ukraine. That work was carried out by the World Bank, using the Tobacco Tax Simulation Model (TaXSiM) developed by the World Health Organization (WHO) (5, 6). Summary of results 4 5 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Summary of Methods Full methods Methods Data Collection The model simulates a virtual population of If an individual’s smoking status is changed by the Ukraine, based on known population statistics. intervention, their smoking status will then remain Smoking Prevalence Data fixed for the entire simulation. Baseline smoking prevalence data was extracted from the 2015 Annual Initial smoking prevalence by age and sex was Household Survey, conducted by the National Statistic Service in Ukraine (7). extracted from the 2015 Annual Household Time since cessation is included in the model to Additional data on percentages of occasional smokers and ex-smokers were Survey conducted by the National Statistics account for change in disease risk for an ex-smoker. extrapolated from the omnibus surveys conducted by Kiev International Service of Ukraine. Institute of Sociology (2, 3). Smokers react quickly to the tax: we modeled an Scenarios took account of price impacts on uptake immediate effect and then a linear trend, in line with Data manipulation and assumptions of smoking and cessation. TaXSiM. 1. Daily versus current smokers: The Annual Household Survey provides Individuals within the model have a specified Limitations prevalence data on daily smoking only, rather than current smoking. Current smoking status and a probability of contracting, No data on survival for the specified NCDs were smoking data are preferable, since the WHO target is focused on a total dying from, or surviving a disease. available. reduction in smoking prevalence, rather than number of cigarettes smoked. Modeling proceeded with a focus on prevalence of current smoking. Pooled Future prevalence of smoking is calculated based Data on the percentages of ex-smokers in Ukraine estimates from other smaller surveys with more detailed collection of smoking on the numbers of smokers and non-smokers who are limited. status data were collated and the 2015 Household survey adjusted to include are still alive in a particular year. estimated proportions of occasional smokers. The model does not take account of future changes Data for disease incidence and mortality were in policy or technology. 2. Ex-smokers: No data by age and sex were available for percentages of extracted from the Global Burden of Disease ex-smokers within the Annual Household Survey. However, some data database. No change in secondhand smoke exposure is were available from the recent omnibus surveys (2013-2015). Therefore, in modeled. order to take account of ex-smokers (who have a greater disease risk than Relative risks of contracting diseases in smokers never smokers), the distributions of ex-smokers and never smokers from the compared to non-smokers were extracted from Baseline is static over time. omnibus surveys were used to proportion out the non-smoker data into ex- peer-reviewed literature. smoker and never smoker from the Annual Household Survey. This enabled The simulation only includes four smoking-related us to initialize the model in the start year with a more accurate estimation of A five-module microsimulation model was used to diseases, so results are likely underestimates of the ex-smokers than would be done using proxy ex-smoker data. predict the future health and economic impacts of true effects. smoking prevalence by 2035. 3. Sample sizes: Often, sample sizes by age group were not presented, No data on non-healthcare costs, e.g. lost productivity therefore the total sample size was proportioned across the five-year age The model quantifies the future impact on health due to disease, were available. groups and the variance increased based on data from the UN population and related costs of different levels of tax increase prospects database (8). relative to a “no change” scenario. No data were available to explore differences by social groups. 4. Age groups: Data for some years were in wide age groups of more than Assumptions 20 years (e.g., 30-59 years), therefore prevalence was assumed to be the same Smoking prevalence follows a static trend from For the scenarios, smokers were moved to never across these groups. Once raw/more detailed data/data by five-year age 2015 estimates. smokers in order to account for change in uptake. groups become available, then the data can be updated. This will overestimate the impact of the tax increases. A specified percentage of smokers who are affected by the tax increase move to the “never- No in-depth uncertainty analysis was conducted. smoker” category in 2017, in order to account for reductions in uptake due to price increases. 7 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Disease Data For this study, the following smoking-related NCDs were modeled: Coronary Table 2: References for Disease Data heart disease (CHD), stroke, lung cancer, and chronic obstructive pulmonary disease (COPD). Incidence and mortality data by age and sex were extracted Incidence Mortality Survival Direct healthcare costs from the Global Burden of Disease study database (9). Lung cancer data were Converted from incidence and I Denisova, P Kuznetsova grouped with trachea and bronchus data in the database, which slightly CHD GBD 2016 (9) GBD 2016 (9) mortality 2014 (28) overestimates cases relative to Globocan (10). No survival data were available Converted from incidence and I Denisova, P Kuznetsova Stroke GBD 2016 (9) GBD 2016 (9) for these diseases in Ukraine, therefore survival data were calculated using mortality 2014 (28) DISMOD equations from the World Health Organization (11). Converted from incidence and I Denisova, P Kuznetsova COPD GBD 2016 (9) GBD 2016 (9) mortality 2014 (28) Relative risks (RR) for smokers compared to non-smokers were extracted from Lung Converted from incidence and I Denisova, P Kuznetsova GBD 2016 (9) GBD 2016 (9) Cancer mortality 2014 (28) prospective cohort studies which observed the development of CHD (12-17), COPD (17-21), lung cancer (17, 21-23), and stroke (24-26). As various cohort studies usually observed participants of different age groups, their estimates Population Data were compared and combined to cover the modeled population: Thus, In order to simulate the population of Ukraine, the population by age and sex, relative risks for various age groups may derive from different studies. However, births by mother’s age, and total fertility rate statistics were taken from the UN if RRs for neighboring age groups from various studies differed much, some population prospects database (8). smoothing was undertaken. Appendix 3 describes the method of creating RRs in more detail. Total mortality rates were taken from the WHO global health estimates database (29). For ex-smokers, RRs were assumed to decrease over time since cessation. The ex-smoker RR was computed using a decay function method developed by These parameters enable the model to simulate the Ukrainian population as Hoogenveen and colleagues (27). This function uses the current smoker RR close to reality as is possible. for each disease as the starting point and then models the decline in relative risk of disease for an ex-smoker over time, as detailed in Appendix 1 of the The Microsimulation Model supplementary appendix. The UK Health Forum (UKHF) microsimulation model was originally developed Health-Economic Data for the English government’s Foresight enquiry (30, 31) and has been further Data on direct health care costs by disease were extracted from the literature developed over the past decade to incorporate a number of additional (28), but no data on indirect, non-healthcare costs by disease were available. interacting risk factors, including smoking. (Methods are described in greater Data on direct health care costs were included in the model and the direct detail in (32, 33) and in our supplementary appendix 1.) The model simulates a healthcare cost impacts output from the model. virtual population that reproduces the characteristics and behavior of a large sample of individuals (20-100 million). These characteristics (age, sex, smoking It was possible to calculate premature mortality costs by including average status) can evolve over the life course based on known population statistics annual income.1 This accumulates the lost earnings due to death before age and risk factor data. Individuals can be born and die in the model. 65 to provide a different measure of lost productivity in terms of losses of GDP due to death. However, this does not take account of losses in Figure 1 illustrates the modular nature of the model. productivity due to morbidity. We carried out a sensitivity analysis on the costs, running the model with a discount rate of 5 percent, as is used in Russia Module 1 uses cross-sectional data on the prevalence of the risk factor - (http://www.ispor.org/peguidelines/countrydet.asp?c=18&t=4). cigarette smoking in this case. For the current study, 2015 smoking prevalence 1 http://data.worldbank.org/ indicator/NY.GDP.PCAP. No discount rate was available for Ukraine. data for Ukraine was extrapolated forward to 2035. It was assumed that the CD?year_high_desc=true - average of 5 years taken from proportions of the population within each smoking category as calculated in the World Bank and OECD national accounts data: $3320 2015 remained constant until 2035. per year 8 9 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Module 2 is a microsimulation model which uses the prevalence of the risk factor over time, along with the specified data on the risks of developing Table 3: TaXSiM Model Scenarios and Outputs diseases, to make projections of future disease burden. SCENARIO 1 (2017): Ad valorem tax Expected Baseline Situation (2016): Ad Expected Expected is the same as in 2016 (12%), and Actual Contri- valorem (12%) minimum Contri- Contri- The model produces a wide range of different outputs, including incidence, 40% increase in both the minimum 2015 bution specific (8.515 UAH) and bution bution specific excise (11.92 UAH), and cumulative incidence, prevalence, premature mortality, direct healthcare costs to GDP simple specific (6.365 UAH) to GDP to GDP simple specific (8.91 UAH)** avoided, and disability-adjusted life years. Total cigarettes taxed (billion 74.0 77.0 70.1 pieces) To our knowledge, no other studies have used a microsimulation model Average cigarette to quantify the future costs and health impacts of tobacco taxation policy price (UAH per 15.3 20.8 25.7 scenarios in Ukraine. pack) Average cigarette Risk data price (US$ per $0.63 $0.81 $0.92 pack).* Average excise RISK tax (UAH per 1000 308.9 431.4 600.0 Population Disease Health Intervention pieces) data data economic scenarios Distribution Total excise tax data programme revenue (billion 22.9 1.0% 33.2 1.4% 41.8 1.6% UAH) Total excise tax revenue (US$ $0.94 $ 1.30 $1.50 UKHF Microsimulation© Programme billion).* Total government revenue (excise, 34.9 1.6% 49.9 2.2% 60.1 2.3% VAT and levies, Input datasets billion UAH) Software programmes Output Total government data revenue (excise, Output $1.44 $ 1.95 $2.16 VAT and levies, US$ billion).* Figure 1: Illustration of the Microsimulation Model Total expenditure on cigarettes 56.4 79.9 90.0 (billion UAH) Percentage change in Development of Scenarios total cigarette 4.1 -9.0 consumption (%) An initial modeling study was carried out using the World Health Organization * World Bank Group actual (2015 -2016) and forecast (2017): Annual average exchange rate = 2016 (1US$/25.6 UAH); 2017 (1US$/27.80 UAH) ** per pack of 20 cigarettes (WHO) TaXSiM model.2 Within this model, a scenario that reflects tobacco excise tax changes in 2017 was simulated to calculate the revenue impact as a result of this tax increase (Table 3). TaXSiM also calculated the percentage reduction in total cigarette consumption (%) due to the suggested tax changes. These taxation changes result in non-smokers’ (predominantly young people) not initiating smoking; smokers’ quitting, and smokers’ reducing the number of cigarettes smoked. Details of the TaXSiM model scenarios can be found in Table 3. 2 WHO tobacco tax simulation model (TaXSiM) http://who.int/ tobacco/economics/taxsim/en/ 10 11 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Scenario Assumptions 9. If an individual’s smoking status is changed by the scenario, their smoking 1. Several studies suggest that around 50 percent of the effect of price increases status will remain fixed for the entire simulation. on overall cigarette consumption results from participation changes (34, 35). Therefore, 50 percent of the estimated reduction in cigarette consumption 10. We assumed an immediate reduction in smoking prevalence due to the was used as an estimate of the reduction in the total prevalence of smoking. tax increases in 2017. We learned via personal communication with Prof. Joy While taxation which results in increased real prices of tobacco might reduce Townsend that there are different views on the temporal impact of a tax: the intensity of smoking, research suggests that people who cut down actually Econometricians follow Becker’s model, assuming that, as tobacco is very inhale more, as measured by serum cotinine levels (36). Further, the WHO target addictive, the reaction to price increases is slow and greater in the long run. is focused on a total reduction in smoking prevalence. Therefore, modeling Becker, therefore, uses a lagged variable of y (t-1) (38). Townsend and Atkinson proceeded with a focus on current smoking prevalence, as opposed to the take the opposite view (39): That smokers tend to react quickly to a price number of cigarettes smoked. change. We used a model similar to theirs, with an immediate effect and then a linear trend, and in line with the TaXSiM model outputs. 2. Our analysis of the omnibus surveys 2013-2015 showed that, in males, 55 percent of the change in smoking prevalence was due to a reduction in There were two scenarios: uptake. Specifically, the percentage of smokers decreased, the percentage of ex-smokers did not change, while the percentage of never smokers increased. 1. A baseline ‘static’ trend. This assumed that smoking prevalence stays Therefore, these changes were probably due to males’“not starting smoking.” constant at 2015 rates. Among females, 100 percent of the change in consumption was due to “not starting smoking” (2). 2. A tax increase scenario. An earlier iteration of TaXSiM calculated that an increase in Ad valorem tax of 15 percent, a 30 percent increase in the 3. While these average changes were not the same for each group, and people minimum specific excise, 11.08 UAH, and a simple specific of 8.28 UAH usually initiate smoking while they are under 30 years old (2), the model did not would result in a reduction of 10.2 percent in cigarettes smoked. Using the take these age differences into account, and the relative decline in percentages assumptions above, this translated into a reduction in uptake of 5.61 percent of current smokers was applied to all age groups. in males, and 10.2 percent in females. Therefore, in 2017, this specified Therefore, it was assumed that taxation would result in changes in uptake. percentage of smokers was moved to the never-smoker group in order to take account of uptake and maintain 100 percent of smokers in the model (the 4. A baseline “static” trend was included. This assumed that smoking prevalence population cannot exceed 100 percent). This slightly underestimates the effect remains constant at 2015 rates. The tax increase scenario was compared to of the scenario, as described in the discussion. This change occurred in 2017 this baseline. only. Appendix 2 provides the full TaxSiM analysis from the earlier iteration. 5. The tax increase scenario represents the tax change adopted in January 2017. Scenario 1 is summarized in Table 4: 6. The change in smoking prevalence occurs in the second year of the Table 4: Summary of Scenarios simulation only (2017). This is in line with TaXSiM. As noted above, tobacco % reduction in cigarettes Estimated expected reduction Estimated expected reduction companies’“price wars” were observed to cause an increase in cigarette consumption as per Table 2 in smoking (males) in smoking (females) consumption, rather than the decrease expected according to taxation policies. Number of cigarettes Number of cigarettes Taking this into account, it was assumed that, in 2016, there would be no (%) Uptake (%) Uptake (%) smoked (%) smoked (%) change in smoking prevalence from the 2015 level (37). The baseline used in the model was 2015 smoking prevalence held constant. 10.2 5.61 4.59 10.2 0 7. The scenarios are based on Monte Carlo simulations (Individuals were sampled from the population and simulated through). 8. The specified percentage of smokers who are affected by the tax increase move to the never-smoker category in 2017. 12 13 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Results 2016 6 2020 A number of different outputs are produced from the model, and these are 2025 5 defined below: 2030 4 Percentage (%) 2035 Smoking Prevalence (%) 3 2 Table 5 shows smoking prevalence for males, females, and both males and 1 females combined for the baseline and Scenario 1. 0 Females Females Table 5: Smoking Prevalence by Year, Sex and Scenario (%) Baseline Scenario 0 Baseline Scenario 1 Year Scenario 0 (baseline) Scenario 1 Figure 3: Female Smoking Prevalence by Year for Each Scenario. M F TOTAL M F TOTAL 2016 40.8 5.7 21.5 40.8 5.7 21.5 Epidemiological Indicators 2020 40.3 5.4 21.1 38.2 4.9 19.8 Results from the microsimulation are presented as rates per 100,000, then 2025 39.4 5.3 20.5 37.5 4.8 19.4 scaled to the Ukraine population for that year, as estimated by the UN 2030 38.9 5.1 20.1 37.2 4.8 19.2 population prospects (8). 2035 38.2 5.0 19.8 36.7 4.7 19.0 1. Cumulative incidence rate per year per Ukraine population The total number of new cases of a disease divided by the total number of susceptible people in a given year and accumulated over a specified period 2016 42 of the simulation from the year 2016. Therefore, the cumulative number of 2020 incident cases represents a sum of all of the incident cases from the start of the 41 2025 simulation. 2030 40 2035 2. Cumulative incidence avoided per Ukraine population over the Percentage (%) 39 simulation period 38 The total number of incident cases of disease avoided or gained as compared 37 to baseline (i.e., scenario 0). A positive value represents the number of cases avoided, whereas a negative value represents the number of cases gained. 36 35 3. Incidence The total number of new cases of a disease, divided by the total number of 34 susceptible people in a given year presented as a rate per population. Males Males Baseline Scenario 0 Baseline Scenario 1 4. Attributable incidence rate per Ukraine population per year The number of new cases of a disease attributable to being a smoker or ex- Figure 2: Male Smoking Prevalence by Year for Each Scenario. smoker in the Ukraine population. 14 15 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 5. Premature mortality rates per Ukraine population Table 6: Summary Table of the Outputs as Rates per Ukraine Population, by Year Premature mortality refers to the total number of deaths in a given year below the life expectancy of that individual in the Ukraine population. Results are Epidemiological outputs Year Sc0 (Baseline) Sc1 presented per year in the total population and cumulative over a given period 2025 5480948[±4237] 5427558[±4237] of the simulation. Cumulative incident cases 2035 11366868[±5753] 11255173[±5753] 6. Potential years of life lost per Ukraine population 2025 NA 56224[±6341] For each individual, the difference between the reference age (life expectancy Cumulative incident cases avoided at birth) and the age of death is calculated. The average annual PYLL was 2035 NA 126730[±9123] calculated each year in the microsimulation. This metric considers individuals 2025 589035[±1545] 582341[±1545] who have died in a given year and is output as a rate per 100,000, which is Incident cases per year then scaled to a rate per Ukraine population. 2035 646600[±1545] 640799[±1727] Economic outputs 2025 218221[±1121] 208475[±1121] Attributable incident cases 7. Direct cost avoided 2035 222603[±1041] 211984[±1041] These are cumulative direct costs across the period of the simulation. The 2025 NA 6372 result for 2020 represents the cumulative costs avoided for the period 2016 to Cumulative premature deaths avoided 2020. These costs are scaled to the total population of Ukraine. 2035 NA 29172 Cumulative potential years of life lost relative 2025 NA 48923 8. Premature mortality costs This relates to lost earnings due to premature deaths. The premature mortality to baseline 2035 NA 267098 costs for each individual in the year of death are calculated by summing over Economic outputs the income costs from the age of death until the individual’s life expectancy (LE) at birth. 2025 NA 542.23 Direct costs avoided (millions UAH) 2035 NA 1545.81 Summary Table Cumulative premature mortality costs avoided 2025 NA 3568.4 Table 6 presents a summary table of total disease cases (epidemiological) (millions UAH) 2035 NA 16536.4 and costs (economic) by parameter, year, and scenario as rates per Ukraine population. Table 7: Cumulative Incident Cases for Each Disease by Year for the Total Ukraine Population Scenario 0 (Sc0) refers to the baseline scenario where smoking prevalence was assumed constant based on 2015 smoking prevalence. CHD COPD Lung Cancer Stroke Total Scenario 1 (Sc1) refers to the one-off tax scenario as summarized in table 3. Sc 0 3712722[+-3390] 665256[+-1695] 194068[+-847] 908901[+-1695] 5480948[±4237] Year Cumulative Incident Cases 2025 Sc 1 3679248[±3390] 655510[±1695] 188136[±847] 904664[±1695] 5427558[±4237] Table 7 presents the cumulative incident cases for each disease by year, and Sc 0 7707697[+-4719] 1346232[+-1966] 392110[+-1180] 1920828[+-2360] 11366868[±5753] Table 8 presents the cumulative incident cases avoided. Year 2035 Sc 1 7638872[±4719] 1323421[±1966] 379132[±1180] 1913749[±2360] 11255173[±5753] 16 17 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 140,000 Scenario 1 rel to 2025 Scenario 0 12,000,000 2025 Scenario 1 120,000 rel to 2035 Scenario 0 2035 Scenario 1 10,000,000 Cumulative incident cases avoided 2025 100,000 Scenario 1 2035 Cumulative incident cases 8,000,000 80,000 60,000 6,000,000 40,000 4,000,000 20,000 2,000,000 0 CHD COPD Lung Stroke TOTAL Caner 0 CHD COPD Lung Caner Stroke TOTAL Figure 5: Cumulative Incident Cases Avoided. Incident and Attributable Incident Cases per Year Figure 4: Cumulative Incident Cases per Ukraine Population by 2025 and 2035. Table 9 presents the incidence rates per Ukraine population, and Table 10 shows the incidence rate attributable to smoking per Ukraine population by Table 8: Cumulative Incident Cases Avoided Relative to Scenario 0 for the Ukraine Population year for each disease. Figure 7 presents the incident cases by scenario for 2025 by 2025 and 2035 by disease. The blue bars show incident cases and red bars show incident cases attributable to smoking in the specified year per Ukraine population. For CHD COPD Lung Cancer Stroke Total scenario 1, the cases attributable to smoking contribute a smaller portion to the overall new cases when compared with baseline (scenario 0). This is to be expected, since the scenario is impacting smokers, so we would expect the Year 2025 35252[+-4908] 10263[+-2677] 6247[+-1338] 4462[+-2677] 56224[±6341] avoided cases attributable to smoking to increase over time. Year 2035 78092[+-7586] 25881[+-3123] 14725[+-1784] 8032[+-3569] 126730[±9123] Table 9: Incident Cases in the Total Population per Year CHD COPD Lung Cancer Stroke Total Sc 0 398492[+-1338] 70059[+-446] 20080[+-446] 100404[+-446] 589035[±1545] Year 2025 Sc 1 394476[+-1338] 68274[+-446] 19634[+-446] 99957[+-446] 582341[±1545] Sc 0 439546[+-1338] 73629[+-446] 21419[+-446] 112006[+-446] 646600[±1545] Year 2035 Sc 1 435976[+-1338] 71844[+-446] 20527[+-446] 112452[+-892] 640799[±1727] 18 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Scenario 0 250,000 2025 Scenario 0 Scenario 0 700,000 2035 2025 Scenario 0 Scenario 1 2025 200,000 2035 600,000 Scenario 1 Scenario 1 2035 Attributable Incident cases 2025 Cumulative incident cases Scenario 1 2035 500,000 150,000 400,000 100,000 300,000 200,000 50,000 100,000 0 CHD COPD Lung Caner Stroke TOTAL 0 CHD COPD Lung Caner Stroke TOTAL Figure 7: Attributable Incident Cases per Ukraine Population in the Years 2025 and 2035. Figure 6: Incident Cases in Ukraine in 2025 and 2035. TOTAL Sc 1 2035 Stroke Lung Cancer Table 10: Attributable Incident Cases per Year COPD 2035 CHD CHD COPD Lung Cancer Stroke Total TOTAL Incident Sc 0 2035 Stroke Sc 0 145339[+-847] 35170[±424] 15678[±424] 22034[±424] 218221[±1121] cases Year Lung Cancer Attrib Incident 2025 COPD cases Sc 1 138983[+-847] 33051[±424] 15254[±424] 21187[±424] 208475[±1121] CHD Sc 0 148664[+-787] 36183[±393] 15338[±393] 22418[±393] 222603[±1041] Total Year Stroke Sc 1 2025 2035 Sc 1 142371[+-787] 33823[±393] 14552[±393] 21238[±393] 211984[±1041] Lung Cancer COPD 2025 CHD Total Stroke Sc 0 2025 Lung Cancer COPD CHD 0 200,000 400,000 600,000 800,000 1,000,000 Figure 8: Incident and Attributable Incident Cases in 2025 for Baseline and Scenario 1 by Disease. 20 21 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Premature Deaths Table 12: Cumulative PYLL by scenario and year, and PYLL avoided due to scenario 1 relative to scenario 0 Table 11 presents the premature deaths, premature deaths avoided, and cumulative premature deaths avoided in the total Ukraine population relative Year Sc 0 Sc 1 PYLL avoided Sc1 rel 0 to scenario 0. The results show that, by 2025, there will be 6,372 premature deaths averted given the scenario 1 tax increase. By 2035, this increases 2025 47173316 47124394 48923 to 29,172 premature deaths averted for scenario 1. Figure 8 presents the cumulative premature deaths avoided by scenario for 2025 and 2035. 2035 92950950 92683852 267098 Table 11: Premature Mortality in the Total Ukraine Population Direct Costs Avoided Premature deaths Table 13 presents the cumulative direct healthcare costs avoided for scenario 1 Premature deaths Cumulative premature deaths avoided avoided relative to scenario 0. Relative to scenario 0, scenario 1 results in the avoidance Sc 0 307204 NA NA of the following direct healthcare costs by disease: CHD (UAH1.1bn/US$130 Year 2025 million3), followed by COPD (UAH 0.16bn/US$25 million). Sc 1 305086 2119 6372 Table 13: Direct Cumulative Healthcare Costs Avoided (UAH millions) Sc 0 328004 NA NA Year 2035 CHD COPD Lung Cancer Stroke Total Sc 1 326824 1180 29172 Year 408.37 26.16 54.05 53.65 542.22 Sc 1 rel 0 2025 [+-0.06] [+-0.06] [+-0.06] [+-0.06] [+-0.12] Cumulative premature Year 1133.1 160.54 143.35 108.82 1545.82 Sc 1 rel 0 mortality avoided 2035 [+-0.08] [+-0.08] [+-0.08] [+-0.08] [+-0.16] Sc 1 rel 0 2035 Scenario by year 1800 Scenario 1 rel to 2025 Scenario 1 1600 rel to 2035 Sc 1 rel 0 2025 1400 Direct cumulative healthcare costs 1200 avoided (UAH millions) 0 5000 10,000 15,000 20,000 25,000 30,000 35,000 1000 Total cases in the population Figure 9: Cumulative Premature Deaths Avoided in the 800 Ukraine Population by 2025 and 2035. 600 400 Potential Years of Life Lost 200 Table 12 presents the cumulative potential years of life lost (PYLL) for each scenario by year in the total Ukraine population. By 2025, scenario 1 is 0 CHD COPD Lung Stroke TOTAL predicted to avoid 267,098 PYLL relative to baseline. Caner 3 The exchange rate of 1US$/23.8 UAH is used here. Figure 10: Direct Cumulative Healthcare Costs Avoided (UAH millions) 22 23 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine As expected, discounting at 5 percent has a large impact on the cumulative Discussion direct costs avoided by 2025 and 2035. For example, scenario 1 is predicted to result in 399 million UAH avoided by 2025, compared to 542 million UAH This study explored the impact of a one-time tobacco tax increase in Ukraine avoided by 2025 without discounting. The results are presented in Table 14. on the future burden of four smoking-related diseases through 2035. The results showed that small changes in smoking prevalence in one year can have Table 14: Direct Cumulative Healthcare Costs Avoided (UAH millions) (with 5 percent discounting) large impacts in terms of disease incidence and premature mortality cases avoided into the future. Our results show that implementation of a one-off tax CHD COPD Lung Cancer Stroke Total has an impact on smoking-related health burden, but highlights the need for continuous tobacco control measures, if smoking prevalence is to continue to Year Sc 1 rel 0 301.18 16.66 40.76 40.13 398.74 decrease and have sizeable impacts on related disease occurrence. 2025 Year Sc 1 rel 0 660.51 82.23 85.24 68.36 896.34 As well as benefits in terms of morbidity, particularly CHD cases avoided, 2035 we observe large savings in terms of premature mortality and potential years of life lost. This is important, since Ukraine is experiencing a decreasing Premature Mortality Costs Avoided population over time, and a lower life expectancy compared to the EU average of 78 for males and around 83 for females (40). Tobacco taxation is one Table 14 presents the premature mortality costs avoided, relative to baseline. important step to improving life expectancy in Ukraine, especially amongst In 2035 alone, UAH 1.97 billion (US$82.7 million) premature mortality costs men, whose smoking prevalence is high. could be avoided. Cumulatively, by 2035, UAH 16.5 billion (US$695 million) premature mortality costs could be avoided relative to baseline. The study included just four smoking-related diseases (CHD, COPD, stroke, lung cancer). However, we know that smoking is responsible for many more diseases, Table 15: Premature Mortality Costs Avoided (UAH millions) and harms almost every organ in the body (41). Therefore, we are likely to see much wider epidemiological benefits than those observed here. Future work PM costs avoided Cumulative costs avoided could update this study by including additional smoking-related diseases. Year 2025 206 3568.4 While the microsimulation method is advantageous in NCD modeling, one key disadvantage is that the model is data intensive. Fortuitously, during the period Year 2035 1968.3 16536.4 of the study, the Global Burden of Disease team published an online database that included many of the data inputs that were required (9). While country- specific data are preferable, and the GBD is based on modeled estimates and recommended as a cross-country comparative tool, few other data were available for Ukraine. Inter alia, there were neither survival data nor relative risks specific for Ukraine available. Once better data become available, the model can easily be updated. No data on indirect costs such as productivity losses by disease were available. Large savings to the health system were observed with just a small change in smoking prevalence. However, wider societal costs such as losses in productivity are likely to be higher than those reflected here, making a stronger case for the implementation of regular tax hikes for tobacco control (42). If indirect cost data by disease become available, then the model can once again easily be updated in the future. 24 25 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine One notable limitation of our scenario methodology is that smokers were individuals of different ages could be affected differently by the intervention. moved to the never-smokers category to account for changes in uptake It was beyond the scope of this study to include this development within the due to the tax. While this is not realistic, it was the only solution by which to microsimulation. However, a prototype, user-friendly tool has been developed model change in uptake within the total population (and ensure we maintain for Ukraine that enables the user to select different age cohorts (as opposed to 100 percent of people in the population). This approach could result in an a population distribution of individuals) and run simulations to quantify health underestimation of the health impact of a tobacco tax increase. This effect and cost impacts by population groups. could arise, because some of those smokers who become never smokers may already have a smoking-related disease. Further work should also explore the impact of other potential policies in Ukraine, such as a tobacco duty escalator, as well as a combination of tobacco- We know that social groups react differently to tax increases (43). Due to small control measures including smoking cessation services. sample sizes, it was not possible to model the long-term health impacts on different social groups within the microsimulation. However, we can infer from This study complements that which was carried out using the TaXSiM model research conducted in Ukraine (43, 44) that the largest impact of taxation will and shows the health and related economic benefits of increasing tobacco tax be observed in the poorest social groups. This is important, since it means that in Ukraine. Even small reductions in smoking prevalence in one year will have tobacco taxation could contribute to reducing social inequalities in health. long-term impacts on disease incidence and subsequent health costs. One specific limitation of any predictive model is that it does not take account of major future changes in circumstances, such as the behavior of the tobacco industry, or the introduction of new drugs or technologies. In theory, their effects can be estimated by altering parameters in the model, but these will significantly increase the degrees of uncertainty. However, they could be simulated as additional scenarios in the future relative to a “no change” scenario. At present, the model does not take account of multimorbidity and the joint effect of several risk factors on disease occurrence and related mortality. However, individuals can get more than one smoking-related disease in their lifetime. Future work could expand the scope of the model to take account of technological and economic changes and their potential effects, and also to model the clustering of risk factors and diseases in the same individuals. The model did not take account of passive smoking/secondhand smoke. Understanding the combined risk of smoking and passive smoking on later disease outcomes will enable us to model the combined impact of these risk factors on later disease outcomes. It was beyond the scope of this study, given the time constraints, to carry out an in-depth uncertainty and sensitivity analysis. We are aware that this is good practice; however, there is a lack of validated datasets by which to compare our outputs. Furthermore, the microsimulation is complex, relative to spreadsheet models, for example. It involves many thousands of calculations which are completed during the simulation of 50 million individuals. Given this complexity, local uncertainty analysis would demand many thousands of consecutive runs and would require a supercomputer to complete the exercise in a realistic time scale. However, we did carry out a small sensitivity analysis of the costs – running the model with and without a 5 percent discount rate. Further work should develop more sophisticated interventions, for example, 26 27 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Bibliography 12. Song YM, Cho HJ. Risk of stroke and myocardial infarction after reduction or cessation of cigarette smoking: A cohort study in Korean men. Stroke. 1. WHO. WHO report on the global tobacco epidemic 2011: Warning about the 2008;39(9):2432-8. dangers of tobacco. Geneva: WHO; 2011. 13. Baba S, Iso H, Mannami T, Sasaki S, Okada K, Konishi M, et al. Cigarette 2. Andreeva T. Results of omnibus surveys with tobacco-related questions smoking and risk of coronary heart disease incidence among middle-aged conducted in Ukraine in 2013, 2014, 2015. 2016. Japanese men and women: The JPHC Study Cohort I. European Journal of Cardiovascular Prevention & Rehabilitation. 2006;13(2):207-13. 3. The results of KIIS survey on tobacco smoking in Ukraine as of 2015 compared to 2013-2014 Kiev: Kiev International Institute of Sociology; 2016. 14. Tolstrup JS, Hvidtfeldt UA, Flachs EM, Spiegelman D, Heitmann BL, Balter K, Available at: http://www.kiis.com.ua/?lang=eng&cat=reports&id=587&page=1. et al. Smoking and risk of coronary heart disease in younger, middle-aged, and older adults. American Journal of Public Health. 2014;104(1):96-102. 4. Oderkirk J, Sassi F, Cecchini M, Astolfi R, OECD Health Division. Toward a New Comprehensive International Health and Health Care Policy Decision Support Tool. 15. Burns DM. Epidemiology of smoking-induced cardiovascular disease. OECD Directorate for Employment, Labour and Social Affairs; 2012. Progress in Cardiovascular Diseases. 2003;46(1):11-29. 5. Feenberg D, Coutts E. An introduction to the TAXSIM model. Journal of Policy 16. Cronin EM, Kearney PM, Kearney PP, Sullivan P, Perry IJ. Impact of a Analysis and Management. 1993;12(1):189-94. national smoking ban on hospital admission for acute coronary syndromes: A longitudinal study. Clinical Cardiology. 2012;35(4):205-9. 6. Butrica BA, Burkhauser RV. Estimating federal income tax burdens for Panel Study of Income Dynamics (PSID) families using the National Bureau of 17. U.S. Department of Health and Human Services. The health consequences of Economic Research TAXSIM model. 1997. smoking—50 years of progress: a report of the Surgeon General. Washington, DC: U.S. Department of Health and Human Services; 2014. 7. State Statistical Service of Ukraine. Population’s self-perceived health status and availability of selected types of medical aid in 2015 (in Ukrainian). Kiev: 18. Prescott E, Bjerg AM, Andersen PK, Lange P, Vestbo J. Gender difference in State Statistical Service of Ukraine; 2016. Available at: http://ukrstat.gov.ua/ smoking effects on lung function and risk of hospitalization for COPD: Results druk/katalog/kat_u/2015/sb/zb_snsz_2015.zip. from a Danish longitudinal population study. The European Respiratory Journal. 1997;10(4):822-7. 8. United Nations. World population prospects 2015. Available at: http://esa. un.org/unpd/wpp/. 19. Johannessen A, Omenaas E, Bakke P, Gulsvik A. Incidence of GOLD- defined chronic obstructive pulmonary disease in a general adult population. 9. Global Burden of Disease. Global Health Data Exchange. In: Institute for International Journal of Tuberculosis and Lung Disease. 2005;9(8):926-32. Health Metrics and Evaluation, editor. Available at: http://ghdx.healthdata.org/ gbd-results-tool2016. 20. Terzikhan N, Verhamme KMC, Hofman A, Stricker BH, Brusselle GG, Lahousse L. Prevalence and incidence of COPD in smokers and non-smokers: 10. World Health Organization. Globocan 2012: Estimated Cancer Incidence, The Rotterdam Study. European Journal of Epidemiology. 2016;31(8):785-92. Mortality and Prevalence Worldwide in 2012. Available at: http://globocan.iarc. fr/Default.aspx. 21. Thun MJ, Carter BD, Feskanich D, Freedman ND, Prentice R, Lopez AD, et al. 50-Year Trends in Smoking-Related Mortality in the United States. New England 11. World Health Organization. Health statistics and information systems - Journal of Medicine. 2013;368(4):351-64. Software tools - DISMOD II 2014 [23/02/15]. Available at: http://www.who.int/ healthinfo/global_burden_disease/tools_software/en/. 22. Freedman ND, Leitzmann MF, Hollenbeck AR, Schatzkin A, Abnet CC. Cigarette smoking and subsequent risk of lung cancer in men and women: Analysis of a prospective cohort study. Lancet Oncol. 2008;9(7):649-56. 28 29 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 23. Bae JM, Lee MS, Shin MH, Kim DH, Li ZM, Ahn YO. Cigarette smoking and 35. Response to increases in cigarette prices by race/ethnicity, income, and risk of lung cancer in Korean men: The Seoul male cancer cohort study. J age groups--United States, 1976-1993. Morbidity and Mortality Weekly Report. Korean Med Sci. 2007;22(3):508-12. 1998;47(29):605-9. 24. Mannami T, Iso H, Baba S, Sasaki S, Okada K, Konishi M, et al. Cigarette 36. Fidler JA, Stapleton JA, West R. Variation in saliva cotinine as a function of smoking and risk of stroke and its subtypes among middle-aged Japanese self-reported attempts to reduce cigarette consumption. Psychopharmacology men and women: The JPHC Study Cohort I. Stroke. 2004;35(6):1248-53. (Berl). 2011;217(4):587-93. 25. Shinton R, Beevers G. Meta-analysis of relation between cigarette smoking 37. Krasovsky K. Public health and revenue impact of cigarette “price wars” in and stroke. BMJ. 1989;298(6676):789-94. Ukraine. ECTOH-20172017. 26. Wannamethee SG, Shaper AG, Whincup PH, Walker M. Smoking cessation 38. Becker G, Murphy K. A Theory of Rational Addiction. Journal of Political and the risk of stroke in middle-aged men. Journal of the American Medical Economy. 1988;96(4):675-700. Association. 1995;274(2):155-60. 39. Atkinson AB, Skegg JL. Anti-Smoking Publicity and the Demand for 27. Hoogenveen RT, van Baal PH, Boshuizen HC, Feenstra TL. Dynamic effects Tobacco in the U.K.*†. The Manchester School. 1973;41(3):265-82. of smoking cessation on disease incidence, mortality and quality of life: The role of time since cessation. Cost Eff Resour Alloc. 2008;6:1. 40. Eurostat: Statistics Explained. File: Life expectancy at birth, EU-28, 2002-14 2016 [cited 2016 20.12.2016]. Available at: http://ec.europa.eu/ 28. Denisova I, Kuznetsova P. The effects of tobacco taxes on health : An eurostat/statistics-explained/index.php/File:Life_expectancy_at_birth,_EU- analysis of the effects by income quintile and gender in Kazakhstan, the Russian 28,_2002%E2%80%9314_%28%C2%B9%29_%28years%29_YB16.png. Federation, and Ukraine. The World Bank; 2014 Oct. 41. Centers for Disease Control and Prevention. Health Effects of Cigarette 29. World Health Organisation. World Health Statistics 2015. Global Health Smoking 2016. Atlanta: Centers for Disease Control and Prevention; 2016. Observatory (GHO) data 2015. Available at: http://www.who.int/gho/ Available at: https://www.cdc.gov/tobacco/data_statistics/fact_sheets/health_ publications/world_health_statistics/2015/en/. effects/effects_cig_smoking/. 30. McPherson K, Marsh T, Brown M. Foresight tackling obesities: Future 42. Action on smoking and health. The economics of tobacco. ASH factsheet. choices – modelling future trends in obesity and the impact on health. 2015. Foresight Tackling Obesities Future Choices. 2007. 43. Krasovsky K. Sharp changes in tobacco products affordability and the 31. Wang YC, McPherson K, Marsh T, Gortmaker SL, Brown M. Health and dynamics of smoking prevalence in various social and income groups in economic burden of the projected obesity trends in the USA and the UK. Ukraine in 2008-2012. Tob Induc Dis. 2013;11(1):21. Lancet. 2011;378(9793):815-25. 44. Krasovsky K, Andreeva T, Krisanov D, Mashliakivsky M, Rud G. The Economics 32. Forum CRUUH. Aiming High: Why the UK should aim to be tobacco-free. 2016. of tobacco control in Ukraine from the public health perspective. Kiev 2002. 128 p. 33. Forum UH. Appendix B4. Detailed Methodology Technical Document. http://econdaproject.eu/2015. 34. Farrelly MC, Bray JW, Zarkin GA, Wendling BW. The joint demand for cigarettes and marijuana: Evidence from the National Household Surveys on Drug Abuse. J Health Econ. 2001;20(1):51-68. 30 31 Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix 1. Microsimulation Framework Appendix 1. Technical Appendix 1 of Our simulation consists Microsimulation Framework two modules. The first module calculates the predictions of risk factor 1 trends over1timeMicrosimulation Framework 1 Microsimulation Framework Our simulation consists of two modules. The first module calculates the pr based on data from rolling cross- Microsimulation Framework 1 Microsimulation Framework 1 Microsimulation Framework 1 Microsimulation Framework 1 Microsimulation Framework trends over time based on data from rolling cross-sectional studies. The se Our simulation consists of two modules. The first module c Our simulation consists of two modules. The first module ca sectional studies. The second module performs the microsimulation of a 1 Microsimulation Framework Our simulation consists of two modules. The first module calculates the Our simulation consists of two modules. The first module calculates the predictions of risk factor Our simulation consists of two modules. The first module calculates the predictions of risk factor Our simulation consists of two modules. The first module calculates the predictions of the microsimulation of a virtual population, generated with demographic c Our simulation consists of two modules. The first module calculates the predictions of risk factor virtual population, generated with demographic trends over time based on data from rolling cross-sectiona trends over time based on data from rolling cross-sectional characteristics matching Our simulation consists of two modules. The first module calculates the predictions of risk factor trends over time based on data from rolling cross-sectional studies. The trends over time based on data from rolling cross-sectional studies. The second module performs trends over time based on data from rolling cross-sectional studies. The second module performs those of trends over time based on data from rolling cross-sectional studies. The second modu those of the observed data. The health trajectory of each individual from t trends over time based on data from rolling cross-sectional studies. The second module performs the observed data. The health the microsimulation of a virtual population, generated with the microsimulation of a virtual population, generated with trajectory of each individual from trends over time based on data from rolling cross-sectional studies. The second module performs the microsimulation of a virtual population, generated with demographi the microsimulation of a virtual population, generated with demographic characteristics matching the microsimulation of a virtual population, generated with demographic characteristics matching the microsimulation of a virtual population, generated with demographic characterist over time allowing them to contract, survive, or die from a set of diseases the microsimulation of a virtual population, generated with demographic characteristics matching the population is simulated over time those of the observed data. The health trajectory of each in those of the observed data. The health trajectory of each in allowing them to contract, survive, or the microsimulation of a virtual population, generated with demographic characteristics matching those of the observed data. The health trajectory of each individual from those of the observed data. The health trajectory of each individual from the population is simulated those of the observed data. The health trajectory of each individual from the population is simulated those of the observed data. The health trajectory of each individual from the populati analyzed risk factors. The detailed description of the two modules is prese those of the observed data. The health trajectory of each individual from the population is simulated die from a set of diseases or injuries related over time allowing them to contract, survive, or die from a over time allowing them to contract, survive, or die from a to the analyzed risk factors. The those of the observed data. The health trajectory of each individual from the population is simulated over time allowing them to contract, survive, or die from a set of disease over time allowing them to contract, survive, or die from a set of diseases or injuries related to the over time allowing them to contract, survive, or die from a set of diseases or injuries related to the detailed over time allowing them to contract, survive, or die from a set of diseases or injuries r over time allowing them to contract, survive, or die from a set of diseases or injuries related to the description of the two modules analyzed risk factors. The detailed description of the two m analyzed risk factors. The detailed description of the two m is presented below. over time allowing them to contract, survive, or die from a set of diseases or injuries related to the 1.1 Module One: Predictions of Smoking Prevalence Ov analyzed risk factors. The detailed description of the two modules is pre analyzed risk factors. The detailed description of the two modules is presented below. analyzed risk factors. The detailed description of the two modules is presented below. analyzed risk factors. The detailed description of the two modules is presented below analyzed risk factors. The detailed description of the two modules is presented below. analyzed risk factors. The detailed description of the two modules is presented below. 1.1 Module One: Predictions 1.1 Prevalence 1.1 For the risk factor (RF), let of Smoking Module One: Predictions of Smoking Pre Module One: Predictions of Smoking Pr N be the number of categories for a given risk fa Over Time 1.1 1.1 Module One: Predictions of Smoking Prevalence Over Time 1.1 Module One: Predictions of Smoking Prevalence Over Time 1.1 Module One: Predictions of Smoking Prevalence Over Time 1.1 Module One: Predictions of Smoking Prevalence O Module One: Predictions of Smoking Prevalence Over Time smoking. Let ! = 1, 2, …, N number these categories, and For the risk factor (RF), let For the risk factor (RF), let For the risk factor (RF), let N be the number of categories for a given N #$ (&) denote the be the number of categories f N be the number of categories fo risk 1.1 Module One: Predictions of Smoking Prevalence Over Time For the risk factor (RF), let For the risk factor (RF), let For the risk factor (RF), let N For the risk factor (RF), let N be the number of categories for a given risk be the number of categories for a given risk factor, e.g. N be the number of categories for a given risk factor, e.g. For the risk factor (RF), let N be the number of categories for a given risk factor, e.g. corresponds to the category N be the number of categories for a given risk factor, e.g. N N = 3 for = 3 for and #$ (&) using a multino N factor, e.g. N = 3 for smoking. Let k = 1, 2, …, NN smoking. Let smoking. Let = 3 for ! ! number at time ! these t = 1, 2, …, = 1, 2, …, . We estimate NN number these categories, and number these categories, and categories, # #$ For the risk factor (RF), let N be the number of categories for a given risk factor, e.g. smoking. Let ! = 1, 2, …, N N = 3 for smoking. Let number these categories, and # ! = 1, 2, …, (&) N number these categories, and #$ (&) denote t denote the prevalence of the RF that smoking. Let smoking. Let ! = 1, 2, …, N number these categories, and ! = 1, 2, …, denote N number these categories, and smoking. Let the #$ (&) prevalence ! = 1, 2, …, model with prevalence of RF category denote the prevalence of the RF that of N$ # the RF that$ number these categories, and (&) denote the prevalence of the RF that corresponds to the category corresponds to the category corresponds ! as the outcome, and time to the category ! # $ (&) ! at time at time at time denote the prevalence tt. We estimate . We estimate t as a sin # #$$ (&) (&) smoking. Let ! = 1, 2, …, N number these categories, and #$ (&) denote the prevalence of the RF that corresponds to the category corresponds to the category ! at time t. We estimate #$ (&) using a mult corresponds to the category corresponds to the category ! at time ! at time ! at time t. We estimate We estimate #t $. We estimate t . We estimate # corresponds to the category For ! a< (&) using a multinomial logistic regression using a multinomial logistic regression , we have ! at time t. We estimate (&) using a multinomial logistic regression using ) #$ (&)model with prevalence of RF category multinomial $ model with prevalence of RF category logistic regression model #$ (&) using a multinomial logist with !! as the outcome, an as the outcome, an corresponds to the category ! at time t. We estimate #$ (&) using a multinomial logistic regression model with prevalence of RF category model with prevalence of RF category ! as the outcome, and time t as a model with prevalence of RF category model with prevalence of RF category prevalence ! as the outcome, and time ! as the outcome, and time model with prevalence of RF category ! as the outcome, and time of RF category t tt as a single explanatory variable. as a single explanatory variable. as the outcome, and time k as a single explanatory variable. For ! For as the outcome, !< < ) !time , we have ), we have and t as a single explanatory t as a single explan model with prevalence of RF category ! as the outcome, and time For ! < ) t , we have as a single explanatory variable. For ! < ) , we have æ p (t ) ö For ! < ) , we have For ! < ) , we have For ! < ) , we have variable. For we have For ! < ) , we have ln ç k ÷ = b0k + b1k t p ( t )(ø ææppkk ((tt))öö kk æ p (t ) ö æ p ( t ) ö æ pkkk ( t ) ö = b kk + b kkt è æ pk ( t ) ö ln ç æ 1p t ) ö ln lnçç ÷ ÷ =b = +b b00 + = bè (b( )) k k k ln ç k ÷ = b0kln+ç ÷ k ÷ 0p+ p tt tø æ pk ( t ) ö k b b b b k ln b t = + t ln = + 0 p (1 t (1.1) è ø è 1 t) ø çp1 (t ) ÷ 0 1 ç ÷ (1.1) (1.1) 11 1 ln ç ÷ = b0 + b1 t k è p1 ( t ) ø èp è The prevalence of the first category is obtained by using the normalization 1 (t ) ø 1 (1.1) ø 0 1 è p1 ( t ) ø è p1 ( t ) ø The prevalence of the first category is obtained by using th The prevalence of the first category is obtained by using the 1. Solving equation (1.1) for #$ (&), we obtain The prevalence The prevalence of the first category is obtained by using the normalizati 1 The prevalence of the first category is obtained by using the normalization constraint of the first category is obtained by ** # (&) = * using the normalization The prevalence of the first category is obtained by using the normalization constraint The prevalence of the first category is obtained by using the normalization constraint The prevalence of the first category is obtained by using the normalization constraint Depending on the 11 #! (&) = . Solving equation (1.1) for . Solving equation (1.1) for $+, $+, ## #! (&) $(&) $+, (&) = , we obtain , we obtain ! ( ) circumstances, this assumption * $ The prevalence of the first category is obtained by using the normalization constraint 1 . Solving equation (1.1) for constraint # (&) # , we obtain (&) = 1 . Solving equation (1.1) for =1. Solving equation (1.1) for # (&) ,, we obtain we obtain exp b 0k + b1k t . Solving equation (1.1) for 1. Solving equation (1.1) for will1be #$ (&), we obtain more or less accurate #$ 1, we obtain $ (&) $+, . Solving equation (1.1) for ! #$ (&), we obtain $ p ( t ) = , ( b b b ) (( 1. Solving equation (1.1) for #$ (&), we obtain and more or less necessary. In k' b exp exp bkk + b +b ) k å N b )(( )( k' 1k+ p exp + +1k tN 1 t ) (b b ) general, it is both extremely )b k exp 00 ((b b b ( ()b k exp 0 + k k 1t k p k 1 tt = =0 0 useful and accurate. For simple exp k + k t b exp + t exp + k ¢ t= () p t å b ) ((b kk = å exp , 0k N (b ) p k tt = , b k exp k the individual surveys, 0 + k b pk t = 1 t probabilities Bayesian ()p 0 = 1 ) , 0 1 () k ' pk ,t = (1.2) 0 1 N (1.2) ((1.2) 1 1,+ + k ' kk¢= exp b ) (b b ) k pk t = () prior and posterior –' the , 1 + k¢=1exp 1 N k k '+which respects all constraints on the prevalence values, i.e. normalization 1 å + (1.2) 0 + 1k (b N k ' exp N ¢t=1exp å å k' 0 + k b' 0 + 1 t b 1 t k ' N å 1 + k¢=1exp 0 + k¢=1 1 + å (b b ) k ' kb exp ' 1 t 0 + ¢= 11k 't 1 0 Nare Beta distributions 1 + k¢likelihood =1 exp beingå k' + 0 binomial. k t 1 For (b b ) k¢=1 which respects all constraints on the prevalence values, i.e. which respects all constraints on the prevalence values, i.e reasonably large samples, 1.1.1which respects all constraints on the prevalence values, i.e. normalizatio Multinomial logistic regression for smoking prevalence which respects all constraints on the prevalence values, i.e. normalization and [0, 1] bounds. thewhich respects all constraints on the prevalence values, i.e. normalization and [0, 1] bounds. which respects all constraints on the prevalence values, i.e. normalization and [0, 1] b which respects all constraints on the prevalence values, i.e. normalization and [0, 1] bounds. approximation of the which respects all constraints on the prevalence values, i.e. normalization and [0, 1] bounds. which respects all constraints on the prevalence values, i.e. normalization and Measured data consist of sets of probabilities, with their variances, at spec Beta distributions by normal 1.1.1 Multinomial logistic regression for smoking p 1.1.1 Multinomial logistic regression for smoking p 1.1.1 distributions isMultinomial logistic regression for smoking prevalence both legitimate [0, 1] bounds. 1.1.1 Multinomial logistic regression for smoking prevalence 1.1.1 Multinomial logistic regression for smoking prevalence 1.1.1 Multinomial logistic regression for smoking prevalence and a practical necessity. For 1.1.1 Multinomial logistic regression for smoking prevalence the year of the survey). For any particular time, the sum of these probabili Measured data consist of sets of probabilities, with their va Measured data consist of sets of probabilities, with their va 1.1.1 Multinomial logistic regression for smoking prevalence Measured data consist of sets of probabilities, with their variances, at sp Measured data consist of sets of probabilities, with their variances, at specific time values (typically Measured data consist of sets of probabilities, with their variances, at specific time values (typically complex, multi-PSU, stratified Measured data consist of sets of probabilities, with their variances, at specific time va data might be the probabilities of smoker, ex-smoker, never smokers as th Measured data consist of sets of probabilities, with their variances, at specific time values (typically the year of the survey). For any particular time, the sum of the year of the survey). For any particular time, the sum of surveys, it is again assumed Measured data consist of sets of probabilities, with their variances, at specific time values (typically 1.1.1 Multinomial logisticthe year of the survey). For any particular time, the sum of these probab regression for smoking prevalence the year of the survey). For any particular time, the sum of these probabilities is unity. Typically such 1 thatthe year of the survey). For any particular time, the sum of these probabilities is unity. Typically such the year of the survey). For any particular time, the sum of these probabilities is unity. Typically such these base probabilities the year of the survey). For any particular time, the sum of these probabilities is unity survey data set. Each data point is treated as a normally distributed data might be the probabilities of smoker, ex-smoker, neve data might be the probabilities of smoker, ex-smoker, neve rando the year of the survey). For any particular time, the sum of these probabilities is unity. Typically such are approximately normally Measured data consist data might be the probabilities of smoker, ex-smoker, never smokers as of sets of probabilities, data might be the probabilities of smoker, ex-smoker, never smokers as they are extracted from the with their variances, at specific data might be the probabilities of smoker, ex-smoker, never smokers as they are extracted from the data might be the probabilities of smoker, ex-smoker, never smokers as they are extracted from the distributed and, again, it is an data might be the probabilities of smoker, ex-smoker, never smokers as they are extra are a set of N groups (number of years) of K probabilities {{ti, µki, ski |k1 survey data set. Each data point is treated as a normally dis survey data set. Each data point is treated as a normally dis Î[0, assumption that makes the time values (typically the data might be the probabilities of smoker, ex-smoker, never smokers as they are extracted from the survey data set. Each data point is treated as a normally distributed survey data set. Each data point is treated as a normally distributed year1 of the survey). 1 1For any particular time, the sum 1 random variable; together they ran survey data set. Each data point is treated as a normally distributed survey data set. Each data point is treated as a normally distributed analysis tractable. Depending survey data set. Each data point is treated as a normally distributed random variable; together they each year the set of K probabilities form a distribution – their sum is equal are a set of are a set of random variable; together they N groups (number of years) of N groups (number of years) of random variable; K probabilities {{ K probabilities {{ 1 on of these probabilities unity. Typically isare a set of such data might be the probabilities survey data set. Each data point is treated as a normally distributed are a set of N random variable; together they groups (number of years) of K probabilities {{ N i groups (number of years) of , µ t-1]} | ki, s | K probabilities {{ti, µki, ski |kÎ are a set of Ndata groups (number of years) of are a set of N groups (number of years) of the nature of the raw set are a set of K probabilities {{ K t probabilities {{ ki, ski |kÎ[0,Kt i, N i, µki, s µ groups (number of years) of iÎ ki [0, each year the set of each year the set of ki |kkÎN [0, [0,KKK-1]} | -1]} | Î-1]}. For K iiÎ Î[0, [0,N probabilities {{ N-1]}. For -1]}. For ti, µki, ski |kÎ[0,K-1]} | iÎ[0 probabilities form a distribution – th K probabilities form a distribution – the it may be possible to use non- are a set of N groups (number of years) of K probabilities {{ ti, µki, each year the set of ski |kÎ K[0, Kof smoker, -1]} | [0,ex-smoker, N-1]}. For never iÎeach year the set of smokers as they each year the set of probabilities form a distribution – their sum is equal to unity. Kare extracted from the survey probabilities form a distribution – their sum is equ each year the set of each year the set of K probabilities form a distribution – their sum is equal to unity. parametric statistical methods K probabilities form a distribution – their sum is equal to unity. 1 K probabilities form a distribution – their sum is equal to unity. Depending on the circumstances, this assumption will be more or less accurate a each year the set of K probabilities form a distribution – their sum is equal to unity. for this analysis. data set. Each data point is treated as a normally distributed random variable; 1 general, it is both extremely useful and accurate. For simple surveys, the individua 11 Depending on the circumstances, this assumption will be more Depending on the circumstances, this assumption will be more 1 1 1 1 posterior probabilities are Beta distributions – the likelihood being binomial. For r Depending on the circumstances, this assumption will be more or less accurat 1 Depending on the circumstances, this assumption will be more or less accurate and more or less necessary. In Depending on the circumstances, this assumption will be more or less accurate and more or less necessary. In general, it is both extremely useful and accurate. For simple surv general, it is both extremely useful and accurate. For simple surve Depending on the circumstances, this assumption will be more or less accurate and more or less necessary. In Depending on the circumstances, this assumption will be more or less accurate and more or 32 1 approximation of the Beta distributions by normal distributions is both legitimate general, it is both extremely useful and accurate. For simple surveys, the individ general, it is both extremely useful and accurate. For simple surveys, the individual Bayesian prior and Depending on the circumstances, this assumption will be more or less accurate and more or less necessary. In 33 posterior probabilities are Beta distributions – the likelihood bein posterior probabilities are Beta distributions – the likelihood bein general, it is both extremely useful and accurate. For simple surveys, the individual Bayesian prior and general, it is both extremely useful and accurate. For simple surveys, the individual Bayesian p general, it is both extremely useful and accurate. For simple surveys, the individual Bayesian prior and complex, multi-PSU, stratified surveys, it is again assumed that these base probab posterior probabilities are Beta distributions – the likelihood being binomial. Fo posterior probabilities are Beta distributions – the likelihood being binomial. For reasonably large samples, the general, it is both extremely useful and accurate. For simple surveys, the individual Bayesian prior and approximation of the Beta distributions by normal distributions i approximation of the Beta distributions by normal distributions is posterior probabilities are Beta distributions – the likelihood being binomial. For reasonably large samples, the posterior probabilities are Beta distributions – the likelihood being binomial. For reasonably la posterior probabilities are Beta distributions – the likelihood being binomial. For reasonably large samples, the normally distributed and, again, it is an assumption that makes the analysis tracta approximation of the Beta distributions by normal distributions is both legitima approximation of the Beta distributions by normal distributions is both legitimate and a practical necessity. For posterior probabilities are Beta distributions – the likelihood being binomial. For reasonably large samples, the complex, multi-PSU, stratified surveys, it is again assumed that th complex, multi-PSU, stratified surveys, it is again assumed that th approximation of the Beta distributions by normal distributions is both legitimate and a practical necessity. For approximation of the Beta distributions by normal distributions is both legitimate and a practic approximation of the Beta distributions by normal distributions is both legitimate and a practical necessity. For of the raw data set it may be possible to use non-parametric statistical methods fo complex, multi-PSU, stratified surveys, it is again assumed that these base prob complex, multi-PSU, stratified surveys, it is again assumed that these base probabilities are approximately approximation of the Beta distributions by normal distributions is both legitimate and a practical necessity. For normally distributed and, again, it is an assumption that makes t normally distributed and, again, it is an assumption that makes th complex, multi-PSU, stratified surveys, it is again assumed that these base probabilities are approximately complex, multi-PSU, stratified surveys, it is again assumed that these base probabilities are ap complex, multi-PSU, stratified surveys, it is again assumed that these base probabilities are approximately å k¢=1 ( 0 1 ) The parameters A0, a0 and b0 are all zero and are used merely to preserve the symmetry of the expressions and A The parameters their 0, a0 manipulation. For and b0 are all a K zero -dimensional and set of probabilities, are used merely there to preserve the will be 2( symmetry of Kthe -1) s all constraints on the prevalence values, i.e. normalization and [0, 1] bounds. regression parameters to be determined. expressions and their manipulation. For a K-dimensional set of probabilities, there will be 2(K-1) regression parameters to be determined. For a given dimension K there are K-1 independent functions pk – the remaining function being Modeling the Long-Term Health and Cost Impacts of Reducing nomial logistic regression for smoking prevalence For a given dimension K there are K-1 independent functions determined from the requirement that the complete set of Smoking pk – the remaining function being K form a distribution and sum to unity. Prevalence through Tobacco Taxation in Ukraine a consist of sets of probabilities, with their variances, at specific time values (typically determined from the requirement that the complete set of K form a distribution and sum to unity. e survey). For any particular time, the sum of these probabilities is unity. Typically such Note that the parameterization ensures the necessary requirement that each pk be interpretable as the probabilities of smoker, ex-smoker, never smokers as they are extracted from the a probability – a real number lying between 0 and 1. Note that the parameterization ensures the necessary requirement that each pk be interpretable as et. Each data point is treated as a normally distributed 1 random variable; together they together they are a set of N groups (number of years) of K probabilities a probability – a real number lying between 0 and 1. The minimum of the function S is determined from the equations The minimum of the function S is determined from the equations groups (number of years) of K probabilities {{ti, µki, ski |kÎ[0,K-1]} | iÎ[0,N-1]}. For For each year the set of K probabilities The minimum of the function S is determined from the equations set of K probabilities form a distribution – their sum is equal to unity. form a distribution – their sum is equal to unity. ¶S ¶S = =0 for j=1,2,....,k-1 (1.5) ¶ Sj ¶ ¶a Sj ¶b = =0 for j=1,2,....,k-1 (1.5) The regression consists of fitting a set of logistic functions {p (a, b, t)|kÎ[0,K-1]} to these data – one The regression consists of fitting a set of logistic functions k ¶a j ¶b j the circumstances, this assumption will be more or less accurate and more or less necessary. In function for each to these data – onek-value. At each time value, the sum of these functions is unity. Thus, for example, function for each k-value. At each time value, the sum of th extremely useful and accurate. For simple surveys, the individual Bayesian prior and noting the relations noting the reations when measuring smoking in the three states already mentioned, the abilities are Beta distributions – the likelihood being binomial. For reasonably large samples, the The regression consists of fitting a set of logistic functions { these functions is unity. Thus, The regression consists of fitting a set of logistic functions { pk(a, for pk(K b, texample, )|kÎ[0, a-1]} to these data – one , b, t)|k when k = 0 regression function Î[0,K-1]} to these data – one measuring smoking in the noting the relations of the Beta distributions by normal distributions is both legitimate and a practical necessity. For ¶pk ¶ æ e Ak ö function for each function for each represents the probability of being a never smoker over time, three states already mentioned, the k = 0 regression function represents k-value. At each time value, the sum of these functions is unity. Thus, for example, k = 1 the probability of being and ex- k-value. At each time value, the sum of these functions is unity. Thus, for example, the = ç = pkd kj - pk p j The regression consists of fitting a set of logistic functions { -PSU, stratified surveys, it is again assumed that these base probabilities are approximately pk(a, b, t)|kÎ[0,K-1]} to these data – one pkj ¶¶ ¶A A æ è 1 + e A1 +e .. + e ÷ Ak AK -1 ö ø probability smoker, and of being a never smoker k = 2 the probability of being a smoker. when measuring smoking in the three states already mentioned, the when measuring smoking in the three states already mentioned, the function for each k over time, k = 1 the probability of being k = 0 regression function = 0 regression function k-value. At each time value, the sum of these functions is unity. Thus, for example, = j ç AK -1 ÷ = pkd kj - pk p j buted and, again, it is an assumption that makes the analysis tractable. Depending on the nature ¶Aj ¶Aj è 1 + e A1 ¶ + .. + e¶ and ex-smoker, and k = 2 the probability of being a smoker. ø represents the probability of being a never smoker over time, represents the probability of being a never smoker over time, k = 1 the probability of being and ex- set it may be possible to use non-parametric statistical methods for this analysis. when measuring smoking in the three states already mentioned, the k = 1 the probability of being and ex- k = 0 regression function = (1.6) smoker, and k smoker, and The regression equations are most easily derived from a familiar least square minimization. In the k = 2 the probability of being a smoker. = 2 the probability of being a smoker. represents the probability of being a never smoker over time, k = 1 the probability of being and ex- ¶¶a j = ¶¶ Aj (1.6) The regression equations are most easily derived 1 from a familiar least square following equation set the weighted difference between the measured and predicted probabilities is ¶ a j ¶A smoker, and k = 2 the probability of being a smoker. ¶ ¶j written as S; the logistic regression functions The regression equations are most easily derived from a familiar least square minimization. In the a,b;t) are chosen to be ratios of sums of pk(weighted The regression equations are most easily derived from a familiar least square minimization. In the =t minimization. In the following equation set the difference between ¶¶ b j = t ¶¶ Aj The regression equations are most easily derived from a familiar least square minimization. In the exponentials (This is equivalent to modeling the log probability ratios, following equation set the weighted difference between the measured and predicted probabilities is following equation set the weighted difference between the measured and predicted probabilities is pk/p0, as linear functions of the measured and predicted probabilities is written as S; the logistic regression ¶b j ¶Aj following equation set the weighted difference between the measured and predicted probabilities is time). written as S; the logistic regression functions pk(a,b;t ) are chosen to be ratios of sums of functions pk(a,b;t) are chosen to be ratios of sums of written as S; the logistic regression functions are chosen to be ratios of sums of exponentials (This is written as S; the logistic regression functions pk(a,b;t) are chosen to be ratios of sums of exponentials (This is equivalent to modeling the log probability ratios, exponentials (This is equivalent to modeling the log probability ratios, equivalent to modeling the log probability ratios, pk/p0, as linear functions of pk/p0, as linear functions of as linear functions exponentials (This is equivalent to modeling the log probability ratios, pk/p0, as linear functions of ˆ . T ( ) ˆ .. They provide the tr 2 ˆ,b time). time). of time). k = K -1 i = N -1 pk ( a, b; ti ) - µki The values of the vectors The values of the vectors aThe values of the vectors , b that satisfy these equations are denoted a, b that satisfy these equations a, b that satisfy these equations are denoted are denoted a ˆ,b 2 a time). S ( a, b ) = 1 2 å å s 2 (1.3) lines They provide p aˆ , b the trend lines ( lines pk a ) ˆ,b ˆ ; t , for the separate probabilities. The confidence intervals for the trend lines are ( ) ˆ ; t , for the separate probabilities. The confidence intervals for the , for the separate probabilities. The 2 ( ( ) ) 2 k =0 i =0 22 k (ka(,a i) ki k = K -1 i = N -1 p k =a ( K,-k1b ; =iK=-t N i ) -p 1- 1N -1 i= µ( p b,;b t; ) -µµ ti - kiki ) confidence intervals for derived most easily from the underlying Bayesian analysis of the problem. the trend lines are derived most derived most easily from the underlying Bayesian analysis of the problem. easily from the The values of the vectors a, b that satisfy thes S ( ) 2 ( S ()a, b2) = 2 2 a , b = 1 S a , b = 1k 1 k =0 ks å å kki åå åå ss ki2 2 e Ak (1.3) (1.3) (1.3) underlying Bayesian The values of the vectors a, b that satisfy these equations are denoted analysis of the 1.1.2 Bayesian interpretation problem. aˆ,b ˆ lines pk a ˆ,b ( . They provide the trend ) ˆ ; t , for the separate probabilities pk ( a, ki b, t ) º k =0 i =0 =i0=ki0 i =0 1.1.2 Bayesian interpretation e A 1 + e A1 + .. + e AK -1 The values of the vectors lines pk a lines pk a ˆ,b ˆ ˆ ; t , for the separate probabilities. The confidence intervals for the trend lines are The 2 1.1.2 K Bayesian -2 regression parameters {( ( )) a, b that satisfy these equations are denoted interpretation The 2K-2 regression parameters { ˆ , b; t , for the separate probabilities. The confidence intervals for the trend lines are a,b a ˆ ˆ , b . They provide the trend } are regarded as random variables whose posterior distribu is proportional to the function exp(-S(a,b)). The maximum likelihood estimate a,b} are regarded as random variables whose derived most easily from the underlying Baye (e Ak t ) ºa º ( a e0A, -1 ) , b º ( b0 ,b1 ,..,bK -1 ) k a1 ,.., a k pk ( a, b, t ) º pk ( a p,kb a , ,) t b, º A1 AKK (1.4) derived most easily from the underlying Bayesian analysis of the problem. The 2K-2 regression parameters {a,b} is proportional to the function exp(- S(aare regarded as random 1.1.2 variables ,b)). The maximum likelihood estimate of this probability Bayesian interpretation K1 e A1e+ .. -1 + ++ .. + eK -1 A -1 derived most easily from the underlying Bayesian analysis of the problem. distribution function, the minimum of the function S, is obtained at the values 1 + e A1 + .. + e 1A+ A e º 0, Ak º ak + bk t whose posterior distribution is proportional to the function exp(-S(a,b)). The The 2 K -2 regression parameters { a,b} are rega ˆ . Other proper a º ( a0 , a1 ,.., aK -1 ) , b º ( b 0 1 ,..,bK -1 ) 0 ,b (1.4) 1.1.2 Bayesian interpretation distribution function, the minimum of the function S, is obtained at the values of the (2 K -2)-dimensional probability distribution function are obtained by firs aˆ , b a º ( a0 , a1 ,..,a K -( 1) a ,0 , ab ºa 1 ,.., (b 1 )1,,..,b bKº -1 )( b0 ,b1 ,..,bK -1 ) aº 1.1.2 Bayesian interpretation -,b The regression consists of fitting a set of logistic functions { (1.4) k(a, b, t)|kÎ[0,K-1]} to these data – one The 2K-2 regression parameters { p(1.4) maximum likelihood estimate of this probability distribution is proportional to the function exp(- function, the S(a,b)). Th K0 a,b} are regarded as random variables whose posterior distribution (2K-2)-dimensional normal distribution whose mean is the maximum likelihood A º 0, A º a + b t of the (2 The 2K-2 regression parameters { K -2)-dimensional probability distribution function are obtained by first approximating it a,b} are regarded as random variables whose posterior distribution ˆ . They provide the trend A0 º 0, AA The parameters function for each 0, a0 and 0,bk tb 0 are all The values of the vectors minimum of a, b the function S, is obtained at the values a that satisfy these equations are denoted ˆ distribution function, the minimum of the fun ,b . Other properties of k zero k and are used merely to preserve the symmetry of the kk-value. At each time value, the sum of these functions is unity. Thus, for example, k º A0aº k + 0 A k º ak + bk t is proportional to the function exp(- is proportional to the function exp(- S ( a ,b ) S (a,b)). The maximum likelihood estimate of this probability amounts to expanding the function S(a,b) in a Taylor series as far as terms qua ). The maximum likelihood estimate of this probability (2K-2)-dimensional normal distribution whose mean is the maximum likelihood estimate. This expressions and their manipulation. For The parameters A0, a0 and b0 when measuring smoking in the three states already mentioned, the K -dimensional are all zero and are used merely to preserve the symmetry of the regression parameters to be determined. a set of probabilities, there will be 2( K -1) k = 0 regression function lines pk a ˆ,b ˆ ; tthe amounts to expanding the function ( (2K-2)-dimensional probability distribution function , for the separate probabilities. The confidence intervals for the trend lines are distribution function, the minimum of the function S, is obtained at the values distribution function, the minimum of the function S, is obtained at the values ) differences S( (a,- aˆ ), b - b a ( ˆ,b (2 ) of the (2 are ˆ . Other properties b) in a Taylor series as far as terms quadratic in the K a K-2)-dimensional probability distribut ˆ obtained ,b by first ˆ . Other properties ˆ about the maximum likelihood estimate -2)-dimensional normal distribution whose ˆ ºS a S ( ˆ,b ˆ 0, a 0 and b0A The Aparameters , a0all their and b 0 and are all represents the probability of being a never smoker over time, zero For and are used merely to preserve the of symmetry of k = 1 the probability of being and ex- of the (2 the approximating it as a (2 K -2)-dimensional normal distribution whose mean is ( ) The parameters expressions 0are and zero manipulation. are used a merely K-dimensional to preserve set of the symmetry probabilities, there the will be 2(K -1) expressions expressions and regression parameters to be determined. their and their The manipulation. a K-dimensional manipulation. parameters For For A0, a0 smoker, and a andK-dimensional b0k set are of all zero set probabilities, and of areprobabilities, usedwill there = 2 the probability of being a smoker. merely be there to K-1) 2(preservewill be the 2(K-1) (2 of the (2 (2K derived most easily from the underlying Bayesian analysis of the problem. K K-2)-dimensional probability distribution function are obtained by first approximating it as a K-2)-dimensional probability distribution function are obtained by first approximating it as a differences the maximum (a - a ˆ ), b - b likelihood -2)-dimensional normal distribution whose mean is the maximum likelihood estimate. This -2)-dimensional normal distribution whose mean is the maximum likelihood estimate. This ( ) ˆ about the maximum likelihood estimate estimate. This amounts to expanding amounts to expanding the function the functionˆ º S2 a S ˆ,b ˆ . Hence S(a,b) in a For a given dimension K there are K-1 independent functions pk – the remaining function being regression parameters to be determined. 1.1.2 Bayesian interpretation k = K -1 i = N -1 p ( k ( a, b; ti ()a--µa ˆki),) b - b ( ) ˆ about the maximu 2 å å regression parameters to be determined. he regression consists of fitting a set of logistic functions { symmetry of the pk(K , b, t)|kÎ[0, a-1 independent functions expressions and K-1]} to these data – one their manipulation. For a K-dimensional set S(a,b) in a Taylor amounts to expanding the function amounts to expanding the function S(a,series as far as terms S (a , b) = 1 quadratic b) in a Taylor series as far as terms quadratic in the S(a,b) in a Taylor series as far as terms quadratic in the differences in the differences determined from the requirement that the complete set of For a given dimension K there are pk – the remaining function being K form a distribution and sum to unity. (a ˆ ) about the maximum likelihood estimate ( )) ( ) s ki 2 The 2K-2 regression parameters { a,b} are regarded as random variables whose posterior distribution unction for each k-value. At each time value, the sum of these functions is unity. Thus, for example, determined from the requirement that the complete set of For a given dimension For a given dimension of probabilities, K there are K-1 independent functions The regression equations are most easily derived from a familiar least square minimization. In the there will be 2(K-1) regression K there are K-1 independent functions K form a distribution and sum to unity. p k – the remaining function being pk – the remaining function being parameters to be determined. differences (a - a differences (aˆ- ),about b), ˆ -b b - theb maximum ( likelihood ˆ about the maximum likelihood estimate ) k = K -1 i = N -1 estimate pk ( a, b; S ( ti ) - ˆkµ ˆ ºS a , =0 Sb ˆ .2 ˆ . Hence ki º S a Hence i= 0 ˆ,b ˆ . Hence when measuring smoking in the three states already mentioned, the following equation set the weighted difference between the measured and predicted probabilities is k = 0 regression function Note that the parameterization ensures the necessary requirement that each pk be interpretable as is proportional to the function exp(- S ( a, b ) = 2 S 1 ( a , b ) å å ). The maximum likelihood estimate of this probability ºS a ˆ, b s ˆ2 + 1 a - a ( ) ( ˆ, b - b ) ( ˆ P -1 a - a ˆ, b - b ) ˆ k =+ K- 1 i = N -1 p ... ( k ( a, b; ti ) ( ) 2 å å determined from the requirement that the complete set of determined from the requirement that the complete set of K K form a distribution and sum to unity. form a distribution and sum to unity. 2 ˆ S a , b = 1 ( ) ˆ a , b 2 Note that the parameterization ensures the necessary requirement that each For a given dimension written as S ; the logistic regression functions K there are K-1 independent functions k – the p be interpretable as p ( a , b ;t ) are chosen to be ratios of sums of k = K - 1 i = N - 1 p ( distribution function, the minimum of the function S, is obtained at the values a , b ; t ) k -= 0 µ i = 0 ki . Other properties s ( ki ) ˆ 2 epresents the probability of being a never smoker over time, k = 1 the probability of being and ex- a probability – a real number lying between 0 and 1. k pk (2ia1 ; tki) - µ 2 2 å 1 å S ( a, b ) = 1 ˆ ˆki ( ) (,b )( () ) k1 k = K -1 i = N - ˆ ˆ ¶2S k =0 i =0 ¶2 S a probability – a real number lying between 0 and 1. exponentials (This is equivalent to modeling the log probability ratios, pk/p0, as linear functions of of the (2 K S ( a, b ) =k =20 å =0 Så -2)-dimensional probability distribution function are obtained by first approximating it as a º a ˆ, b s + ki 2 a2- a i ˆ, b -» b SPa -1 ˆ ˆ , b a+ -2a1 å ˆ , b( - ai b -a ˆ+i )... (S aj - a ˆˆ j )+ +1 2 å ( ai - a 1 ˆˆ ) -1 ˆ moker, and k = 2 the probability of being a smoker. remaining function being determined from p Note that the parameterization ensures the necessary requirement that each Note that the parameterization ensures the necessary requirement that each The minimum of the function set of K form S time). a is determined from the equations distribution and sum to unity. S the requirement k be interpretable as is determined from the equations that the complete pk be interpretable as (2 K -2)-dimensional normal distribution whose mean is the maximum likelihood estimate. This º S a ˆ, b ˆ + a-a 1 i k =0 ˆ, b - b i =0 ( ) ( ˆ P a-a - 1 ) ( s ) ki ˆ, b - b ˆ + ... i, j ¶a ˆi ¶a ˆº j a ( ) ( ˆ, b 2 a i , j- a ) ( ˆ , b - ib ¶P ˆi ¶ba a j a probability – a real number lying between 0 and 1. a probability – a real number lying between 0 and 1. The minimum of the function 2 ˆ 2 ˆ ( ) ( ( )) å( ¶ S ) ¶ ( ( ) )( S () ) 2 ˆ ˆ ˆ ( ( ) ( ) ( ) å) amounts to expanding the function ˆ S ( a , b1) in a Taylor series as far as terms quadratic in the 2 ˆ 2 ˆ S ˆ + ¶ˆ S ˆ ˆ, b + 2 » ˆ (a ˆ, b ) b12 å -( ˆi ) - he regression equations are most easily derived from a familiar least square minimization. In the ºS a 1 aS -a a ,b ˆ ,b -b +P a -a ˆ- + ... aˆ ˆ j S +(1 ¶ ˆ b¶ 2 i - ai + 2j ˆ- a ˆa - a j - b j ˆb) j )» 1 2å ˆ i å ( ai - a ( ) ( ) ¶2S ˆ 2 bi -¶b iS a j ˆ 2 a i (1.7) +S 1 aˆ, b bi + -ˆ b1 j - bj ¶S ¶the S necessary requirement differences (ˆa ˆ , b- + a )2 ˆ ,å b- (a bˆ i - ai ) ( aijˆ about the maximum likelihood estimate ˆ j ) + 2 å ( aii ,i- ¶ aˆ ¶ j i) aˆ Sˆˆ bº ˆS ˆ ˆ bˆja,+ b . Hence ¶ aˆ ¶ b ˆ ˆ ( ) »S a ˆ -a ˆj a i ollowing equation set the weighted difference between the measured and predicted probabilities is 2 - 2 The minimum of the function S is determined from the equations b; ti ) - µki ( a,(1.5) ¶a ˆi ¶a ˆj 1 1 The minimum of the function S is determined from the equations Note that the parameterization ¶S ¶S ensures k = K -1 i = N -1that k each ˆ ¶bi ¶ j a ¶bi ¶b j ,j i, j =j=1,2,....,k-1 = 0 for 1 j=1,2,....,k-1 p ˆ j ¶2S ¶ 2jS i (1.7) j i, j ( ) ( ) i, j (1.5) ¶a ˆi ¶a ¶a ˆi ¶b written as S; the logistic regression functions pk(be a,binterpretable ;t) are chosen to be ratios of sums of ¶a as ¶ = =0 abprobability for ¶a j– a ¶ realb j numberS ( a, blying ) = 2between 0 and 1. 2 å å (1.3) » S a ˆ , bˆ + 2å i, j 1 ( a - aˆ ) ˆ j¶ S (2 ˆj j )22 ˆ 2 å i i ¶a a - aˆi, j + 1 ( a - a ˆ j ) ˆ ¶ b - j 2 ˆ j b S 1 ˆ + ˆ å( ) ( () ( ) ( ( ) ) i i ¶a ˆi ¶a ¶S ¶S ¶S ¶S xponentials (This is equivalent to modeling the log probability ratios, j j pk/p0, as linear functions of k = 0 i = 0 s ki + 2 å bi - bi 1 2 ( a j - a 1 ˆ ¶k + 2 S i= , ˆ jK1-1 i = N -1 b ( ( ) - p b ˆ ( a 1 The (2 ˆ j )k+ i2 å ˆb , b ;) ( åti i - bi ki ) )) K- µ ˆ a ¶- Sa ˆ + -2)-dimensional covariance matrix i, j jbj - b 1 ˆ b ˆi ¶b - b ˆ j ˆ ¶b P bå - bˆ ˆ is the inverse of the appropriate ex + ˆ i, j b - b ¶2S (a j - a ˆj )+ å å ˆ ¶a i j 2j i i j j i, j ( S a, b ) =¶2 2 i i = = 0 = for = 0 j=1,2,....,k-1 for j=1,2,....,k-1 (1.5) (1.5) bˆ¶ aˆ 2 S ˆ ¶ b ¶ aˆ ¶ b ˆ ¶bˆ 2 ˆ i, j This matrix is central to the construction of the confidence limits for the trend ¶ b ¶ b ˆ ˆ k =¶ ( ) ( ) ( ) ˆ ¶ S b -b i , j i =0 ( a - a ˆ j )s noting the relations ¶noting the relations a j ¶b j ¶a j ¶b j i , j i2 j ˆ i j i j 2 å bi - bi 2 å bi - bi me). +1 i j 0 +ki1 i j e Ak ˆ ˆ j ¶bi ¶a j P is the inverse of the appropriate expansion coefficients. ˆ ¶bi ¶b j ˆ j j ¶pk ¶ æ ¶2 pe Ak ¶ æ ö = p d e- Ak p k ( a , b ö , t ) º 1 + e A1 + .. + e AK -1 The (2 K i, j -2)-dimensional covariance matrix º S aˆ , bˆ + The (2K-2)-dimensional covariance matrix 1 2 a - aˆ , b ( ) ( - bˆ) ( 1.1.3 P ) -1 a i, j - Estimation of the confidence intervals a ˆ , b - bˆ + ... The (2K-2)-dimensional covariance matrix P is P is the inverse of the appropriate expansion coefficien noting the relations noting the relations k = K -1 i = N -1 p a, b (Aj ;¶ = tiA )j- ( ç µki A1 = AK -1ç ÷k ) k kj p p k j AK -1 ÷ = pkd kj - pk p j This matrix is central to the construction of the confidence limits for the trend lines. The (2 The logistic regression functions This matrix is central to the construction of the confidence limits for the trend lines. K-2)-dimensional covariance matrix P ˆ 2 is the inverse of the appropriate expansion coefficients. p k(t) can be approximated as a normally distri This matrix is central to the construction of th 2 ˆ (1.7) S ( a, b ) = 21 k¶ å å 2 k s ki è 1 + e ¶A + j .. + e ¶Aj è 1 + e aø A1 +º .. (+ae 0 , a1 ,.., (1.3) ø aK -1 ) , b º ( b0 ,b1 ,..,bK -1 ) (1.4) 1.1.3 »S a ˆ, b ˆ + 1 (a - a Estimation of the confidence intervals å ( ) This matrix is central to the construction of the confidence limits for the trend lines. ¶ S ˆi ) random variable ( ( ) ( aj - a ) ˆj)+N 1 ˆ å p ((at ),-s a ˆi() 2 t ) ¶ S by expanding bj - b ˆ p+ k about its maximum likelihoo ˆ Estimation of the confidence inter ¶1.1.3 2 i 2 k i k j ¶pk k =0¶ iæ =0 ¶p eA æ ö e Ak¶ = ¶ ö ¶a ˆi ¶a ˆj aˆi ¶b = ¶Aj ¶Aj è 1 + e k = ç ¶A A1 ¶A ç 1A+ + ¶ .. + e K -1 ÷ e A1 = + p ¶ .. a d k kj A¶ + j e - 1 ÷ A p p= k j k kj p d ¶ - p = p k j ¶ A 0 º 0, A k º a k + b (1.6) k t 1.1.3 1.1.3 Estimation of the confidence intervals The logistic regression functions ( ) Estimation of the confidence intervals ( The logistic regression functions ) ip k 2 ˆ ( ) ( ) , j (t) can be approximated as a normally distributed time-varying line) pˆ k p ( t ( ) t = p aˆ , bˆ i, j , t 2 ˆ The logistic regression functions ) can be approximated as a normally distributed time-varying j pk(t) can be a ( ) (1.6) N pˆ t s 2 t ¶ S ¶ S ( ) ( ) ( ( ) ( )) ( ) ( ) è ø K- ø random variable , by expanding p about its maximum likelihood estimate (the trend pk(t( ˆj )+ 1 ˆ k (t ),s k (t ) by expandin j ˆ ˆ ˆ 2 å bi - bi 2 å bi - bi k j j e Ak The parameters ¶a j ¶Aj +1 The logistic regression functions k k aj - a k ) can be approximated as a normally distributed time-varying b b j - random variable Np 2 ( ) ˆ¶ ( ˆ ¶p ˆ about its maximum likelihood estimate (the tre ) j p a, b , t º ¶ A0, ¶ a0 and b0 are all zero and are used merely to preserve the symmetry of the ( ) ˆ ¶ b N aˆ pˆ t , s 2 t ¶ b b random variable ( ) by expanding j ( a, b, t ) = p ˆ +b-b ˆ,t ¶ p ˆ t p aˆ b t ¶ ¶ =¶ line) = , , t ( ) i kp aˆ + a ˆ- a ˆ, b i , j i , j 34 ˆ k (t ),s k ( ) ˆ k k(t ) , t 35 k i j k 1 + e A1 + ..expressions + A = e K -1 = ¶ ¶ k random variable N p 2 t k line) p = p a, b ˆ ( ) ( () k ¶and b their ¶ A manipulation. For a K-dimensional (1.6) set (1.6) of probabilities, there will be 2(K-1) by expanding pk about its maximum likelihood estimate (the trend ¶Aj ¶a j j ¶Aj j =t ˆ k (t ) = p a ˆ , t P is the inverse of the appropriate expansion coefficients. a º ( a0 , a1 ,.., aK -1 ) , b ¶ ºa(jb 0 ,b1 ,..,bK -1 ) regression parameters to be determined. ¶b j ¶ Aj (1.4) The (2 ˆk ( line) ˆp K-2)-dimensional covariance matrix t)= p a k ( a, b, t ) = pk a p ˆ ,b ˆ +a-a ) ˆ, bˆ +b-b ˆ,t æa-a ˆö line) p ˆ , b, t ( ) = ˆ p ( t ) + Ñ , Ñ pˆ ( t ) pk ( a, b ç ÷ + ( ... A º 0, A º a¶ + b t ¶ ¶ è -)b ¶ This matrix is central to the construction of the confidence limits for the trend lines. k aˆ ˆ b k b, t ˆ= pk a ˆ+ a-a ˆ (1.8) ø ( p ( a, b;kt=0) - iµ=0 ) s ki a, b() p=( a,ab,;bt )=- µ () k ( i ki ) ) S (å 2 (å ) å å å k = K -1 i = N -1 S ( a, b ) = å å k i ki k i ki S 1 1 ki ( ) å 1 k i S a, b = 1 2 ˆ . Other properties ( ˆ()a+ ( ˆ ) P -1 ( a - a ˆ ) + ... k = s ki s 2 2 ( s ) 2 2 2 º k= S kK ˆ ap -1 i = =0 , b iN,b -1 =01 ta)- ;distribution function, the minimum of the function S, is obtained at the values - µaˆ , b - b ki ˆ , b - b s 0 i =0 2 ˆ,b a 2 k =0 i =0 ki S ( a, b ) ˆ= å å k i ki k =0 i =0 ki 1 (a ˆ) + (a - ˆ) P (a - a ˆ ) + ... 1 º S (a ˆ, b ) + ( a - a ( a - aˆ, b K ˆ ) P sof the (2 ˆ ) + ... ˆ 2 ( ) ( ( ) ) (( ) ( ) --2)-dimensional probability distribution function are obtained by first approximating it as a ˆ ˆº S ˆ, b ˆˆ a ˆ, b ˆ ˆ, b - b -1 2 1 ˆ, b - b -1 2 b -1-b Sa ˆa ,ˆb ,b 1 ˆa ˆ , b 2 º - b+ P 2 aa -- a , b -b bˆ P + ... a - a - b +(1.7) ... ˆ ˆ, 2b- -1 2k =0 i =0 ki ºS a 2, ˆ b +1 a- 2 ¶ S ¶ S ºS a (( )) ( ˆ, bˆ + 1 a» -S a ( ) (å ˆ , ba ˆ-,bˆ bˆ ¶) +2 S P - (1(2 ˆ a - K ) å( ) -2)-dimensional normal distribution whose mean is the maximum likelihood estimate. This (a ˆi, b -- a ˆb( i ) + ˆ ... ) a j - a ˆ j2 S ˆ+ 2 1 (2a ˆˆi - ˆi )1 (1.7) 2 b a j -¶ bˆ2 S j + ¶2S ˆ ¶2S ˆ (1.7) Modeling the Long-Term Health and Cost Impacts of Reducing ( ( )) ¶ ( ) ˆ ˆ (1.7) ˆ »S a ˆ ˆ , b + 2 å ( ai - a 2 1 ˆi ) ( ) 2 i,( ( ) å jaj - a ˆ j )» +S ( 2å 1 ˆ ¶ amounts to expanding the function ) (å ) aa ¶ (a ˆ aji + -1 aˆi ) ( a - a å ˆ b) » ji ,- S¶ jS(b ˆ a,S a,+ ˆ ( ba +- 2¶ å aˆ ) (+ ¶ a ˆ bi¶1-S å aˆi()a - a j b) in a Taylor series as far as terms quadratic in the ) ( aj -1 aˆ jb ) +-2b 1 å ˆ ( ai - a ˆi )S ¶ 2 ˆ b jˆ- b j + (1.7) Smoking Prevalence through Tobacco Taxation in Ukraine ˆ ( i i(1.7) ,ij ) ( ii , j i ) ¶ˆa iˆ , b ˆ ˆ ˆ aˆ ˆ ˆ + ˆ ¶a ˆi2 ˆj a » S a ˆ, b + 1 a - a i j ¶aa ˆij¶a - a ˆ j ¶ja + a - a ˆ ¶b jbˆ - b + ( ) ¶ ¶a ˆ 2¶ˆ b ˆ2 ( ) ( ) ˆ 2 i i j j 2 i i j j ¶ S , j ¶i S j ¶a ˆ2i ¶a ˆi ¶ ˆbº j ( ) ¶a ˆi ¶ i ,ajˆ ˆi ¶b j i j i, j i, j »S a ˆ, bˆ + 1 (a - a å ˆ ) ( differences a 2- ˆ aˆ ) + ( a1 å - aˆ( ) a, b - - aˆ iˆ b) ˆ 2i ,ˆjj + about the maximum likelihood estimate b - b S S j i, j aˆ , bˆ . Hence ¶a å( ˆ ¶j S j a ˆ2 ) ( ( ) () ( å ( ) ( ) ˆ ˆ j ¶ jS b - b j 2 ˆ ˆ 2 i ˆ i b¶a ˆi ¶ˆa 2 ˆ i i ˆ ˆ j¶ S ¶2S ( ) ˆ i, j + ¶ 21 S - b ) ( () ) ( ( ) () ) ( ( ) ) j 1-i a ,ˆ j¶ S j + 2ˆ ¶¶a ˆ b 2¶ ˆb i -j bi + 1 2 ˆ 1 +1 å b - b 2 ( a i- aˆ ) ) å i j+ 1 ˆå b - ˆ j i + 2i å b b - bˆ iS ( ˆ å j) ˆ b j 2å a ¶ S b¶2- ˆ jb ˆ ( ¶ a S - aˆ ) + 1 ˆ å b - b ˆ 2 ˆ ¶ S ¶b bj - b ˆ ( ¶2bii¶ )a ( ) ˆb ˆ- bi ij,+ aj - ¶b baˆ ¶ b+ - ba 1 b j - b ji , j ˆ 1 ˆ ˆ- ˆb ˆ- b ˆ ˆ2 å 2 i i j j j 2 i i j j 2 i i j i b ¶ iˆ ¶ b ¶b ¶ a ˆˆi, j ¶ b ¶ ib ˆ j 1 b ˆ aˆ b - ii aˆ bˆ + ˆ b b - b ( å )å ( ) ¶ ¶ ¶ S ¶ 2 S j¶ ¶ ( ) ( ) 2 ( a¶ tiˆ)j - µki i, j , j , j 1.2 Module Two: Microsimulation The (2K-2)-dimensional (a j - a j ) + 2 å bi - bi ˆcovariance matrix ˆ P is the inverse -1 the j of pk appropriate i i j j 2 å bi - bi ˆ i j i , j i j bk =K - 2 iˆ 1 i=N i i ,b bˆ;¶ i , j j j 2 i i ˆ ¶b ˆ j j - bi i a ¶b i j +1 1 expansion ¶ bˆ ¶a ˆ coefficients. The (2K-2)-dimensional covariance matrix This matrix is S ( central a ¶, b bˆ¶ P ) to bˆ= the 1 2 construction j, j of i is the inverse of the appropriate expansion coefficients. the i, j i j i , j The (2K-2)-dimensional covariance matrix i j i , j P is the inverse of the appropriate expansion coefficients. The (2K-2)-dimensional covariance matrix j The (2K-2)-dimensional covariance matrix k =0 i =0 s ki 2 P is the inverse of the appropriate expansion coefficients. P is the inverse of the appropriate expansion coefficients. confidence limits for the trend lines. This matrix is central to the construction of the confidence limits for the trend lines. This matrix is central to the construction of the confidence limits for the trend lines. This matrix is central to the construction of the confidence limits for the trend lines. The (2K-2)-dimensional covariance matrix P is the inverse of the appropriate expansion coefficients. 1.2.1 Microsimulation initialization: Birth, disease and death models The (2K-2)-dimensional covariance matrix P is the inverse of the appropriate expansion coefficients. This matrix is central to the construction of the confidence limits for the trend lines. 1.1.3 Estimation of the confidence intervals ( ) ( This matrix is central to the construction of the confidence limits for the trend lines. S a ˆ, b ˆ + 1 a-a 2 1.1.3 ˆ ) ( , b - bˆ P -1 a - a ºThis matrix is central to the construction of the confidence limits for the trend lines. Estimation of the confidence intervals ) ˆ , b - b ˆ + ... Simulated people are generated with the correct demographic statistics in 1.1.3 Estimation of the confidence intervals 1.1.3 Estimation of the confidence 1.1.3 intervals Estimation of the confidence intervals the simulation’s start-year. In this year, women are stochastically allocated the ¶2S ˆ ¶2S ˆ (1.7) The logistic regression functions The logistic regression functions The logistic 1.1.3 Estimation of the confidence intervals ( regression ) ( ( ) ( )) p ( t S The logistic regression functions functions k » can ( ) å pk(t) can be approximated as a normally distributed time-varying ) can be approximated as a normally distributed time-varying 1.1.3 aˆ be , ( ( b ˆ )) å + 1The logistic regression functions approximated ( i i) ap k (-t a Estimation of the confidence intervals asˆ a normally ( a ) - ( aˆ ) can be approximated as a normally distributed time-varying ) + 1 ( pk(t) can be approximated as a normally distributed time-varying ( i i) a - ) aˆ ˆ b - bˆ + number and years of birth of their children – these are generated from known ( ) ( ) ( ) 2 j 2 j 2 j j ˆ k (tp ¶ aˆN ¶ paˆˆ t s t ¶ aˆ ¶ b N p ˆ t s t ),ks k (t ) by expanding pk about its maximum likelihood estimate (the trend , 2 random variable distributed The logistic regression functions random variable k , time-varying k by expanding p k(N p ,s k kt variable ˆ k t random 2 p about its maximum likelihood estimate (the trend The logistic regression functions t) can be approximated as a normally distributed time-varying random variable by expanding Np random variable i2, j by expanding i about its maximum likelihood estimate (the trend kj pk(t) can be approximated as a normally distributed time-varying k by expanding i , j p k about its maximum likelihood estimate (the trend i j fertility and mother’s age at birth statistics (valid in the start-year). If a woman ˆ k (t ) = p a line) p random variable ˆ ( () ,N aboutˆ b, p line) p tˆ k (t ),maximum its s k (t ) ˆ by expanding ˆ k (t ) = 2 p a, b() )ˆ ,likelihood t line) p pestimate kˆ k (t ) = (å ( ) ) ( ( ( () ()))) å ( ) +p1 (the ˆ ,trend about its maximum likelihood estimate (the trend random variable aˆ ,b t ˆ bi - bi line) ( line)¶ p ˆN2 kSˆ(tp )k ˆ =p ta aj - a ,ˆs ˆ , b, tt by expanding p¶ ˆ 2 k + 1 bi - b ) 2 S ˆ ˆ k about its maximum likelihood estimate (the trend bj - b ˆ ˆ has children, then those children are generated as members of the simulation ˆ k (t ) = p a line) p ˆ ,b( ) ˆ ,t 2 ( The values of the vectors ) i, j ˆ ( ) ¶a b ˆ ˆa ˆ i,¶ a , b ˆ, jt that satisfy these equations are denoted j 2 i ¶b ˆ ¶b ˆ j a ˆ ,b . They provide the trend in the appropriate birth year. ( ( ( )) line) pk t = p ) ( b ) i, j i j ˆ ˆ () k ( ) ,= pk ( a, b, t ) = pk a ˆ ˆ ab , b- ˆ ˆ ˆ +a- aˆ, b + lines pk b- aˆ ,b bˆ, t pk ( aˆ , b, t ) = p ; t , for the separate probabilities. The confidence intervals for the trend lines are ˆk, ta ˆ +a-a ˆp ,b ˆ + , tb ˆ t pk a + a - a, b + b - b, t p ( a , b , t ) = p ˆ a + a - ˆ a , b + b - b ( pk ( a, b, t ) = pk The (2 k K ˆ +b-b ) ˆ-2)-dimensional covariance matrix k ˆ ,æ P is the inverse of the appropriate expansion coefficients. The microsimulation is provided with a list of relevant diseases. These ) ( ) ˆ + a - a a ,b ta-a derived most easily from the underlying Bayesian analysis of the problem. ˆ ö (1.8) ˆ ˆ æ a-a ˆ (1.8) ö (1.8) = ˆ p ) k (This matrix is central to the construction of the confidence limits for the trend lines. t + (Ñ , Ñ ) pˆ k ( )ç t ( ( + ... p ( ta )+ ( a , b , t ) = p ˆ a ) æ + = a a pˆ ˆ k ( t ) ç k ÷ + ... a - -(aˆ t ˆ aö ) ,+b + Ñ b , Ñ- b , t ˆ p ( t ) ç + ... diseases used the best available incidence, mortality, survival, relative risk and ÷ =p ˆk æ -a ˆÑ öaˆ , Ñ ˆ p ˆ÷ k k ˆ (1.8) ˆ k ˆ a ˆ ˆ ( ) b b ˆ k (è ba b ˆ a ˆ b - b t) + Ñ æ - ö ˆ k (t ) ç (1.8) è ø b = p øa ,ш p ÷ + ... èb - bø prevalence statistics (by age and sex). Individuals in the model are simulated =p ˆ k (t ) + Ñ ( 1.1.2 ˆ , Ñb ) ˆ p ˆ Bayesian interpretation k (t ) ç ÷ ˆ+ ...b ˆ æa-a ˆö (1.8) Denoting mean values by angled brackets, the variance of 1.1.3 The 2 a Estimation of the confidence intervals K è b - -2 regression parameters { Denoting mean values by angled brackets, the variance of ( ˆ b pk is thereby approximated as ) ø è b a,b - Denoting mean values by angled brackets, the variance of b ø = pˆ } are regarded as random variables whose posterior distribution pk ( t ) + Ñ aˆ is thereby approximated as , Ñ bˆ pˆ k ( t ) ˆ ÷ + ... çpk is thereby approximated as from their year of birth (which may be before the start year of the simulation). Denoting mean values by The logistic regression functions angled brackets, the variance of pk(is is proportional to the function exp(- t) can be approximated as a normally distributed time-varying thereby k S(a,b)). The maximum likelihood estimate of this probability èb - bø In the course of their lives, simulated people can die from one of the diseases Denoting mean values by angled brackets, the variance of Denoting mean values by angled brackets, the variance of approximated as random variable p k ( ( ) ( )) is thereby approximated as Np ˆ k tæ a ,s -ka 2 ˆt p k is thereby approximated as ö æ by expanding distribution function, the minimum of the function S, is obtained at the values a- aˆö T pk about its maximum likelihood estimate (the trend a - aˆ a - ˆ a T aˆ,b ˆa æ -a ˆ öæ a - a . Other properties ˆö T caused by smoking that they might have acquired or from some other cause. ˆ k ( t ) ) = ( Ñaˆ p ÷ k ´ ) çb - b s k ( t ) º ( pk ( a, b, t ) - p ˆ k ( t ) ) = ( Ñaˆ p )k) ( a Denoting mean values by angled brackets, the variance of (( Ñ æ ö æ p ö is thereby approximated as 2 k (t ) Tº=´ pk (p a, b, t ) - p ˆ k ( t ) , Ñbˆ p ˆ (t ) ( ) 2 2 ˆk s ( tk2) ,(ÑtK)ba pˆ kˆ( (,tp ç, b, t )ˆ- ÷ç pˆs 2 k (t ) ˆ)÷ 2 ˆ ˆ k ( t ) , Ñb ˆ ˆ k ( t ) )T ç p k ˆ÷ ç ˆ÷ The probability that a person of a given age and sex dies from a cause other ´ line) p ˆ ( of the (2 t ) = p ˆº -2)-dimensional probability distribution function are obtained by first approximating it as a ˆ , b t æ è ba - b aˆ ö ø æ è ba - b aˆ ö ø æ a a - (1.9) aˆ ö æ a - ˆ a ö b - bˆ ÷ç b - ˆ b è ø è b-b ø than (1.9) the disease are calculated in terms of known death and disease statistics = ( Ñ(2 )) Ñ è øè ø k s k2 ( t ) º ( pk ( as )) ( ) (1.9) 2 ,b 2, t ) - p k (T t) º ˆ k ( tp ( k ( a, b, ta ) ˆp ˆ -K (pt ) ,(Ñ ˆ k t) ) ˆp ˆ2k ( t = ça pˆk ( t ) ÷ ç ˆ øèb T, Ñ pˆ ( k -2)-dimensional normal distribution whose mean is the maximum likelihood estimate. This ˆø ÷ t ) ´ ç ÷ ç ÷ ´ ˆ ka( t ˆÑ ˆö T ( t()( ˆ)ø è= ( ) P ( Ñaˆ p - ) ,a ˆa (-t )a) b ( Ñaˆ p ˆ k ( t ) ) = ( Ñaˆ p ) aˆP ,b b b b ˆp æ öˆæp valid in the start year. It is constant over the course of the simulation. The ˆk ( ) ˆ k ( t ) , Ñbˆ p ˆ k ( t() ˆ k ( t ) , Ñbˆ amounts to expanding the function ˆ k (st) è2 ˆ -p - ˆ k ˆ k(( ),S b -( b) T b - b () ) ( t )T T p t )aˆ,p tbˆ)( ( ˆ Ñ p t ,(Ñ ˆ p ) ( ) ˆ) in a Taylor series as far as terms quadratic in the t (1.9) ) ( Ña ˆø ) tˆ ,Ñ pˆ k( t ) T (Ñ Ñ ( 2 k k(p ( ) ) ( ) ( ) Ñ p Ñ ˆpˆ t ºk= Ñ p ˆ( p a tb , , Ñ ta) è , ˆ p b - pˆ t ( t ) P ˆ Ñ = pˆ Ñ ˆ t p , Ñ ( tpˆ) , Ñ (1.9) pˆ ( t ) ´ k a, b, tT) = pk a + a - a, b + b - b, t aˆ ˆ a k ˆ ˆ b k k ˆ a ˆ ˆ k k ˆ bˆ k ç ÷ ç b k ˆ øèb - b ˆø÷ survival rates from tobacco-related diseases will change as a consequence of ( ) b k b k aˆ k b ˆ k ( )) ( b ( ˆ k ( t ) , Ñbˆ p ˆ k ( t ) ) = ( Ñaˆ p T T ( t ) ) P ( Ñ p ˆ k ( t ) , Ñbˆ differences ) è b-b ( k (at ),, Ñ ( ) Ñaˆ p pˆ ˆ- pˆ ˆ t ˆ ˆ (1.9) ( ) ( ) ˆ S S aˆ , b T ш p ˆ (t ) , Ñ ˆ p When K=3 this equation can be written as the 4-dimensional inner product ˆWhen k = ш p aa ˆ ( t ) , Ñ ˆWhen ˆ ˆ ( p b ( t ) K=3 this equation can be written as the 4-dimensional inner product ˆ - ) P ш p kb about the maximum likelihood estimate ˆ (t ) , Ñ ˆ p ˆ (t ) b t K=3 this equation can be written as the 4-dimensional inner product º . Hence the changing distribution of smoking level in the population. a k b k a k æa-a ˆö b k a k b k (1.8) (Ñ p () =((Ñ ˆ ()t ) ˆç ( t ) ) P÷ +(... ˆ (t ) ) T T When K=3 this equation can be written as the 4-dimensional inner product ˆ (t ) , Ñ = ˆp p ˆ( t ) t) + Ñ ,p Ñ p ( t )p ˆ, Ñ Ñ p ˆ (t ) , Ñ p 3 b bˆ ˆ a k bˆ k k ˆa a ˆ ˆ kb k ˆ b k ˆ a k ˆ b k 3 ( p ( a, b; t ) - µ ) è - 2 When K=3 this equation can be written as the 4-dimensional inner product ø 3 k = K -1 i = N -1 The microsimulation incorporates a sophisticated economic module. The When K=3 this equation can be written S as the When ( a, b ) = å inner 4-dimensional å products 3 2 i K=3 this equation can be written as the 4-dimensional inner product ki 1 2 k module employs Markov-type simulation of long-term health benefits, health Denoting mean values by angled brackets, the variance of k =0 i =0 ki p k is thereby approximated as 3 care costs, and cost-effectiveness of specified interventions. It synthesizes and ºS a ( ) ( ˆ, b ˆ + 1 a-a 2 ˆ , b - bˆ P -1 a - a æ ¶p ) ( ˆ k ( t ) öæ ¶p ˆ , b - ˆ k (t ) ö bˆ + ... æ ¶p )ç ˆ k ( t ) ö ÷ 3 evidence on cost-effectiveness analysis and cost-utility analysis. estimates aa ˆ öæ a ˆö T 1 a÷ ˆa 2 ç ¶ ˆˆ ÷ç ÷ ç æ¶ ˆ- - The model can be used to project the differences in quality-adjusted life years (1.7) s k2 ( t ) º »S a p (( ) å ( a ˆ , b , t ˆ , b + 2 ( ai - a 1 ) - ˆ p ( t ) ˆié)P ç = ) ¶ 2a ÑS1 p ˆ ÷ ç ( ( ( aj - t¶)aˆ ,1 aÑ ÷ ˆ p ˆ j b +P2 ù ç 1( t )) å) (a ç ˆ- (aˆ)iˆ) ÷ ç ÷ b-b ¶2S ÷ ´ˆ b -b + ( ) (QALYs), direct and indirect lifetime health-care costs, and, as a consequence ˆ ˆ ø j j (1.9) k k aˆ k ˆ k tP ˆ kP ¶ip t é Paa11 Paa éP PabjPaa12 abP P 12ab ù11ç¶ ¶ aa ap ˆˆP 11 ¶ ( a k abˆ ) ù÷12 aa ç¶ p ( ab ) t11 ÷ ab12 ç èb k- b ÷ø¶ èa ˆ ¶ b öê ÷ç P ÷ P ú ¶a 12aa11 i , 11 i , j ç i 12j i j of interventions, incremental cost effectiveness ratios (ICERs) over a specified )t ) ö¶êp ˆ(kt ( ˆ(kt )t ) ö ˆ k ( t )T ˆk ( t )P úP Paa ˆ2 ÷ æ ¶p2k ( ) ˆ t æ¶ pˆ k ( t ) s 2¶ (ˆ p ) ( t )æ ¶p ˆ¶kp ˆP ¶kp ( ¶ê P p aaP PabP¶p P ê ¶P aˆ ú 222 ÷ç ¶a ˆab ab 22 ú ç ( )) ( ( ç )( ( ) )) aa 21 22 21 t = ˆ 2 ÷ 2 ˆ T s k ( t ) = çs k ( t ) = ç ç ˆ k÷ (¶ ê )ˆ12, ˆ ê (t ˆ)ˆ =¶¶Ñ 2 S ÷ ú(P )¶,11 ú( 1t ) P ¶(çt ) S, k ( tˆ ÷ ) time scale. Outputs can be discounted for any specific discount rate. å( ( aa 2 k ˆÑ ˆp ta 21 Ñbˆ p ˆ 22 aa 21 21 aa 22 ab 22 ˆpˆk êab t 21 Ñbˆ p ab ˆba P11 ÷ Ñ pˆ ˆÑ ˆtpˆ ) å k ( ) k ¶a ÷ b ˆa P Pˆaˆbb kú ¶ p 1b P P b2 k¶ çba k÷12 ç è ¶ ˆ a è ¶ ˆ a ¶ abˆ ˆ è ¶ 1bˆa + ê ø ba P ¶ b b P ê ø ba P- ˆ P Pøa ú ê - pa ˆˆ P ( t )+ú ¶ ˆ p ( bbb t ) - b 12 ˆç¶b bb ˆ¶ ˆ j ÷ j - b The section provides an overview of the initialization of the microsimulation ¶ú i 12 i bb11 j j 12 2 1 2 2 11 ba11 ¶b ba12 bb12 ˆ j bb ç k bb 11 ÷ç i , jkP i ÷ P i i ¶a b 1 12 2 1 2 ê i, j ê úP ë ˆ Pbaú ˆ bb 22 û ç j b1 ÷ P ë ba 21 ba P P ë 22 P P P P ç ba 21 û21 bb ¶P b ÷ 1 22 û ç 22 ¶ b bb 21 ÷ i model and will be expanded upon in the next sections. When K=3 this equation can be written as the 4-dimensional inner product ba 21 bb 21 ba 22 bb 22 bb 1 ç ÷ ç ¶ pˆ t ÷ç ) ÷ç ¶p ˆ k (t ) ÷ ÷ ç ¶ ˆ p k ( t ) ( is the inverse of the appropriate expansion coefficients. ÷ The (2K-2)-dimensional covariance matrix ç kP ç ¶b ˆ ÷ ç ¶b ˆ ÷ç ¶b ÷ 3 1.2.2 Population models This matrix is central to the construction of the confidence limits for the trend lines. è 2 øè ˆ2 ø è 2 ø (1.10) (1.10) Populations are implemented as instances of the TPopulation C++ class. The (1.10) 1.1.3 Estimation of the confidence intervals where Pcdij º where (c P- cˆ º) d (c where i cdij i j i j i j- -d( where ˆc ˆ ) d Pcdijˆ º . The 95% confidence interval for . The 95% confidence interval for - d )(j ( ci - c ) ˆi ) d j - d (ˆ . The 95% confidence interval for pk(t) is centred given as The logistic regression functions j The 95% confidence interval for ) p k ( t ) is centred given as p k ( t p k ( t ) can be approximated as a normally distributed time-varying ) is centred given as is TPopulation class is created from a population (*.ppl) file. Usually a simulation will use only one population, but it can simultaneously process multiple [kˆ (t ) - 1.96 p [pk (tk) - 1.96 centred given as pˆ [ k) + ]1.96s k (t )]. (t ) ˆ s (t ), p (t ) + 1.96ks (t ) . k random variable k s k (t ), pk (t - 1 .96 s ( t ), p k ( t ) + 1 .96 s ( t k N p ) . ] ( k ( ) k ( )) ˆ t ,s t by expanding pk about its maximum likelihood estimate (the trend 2 populations (for example, different ethnicities within a national population). line) p 1.2 Module Two: Microsimulation ˆ k (t ) = p a 1.2 Module Two: Microsimulation ˆ ,t ˆ ,b 1.2 Module Two: Microsimulation ( ) 1.2.1 Microsimulation initialization: Birth, disease and death models 1.2.1 Microsimulation initialization: Birth, disease and death models pk a, b, t = pk a 1.2.1 Microsimulation initialization: Birth, disease and death models ˆ +a-a ˆ Simulated people are generated with the correct demographic statistics in the simulation’s start- Simulated people are generated with the correct demographic statistics in the simulation’s start- ˆ, b Simulated people are generated with the correct demographic statistics in the simulation’s start- +b-b ˆ,t ( ) ( ) ˆö year. In this year, women are stochastically allocated the number and years of birth of their children year. In this year, women are stochastically allocated the number and years of birth of their children æa-a (1.8) year. In this year, women are stochastically allocated the number and years of birth of their children ˆ = pk t + Ñaˆ , Ñbˆ pk t ç – these are generated from known fertility and mother’s age at birth statistics (valid in the start- – these are generated from known fertility and mother’s age at birth statistics (valid in the start- ˆ – these are generated from known fertility and mother’s age at birth statistics (valid in the start- b-b ˆ÷ + ... () ( ) () è ø year). If a woman has children, then those children are generated as members of the simulation in year). If a woman has children, then those children are generated as members of the simulation in year). If a woman has children, then those children are generated as members of the simulation in the appropriate birth year. the appropriate birth year. 36 the appropriate birth year. 37 Denoting mean values by angled brackets, the variance of pk is thereby approximated as The microsimulation is provided with a list of relevant diseases. These diseases used the best The microsimulation is provided with a list of relevant diseases. These diseases used the best The microsimulation is provided with a list of relevant diseases. These diseases used the best available incidence, mortality, survival, relative risk and prevalence statistics (by age and sex). T available incidence, mortality, survival, relative risk and prevalence statistics (by age and sex). available incidence, mortality, survival, relative risk and prevalence statistics (by age and sex). 2 2 ˆö ˆ öæ a - a æa-a Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 1.2.2.1 Population Editor The Population Editor allows editing and testing of TPopulation objects. 1.2.2.2 Birth model 1.2.2 Population models 1.2.2.2 Birth model Any female in the childbearing 1.2.2.2years {AgeAtChild.lo, AgeAtChild.hi} is deemed Birth model 1.2.2 1.2.2 Population models Population models Populations are implemented as instances of the TPopulation C++ class. The TPopulation class is The population is created in the start year and propagated forwards in time Any female in the childbearing years {AgeAtChild.lo, AgeAtChild.hi capable of giving } is deemed capable of giving birth. birth. The number of children, n, that she has in Any female in the childbearing years { her life is AgeAtChild.lo, AgeAtChild.hi } is d Populations are implemented as instances of the TPopulation C++ class. The TPopulation class is Populations are implemented as instances of the TPopulation C++ class. The TPopulation class is created from a population (*.ppl) file. Usually a simulation will use only one population, but it can by allowing females to give birth. An example population pyramid which can The number of children, n, that she has in her life is dictated by the Poisson distribution dictated by the Poisson distribution # 3 where where the mean of the Poisson The number of children, n, that she has in her life is dictated by the Poi l created from a population (*.ppl) file. Usually a simulation will use only one population, but it can created from a population (*.ppl) file. Usually a simulation will use only one population, but it can 1.2.2.2 Birth model 2 simultaneously process multiple populations (for example, different ethnicities within a national be used when initializing the model is shown in Figure 1. It shows the 2015 the mean of the Poisson distribution is the Total Fertility Rate (TFR) parameter. distribution 1.2.2.2 Birth model is the Total Fertility Rate (TFR) parameter. 1.2.2.2 Birth model 2 the mean of the Poisson distribution is the Total Fertility Rate (TFR) par simultaneously process multiple populations (for example, different ethnicities within a national simultaneously process multiple populations (for example, different ethnicities within a national Any female in the childbearing years {AgeAtChild.lo, AgeAtChild.hi} is deemed capable of giving birth. population). population distribution in Ukraine used in the initialization of the model. Any female in the childbearing years { 1.2.2.2 Birth model AgeAtChild.lo, AgeAtChild.hi } is deemed capable of giving birth. Any female in the childbearing years { AgeAtChild.lo, AgeAtChild.hi} is deem population). 1.2.2.2 Birth model The number of children, n, that she has in her life is dictated by the Poisson distribution #l 3 where population). The probability that a mother (who does give birth) gives birth to a child at age a is determined from The probability that a mother (who does give birth) gives birth to a chil The number of children, n, that she has in her life is dictated by the Poisson distribution The Any female in the childbearing years { probability that a mother (who does give birth) gives birth AgeAtChild.lo, AgeAtChild.hi Any female in the childbearing years { 2 to a# l 3 where The number of children, n, that she has in her life is dictated by the Poisso child at } is deemed capable of giving birth AgeAtChild.lo, AgeAtChild.hi } is deem 1.2.2.1 Population Editor the BirthsByAgeOfMother distribution as the mean of the Poisson distribution is the Total Fertility Rate (TFR) parameter. # 0 . For any particular mother, the births of multiple 2 1.2.2.1 Population Editor the mean of the Poisson distribution is the Total Fertility Rate (TFR) parameter. 2 age a is determined from the BirthsByAgeOfMother distribution as the BirthsByAgeOfMother distribution as The number of children, n, that she has in her life is dictated by the Poisson distribution # 0 . For any particular mot the mean of the Poisson distribution is the Total Fertility Rate (TFR) param # 3 wher 1.2.2.1 Population Editor Ukraine males (20.8m) females (24.1m) The number of children, n, that she has in her life is dictated by the Poisso 2 l The Population Editor allows editing and testing of TPopulation objects. children are treated as independent events, so that the probability that a mother who produces N The probability that a mother (who does give birth) gives birth to a child at age a is determined from For any particular mother, children are treated as independent events, so that the probability tha the births of multiple children are treated as 2 The Population Editor allows editing and testing of TPopulation objects. the mean of the Poisson distribution is the Total Fertility Rate (TFR) parameter. the mean of the Poisson distribution is the Total Fertility Rate (TFR) param The Population Editor allows editing and testing of TPopulation objects. (90+) 0.3% (90+) 0.8% The probability that a mother (who does give birth) gives birth to a child at age a is determined from The probability that a mother (who does give birth) gives birth to a child a children produces n of them at age a is given as the Binomially distributed variable, the BirthsByAgeOfMother distribution as independent events, so # . For any particular mother, the births of multiple children produces n of them at age a is given as the Binomially distribu that 0 the probability that a mother who produces N The population is created in the start year and propagated forwards in time by allowing females to 1.2.2.2 Birth model the BirthsByAgeOfMother distribution as 2 . For any particular mother, the births of multiple #2 0 the BirthsByAgeOfMother distribution as #2 0 . For any particular mother The probability that a mother (who does give birth) gives birth to a child at age a is determined from The population is created in the start year and propagated forwards in time by allowing females to The probability that a mother (who does give birth) gives birth to a child a children are treated as independent events, so that the probability that a mother who produces N (80-89) 2.2% (80-89) 4.8% The population is created in the start year and propagated forwards in time by allowing females to children produces n of themchildren are treated as independent events, so that the probability that a Any female in the childbearing years { at age a is given as the Binomially distributed AgeAtChild.lo, AgeAtChild.hi children are treated as independent events, so that the probability that a mother who produces N } is deemed capable of giving birth. give birth. An example population pyramid which can be used when initializing the model is shown in N! the BirthsByAgeOfMother distribution as #2 0 . For any particular mother, the births of multiple ( )( ) n N -n the BirthsByAgeOfMother distribution as #2 N 0!. For any particular mothe give birth. An example population pyramid which can be used when initializing the model is shown in give birth. An example population pyramid which can be used when initializing the model is shown in Figure 1. It shows the 2015 population distribution in Ukraine used in the initialization of the model. pb ( n at a | N ) = a ) 1 - pbm ( a ) children produces n of them at age a is given as the Binomially distributed (children are treated as independent events, so that the probability that a mother who produces N children produces n of them at age a is given as the Binomially distributed variable, pb The number of children, n, that she has in her life is dictated by the Poisson distribution variable, pb ( n at a | N ) = (1.11) children produces n of them at age a is given as the Binomially distributed variable, p ( a ) n #l 31 where - pbm ( )( n !( N - n )! the mean of the Poisson distribution is the Total Fertility Rate (TFR) parameter. (70-79) 5.8% (70-79) 9.7% children are treated as independent events, so that the probability that a Figure 1. It shows the 2015 population distribution in Ukraine used in the initialization of the model. Figure 1. It shows the 2015 population distribution in Ukraine used in the initialization of the model. ! children produces n of them at age a is given as the Binomially distributed variable, n ! ( N - n )2 N children produces n of them at age a is given as the Binomially distributed ! b ( )( ) ) n N - n pb ( n at a | N ) = N! pb ( a ) n 1 - pbm ( a ) N -n N ! (1.11) ( ( ) (60-69) 10.5% (60-69) 13.0% ( )( n p ( n at a | N ) = n ! ( N - n ) ! pb ( a ) 1 - pbm p( ( a ) n at a | N ) = The probability that a mother (who does give birth) gives birth to a child at age a is determined from (1.11) p ( a ) 1 - pbm ( a ) The probability that the mother gives birth to n children at age a is b n! ( N - n )! The probability that the mother gives birth to n children at age a is N! b n !( N N-! nN ) !n b ( )( ) ( )( n - (a a)| N ) - pbm ( a ) pb ( n at a | N ) = #2 0 . For any particular mother, the births of multiple n pb ( n pb ( a ) 1 - pbm (a (50-59) 14.2% (50-59) 14.6% the BirthsByAgeOfMother distribution as pb at 1= (1.11 ¥ N The probability The probability that the mother gives birth to n children at age a is lN that the mother gives birth l n! ( to Nn- n) ! children at age a is n ¥ children are treated as independent events, so that the probability that a mother who produces N ! ( N - n ) ! ( p ( a ) ) (1 - p ) a ) (1.12) The probability that the mother gives birth to n children at age a is l l n N -n N N The probability that the mother gives birth to n children at age a is ¥ ¥ pb ( n at a ) = e - l å åchildren produces n of them at age a is given as the Binomially distributed variable, (40-49) 14.2% (40-49) 13.1% pb ( n at a | N ) = e -l n)at p ((a =e å -l p ( n at a | N ) = e å -l (p !( N - n )! b b N! lN n ¥ ¥ lN b ¥N! The probability that the mother gives birth to n children at age a is b The probability that the mother gives birth to n children at age a is N =n n ! ( N - n )! ( ) n (1 - p ( a ) ) N - n (1.12) N =n N =n n N -n pb ( n at a ) = e å l pb ( n at a | N ) = e - -l å l b ( al ) (30-39) 17.1% (30-39) 14.6% N -l ¥ l ¥ lN N =n -p N ¥ lN e( pb ( a¥) )N (p | )N ) = pb ( n at a ) = e å N ! pb ( n at a | N ) = e N -l å p bn(n! (at (bN a-)Nn=) ! n! å 1b (n b (a ) - pat b Ne -n å -l (1.12) ( pb ( ( )!) - l - !) ) l ( ) N =n n ! ( N ) n l ( n at = n)n ( ( ) N! Performing the summation in this equation gives the simplifying result that the probability N =n p ¥ l a N | N =n =! p N( - n at a ¥p N nb l a 1N - p a ¥ l -n N !(1.11) pb ( n at a ( e) (å 1 - pb ( a ) ) (1.12 (20-29) 14.9% (20-29) 12.3% ( pb ( N =n N n N- n pb ( n at a ) = e å p( )n å pb )( =a) N = Performing the summation in this equation gives the simplifying result (aN =)e !å b ( nn -b pb n at at!a | )N= - e l bm |N - l (10-19) 9.9% (10-19) 8.1% is itself Poisson distributed with mean parameter 4#2 0 , N =n N ! Performing the summation in this equation gives the simplifying result that the probability is itself Poisson distributed with mean parameter N =N n= nn N ! ! ( N - n ) ! pb(n at a 4# ) N = n0 ( n !, N - n )! Performing the summation in this equation gives the simplifying result that the probability Performing the summation in Performing the summation in this equation gives the simplifying result tha this equation gives the simplifying result pb (n at a) that 2 is itself Poisson distributed with mean parameter 4#2 0 , The probability that the mother gives birth to n children at age a is is itself Poisson distributed with mean parameter the probability pb(n at a) is itself 2 0 , distributed with mean parameter 4#2 0 , is itself Poisson distributed with mean parameter 4# Poisson ( ) Performing the summation in this equation gives the simplifying result that the probability Performing the summation in this equation gives the simplifying result tha n pb(n at a n (0-9) 9.9% (0-9) 8.8% l pb ( a ) ( l p0 ( a) ) pb ( n at a ) = e - l pb ( a ) = pl pb ( a ) ( n ) - ) (l p (a )) n is itself Poisson distributed with mean parameter 4# is itself Poisson distributed with mean parameter 0 , , )b ( - l p ( al N (1.13) p n at l a) = e N - l p ( a ) 4# b ) )( a ) ) = p ¥ ¥ p ( a ) ( pb )) a= (nl(p n 2 (p ) 2 n (n b 1b-( N -n Figure 1: Population Pyramid in 2015 in Ukraine n ! n! ppbb(( nnat at = pl pb aa))==eel (n -l åN) pb ( n at a ) = e n( =p l pb (N n b !n at b( n -a) a at b n= b N! |N p) = e-l å l p (a ) ( n )! e b (1.13) n = pl pb ( a ) ( n ) -l bp((aa) )(1.13) n a p!b l pb ( a ) (1.12) n! pb ( n at a ) = e ( b (a ) l p ( a ) ) N =n ( l pb !a ) ) n( ( a ) ( n ) 4#2 0 children in th - l pb ( a ) b Thus, on average, a mother at age 0 will produce 4#2 0 children in that year. Thus, on average, a mother at age pb ( n at 0= a )p=l pbe will produce - l pb ( a ) ) (n) = pl pb ( a(1.13 Thus, on average, a mother at age Thus, on average, a mother at age 0 will produce will produce 4#2 0 children in that year. Performing the summation in this equation gives the simplifying result that the probability children in that year. n! n ! children in that y pb(n at a) Figure 1 Population Pyramid in 2015 in Ukraine 0 Thus, on average, a mother at age 4# 2 0 0 will produce 4# 2 0 People within the model can die from specific diseases or from other causes. A The gender of the children 3 Thus, onpaverage, a mother at age a will produce 4#3 is itself Poisson distributed with mean parameter 0 , children in that year. Figure 1 Population Pyramid in 2015 in Ukraine Figure 1 Population Pyramid in 2015 in Ukraine is determined by the probability The gender of the children 3 =1-pfemale is determined by the probability male . In the baseline model this The gender of the children Thus, on average, a mother at age 0 p =1- will produce p 4#2 is determined by the probability . In the baseline model this 0 children in that year. pmale=1-pfe Thus, on average, a mother at age The gender of the children3 is determined by the probability pmale female3 0 will produce 4#2 0 children in that y . In the baseline model this male=1-pfemale is determined by the probability 2 disease file is created within the program to represent deaths from other causes. The gender of the children pmale=1-pfemale People within the model can die from specific diseases or from other causes. A disease file is created is taken to be the probability Nm/(Nm+N is taken to be the probability f ). Nm/(Nm+Nf). is taken to be the probability Nm/( nNm+Nf). People within the model can die from specific diseases or from other causes. A disease file is created The following distributions are required by the Population Editor (Table 1). People within the model can die from specific diseases or from other causes. A disease file is created within the program to represent deaths from other causes. The following distributions are required is taken to be the probability The Nm /(Nm +Nf). 3 is The gender of the children gender of the children is taken to be the probability 3 is determined by the probability determined by The gender of the children ( ) - l pthe ( l pb3 is determined by the probability b ( a ) probability (a N)m)/(N pm +Nf). male=1-pfemale. In the baseline model thi pmale=1- pfemale p n at a = e = p l pb ( a ) ( n ) (1.13) within the program to represent deaths from other causes. The following distributions are required The Population Editor menu item Population Editor\Tools\Births\show random birthList creates an within the program to represent deaths from other causes. The following distributions are required The Population Editor menu item Population Editor\Tools\Births\show random birthList creates an In theis taken to be the probability baseline model this b is taken Nm/(to m+ N Nbe f). probability is taken to be the probability the n ! Nm/(Nm+Nf). by the Population Editor (Table 1). The Population Editor menu item Population Editor\Tools\Births\show The Population Editor menu item Population Editor\Tools\Births\show random birthList creates an The Population Editor menu item Population Editor\Tools\Births\show ran by the Population Editor (Table 1). by the Population Editor (Table 1). instance of the TPopulation class and uses it to generate and list a (selectable) sample of mothers instance of the TPopulation class and uses it to generate and list a (selectable) sample of mothers Table 1: Summary of the Parameters Representing the Distribution Component instance of the TPopulation class and uses it to generate and list a (sele instance of the TPopulation class and uses it to generate and list a (selectable) sample of mothers Thus, on average, a mother at age instance of the TPopulation class and uses it to generate and list a (selecta 0 will produce 4#2 0 children in that year. The Population Editor menu item Population Editor\Tools\Births\show random birthList creates an The Population Editor menu item Population Editor\Tools\Births\show ran Table 1 Summary of the Parameters Representing the Distribution Component Table 1 Summary of the Parameters Representing the Distribution Component and the years in which they give birth. and the years in which they give birth. The Population Editor menu and the years in which they give birth. item Population Editor\Tools\Births\show random and the years in which they give birth. and the years in which they give birth. Table 1 Summary of the Parameters Representing the Distribution Component instance of the TPopulation class and uses it to generate and list a (selectable) sample of mothers 3instance of the TPopulation class and uses it to generate and list a (selecta The gender of the children birthList 1.2.2.3 Deaths from modeled diseases creates an instance of is determined by the probability the TPopulation class and uses pmale it to=1- pfemale. In the baseline model this generate Distribution name Distribution name symbol symbol note note 1.2.2.3 Deaths from modeled diseases 1.2.2.3 Deaths from modeled diseases and the years in which they give birth. and the years in which they give birth. 1.2.2.3 Deaths from modeled diseases Distribution name Distribution name symbol symbol note note is taken to be the probability and list a (selectable) sample of mothers1.2.2.3 N m /(Deaths from modeled diseases N The simulation models any number of specified diseases, some of which may be fatal. In the start m + N ). fand the years in which they give birth. The simulation models any number of specified diseases, some of which may be fatal. In the start The simulation models any number of specified diseases, some of which may be fatal. In the start 1.2.2.3 The simulation models any number of specified diseases, some of which m The simulation models any number of specified diseases, some of whic Deaths from modeled diseases MalesByAgeByYear #/ 0 Input in year 0 – probability of a male having age a 1.2.2.3 year, the simulation’s death model uses the diseases’ own mortality statistics to adjust the Deaths from modeled diseases MalesByAgeByYear 0 Input in year Input in year 0 – probability of a male having The Population Editor menu item Population Editor\Tools\Births\show random birthList creates an year, the simulation’s death model uses the diseases’ own mortality statistics to adjust the MalesByAgeByYear # 0 – probability of a male having age a year, the simulation’s death model uses the diseases’ own mortality statistics to adjust the year, the simulation’s death model uses the diseases’ own mortality statis The simulation models any number of specified diseases, some of which may be fatal. In the start MalesByAgeByYear / 0 #/ Input in year age a 0 – probability of a male having age a probabilities of death by age and gender. In the start year, the net effect is to maintain the same 1.2.2.3 Deaths from modeled The simulation models any number of specified diseases, some of which m year, the simulation’s death model uses the diseases’ own mortality st diseases instance of the TPopulation class and uses it to generate and list a (selectable) sample of mothers probabilities of death by age and gender. In the start year, the net effect is to maintain the same probabilities of death by age and gender. In the start year, the net effect i FemalesByAgeByYear #1 0 probabilities of death by age and gender. In the start year, the net effect is to maintain the same Input in year0 – probability of a female having age a year, the simulation’s death model uses the diseases’ own mortality statistics to adjust the year, the simulation’s death model uses the diseases’ own mortality statis probabilities of death by age and gender. In the start year, the net effe probability of death by age and gender as before; in subsequent years, however, the rates at which The simulation models any number of specified diseases, some of which may be FemalesByAgeByYear # 0 Input in year Input in year – probability of a female having age a 0 – probability of a female having probability of death by age and gender as before; in subsequent years, however, the rates at which and the years in which they give birth. probability of death by age and gender as before; in subsequent years, however, the rates at which probability of death by age and gender as before; in subsequent years, ho FemalesByAgeByYear FemalesByAgeByYear #11 0 Input in year0 0 – probability of a female having age a probabilities of death by age and gender. In the start year, the net effect is to maintain the same probabilities of death by age and gender. In the start year, the net effect i people die from modeled diseases will change as modeled risk factors change. The population probability of death by age and gender as before; in subsequent years, age a fatal. In the start year, the simulation’ s death model uses the diseases’ own mortality people die from modeled diseases will change as modeled risk factors change. The population people die from modeled diseases will change as modeled risk factors cha BirthsByAgeofMother #2 0 Input in year0 – conditional probability of a birth at age a| the probability of death by age and gender as before; in subsequent years, however, the rates at which people die from modeled diseases will change as modeled risk factors change. The population 1.2.2.3 probability of death by age and gender as before; in subsequent years, ho Deaths from modeled diseases dynamics sketched above will be only an approximation to the simulated population’s dynamics. The BirthsByAgeofMother # 0 Input in year Input 0 – conditional probability of a birth at age a| the in year statistics to adjust the probabilities people die from modeled diseases will change as modeled risk factors of death by age and gender. In the start year, the BirthsByAgeofMother BirthsByAgeofMother #22 0 Input in year 0 – conditional probability of a 0 – conditional probability of a birth at age a| the mother gives birth. dynamics sketched above will be only an approximation to the simulated population’s dynamics. The dynamics sketched above will be only an approximation to the simulated p people die from modeled diseases will change as modeled risk factors change. The population people die from modeled diseases will change as modeled risk factors cha The simulation models any number of specified diseases, some of which may be fatal. In the start latter will be known only on completion of the simulation. mother gives birth. birth at age a| the mother gives birth. dynamics sketched above will be only an approximation to the simulated population’s dynamics. The net effect is to maintain the same latter will be known only on completion of the simulation. dynamics sketched above will be only an approximation to the simulat probability of death by age and gender as before; mother gives birth. latter will be known only on completion of the simulation. dynamics sketched above will be only an approximation to the simulated population’s dynamics. Th dynamics sketched above will be only an approximation to the simulated latter will be known only on completion of the simulation. year, the simulation’s death model uses the diseases’ own mortality statistics to adjust the NumberOfBirths #l 3 TFR, Poisson distribution, probability of lºTFR, Poisson distribution, probability of giving birth to n 2 This could be 1.2.3 The risk factor model made to be time in subsequent years, latter will be known only on completion of the simulation. however, the rates at which people die from modeled diseases latter will be known only on completion of the simulation. latter will be known only on completion of the simulation. NumberOfBirths NumberOfBirths # 3 lº TFR, Poisson distribution, probability of giving birth to n dependent; in the1.2.3 baseline model probabilities of death by age and gender. In the start year, the net effect is to maintain the same The risk factor model 1.2.3 The risk factor model NumberOfBirths #l 3 l giving birth to n children lºTFR, Poisson distribution, probability of giving birth to n children will change as modeled risk factors change. The population dynamics sketched it is constant. The distribution of risk factors (RF) in the population is estimated using regression analysis stratified children 1.2.3 The risk factor model probability of death by age and gender as before; in subsequent years, however, the rates at which The distribution of risk factors (RF) in the population is estimated using regression analysis stratified 1.2.3 1.2.3 The risk factor model The distribution of risk factors (RF) in the population is estimated using re The risk factor model children above will be by both sex S = {male, female} and age group A1.2.3 only an approximation to The risk factor model the simulated population’s dynamics. The = {0-9, 10-19, ..., 70-79, 80+}. The fitted trends are people die from modeled diseases will change as modeled risk factors change. The population The distribution of risk factors (RF) in the population is estimated using regression analysis stratified child gender S = {male, female} and age group by both sex The probability of = {0-9, 10-19, ..., 70-79, 80+}. The fitted trends are Aby both sex S = {male, female} and age group A = {0-9, 10-19, ..., 70-79, 80 The distribution of risk factors (RF) in the population is estimated using 3 The distribution of risk factors (RF) in the population is estimated using regression analysis stratified The distribution of risk factors (RF) in the population is estimated using re can be made time dependent. latter will be known only on completion of the simulation. dynamics sketched above will be only an approximation to the simulated population’s dynamics. The S = {male, female} and age group by both sex S = {male, female} and age group 2 A = {0-9, 10-19, ..., 70-79, 80+}. The fitted trends are by both sex by both sex by both sex A = {0-9, 10-19, ..., 70-79, 80+}. The fitted trends are S = {male, female} and age group S = {male, female} and age group AA = {0-9, 10-19, ..., 70-79, 80 = {0-9, 10-19, ..., 70-79, This could be made to be time dependent; in the baseline model it is constant. latter will be known only on completion of the simulation. 2 2 38 3 This could be made to be time dependent; in the baseline model it is constant. 39 This could be made to be time dependent; in the baseline model it is constant. 3 The probability of child gender can be made time dependent. 2 2 3 The probability of child gender can be made time dependent. 2 The probability of child gender can be made time dependent. This could be made to be time dependent; in the baseline model it is constant. 1.2.3 The risk factor model This could be made to be time dependent; in the baseline model it is constant. 2 This could be made to be time dependent; in the baseline model it is constant. 3 This could be made to be time dependent; in the baseline model it is constan 3 6 3 The probability of child gender can be made time dependent. The probability of child gender can be made time dependent. 3 The probability of child gender can be made time dependent. The distribution of risk factors (RF) in the population is estimated using regression analysis stratified 6 The probability of child gender can be made time dependent. Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 1.2.3 The risk factor model In each year the probability of being a smoker or a non-smoker will depend The distribution of risk factors (RF) in the population is estimated using on the forecast smoking scenario, which provides exactly that information. regression analysis stratified by both sex S = {male, female} and age group A Note that these states are two-dimensional and cross-sectional {non-smoking, = {0-9, 10-19, ..., 70-79, 80+}. The fitted trends are extrapolated to forecast the smoking}, and they are turned into three-dimensional states {never smoked, distribution of each RF category in the future. For each sex-and-age-group extrapolated to forecast the distribution of each RF category in the future. For each sex-and-age- ex-smoker, smoker} as described above. The time evolution of the three- stratum, the set tribution of each RF category in the future. For each sex-and-age- of cross-sectional, time-dependent, discrete distributions group stratum, the set of cross-sectional, time-dependent, discrete distributions 5 = {#$ & |! = dimensional states are the smoking trajectories necessary for the computation sectional, time-dependent, discrete distributions 5 = {#$ & |! = 1, … ); & > 0}, is used to manufacture RF trends for individual members of the population. is used to manufacture RF trends for individual of disease-table disease and death probabilities. facture RF trends for individual members of the population. members of the population. We model different risk factors, some of which are continuous (such as BMI) and some are 1.2.3.2 Smoking some of which are continuous (such as BMI) and some are categorical (smoking status). We model different risk factors, some of which are continuous (such as BMI) The microsimulation framework applied to smoking enables us to measure 1.2.3.1 Categorical risk factors and some are categorical (smoking status). the future health impact of changes in rates of tobacco consumption. This extrapolated to forecast the distribution of each RF category in the future. For each sex-and-age- ors Smoking is the categorical risk factor. Each individual in the population may belong to one of the includes the impact of giving up smoking on the following diseases: i) group stratum, the set of cross-sectional, time-dependent, discrete distributions 5 = {#$ & |! = actor. Each individual in the population may belong to one of the 1.2.3.1 Categorical three possible smoking categories { risk factors never smoked, ex-smoker, smoker} with their probabilities {p0, p1, Chronic obstructive pulmonary disease (COPD), ii) Coronary heart disease 1, … ); & > 0}, is used to manufacture RF trends for individual members of the population. es {never smoked, ex-smoker, smoker} with their probabilities { p , p p Smoking is the0 categorical 2 1 }. These states are updated on receipt of the information that the person is either a smoker or a , risk factor. Each individual in the population may (or Myocardial Infarction if CHD data are not available), iii) stroke, and iv) lung n receipt of the information that the person is either a smoker or a non-smoker. They will be a never smoker or an ex-smoker depending on their original state (an ex- We model different risk factors, some of which are continuous (such as BMI) and some are belong to one of the three possible smoking categories {never smoked, ex- cancer. In the simulation, each person is categorized into one of the three er smoker or an ex-smoker depending on their original state (an ex- categorical (smoking status). smoker can never become a never smoker). smoker, smoker} with their probabilities {p0, p1, p2}. These states are updated on smoking groups: Smokers, ex-smokers, and people who have never smoked. ver smoker). receipt of the information 1.2.3.1 Categorical risk factors that the person is either a smoker or a non-smoker. The complete set of longitudinal smoking trajectories and the probabilities of their happening is Their initial distribution is based on the distribution of smokers, ex-smokers Myocardial Infarction if CHD data are not available), iii) stroke, and iv) lung cancer. In the simulation, They will be a l smoking trajectories and the probabilities of their happening is never smoker or an ex-smoker depending on their original state generated for the simulation years by allowing all possible transitions between smoking categories: Smoking is the categorical risk factor. Each individual in the population may belong to one of the and never smokers from published data. each person is categorized into one of the three smoking groups: Smokers, ex-smokers, and people ars by allowing all possible transitions between smoking categories: never become (an ex-smoker cannever smoked three possible smoking categories { smoker). a never , , ex-smoker smoker} with their probabilities {p0, p1, who have never smoked. Their initial distribution is based on the distribution of smokers, ex- {never smoked} ® {never smoked, smoker} p2}. These states are updated on receipt of the information that the person is either a smoker or a During the simulation, a person may change smoking states, and their relative smokers and never smokers from published data. moked} ® {never smoked, smoker} {ex-smoker} ® {ex-smoker, smoker} non-smoker. They will be a never smoker or an ex-smoker depending on their original state (an ex- The complete set of longitudinal smoking trajectories and the probabilities of risk will change accordingly. Relative risks associated with smokers and people ker} ® {ex-smoker, smoker} smoker can never become a never smoker). { smoker } ® { ex-smoker , smoker } During the simulation, a person may change smoking states, and their relative risk will change their happening is generated for the simulation years by allowing all possible who have never smoked have been collected from published data. The } ® {ex-smoker, smoker} accordingly. Relative risks associated with smokers and people who have never smoked have been transitions between smoking categories: When the probability of being a smoker is p the allowed transitions are summarized in the state The complete set of longitudinal smoking trajectories and the probabilities of their happening is relative risks associated with ex-smokers (RRex-smoker) are related to the relative collected from published data. The relative risks associated with ex-smokers ( RRex-smoker) are related smoker is p the allowed transitions are summarized in the state update equation: generated for the simulation years by allowing all possible transitions between smoking categories: risk of smokers (RRsmoker ). The ex-smoker relative risks are assumed to decrease to the relative risk of smokers (RRsmoker ). The ex-smoker relative risks are assumed to decrease over {never smoked} — {never smoked, smoker} over time with the number of years since time with the number of years since smoking cessation ( smoking Tcessation cessation (Tcessation). These ). These relative risks are computed {never smoked {ex-smoker} } ® {never smoked — {ex-smoker, smoker} , smoker é} p0 ' ù é1 - p 0 0 ù é p0 ù relative risks are computed in the model using equations 1.19 and 1.20 (1). in the model using equations 1.19 and 1.20 (1). é p0 ù é1 - p ' 0 0 ù é p0 ù { ex-smoker } ® { ex-smoker , smoker } ê 'ú ê úê ú ê 'ú ê ú ê ú {smoker} — {ex-smoker, smoker} ê p1 ú = ê 0 1 - p 1 - p ú ê p1 ú (1.14) { smoker } ® { ex-smoker , smoker } ê p1 ú = ê 0 1 - p 1 - p ú ê p1 ú ê p2 ú ê RRex-smoker ( A, S , Tcessation ) = 1 + ( RRsmoker ( A, S ) - 1)exp( -g ( A)Tcessation ) ' ë û ë p p p ú ûêë p2 ú (1.14) (1.15) û ê p2 ' ú ê p p p ú ê p ú When the probability of being a smoker is p the allowed transitions are ë û ë ûë 2û When the probability of being a smoker is p the allowed transitions are summarized in the state summarized in the update equation: After the final simulation year, the smoking trajectories are completed until the person’s maximum state update equation: g ( A) = g 0 exp( -h A) (1.16) he smoking trajectories are completed until the person’s maximum possible age of 110 by supposing that their smoking state stays fixed. The life expectancy calculation g that their smoking state stays fixed. The life expectancy calculation will consist in summing over the probability of being alive in each possible year of life. where γ is the regression coefficient of time dependency. The constants γ0 and η are intercept and é p0' ù é1 - p 0 0 ù é p0 ù where γ is the regression coefficient of time dependency. The constants γ0 probability of being alive in each possible year of life. ê 'ú ê ú ê ú regression coefficient of age dependency, respectively, which are related to the specified disease p1 ú = 0 1 - p 1 - p p1 In the initial year of the simulation, a person may be in one of the three smoking categories; after (1.14) N and η are intercept and regression coefficient of age dependency, respectively, ê ê úê ú Table 2. ê p 2 ú ë ' N which are related to the specified disease Table 2. on, a person may be in one of the three smoking categories; after ëN û ê p p ´ 2 updates there will be 3 p possible trajectories. These trajectories will each have a calculated úë û ê p2 û ú Table 2 Parameter Estimates for γ0 and η Related to Each Disease (1) sible trajectories. These trajectories will each have a calculated probability of occurring; the sum of these probabilities is 1. Table 2: Parameter Estimates for γ0 and η Related to Each Disease (1) m of these probabilities is 1. After the final simulation year, the smoking trajectories are completed until the person’s maximum Disease γ0 η In each year the probability of being a smoker or a non-smoker will depend on the forecast smoking After the final simulation year, the smoking trajectories are completed until the possible age of 110 by supposing that their smoking state stays fixed. The life expectancy calculation eing a smoker or a non-smoker will depend on the forecast smoking scenario, which provides exactly that information. Note that these states are two-dimensional and AMI Distribution name γ0 0.24228 η 0.05822 person’s maximum possible age of 110 by supposing that their smoking state will consist in summing over the probability of being alive in each possible year of life. y that information. Note that these states are two-dimensional and cross-sectional {non-smoking, smoking}, and they are turned into three-dimensional states {never Stroke 0.31947 0.01648 stays fixed. The life expectancy calculation will consist in summing over the moking}, and they are turned into three-dimensional states { never smoked, ex-smoker, smoker} as described above. The time evolution of the three-dimensional states In the initial year of the simulation, a person may be in one of the three smoking categories; after N AMI 0.24228 0.05822 probability of being alive in each possible year of life. COPD 0.20333 0.03087 updates there will be 3 ´ 2N possible trajectories. These trajectories will each have a calculated described above. The time evolution of the three-dimensional states are the smoking trajectories necessary for the computation of disease-table disease and death essary for the computation of disease-table disease and death probabilities. probability of occurring; the sum of these probabilities is 1. Stroke Lung cancer 0.31947 0.15637 0.01648 0.02065 In the initial year of the simulation, a person may be in one of the three 1.2.3.2 smoking categories; afterSmoking In each year the probability of being a smoker or a non-smoker will depend on the forecast smoking N updates there will be 3 x 2N possible trajectories. COPD 0.20333 0.03087 The microsimulation framework applied to smoking enables us to measure the future health impact scenario, which provides exactly that information. Note that these states are two-dimensional and These trajectories will each have a calculated probability of occurring; the sum Lung cancer 0.15637 0.02065 However, a minimum exists when the cessation time is equal to η-1. The minimum value was applied to smoking enables us to measure the future health impact cross-sectional {of of changes in rates of tobacco consumption. This includes the impact of giving up smoking on the these probabilities }, and they are turned into three-dimensional states { non-smoking, smoking is 1. never onsumption. This includes the impact of giving up smoking on the following diseases: i) Chronic obstructive pulmonary disease (COPD), ii) Coronary heart disease (or smoked, ex-smoker, smoker} as described above. The time evolution of the three-dimensional states calculated by the method detailed below (equations (1.17), (1.18) and (1.19)). Where time t is equal 40 structive pulmonary disease (COPD), ii) Coronary heart disease (or are the smoking trajectories necessary for the computation of disease-table disease and death to the age A of an individual. 41 probabilities. 7 7 r Exsmk ( t ) = 1 + ( r smk - 1) f ( t ) (1.17) 1.2.3.2 Smoking 0.8 0.8 0.8 T_cess AMI Stroke T_ces T_ces T_cessati AMI 0.24228 0.05822 0.8 0.6 Stroke 0.6 0.6 COPD Stroke 0.31947 0.01648 0.6 COPD Lung cancer 0.4 Modeling the Long-Term Health and Cost Impacts of Reducing COPD 0.20333 0.03087 0.4 0.4 Lung cancer Smoking Prevalence through Tobacco Taxation in Ukraine 0.4 Lung cancer 0.15637 0.02065 0.2 0.2 0.2 0.2 0 0 0 0 15 5 5 0 20 10 5 10 0 15 10 20 However, a minimum exists when the cessation time is equal to η-1. The 0 1.2.4 Relative risks However, a minimum exists when the cessation time is equal to η-1. The minimum value was Age Age minimum value was calculated by the method detailed below (equations calculated by the method detailed below (equations (1.17), (1.18) and (1.19)). Where time t is equal 0 5 10 15 The reported incidence risks for any disease do not make reference toAge 20 any Age (1.17), (1.18) and (1.19)). Where time t is equal to the age A of an individual. to the age A of an individual. underlying risk factor. The microsimulation requires this dependence to be made 1.2.4manifest. Relative risks 1.2.4 Relative risks 1.2.4 Relative risks r Exsmk ( t ) = 1 + ( r smk - 1) f ( t ) (1.17) 1.2.4 Relative risks The reported incidence risks for any disease do not make reference to any underlying risk factor. The The reported incidence risks for any disease do not mak The reported incidence risks for any disease do not make reference to any microsimulation requires this dependence to be made manifest. The risk factor dependence of The reported incidence risks for any disease do not make reference to any underlying risk factor. The disease incidence has to be inferred from the microsimulation requires this dependence to be made m microsimulation requires this dependence to be made manifest. f ( t ) = exp ( -g 0 ( t - t0 ) exp ( -ht ) ) microsimulation requires this dependence to be made manifest. distribution of the risk factor in the population (here denoted as π); it is a The risk factor dependence of disease incidence has to be inferred from the distribution of the risk The risk factor dependence of disease incidence has to The risk factor dependence of disease incidence has to be inferred from t Þ (1.18) disaggregation process: factor in the population (here denoted as The risk factor dependence of disease incidence has to be inferred from the distribution of the risk p ); it is a disaggregation process: factor in the population (here denoted as factor in the population (here denoted as p); it is a disa p); it is a disaggregation process f ¢ ( t ) = -g 0 f ( t ) e -ht ( -h ( t - t ) + 1) 0 factor in the population (here denoted as p); it is a disaggregation process: Suppose Suppose that state of some that a is a risk factor state of some risk factor is a risk factorSuppose that risk factor A, and denote by A, and denote by p Suppose that A(d|a,a,s) the incidence a is a risk factor state of some risk factor a is a risk factor state of some risk factor A, and denote by p The function f(t) has the following properties: Suppose that a is a risk factor state of some risk factor A, and denote by pprobability for the disease d given the risk state, the incidence probability A(d|a,a,s) the incidence a, the person’s age, a, and gender, s. The relative for the disease d given the risk state, a, the person’s age, a, an probability for the disease d given the risk state, probability for the disease d given the risk state, a, the The function f(t) has the following properties: risk r probability for the disease d given the risk state, a, the person’s age, a, and gender, s. The relative the person’s A is defined by equation (1.22). age, a, and gender, risk s. risk rA is defined by equation (1.22). A is defined by equation (1.22). rThe relative risk is defined by equation risk rA is defined by equation (1.22). (1.22). f( f (t 0) t0 )= =1 1 9 ¢ f ¢((t )= t0 ) =- g 0e -g e--h htt0 pA ( d a , a , s ) = rp A (d |d ( aa s) a, a p) ,s ( dra A= 0 ,( |d aa ) , s ) pA ( d a 0 , a , s ) , sa f 0 0 0 (1.19) A 9 A (1.22) (1.22 (1.19) f f((t t) has a ) has a minimum at t minimum at t==t +h--1 r A|d (a0 a, s ) º 1 r A|d (a 0 a, s ) º 1 0 +h 1 t0 8 p pAA (d da a, ,a a,,ss) ==r rA d (a aa ,s a, s) p pAA (d daa0 ,a 0, ,s a, s) (¥ )= A||d f f( ¥) =A A (1.22) (1.22) a, s ) = a pA ( d a , Where r0A |d (a a (d , s ) pA a a is the zero risk state (for example, the moderate state for alcohol consumption). Where Where 0 is the zero risk state (for example, the moderate state for alcohol consumption). is r r , the 0 A a|d , ( s ) a a zero0 a risk a , , s s ) º 1 state º 1 (for example, the moderate state for alcohol A|d 0 (1.22) a, s ) º 1 r A|d (a0 The incidence probabilities, as reported, can be expressed in terms of the equation, -1 In order to keep the RR In order In order to keep the RR to keep ex-smoker ex-smoker from increasing, the cessation time was set equal to η the RRex-smoker from increasing, the cessation time from increasing, the cessation time was set equal to η -1 when the was set equal when the consumption). -1 Where Where a The incidence probabilities, as reported, can be expressed in terms of the equation, is the zero risk state (for example, the moderate state for alcohol consumption). cessation time was greater than η -1 (see equation (1.20)). to η-1 when the cessation cessation time was greater than η time was greater than η-1 (see equation (1.20)). (see equation (1.20)). a00 is the zero risk state (for example, the moderate state for alcohol consumption). Where a0 is the zero risk state (for example, the moderate state for alcohol consumption). p ( d a, s The incidence ) = pp å (A (d probabilities, das a a, s ) p A (a reported, can s ) expressed in terms of the equation, be a,, s = ) å ( pAa,d The incidence probabilities, as reported, can be expressed in terms of the equation, a , a, s p A a a, s The incidence probabilities, as reported, can be expressed in terms of the equation, ) ( ) ì1 -g 5} 1.2.6.5 1.2.6.6.2 Survival model 1 , R, R>5} The model uses three parameters {p1, R, R } 1, R} 1 1.2.6.5 Survival rates 1.2.6.5 Survival rates Survival rates 1 >5 The model uses three parameters {p 1.2.6.5 Survival rates The model uses two parameters {p It is common practice to describe survival in terms of a survival rate R, supposing an exponential 1, R} It is common practice to describe survival in terms of a survival rate R, supposing an exponential psurvival (1) and the 5-year survival probability It is common practice to describe survival in terms of a survival rate R, supposing an exponential It is common practice to describe survival in terms of a survival rate R, supposing death-distribution. In this formulation, the probability of surviving t years from some time t death-distribution. In this formulation, the probability of surviving t years from some time t is given Given the 1-year survival probability Given the 1-year survival probability Given the 1-year survival Given the 1-year survival probability probability psurvival p (1)and the 5-year survival probability survival (1) and the 5-year survival probability and the 5-year survival p probability psurvival (5) psurvival (5) survival ( 5) death-distribution. In this formulation, the probability of surviving t years from some time t 0 is given 0 0 is given Given the 1-year survival probability psurvival (1) and the 5-year survival probability psurvival ( 5) as as as an exponential death-distribution. In this formulation, the probability of surviving t0 years from some time t0 is given as p1 = p1 =11- -p psurvival ((1) survival 1) t psurvival (55) ö t t 1 æ æp ö survival ( ç survival ( ) ÷ t ) = 1 - R- survival ( t ) = 1 - R ò 1 -1 Ru - Ru Rt - Rt p -1 due - =e ò due = - 1 ln - (1.31) psurvival - Ru e - Rt (1.31) (1.31) R= R =- lnç ÷ (1.36) 0 0 0 4 èp 4 è survival ( psurvival (11)) ø ø (1.36) 13 13 For a time period of 1 year For a time period of 1 year For a time period of 1 year For a time period of 1 year R>5 = =- 1 æp - 1 ln lnç æ psurvival ((10) ö survival 10 ) ÷ ö R ç >5 5 è p 5 psurvival ((55))÷ ø survival (1) = e - p -RR è survival ø survival (1) = e -R psurvival Þ (1.32) (1.32) 1.2.6.7 Approximating single-state disease survival data from mortality and prevalence 1.2.6.7 Approximating single-state disease survival data from mortality and prevalence (1.32) An example is provided here with a standard life-table analysis for a disease d . d. R R= - ln =- ln ( p survival (1) = - ln (1 - pw psurvival survival (1) ) = - ln (1 - pw w) ) An example is provided here with a standard life-table analysis for a disease Consider the 4 following states: Consider the 4 following states: For a time period of, for example, 4 years, For a time period of, for example, 4 years, For a time period of, for example, 4 years, state state Description Description 46 4 4 4 4 47 survival t = 4 = 1 - R ò ( t = 4) = 1 - R- = (1 )4 - Ru 4R 0 d -1 -4 - p 1 due - =e -p ò due = e -4 R = 1- 4 (1.33) -1 - Ru Ru R (1.33) alive without disease d psurvival survival pw w w (1.33) 0 alive without disease 0 0 0 1 1 alive with disease d alive with disease d In short, the Rate is minus the natural log of the 1-year survival probability. p p p 1= 11 =1 = 1- 1 -p - p p survival )) (1 survival(1 1 survival p ( k + 1) = p p ( k ) + p ( k ) (1.38) The probability of dying from the disease in the ag 2 wk 1 The probability of dying from the disease in the age interval [ k2, k+1] is pwk p1 k - this is otherwise The probability of dying from the disease in the age interval [ ( )k, k+1] is pwk p1 k - this ( ) survival(( ))öö 1.2.6.6.3 Survival model 2 The model uses three parameters {p 1, R, R>5} R = -11 1 lnæææp p p 55 5 ö k, k+1] is pwk p1 k The probability of dying from the disease in the age interval [ The probability of dying from the disease in the age interval [ k, k ( ) +1] is pwk p1 k - this is otherwise ( ) - this is otherwise the (cross-sectional) disease mortality, pmor(k). p1(k R R= =- - lnlnçççp survival survival ÷÷ ÷ (1.36) the (cross-sectional) disease mortality, (1.36) (1.36) pmor(k). p1(k) is otherwise known as the disease prevalence, the (cross-sectional) disease mortality, pmor(k). p1(k) is otherwise known as the disease p 44 4 èè pp survival(1( 1 1 è psurvival (5) øø)) ø the (cross-sectional) disease mortality, the (cross-sectional) disease mortality, pprep (kmor pmor(k). p1(p ) is otherwise known as the disease prevalence, kpre (k). p1(k) is otherwise known as the disease prevalence, ). Hence the relation (k). Hence the relation Modeling the Long-Term ppreHealth (k). Hence the relation and Cost Impacts of Reducing Given the 1-year survival probability psurvival (1) and the 5-year survival probability survival survival ppre(k). Hence the relation The probability of dying from the disease in the age interval [k, k+1] is pw ppre(k). Hence the relation k p1 (k ) - this is otherwise Smoking Prevalence through Tobacco Taxation in Ukraine 11 ææ 1 æp p p survival(( survival 10))öö 10 10 ö pmor ( k ) p (k ) p1 = 1 - psurvival (1) R R R = = = -- - ln ln ln ç ç ç survival ÷ ÷ ÷ the (cross-sectional) disease mortality, pmor(p k). p wk = 1(k) is otherwise known as the disease prevalence, pwk = mor (1.39) pwk = pmor ( k ) pmor ( k ) p pre ( k ) p pre ( k ) > 5 >>55 55 5 èè p p p ((55 5)) ppre(k). Hence the relation è survival survival survival øøø pwk = pwk = (1.39) (1.39) 1 æ psurvival ( 5) ö p pre ( k ) p pre ( k ) 1.2.6.7 Approximating R =single-state - ln ç disease ÷ survival data from mortality (1.36) For exponential survival probabilities, the probability of dying from the disease in the age-interval For exponential survival probabilities, the probability of dying from the disease in the ag For exponential survival probabilities, p the ( k )probability of dying from the disease in 1.2.6.7 1.2.6.7 1.2.6.7 4 è psurvival (1) ø Approximating single-state disease survival data from mortality and prevalence Approximating single-state disease survival data from mortality and prevalence Approximating single-state disease survival data from mortality and prevalence pwk = mor For exponential survival probabilities, the probabi (1.39) and prevalence [k, k+1] is denoted the Wk and is given by the formula page-interval [k, [k, k+1] is denoted k+1] is denoted p p pre ( k ) W k and is given by the formula and is given by the formula p and is given by the formula An example is provided here with a standard life-table analysis for a disease For exponential survival probabilities, the probability of dying from the disease in the age-interval For exponential survival probabilities, the probability of dying from the disease in the age-interval d . [k, k+1] is denoted An example is provided here with a standard life-table analysis for a disease An An example is provided here with a standard life-table analysis for a disease dd . . W k example is provided here1 withæa (10life-table )ö pstandard analysis for a disease d. [kp k+1] is denoted , W p R>5 = - ln ç survival [k, k+1] is denoted Wk and is given by the formula k and is given by the formula Þ Rk = - ln (1 5 è psurvival ( 5) ø ÷ pwk = 1 - e- R k pwkp - =wk1)- e- R For exponential survival probabilities, the probability of dying from the disease in the age-interval k Þ Rk = - ln ( 1 - pwk ) (1.40) Consider the 4 following states: Consider the 4 following states: Consider the 4 following states: pwk = 1 - e- Rk Þ Consider the 4 following states: [k, k+1] is denoted pWk and is given by the formula 1.2.6.7 Approximating single-state disease survival data from mortality and prevalence 1 - e- Rk Þ pwk = Rk1- =e Rk --ln (1Þ pwkR )k = - ln (1 - pwk ) pwk = When, as is the case for most cancers, these survival probabilities are known, the microsimulation - When, as is the case for most cancers, these survival probabilities are known, the micro (1.40) (1.40) state state state Description Description Description An example is provided here with a standard life-table analysis for a disease d. When, as is the microsimulation will case pwk for most cancers, these survival will use them. When they are not known or are too old to be any longer of any use, the probabilities are known, the When, as is the case for most cancers, these surviv will use them. When they are not known or are too old to be any longer of any use, the = 1 - e- R Þ Rk = - ln (1 - pwk ) k use them. When they are not known or are too old to be any (1.40) state Description When, as is the case for most cancers, these survival probabilities are known, the microsimulation When, as is the case for most cancers, these survival probabilities are known, the microsimulation will use them. When they are not known or are to microsimulation uses survival statistics inferred from the prevalence and mortality statistics microsimulation uses survival statistics inferred from the prevalence and mortality stati longer of any use, the microsimulation uses survival statistics inferred from the 00 0 Consider the 4 following states: alive without disease alive without disease alive without disease d d d will use them. When they are not known or are too old to be any longer of any use, the (equation (1.39)). (equation (1.39)). will use them. When they are not known or are too old to be any longer of any use, the microsimulation uses survival statistics inferred fro When, as is the case for most cancers, these survival probabilities are known, the microsimulation 0 alive without disease d prevalence and mortality statistics (equation (1.39)). microsimulation uses survival statistics inferred from the prevalence and mortality statistics microsimulation uses survival statistics inferred from the prevalence and mortality statistics will use them. When they are not known or are too old to be any longer of any use, the (equation (1.39)). state Description An alternative derivation equation (1.39) is as follows. Let Nk be the number of people in the An alternative derivation equation (1.39) is as follows. Let Nk be the number of people i 11 1 1 alive with disease alive with disease alive with disease alive with disease d d d d (equation (1.39)). microsimulation uses survival statistics inferred from the prevalence and mortality statistics (equation (1.39)). population aged k, and let nk be the number of people in the population aged k with the disease. population aged k, and let nk be the number of people in the population aged k with the 0 alive without disease d An alternative derivation equation (1.39) is as follows. (equation (1.39)). An alternative derivation equation (1.39) is as follo Let Nk be the number of Then, the number of deaths from the disease of people aged k can be given in two ways: as pwk Then, the number of deaths from the disease of people aged k n k can be given in two way 22 2 2 dead from disease dead from disease d dead from disease dead from disease dd d An alternative derivation equation (1.39) is as follows. Let An alternative derivation equation (1.39) is as follows. Let people in Nkthe population aged Nk be the number of people in the be the number of people in the and, equivalently, as pmor(k)Nk. Observing that the disease prevalence is An alternative derivation equation (1.39) is as follows. Let k, and let nk be the and, equivalently, as pmor( kk N population aged number of people ) be the number of people in the k, and let in the nk be the number of pe nk/Nk leads to the equation Nk. Observing that the disease prevalence is nk/Nk leads to t 1 alive with disease d population aged k, population aged k, and let nk be the number of people in the population aged and let nk be the number of people in the population aged population aged k with the disease. k with the disease. Then, the k with the disease. Then, the number of deaths from the disease of p number of deaths from the disease of 3 dead from another disease population aged k, and let nk be the number of people in the population aged k with the disease. 2 33 3 dead from disease d dead from another disease dead from another disease dead from another disease Then, the number of deaths from the disease of people aged Then, the number of deaths from the disease of people aged people aged k can be given in two ways: as k can be given in two ways: as k can be given pWkin Then, the number of deaths from the disease of people aged nk = pways: two mor ( k ) N k pWk nkp= as pwknand, equivalently, as k and, equivalently, w kn k can be given in two ways: as pk pmor as mor ( k ) p N(k)Nk. Observing that the kn k wk and, equivalently, as and, equivalently, as pmor(k)Nk. Observing that the disease prevalence is pmor(k)Nk. Observing that the disease prevalence is Observing that the disease n k/ N k is nk/Nk leads to the equation leads to the equation prevalencen leads to the equation n and, equivalently, as pmor(k)Nk. Observing that the disease prevalence is nk/Nk leads to the equation 3 p p p ik ik ik is the is the probability of disease is the probability of disease d incidence, is the probability of disease dead from another disease probability of disease dd d incidence, aged incidence, aged incidence, aged aged k kk k p pre ( k ) = k p pre ( k ) = k pWk nk = N N pWk nk = pmor ( k ) p pmor ( kp)WN k k NWkk nk = knk = pmor ( k ) N k (1.41) pik is the probability of disease p p p w k is the probability of dying from the disease is the is the probability of dying from the disease d incidence, aged k d, aged k is the probability of dying from the disease probability of dying from the disease d, aged k d d , aged , aged kk kÞ Þ p pre ( k ) = wwkk n n nk ( k ) p pre ( k ) = k p pre ( k ) = k p pre ( k ) = pmor p (k ) pwk is the probability of dying from the disease d, aged k Nk Nk p = N pWk = mor # is the probability of dying other than from disease d, aged is theis the probability of dying other than from disease d, aged is the probability of dying other than from disease d, aged kk k Wk p k (k ) p pre ( k ) Þ # # >$ >$ >$ probability of dying other than from disease d, aged k Þ pre (1.41) (1.41) (1.41) #>$ is the probability of dying other than from disease d, aged k Þ Þ pWk an = pmor ( k ) p = pmor ( k ) The state transition matrix is constructed as follows 1.2.6.8 Approximating multi-state 1.2.6.8 disease survival data Approximating from incidence multi-state and mortality, disease survival data from incidence The state transition matrix is constructed as follows The The state transition matrix is constructed as follows state transition matrix is constructed as follows The state transition matrix is constructed as follows pmor ( k ) pWk = passuming no remission Wk = assuming no remission Wk p pre ( k ) p pre ( k ) p pre ( k ) Disease Mortality statistics give the probability that a person will die from the disease in a given year Disease Mortality statistics give the probability that a person will die from the disease in é p0 ( k + 1) ù é(1 - p é ékp éw p p 0( )( 00 1(kk k+ - p+ + 1 ik )))ùù 1 1 ù (é 1é(( é 1 -1 1- p- - p wkp p- ww wkp k )()1 k)( wk 11p-- a kp - p p ik) 0 0ù ik ik ) ((1 é1 1- p- 0 (p - p p k )w wwùk - kk - -p p pww wkk )p k) p p aa akk k 00 0 0 0 0 ùéé ùù épp p 0( k 00 (kk))ùù ù of life. They make no reference to when the disease from which the person dies was contracted. 1.2.6.8 Approximating multi-state 1.2.6.8 Approximating multi-state disease of life. They make no reference to when the disease from which the person dies was co disease survival data from incidence and mortality, ê ú ê êê ê úú ú êê ê ú ê ú 1.2.6.8 úú úêê ê 1.2.6.8 Approximating úú ú Approximating multi-state disease 1.2.6.8 disease multi-state survival survival data from data from incidence and survivaland incidence mortality, assuming no remission mortality, ê p1 ( k + 1) ú ê (1 - p êê pp p ê 11 w1k(( )kk p+ k + + ik 1 1 1))úú (1 - pwk(( ú =êê ê - 11 1p -- -wkp)( p p w 1k)) - p pp p a ikk) 0 (( 11 0 1 - ú- -ê p p p p1w( k- k )- -ú p p p w k )( )( 11 1-- - p p p a k)) 00 0 00 0 úêêpp p ( ê 11 úú 1( kk k)) ú assuming no remission Approximating assuming no remission multi-state disease data from incidence and ê p2 ( k + 1) ú ê = 0 ( k + 1) ú ê == wwkk ik ik wwkk 1 0ú ê p2 ( k ) ú p wwkk(1.37) a akk úú assuming no remission (1.37) (1.37) (1.37) Disease Survival statistics give the probability that a person will die from the disease in a given year mortality, Disease Survival statistics give the probability that a person will die from the disease in assuming no remission Disease Mortality statistics give the probability tha Disease Mortality statistics give the probability that a person will die from the disease in a given year ê ú ê êê êpp p 2( k 22 k+ +1 1) úú êê pwk 0 00 úê úp p wkk 11 1 0 0 0 úêê úú êpp p 2 (k 2( kk))úú ú of life, given that they contracted the disease in some earlier year. of life, given that they contracted the disease in some earlier year. Disease Mortality statistics give the probability that a person will die from the disease in a given year Disease Mortality statistics give the probability that a person will die from the disease in a given year of life. They make no reference to when the diseas of life. They make no reference to when the disease from which the person dies was contracted. Disease Mortality statistics give the probability that a person will die from the p ë 3 ( k + 1 ) û ë ê p êê p úú ú êê ê p 0 1 û ë p ( k ) û ww k úú úêê ê 2 úú ú 3(( wk k + 1))ûû wk p 3 p 0 1 p 3(( k k))ûû of life. They make no reference to when the disease from which the person dies was contracted. of life. They make no reference to when the disease from which the person dies was contracted. ëp ëë p33 kk+ +1 1 û ëë ë ppww wkk k pp ww wkk k 00 1 1ûû ëp ûëë p33 k û disease in a given year of life. They make no reference The connection between the two is provided by the equation of the form to when the disease from The connection between the two is provided by the equation of the form Disease Survival statistics give the probability that Disease Survival statistics give the probability that a person will die from the disease in a given year It is worth noting that the separate columns correctly sum to unity. which the person dies was contracted. of life, given that they contracted the disease in so Disease Survival statistics give the probability that a person will die from the disease in a given year Disease Survival statistics give the probability that a person will die from the disease in a given year of life, given that they contracted the disease in some earlier year. It is worth noting that the separate columns correctly sum to unity. worth noting that the separate It isIt is worth noting that the separate columns correctly sum to unity. columns correctly sum to unity. It is worth noting that the separate columns correctly sum to unity. of life, given that they contracted the disease in some earlier year. of life, given that they contracted the disease in some earlier year. The disease mortality equation is that for state-2, The disease mortality equation is that for state-2, The connection between the two is provided by the equation of the form Disease Survival statistics give the probability thatThe connection between the two is provided by th a person will die from the disease 14 The disease mortality equation is that for state-2, The disease mortality equation is that for state-2, in a given year of life, given that they contracted the disease in some earlier year. 15 The connection between the two is provided by the equation of the form The connection between the two is provided by the equation of the form p2 ( k + 1) = pwk p2 + 1) = p )k p1 ( k ) + p2 ( k ) 1 ( k ) + p2 ( kw (1.38) (1.38) 14 14 14 p2 ( k + 1) = pwk p1 ( k ) + p2 ( k ) (1.38) The connection between the two is provided by the equation of the form 15 The probability of dying from the disease in the age interval [ k,k pwk p1[k, k ( ) is pwk p1 k - this is otherwise +1] is ( ) The probability of dying from the disease in the The probability of dying from the disease in the age interval [ age interval k, k+1] is k+1] - this is otherwise pmor ( a ) = å pw ( a a ) p ( a ) 15 15 (1.42) k p1 (k ) 0 inc 0 this is otherwise the (cross-sectional) the (cross-sectional) disease mortality, The probability of dying from the disease in the age interval [ the (cross-sectional) disease mortality, disease kmortality, pmor(k). p1 , k +1] is pp (k) is otherwise known as the disease prevalence, pmor(k). p1(k) is otherwise known as the disease prevalence, wmor (k). p1(k) is otherwise - this is otherwise a0 )),= This equation can be used to infer survival statistics when only the incidence and mortality statistics mor(( a calculated from equation (1.42) as close as possible to the known set. nown – essentially by choosing the survival statistics so as to get the mortality statistics pp mor a ))= a0 aaa )K ) = , 0d = > d d pw 0 ,0 ((( 0Kp K K 00> w 0 > 0a0 0 00 . . )) ,, p 0 pp . w(((a aa ))= =0 0) = 0 0 . . . ulated from equation (1.42) as close as possible to the known set. are known – essentially by choosing the survival statistics so as to get the mortality statistics way. way. are known – essentially by choosing the survival statistics so as to get the mortality statistics This equation can be used to infer survival statistics when only the incidence and mortality statistics a a 00< a < a initial state vector is determined from the initial conditions At subsequent ages, the state probabilities are given by the recursion equation At subsequent ages, the state probabilities are given by the recursion equation At subsequent ages, the state probabilities are given by the recursion equation p i ( a 0 ) = d iK ( K 0 > 00 00 ) , p iK iK iK w 00 0( a ) = 0 0 . ww 00 calculated from equation (1.42) as close as possible to the known set. calculated from equation (1.42) as close as possible to the known set. Multi-state diseases have mortality, survival, and incidence statistics that are state dependent. Aside This equation can be used to infer survival statistics when only the incidence and mortality statistics At subsequent ages, the state probabilities are given by the recursion equation At subsequent ages, the state probabilities are given by the recursion equation At subsequent ages, the state probabilities are given by the recursion equation 0 calculated from equation (1.42) as close as possible to the known set. are known – essentially by choosing the survival statistics so as to get the mortality statistics This equation can be used to infer survival statistics when only the incidence and mortality statistics This equation can be used to infer survival statistics when only the incidence and mortality statistics 1.2.6.8.1 p mor ( a from this additional level of complexity, the determination of disease survival proceeds in the same ) = Setup w å ( Multi-state diseases have mortality, survival, and incidence statistics that are state dependent. Aside p a a 0 ) p inc ( a ) (1.42) are known – essentially by choosing the survival statistics so as to get the mortality statistics 0 At subsequent ages, the state probabilities are given by the recursion equation 1.2.6.8.1 1.2.6.8.1 i-state diseases have mortality, survival, and incidence statistics that are state dependent. Aside Setup Setup from this additional level of complexity, the determination of disease survival proceeds in the same Multi-state diseases have mortality, survival, and incidence statistics that are state dependent. Aside æ p00 ( a æ+ p1 0| (a00+ ,K 1 |0 ) a 0ö,K (((a ) 11ö æ ) pö0ö 0 (a æ |p a00,(K a0 |0 ) a æö,K 0( )(aaö| |aa ,,K calculated from equation (1.42) as close as possible to the known set. Multi-state diseases have mortality, survival, and incidence statistics that are state dependent. Aside are known – essentially by choosing the survival statistics so as to get the mortality statistics are known – essentially by choosing the survival statistics so as to get the mortality statistics Multi-state diseases have mortality, survival, and incidence statistics that are state dependent. Aside way. For each sex, consider an a0 >00 p ç k 0 ÷ ççèè k > 0 è kkk øøø0 ... 0 0 ÷ ÷ ) ( ) ç ÷ K mortality probabilities 1.2.6.8.1 single-stage determination of survival. abilities. In the special case of a single-state disease, it reduces to the previously developed i ,i j p Setup p (K denotes the stage number and a the age) are known. The | a , K single-stage determination of survival. the probability of not having died from the disease and being in stage K at age p K a (K denotes the stage number and a the age) are known. The >>000 ÷÷ probabilities. In the special case of a single-state disease, it reduces to the previously developed 1.2.6.8.2 Definitions mor probabilities. In the special case of a single-state disease, it reduces to the previously developed K 0 0 Multi-state diseases have mortality, survival, and incidence statistics that are state dependent. Aside from this additional level of complexity, the determination of disease survival proceeds in the same mor ç è kç>0 ç ø ç ç > ÷ ÷ ÷ ÷ æ æ ö ææ ö11 e-stage determination of survival. For each sex, consider an mortality probabilities mortality probabilities probabilities. In the special case of a single-state disease, it reduces to the previously developed p p K ( mor ( ) K a a ) N -stage, terminal disease for which both the inter-stage transition a (K denotes the stage number and a the age) are known. The a , given that the disease was contracted at , given that the disease was contracted at (K denotes the stage number and a the age) are known. The a in state K a in state K p çç p0,1 ç 1 - p 1 p- å å( k÷ 1 p- p 1(a (å å ) - |p a)1 0) (aö ö ö |1 a... )pp 1 ... 0 ... 0 0÷ 0÷ ÷÷ ÷ ) (( )) following algorithm allows for an optimal determination of the stage-dependent survival following algorithm allows for an optimal determination of the stage-dependent survival 0 ç ç0,1 1,æ ÷ 0 single-stage determination of survival. way. 0 0 ç çp k1 ÷ 1 - - p p 1 - - 11 ( (a a | |aa )) ... 0 00 ... w ( a | a0 0) 0 single-stage determination of survival. from this additional level of complexity, the determination of disease survival proceeds in the same 1.2.6.8.2 mor Definitions ç æ ç ö ww w ÷÷ å ( ) 0,1 1,k 1, 0 0 T ( a, aT 0)(º a0 ) º p 1 - a |a p1, k÷÷ 1 - pw 0 ... 0 ÷ 0÷ ( a ) = probabilities, pi ,( å ( çççp 1.2.6.8.2 (a Definitions 0) psingle-stage determination of survival. a,ç èç1 - kk> è 0,1 kø øw ( 1 0 )ø p w a a0 pinc ja (the probability to go from stage probabilities. In the special case of a single-state disease, it reduces to the previously developed (1.42) i to stage j), and the state-dependent ç TTpç( (a a, ,a a )) º º > 0,1 p1,k ÷ 1 1 0,1 >1 1 - è è 1 1, k 1,k ÷ø w 0 0÷ ÷ 6.8.2 Definitions following algorithm allows for an optimal determination of the stage-dependent survival mor following algorithm allows for an optimal determination of the stage-dependent survival way. pK ( a | a0 ,Definitions K0K)( pp athe | a, the probability of not having died from the disease and being in stage K at age , K0 ) probabilityp ( a the probability of not having died from the disease and being in stage K at age |of a0not probabilities. In the special case of a single-state disease, it reduces to the previously developed ) the probability of being dead (from the disease) at age , K0having died from the disease and being in a , given that the a , given that the 0 T ( a , a0 ) º ç T ç ( a 0,1 , a 0 0 ) º ççç èç k >1 ø ... è kk> k >11 > ø ... ... ÷ ÷ ... ... ÷ ÷÷ ÷ w ( a | a0 )a a0 0 ) , ap= 0 w (èa 0 ) = k = 10 . . At subsequent ø ages, the hese probabilities are linked by the state-update equation, pw ( a that they die from other causes.) Suppose they acquire the disease at age | a0 , K0 ) the probability of being dead (from the disease) at age a the 0 , given that the a0 in stage K, then the 16 16 17 17 17 At subsequent ages, the state probabilities are given by the recursion equation state probabilities are given by recursion equation 1.2.6.8.4 Disease survival 17 17 17 17 disease was contracted at a 1.2.6.8.4 in state K Disease survival Once a person has the disease, they can possibly change disease stage, or they can die from the 2.6.8.3 Disease incidence 0 0 Once a person has the disease, they can possibly change disease stage, or they can die from the disease. (This analysis focuses only on the identified disease and does not allow for the possibility 16 æ p0 ( a + 1 | a0 , K 0 ) ö æ p0 ( a | a0 , K 0 ) ö he probability that a person, who at age 0 does not have the disease, first gets the disease at age pinc ( a0 , K ) disease. (This analysis focuses only on the identified disease and does not allow for the possibility ÷ a in state K0 , given no disease at that they die from other causes.) Suppose they acquire the disease at age a0 in stage K, then the 0ç ÷ ç the probability of first getting the disease in ç p1 ( a + 1 | a0 , K 0 ) ÷ p1 ( a | a0 , K 0 ) ÷ 0 in state K given as 0 ç that they die from other causes.) Suppose they acquire the disease at age a0 in stage K, then the ç age 0 ÷ = T ( a , a0 ) ç ÷ .. ... ç ÷ çæ a = a0 -1 K ÷ 16 ( a the probability of dying from the disease in stage K at age pN -150 + 1 | ap 0 , K( ) p -1 ( a p | a0( ,a ö K)0 )p a , given that the pw ( a | a0 ) ç K ç p (a + inc 0a ÷ 0 ÷ , K 0 ) = Õ ç ç ç 1N- å 0k ÷÷÷ 0 K0 (1.43) 16 51 è w 1 | a 0 , K ) disease was contracted at age a 0 ø a = 0 è è w 0p (k a =1 ) | a and that the person was alive at age a-1 0 , K ø 0 ø 2.6.8.4 Disease survival Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 1.2.7 Approximating attributable cases 1.2.7 Approximating attributable cases 1.2.10.1 Direct costs 1.2.7 Approximating attributable cases The smoking attribute cases (I The smoking A ) for a disease (d) is calculated by dividing the number of new cases of attribute cases (IA) for a disease (d) is calculated by dividing the Direct costs are calculated based on a cost per case, which is constant throughout The smoking attribute cases (I ) for a disease (d) is calculated by dividing the number of new cases of a disease among individuals who are either smokers or ex-smokers (n) by the total number of people number of new A cases of a disease among individuals who are either smokers or ex- the simulation. a disease among individuals who are either smokers or ex-smokers (n) by the total number of people in the population in a given year. smokers (n) by the total number of people in the population in a given year. 1.2.7 Approximating attributable cases in the population in a given year. 1.2.7 Approximating attributable cases Cost per case ($) * Prevalence(year) The smoking attribute cases (IA) for a disease (d) is calculated by dividing the number of new cases of Direct cost (£) per individual (year) = Cost The smoking attribute cases (I n( y) A) for a disease (d) is calculated by dividing the number of new cases of Cost per per case case ($) ($) * * Prevalence(year) Prevalence(year) (1.6) a disease among individuals who are either smokers or ex-smokers (n) by the total number of people I Ad ( y) = (1.1) Direct Direct cost cost (£) (£) per per individual individual (year) (year) = = Cost Cost per per Alive case case ($) ($) * *(year) Prevalence(year) Prevalence(year) (1.6) n((y N y)) Direct cost (£) (£) per Direct cost individual (year) per individual (year) = = Cost per case Alive ($) *(year) Prevalence(year) (1.6) (1.6) a disease among individuals who are either smokers or ex-smokers (n) by the total number of people in the population in a given year. I Ad ( y) = (1.1) Alive Alive (year) (1.6) in the population in a given year. N ( y) Direct cost (£) per individual (year) = Alive (year) (year) The direct costs are displayed per million $ and at a rate defined by the user. (1.6) 1.2.8 Potential Years of Life Lost (PYLL) (2, 3) The direct costs are displayed per million $ and atAlive (year) a rate defined by the user. The direct costs are displayed per million $ and at a rate defined by the user. The direct costs are displayed per million $ and at a rate defined by the user. 1.2.8 Potential PYLL(i)Years of Life n( y) (2, 3) The direct costs are displayed per million $ and at a rate defined by the user. 1.2.8 y) = (PYLL) I Ad ( Lost Potential Years of Life Lost (PYLL) (2, 3) The PYLL for an individual ( ) who dies in a given year will be calculated from the following n( y) (1.1) The direct costs are displayed per million $ and at a rate defined by the user. Direct cost ($) per individual (year) * rate The direct costs are displayed per million $ and at a rate defined by the user. The PYLL for an individual ( equation (1.2). The PYLL forPYLL(i) an individual I Ad ( y) =who (PYLL(i)) N(y ) in a given year will be calculated from ) who dies in a given year will be calculated from the following dies (1.1) Direct costs (M£) per rate (year)= Direct cost ($) per individual (year) * rate (1.7) N ( y) Direct costs (M£) per rate (year)= Direct Direct cost cost ($) ($) per per individual 10 6 individual (year) (year) * * rate rate (1.7) Direct Direct costs (M£) per rate (year)= Direct cost ($) per individual (year) * rate (1.7) costs (M£) Direct costs per rate (M£) per rate (year)= 6 equation (1.2). the following equation (1.2). 10 (year)= Direct cost ($) per individual 10 6 (year) * rate (1.7) (1.7) 1.2.8 Potential Years of Life Lost (PYLL) (2, 3) Direct costs (M£) per rate (year)= 10 106 6 6 (1.7) 1.2.8 ì Ageref - Agedeath Potential Years of Life Lost (PYLL) (2, 3) if Agedeath < Ageref 95% confidence intervals are calculated from the prevalence rates per individual (P) by the equation 10 The PYLL for an individual ( PYLL PYLL(i) (i) = ì) who dies in a given year will be calculated from the following í (1.2) 95% confidence intervals are calculated from the prevalence rates per individual (P) by the equation The PYLL for an individual ( PYLL(i) Age 0 ref - Age if Age if Agedeath ³ Age < Ageref ) who dies in a given year will be calculated from the following 95% confidence intervals are calculated from the prevalence rates per individual (P) by the equation below. 95% confidence intervals are calculated from the prevalence rates per individual (P) by the equation 95% confidence intervals are calculated from the prevalence rates per individual (P) equation (1.2). PYLL(i) = í î death death ref (1.2) 95% confidence intervals are calculated from the prevalence rates per individual (P) by the equation below. if Agedeath ³ Ageref below. 95% confidence intervals are calculated from the prevalence rates per individual (P) by the equation equation (1.2). î0 below. below. by the equation below. For each individual, the difference between the reference age (life expectancy) and the age of death below. P(1 - P) ì Ageref - Agedeath if Agedeath < Ageref 95%CI = Direct Costs (year)*1.96 P (1 - P ) (1.8) PYLL(i) = ì For each individual, the difference between the reference age (life expectancy) and the age of death will be calculated. The total PYLL each year ( í TotalPYLL Ageref - Age ( year ) ) will be calculated each year in the if Age death < Age (1.2) 95%CI P (1 - TrialsP ) For each PYLLindividual, (i) = í 0 the difference between death Age if the death ³ Ageref reference age ref (life expectancy) and 95% Direct Costs = Direct CI = Costs (year)*1.96 (year)*1.96 P P (1 - P ) (1 - P) (1.8) (1.8) î will be calculated. The total PYLL each year ( microsimulation. This metric will consider individuals who have died in a given year (N 0 TotalPYLL( year if ) ) will be calculated each year in the Age ³ Age (1.2) died(year)). 95%CI 95% = Direct CI = Direct Costs Costs (year)*1.96 (year)*1.96 PTrials (1 - P) Trials (1.8) (1.8) the age of death î will be calculated. The total PYLL death TotalPYLL(year)) will each year (ref 95%CI = Direct Costs (year)*1.96 Trials Trials (1.8) microsimulation. This metric will consider individuals who have died in a given year (Ndied(year)). For each individual, the difference between the reference age (life expectancy) and the age of death Trials be calculated each year in the microsimulation. This metric will consider individuals As the simulation projects into the future, and simulates a cohort of children with defined age For each individual, the difference between the reference age (life expectancy) and the age of death will be calculated. The total PYLL each year ( TotalPYLL ( year ) ) will be calculated each year in the 1.2.11 Premature mortality costs (PMC)(4, 5) who have died in a given year (Ndied(year)). As the simulation projects into the future, and simulates a cohort of children with defined age groups and therefore year of birth, life expectancy values for each simulated individual will be based will be calculated. The total PYLL each year ( TotalPYLL ( year ) ) will be calculated each year in the 1.2.11 1.2.11 Premature mortality costs (PMC) Premature mortality costs (PMC)(4, 5) (4, 5) microsimulation. This metric will consider individuals who have died in a given year (Ndied(year)). groups and therefore year of birth, life expectancy values for each simulated individual will be based 1.2.11 Premature mortality costs (PMC) 1.2.11 Premature mortality (4, 5) costs on their life-expectancy values at birth, obtained for each country from national statistics 1.2.11 Premature mortality costs (PMC) (4, 5) (PMC)(4, 5) microsimulation. This metric will consider individuals who have died in a given year (Ndied(year)). 1.2.11 Premature mortality costs (PMC) (4, 5) on their life-expectancy values at birth, obtained for each country from national statistics repositories. As the simulation projects into the future, and simulates a cohort of children with As the simulation projects into the future, and simulates a cohort of children with defined age ìi =LE at birth-1 repositories. defined age groups and therefore year of birth, life expectancy values for each As the simulation projects into the future, and simulates a cohort of children with defined age PMC (i) = ì ï ìiii = å = LE LE at birth-1 Income(i ) at birth-1 if agedeath < LE at birth groups and therefore year of birth, life expectancy values for each simulated individual will be based N died ( year ) í ì ï ì ï i==i å LE LE å = at at death age birth-1 Income(i ) birth-1 Income (i ) if if age age < LE death < LE at at birth birth (1.9) å groups and therefore year of birth, life expectancy values for each simulated individual will be based death simulated individual will be based on their life-expectancy values at birth, obtained PMC (i ) = i = LE at birth-1 Income(i ) if age < LE at birth N died ( year ) PYLL(i ) on their life-expectancy values at birth, obtained for each country from national statistics PMC (i ) = ï ì í ï í Income (i ) if age < LE at birth (1.9) (1.9) å i = age death PMC PMC ( (i i))==ïî í 0i = agedeath death Income(i) ififage age ³ LE at birth death < LE at birth death (1.9) repositories. for each country from national TotalPYLL ) = i =repositories. statistics ( year N 1 å on their life-expectancy values at birth, obtained for each country from national statistics PYLL(i) ( year ) (1.3) PMC (i) = í í ï î ï î ï 0 0 i i i== age = agedeath age death if if age age death death ³ ³ LE LE at at birth birth (1.9) (1.9) repositories. 0 if age if ³ LE at birth death death TotalPYLL( year ) = i =1 population (1.3) î ï î 0 0 agedeath death ³ LE at birth N N died ( year ) ( year ) î if age death ³ LE at birth population 1.2.9 Premature mortality rate å N died ( year ) PYLL(i ) The premature mortality costs for each individual (PMC(i)) are calculated by summing over the 1.2.9 TotalPYLL( year ) = i =1 PYLL(i) Premature mortality rate å The premature mortality rate (PM(year)) based on the number of individuals who die prematurely in (1.3) The premature mortality costs for each individual ( The premature mortality costs for each The premature mortality costs for each individual ( The premature mortality costs for each individual ( PMC(i) PMC(i) PMC(i) individual (PMC(i)) are calculated by ) are calculated by summing over the ) are calculated by summing over the income costs from the age of death until the maximum age. The maximum age can be defined as the ) are calculated by summing over the TotalPYLL ( a given year is calculated based on equations year ) = N i =1 ( year ) The premature mortality rate (PM(year)) based on the number of individuals who die prematurely in (1.3) The premature mortality costs for each individual ( summing over the income costs from PMC(i) ) are calculated by summing over the the age of death until the maximum age. income costs from the age of death until the maximum age. The maximum age can be defined as the income costs from the age of death until the maximum age. The maximum age can be defined as the population N population ( year ) pension age or by some other value. The premature mortality costs for each individual ( PMC(i)) are calculated by summing over the income costs from the age of death until the maximum age. The maximum age can be defined as the income costs from the age of death until the maximum age. The maximum age can be defined as the The maximum age can be defined as the pension age or by some other value. pension age or by some other value. a given year is calculated based on equations pension age or by some other value. income costs from the age of death until the maximum age. The maximum age can be defined as the 1.2.9 Premature mortality rate pension age or by some other value. pension age or by some other value. The model outputs average PMC per 100,000 as shown in the equation below (1.10). 1.2.9 1.2.9 Premature mortality Premature mortality rate ìrate 1 if agedeath < 70 pension age or by some other value. The premature mortality rate (PM(year)) based on the number of individuals who die prematurely in premature (i ) = í (1.4) The model outputs average PMC per 100,000 as shown in the equation below (1.10). The model outputs average PMC per 100,000 as shown in the equation below (1.10). The premature mortality rate premature(i) = î a given year is calculated based on equations ì1 (PM(year)) 0 if age if basedage on < 70 the number of individuals who death ³ 70 The premature mortality rate (PM(year)) based on the number of individuals who die prematurely in death The model outputs average PMC per 100,000 as shown in the equation below (1.10). The model outputs average PMC per 100,000 as shown in the equation below (1.10). The model outputs average PMC per 100,000 as shown in the equation below (1.10). PMC die prematurely in a given year í (1.4) The model outputs average PMC per 100,000 as shown in the equation below (1.10). PMC per 100,000 = PMCTotal *100,000 (1.10) a given year is calculated based on equations î0 is calculated based if age on equations death ³ 70 PMC N PMC PMC per 100,000 per 100,000 = PMC = PMCTotal Total *100,000 Total population *100,000 (1.10) (1.10) PMC PMC per per 100,000 100,000 = Total *100,000 N ( year ) ì1 died if agedeath < 70 = PMCN N *100,000 (1.10) (1.10) premature(i) = ì í1died N å ( year ) premature if agedeath(i)< 70 (1.4) PMC per 100,000 = N population N population Total population *100,000 (1.10) í å 0 if age (i)³ 70 N population population premature (i ) PM ( year ) == î i =1 premature death (1.4) (1.5) î0 iN if agedeath ( year ) ³ 70 PM ( year ) = =1 population (1.5) 1.2.12 Propagation of errors equation N diedN 1.2.12 Propagation of errors equation 1.2.12 Propagation of errors equation population ( year ) ( year ) 1.2.12 Propagation of errors equation To include totals for each of the outputs, the sum of each disease output (e.g. incidence, prevalence) 1.2.10 Costs module å N died ( year ) premature(i ) 1.2.12 Propagation of errors equation 1.2.12 Propagation of errors equation To include totals for each of the outputs, the sum of each disease output (e.g. To include totals for each of the outputs, the sum of each disease output (e.g. incidence, prevalence) PM ( year ) = å premature(i) To include totals for each of the outputs, the sum of each disease output (e.g. incidence, prevalence) was summed. The total errors (E 1.2.12 Propagation of errors equation T) were calculated using the propagation of errors equation: 1.2.10 Costs module i =1 The cost module includes both direct and indirect cost calculations. (1.5) To include totals for each of the outputs, the sum of each disease output (e.g. incidence, prevalence) To include totals for each of the outputs, the sum of each disease output (e.g. incidence, prevalence) iN incidence, prevalence) was summed. The total errors (E was summed. The total errors (ET was summed. The total errors (ET) were calculated using the ) were calculated using the propagation of errors equation: ) were calculated using the propagation of errors equation: To include totals for each of the outputs, the sum of each disease output (e.g. incidence, prevalence) PM ( year ) = population ( year ) =1 The cost module includes both direct and indirect cost calculations. (1.5) was summed. The total errors (ET was summed. The total errors (E T ) were calculated using the propagation of errors equation: ) were calculated using the propagation of errors equation: 1.2.10.1 Direct costs N population ( year ) propagation of errors was summed. The total errors (E T equation: T) were calculated using the propagation of errors equation: E = E 2 + E 2 + + E 2 (1.11) T 1 2 2 2 n 2 1.2.10 Costs module E E = E E 2 + E2 + E + E E 2 (1.11) 1.2.10.1 Direct costs Direct costs are calculated based on a cost per case, which is constant throughout the simulation. T= 2 + 2 + + 2 2 2 (1.11) E ET T = E1 1 + E2 2 + + E n 2 n (1.11) 1.2.10 Costs module T = E1 2 + E2 2 + + En The cost module includes both direct and indirect cost calculations. 2 (1.11) Direct costs are calculated based on a cost per case, which is constant throughout the simulation. Where En is the error for each individual disease output which has been included in the sum. ET = E1 1 + E2 + + Enn (1.11) The cost module includes both direct and indirect cost calculations. Where E Where En is the error for each individual disease output which has been included in the sum. 1.2.10 Costs module n is the error for each individual disease output which has been included in the sum. 1.2.10.1 Direct costs Where E Where En is the error for each individual disease output which has been included in the sum. n is the error for each individual disease output which has been included in the sum. 1.2.10.1 The cost module includes both direct and indirect cost calculations. Direct costs Where En is the error for each individual disease output which has been included in the sum. Where En is the error for each individual disease output which has been included in Direct costs are calculated based on a cost per case, which is constant throughout the simulation. 18 Direct costs are calculated based on a cost per case, which is constant throughout the simulation. the sum. 18 52 18 53 18 19 19 19 19 19 19 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 2 Software Architecture Outputs are handled in a similar way but in reverse: Run-time generated outputs are stored in dynamically created output objects; at the end of a run, 2.1 Aim of the Model the objects write their data to tab-delimited text files. Outputs can be summary This model utilizes a common bespoke method to predict the impact of files, medium or low-level data files which can be further processed by standard changing risk factors to measure chronic disease. UKHF currently have risk-factor software packages. models for obesity, tobacco, albuminuria, and eGFR and related diseases that include myocardial infarction/coronary heart disease, stroke, Type 2 diabetes, chronic obstructive pulmonary disease, chronic kidney disease, and lung cancer. Input editors The model is an epidemiological/medical competing-risk application that uses Input Files both stochastic and deterministic processing capable of projecting cohort mortality rates of an individual or a population, taking account of the individuals’ risk factors and medical profiles. Through its interactive scenario specification, MODEL process the model allows for the effects of ageing and the projection of future mortality rates, either with or without taking into account possible future trends in risk Output Files factors or medical conditions. Output editors 2.2 Summary of the Architecture of the Existing Model The existing solution is written in C++ (compiler Embarcadero C++ Builder). It is a modular, object-oriented design and is compiled to run under the Windows Figure 2.2: The Model Structure. operating system. The application has a limited interactive graphics capability designed for 2.3 Main C++ Classes Used by the Model the rapid assessment of outputs and comparative assessments of batched Individual members of the population and diseases are modeled by the C++ runs. Diagrams and graphs produced in this way can be exported from the classes Tperson and Tdisease respectively. The risk-factor trends and scenarios application in suitable file formats. The model is equipped with a suite of editors are modeled by the C++ class Tscenario. The principal operations of the allowing flexible and traceable input of individual, cohort, or population data. program can be regarded as the interactions of Tperson, Tdisease, and Tscenario objects. These classes have some of their fields and methods highlighted in The model’s inputs are in the form of tab-delimited text files. The application the following section and subsections; the idea is to give an indication of the has a number of editors that can create, edit, and store these files. The processing chain implemented in the model. The software closely follows the simulation, disease, and scenario editors allow the user to specify all input data real-world life of individuals – they age in a personal risk factor environment and files, parameters and processing rules necessary for a run of the program. The possibly catch diseases from which they may recover or die. application’s many data inputs are processed in a similar fashion – for a specified run-configuration, the application dynamically creates and maintains lists of Figure 3 shows a schematic of the model illustrating the overall structure of the software objects, each object being constructed from a designated data file (the model and each class. disease and scenario objects sketched below are examples of this process). Files input in this way into their corresponding dynamical objects are automatically checked for their data integrity by the newly created software object’s own methods. Each run creates and stores a time-tagged configuration file specifying the complete set of input files, output files, parameter settings and rule set. Provided that the necessary input data are available, it is possible to rerun a simulation by reusing the configuration file. 54 55 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 2.3.2 Tdisease C++ class Risk data Both single and multistage diseases are implemented as instances of the C++ class Tdisease. An example of their structure is shown in Figure 4. Each stage of a disease has its own set of disease statistics such as incidence risks, remission RISK Health risks, and survival risks. Moreover, each disease stage will also contain economic Population Disease Intervention data data economic scenarios data such as direct and indirect costs. In addition to the data fields, there are also Distribution data methods that are included in the Tdisease class, some examples of which are programme provided in Table 5. Disease data are stored in structured text files – one file for each disease or version of that disease. The key method of the Tdisease class is UKHF Microsimulation© Programme function GetRisk() . For a specified person state, medical history, and risk factor type, the method returns the relevant transition probability. If the application is running in a stochastic transition mode, this probability is compared to an application-generated random number to determine if the transition takes Input datasets place; in deterministic mode, the same transition probability is included in the Software programmes Output data relevant life-disease table that computes and lists the probabilities of being alive Output and in possible exclusive disease states or dead. Figure 3: Schematic of the Model. KEY CostQoL 11 CostQoL 22 inc > incidence 2.3.1 Tperson C++ class rem > remission pre > prevalence People are implemented as instances of the C++ class, Tperson; an indication mor > mortality of its data fields and methods provided in Table 4 are grouped into the state Cost indirect 11 Cost indirect 22 sur > survival bmi > body mass index record. The medHistory record maintains their current disease and risk factor smk > smoking status together with data necessary for the computation of disease-related -- 01 > stage0 to stage 1 transitions. The Tperson object’s data fields are updated annually with the Common Cost direct 11 Cost direct 22 02 > stage1 to stage 2 yearByYear(diseaseList, scenarioList, …) method. The method needs to be 20 > stage1 to stage 2 11 > stage1 supplied with pointers to the list of disease object pointers being modeled DiseaseState DiseaseState and the list of risk-factor scenario object pointers, which determines how the Disease stage 1 stage 2 person’s set of risk factors change over the year. Risk inc 01 relrisk bmi relrisk smk Risk inc 02 relrisk bmi relrisk smk Table 4: The C++ Tperson Class Tperson Description Risk inc 12 relrisk bmi relrisk smk Risk rem 20 relrisk bmi relrisk smk Data field State State vector record Risk rem 10 relrisk bmi relrisk smk Risk rem 21 relrisk bmi relrisk smk medHistory Medical history record … … Method Figure 4: Mulitistage Disease Architecture Tperson(state0,medHistory0) Constructor for initial state and history yearByYear(…) Updates state and medHistory by one year … … 56 57 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Table 5: The C++ Tdisease Class Individuals are processed one at a time from the simulation’s start year until they either die or reach the simulation’s stop year. In each simulated year they can TDisease Description either contract any mix of the modeled diseases which they do not yet have, Data field name Disease name achieve remission from any disease or disease stage they might have, die from terminal Boolean, true if the disease is terminal any terminal disease they might have, or die from other causes. (Other causes are modeled as a single, instantly fatal, terminal disease; its incidence probability state Disease state {normal, severe,…} is constructed via the disease editor from the modeled diseases’ mortality *DataAvailability Boolean array of Data availability by risk type statistics and the appropriate national mortality statistics). **IncidenceRisk Incidence rates by age, gender Each run of the model requires the specification of a risk-factor scenario for each ***SurvivalRisk Survival rates by age, gender, state risk factor modeled. These scenarios can simply maintain risk-factor distributions **PrevalenceRisk Prevalence rates by age, gender at their start year values, or they can allow for the modeling of risk factor trends or medical advances resulting in the reduction of disease incidence or ***RemissionRisk Remission rates by age and gender,state improvements in the survivability of specified diseases. ***MortalityRisk Mortality rates by age, gender, state Relative risks by risk factor type, age, gen- 2.3.3 Tscenario C++ class ***RelRisk der Scenarios are modeled as instances of the C++ Tscenario class and are … … constructed by the scenario editor, which is accessed via the simulation editor. Method Runs can be organized into batches, with different runs having different risk- TDisease(aFile) Constructor using data from aFile factor scenarios. This allows for direct comparisons to be made – for example, LoadFromFile(aFile) Fills the data fields from aFile what happens to life expectancy with or without improvements to the treatment of stroke. WriteToFile(aFile) Writes the data fields to aFile GetRisk(state, medHistory,risk- Scenarios are implemented as instances of the C++ class, Tscenario; an Returns risk for specified risktype type) indication of its data fields and methods is provided in Table 6. … … The scenario objects are constructed from files that are created by the Processing is user-specified to be either random (Monte Carlo) or deterministic. scenario editor. The random option can process any specified population or cohort; the deterministic option processes only cohorts. In this context: A population is Much of the input data (disease data, mortality data, demographic data, etc.) is a specified number of males and females whose age distributions and risk typically changed on an annual basis. Such changes are easily accommodated factor distributions are input as appropriate tab delimited text files; a cohort and logged via the input editors – the disease, distribution, and simulation is a text file of individuals specifying, for each individual, their initial state and editors. medical history. The user options and necessary data files are specified in the application’s simulation editor. New diseases that are described by the current set of risk factors can be added to (or subtracted from) the simulation via the disease editor. The user must also specify the set of diseases and the set of risk factors that The model has essentially only two external software dependencies: Its own are being simulated. Again, this is done via the appropriate application editor: C++ development environment and its host processor’s operating system. The The disease editor allows the construction and identification of a batch file configuration was chosen for ease of its maintainability. of disease files; the simulation editor allows for the specification of the mix of risk factors and, where necessary, their distributions by age and gender. The simulation editor also provides the mechanism by which essential run parameters are specified – the start year, stop year, number of trials, and so on. 58 59 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Table 6: The C++ Tscenario Class References Tscenario Data field 1. Hoogenveen RT, van Baal PH, Boshuizen HC, Feenstra TL. Dynamic effects of smoking cessation on disease incidence, mortality and quality of life: The role of scenarioType Type of scenario eg. {,smoking,… } time since cessation. Cost Eff Resour Alloc. 2008. start year Year at which scenario starts stop year Year at which scenario stops 2. Gardner JW, Sanborn JS. Years of Potential Life Lost (YPLL)-What Does it Measure? Epidemiology. 1990;1(4):322-9. futureRiskFile File specifying future risk distribution targetAgeGroup Target age group eg. {18+} 3. Health and Social Care Information Centre. Indicator Specification: CCG targetGenderGroup Target gender group eg. {males,females} OIS 1.1 Potential Years of Life lost (PYLL) from causes considered amenable to healthcare 2015. Available at: https://indicators.hscic.gov.uk/download/ … … Clinical%20Commissioning%20Group%20Indicators/Specification/CCG_1.1_ Method I00767_S.pdf. Tscenario(aFile) Constructor using data from aFile 4. Gold M, Siegel JE, Russell LB, Weinstein MC. Cost-effectiveness in Health and LoadFromFile(aFile) Fills data fields from aFile Medicine. Press OU, editor1996. … … 5. Menzin J, Marton JP, Menzin JA, Willke RJ, Woodward RM, Federico V. Lost productivity due to premature mortality in developed and emerging countries: An application to smoking cessation. BMC medical research methodology. 2012;12(1):1. 60 61 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Appendix 2. Results of the TaXSiM model: Ukraine Summary Cigarette Tax - Scenarios Output – 2015 – 2017 SCENARIO 2 SCENARIO 3 SCENARIO 4 SCENARIO 1 Baseline Situation (2017): Increase (2017): Increase (2017): Increase (2017): Ad valorem (2016): Ad valorem Ad valorem tax 30% the Ad va- Ad valorem and remains equal and Expected (12%) minimum Expected Expected (15%), and 30% Expected lorem, and 50% Expected specific tax (40%), Expected Actual 12% tax increase Government Revenue Type Contribution specific (8.515 Contribution Contribution Increase in the Contribution Increase in the Contribution adopting a simpli- Contribution 2015 in minimum spe- to GDP UAH) and simple to GDP to GDP minimum specific to GDP minimum specific to GDP fied tax structure to GDP cific excise (9.54 specific (6.365 excise (11.08 UAH), excise (12.77 UAH), with uniform UAH), and simple UAH) and simple specific and simple specific specific excise tax specific (7.13 UAH) (8.28 UAH) (9.55 UAH) (11.92 UAH) Total cigarettes taxed (billion 73.8 66.9 64.0 60.1 53.4 48.8 pieces) Average cigarette price (UAH per 15.2 19.2 21.2 24.7 32.9 41.4 pack) Average cigarette price (US$ per $0.63 $0.81 $0.87 $1.01 $ 1.35 $1.69 pack) * Average excise tax (UAH per 1000 308.9 430.7 482.6 573.0 825.8 1106.1 pieces) Total excise tax revenue (billion 22.8 1.0% 28.8 1.3% 30.9 1.2% 34.4 1.3% 44.1 1.7% 54.0 2.1% UAH) Total excise tax revenue (US$ $0.94 $1.21 $ 1.27 $1.41 $ 1.81 $ 2.21 billion) * Additional tobacco excise (billion 6.0 0.3% 2.1 0.1% 5.6 0.2% 15.3 0.6% 25.1 1.0% UAH) /percentage of GDP Additional tobacco excise (U$ $ 254 $85 $230 $ 626 $1,030 million) * Total government revenue (excise, 34.9 1.6% 42.1 1.8% 45.0 1.8% 49.9 1.9% 61.9 2.4% 73.9 2.9% VAT and levies, billion UAH) Total government revenue (excise, $1.4 $1.8 $1.8 $2.0 $ 2.5 $ 3.0 VAT and levies, US$ billion) * Total expenditure on cigarettes 56.3 64.3 67.9 74.2 88.0 100.9 (billion UAH) Percentage change in total ciga- -9.3 -4.3 -10.2 -20.2 -27.1 rette consumption (%) * World Bank Group forecast: Annual average exchange rate = 2016 (1US$/23.8 UAH); 2017 (1US$/24.4 UAH) 62 63 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Subsequently, from the graphs included in papers on the Framingham study, Appendix 3. Adjustment of Epidemiological Input Data the biennial levels of new cases of CHD per 1000 were extracted, smoothed to for the Microsimulation Model of the Health Impacts of five-year intervals, and multiplied by 50 to switch to rates per year per 100000. Tobacco Taxation in Ukraine Then, average unweighted incidence for people aged 35-84 was calculated, the coefficients were determined to adjust the incidence to the expected value based on mortality levels (1.6 for men and 1.9 for women), and the The goal of this annex is to summarize research findings relevant to the expected incidence was calculated. It was assumed that people older than 84 microsimulation model, including data on the incidence of tobacco-use- years have the same incidence as people aged 80-84. In Table 3.2, columns related diseases, relative risks of their development in smokers and former which correspond to age groups 0-34 are not shown, as their incidence of smokers compared with never smokers, and related risk of premature death. CHD is assumed to be 0. Below, diseases included in the model are considered and, for each of them, Table 3.2: Calculation of Estimated Incidence of CHD in Ukraine By Gender And Age, per 100 000 extracts are shown from studies related to incidence, relative risks, and Population per Year mortality. Based on the published evidence and statistics available for Ukraine, estimates of incidence and relative risks were elaborated and suggested as Age groups Coefficients inputs for the microsimulation model. 35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-74 75-79 80-84 >84 As estimates from the Global Burden of Disease database became available Framingham during our research (Global Burden of Disease, 2016), we utilized incidence biennial per inputs from this database. Estimates for relative risks were used, as explained 1000 below. male 9 9 21 21 40 40 48 48 52 52 3.1 Cardiovascular Diseases, Particularly Coronary Heart Disease (CHD) female 1 1 7 7 20 20 26 26 46 46 Smoothed 3.1.1 Incidence Because there are no proper estimates of CHD incidence in Ukraine, and most male 9 13 17 27 34 43 45 49 51 52 studies report only relative risks and no absolute risks, data were adapted from female 1 3 5 11 16 22 24 33 39 46 the seminal study on ischemic heart disease – the Framingham study (Castelli, 1984; Lerner & Kannel, 1986). Per year per 1000000 The Framingham study reports average morbidity and mortality for men and male 450 650 850 1367 1683 2133 2267 2467 2533 2600 1700 women. It was calculated that the number of CHD cases is about five times female 50 150 250 567 783 1100 1200 1633 1967 2300 1000 greater than the number of CHD deaths. SDR for CHD in Ukraine was taken Multiplied for the years 2012-2014 and multiplied by the coefficient to estimate the for Ukraine approximate level of average incidence. male 735 1062 1389 2233 2750 3485 3703 4030 4139 4248 4248 1.6 Table 3.1: Calculation of Estimated CHD Morbidity for Ukraine (per 100 000) female 93 280 466 1056 1460 2051 2237 3045 3666 4288 4288 1.9 Morbidity/ Framingham Ukraine mortality morbidity mortality ratio morbidity mortality Males 28.7 6.2 4.6 2777 600 Females 14.5 2.8 5.2 1864 360 64 65 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 3.1.2 Relative risk 16 Peer-reviewed literature shows that younger smokers have a much greater relative risk of contracting CHD than older ones. Relative to never smokers, CHD risk among current smokers was highest in the youngest and lowest 8 in the oldest participants. For example, among women aged 40 to 49 years, Relative Risk the hazard ratio was 8.5 (95% confidence interval [CI] = 5.0, 14), while it was 4 3.1 (95% CI = 2.0, 4.9) among those aged 70 or older. The largest absolute risk differences between current smokers and never smokers were observed among the oldest participants. Finally, the majority of CHD cases among 2 smokers were attributable to smoking. For example, attributable proportions of CHD by age group were 88% (40-49 years), 81% (50-59 years), 71% for (60-69 years), and 68% (70 years) among women who smoked (Tolstrup et al., 2014). 1 40-49 50-59 60-69 >70 Graphs from this systematic review which display RR by age groups are Age Group, Years shown below. Figure 3.2: RRs of Developing CHD among Male Smokers Compared with Non-Smokers by Age Groups, Results from a Systematic Review (Tolstrup et al., 2014). 16 A population-based prospective cohort study of 19 782 men and 21 500 women aged 40-59 years between 1990-1992 and 2001 was conducted 8 to examine the relationship between smoking status and the risk of CHD. A total of 260 incidences of CHD were confirmed among men, including Relative Risk 174 myocardial infarctions (MI). The numbers among women were 66 and 4 43, respectively. The multivariate relative risk [95% confidence interval (CI)] for current smokers versus never-smokers in men, after adjustment for 2 cardiovascular risk factors and several lifestyle factors, was 2.85 (1.98, 4.12) for total CHD and 3.64 (2.27, 5.83) for MI. These respective risks in women were 3.07 (1.48, 6.40) and 2.90 (1.18, 7.18). Among men, a dose-dependent 1 relationship was observed between the number of cigarettes and the risk of 40-49 50-59 60-69 >70 MI. The population-attributable risk percent (95% CI) of CHD was 46% (34, 55) Age Group, Years in men and 9% (0, 18) in women. Smoking cessation, however, led to a rapid decline in the CHD risk within 2 years (Baba et al., 2006). Figure 3.1: RRs of Developing CHD among Female Smokers Compared with Non-Smokers by Age Groups, Results from a Systematic Review Smoking fewer cigarettes/day for a longer duration was more deleterious (Tolstrup et al., 2014). than smoking more cigarettes/day for a shorter duration (P < 0.01). For 50 pack-years (365,000 cigarettes), estimated RRs of CVD were 2.1 for accrual at 20 cigarettes/day and 1.6 for accrual at 50 cigarettes/day (Lubin et al., 2016). 3.1.2.1 Women versus men A systematic review and meta-analysis of prospective cohort studies with data for 3 912 809 individuals and 67 075 coronary heart disease events from 86 prospective trials concluded as follows: In 75 cohorts (2.4 million participants) that adjusted for cardiovascular risk factors other than coronary heart disease, 66 67 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine the pooled adjusted female-to-male RRR of smoking compared with not cardiovascular disease) yielded a summary hazard ratio of 2.07 (95% CI 1.82 to smoking for coronary heart disease was 1.25 (95% CI 1.12-1.39, p<0.0001). This 2.36) for current smokers and 1.37 (1.25 to 1.49) for former smokers compared outcome was unchanged after adjustment for potential publication bias, and with never smokers. Corresponding summary estimates for risk advancement there was no evidence of important between-study heterogeneity (p=0.21). periods were 5.50 years (4.25 to 6.75) for current smokers and 2.16 years (1.38 The RRR increased by 2% for every additional year of study follow-up (p=0.03). to 2.39) for former smokers. The excess risk in smokers increased with cigarette In pooled data from 53 studies, there was no evidence of a sex difference in the consumption in a dose-response manner and decreased continuously with RR between participants who had previously smoked compared with those time since smoking cessation in former smokers (Mons et al., 2015). who never had (RRR 0.96, 95% CI 0.86-1.08, p=0.53) (Huxley & Woodward, 2011). In Sweden and Estonia, a 13-year follow-up regarding all-cause and cardiovascular mortality revealed that smoking and, to a lesser extent, plasma 3.1.2.2 Effects of quitting smoking levels of interleukin-6 were significant predictors of CVD and non-CVD mortality In a cohort of 475 734 Korean men aged 30 to 58 years, compared with non- in men, but none of the other conventional risk factors reached statistical reducing heavy smokers (>= 20 cigarettes/d), those who quit smoking showed significance (Jensen-Urstad, Viigimaa, Sammul, Lenhoff, & Johansson, 2014). significantly lower risks of MI with hazard ratios (95% confidence intervals [CI]) of 0.43 (0.34 to 0.53) (Song & Cho, 2008). In a large prospective cohort of women (Sandhu et al., 2012) without coronary heart disease at baseline (among 101 018 women participating in the Nurses’ 3.1.2.3 Suggested RR for smokers compared with non-smokers Health Study), a strong dose-response relationship between cigarette smoking Based on the above results from (Tolstrup et al., 2014) and (Song & Cho, 2008), and SCD risk was observed, and smoking cessation significantly reduced the updated risk ratios might be as follows. and eventually eliminated excess SCD risk. Compared with never smokers, current smokers had a 2.44-fold (95% CI, 1.80-3.31) increased risk of SCD after Table 3.3: Suggested Input Risk Ratios of Developing CHD among Smokers Compared to Never controlling for coronary risk factors. In multivariable analyses, the quantity of Smokers in Ukraine, by Gender and Age cigarettes smoked daily (P value for trend, <0.0001) and smoking duration (P value for trend, <0.0001) were linearly associated with SCD risk among current Age groups smokers. Small-to-moderate amounts of cigarette consumption (1-14 per Gender day) were associated with a significant 1.84-fold (95% CI, 1.16-2.92) increase in 35-40 45-50 50-55 55-60 60-65 >65 SCD risk and every 5 years of continued smoking was associated with an 8% increase in SCD risk (hazard ratio, 1.08; 95% CI, 1.05-1.12; P<0.0001). The SCD men 5.0 4.0 3.0 3.0 2.0 2.0 risk linearly decreased over time after quitting and was equivalent to that of a women 8.5 8.5 6.6 4.8 3.4 3.1 never-smoker after 20 years of cessation (P value for trend, <0.0001). 3.1.3.1 Effects of quitting smoking 3.1.3 Mortality A systematic review was conducted to determine the magnitude of In the Greek cohort study (Notara et al., 2015), which observed 10-year Acute risk reduction achieved by smoking cessation in patients with CHD. The Coronary Syndrome (ACS) prognosis among 2172 cardiovascular patients, researchers estimated a 36% reduction in crude relative risk (RR) of mortality patients with >60 pack-years of smoking had 57.8 % higher ACS mortality and for patients with CHD who quit, compared with those who continued 24.6 % higher risk for any ACS event. A nested model, adjusted only for age and smoking (RR, 0.64; 95% confidence interval [CI], 0.58-0.71) (Critchley & sex, revealed that, for every 30 pack-years of smoking increase, the associated Capewell, 2003). ACS risk increased by 13 % (95 % CI 1.03, 1.30, p = 0.001). Smoking is a strong independent risk factor for cardiovascular events and mortality even at older age, advancing cardiovascular mortality by more than five years, and demonstrating that smoking cessation in these age groups is still beneficial in reducing the excess risk. Random effects meta-analysis of the association of smoking status with cardiovascular mortality (based on the data of 503 905 participants aged 60 and older, of whom 37 952 died from 68 69 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Table 3.4: Extract on the Incidence of COPD by Age and Sex in Japan (Kojima et al., 2007) 3.2 COPD Males Females 3.2.1 Incidence among the population Incident cases of COPD in a population-based prospective 9-year study in Sao Number of Number of Incidence rate (per Incidence rate (per Paulo, Brazil, ranged from 1.4% to 4.0%, depending on the diagnostic criterion Age n incidence cases n incidence cases (years) for COPD 100 person-years) for COPD 100 person-years) used (Moreira et al., 2015). Total 11,160 387 0.81 5,946 79 0.31 In the Rotterdam Study (Terzikhan et al., 2016), the overall IR was higher in 25-29 94 2 0.62 36 0 0.00 men (13.3/1000 PY, 95 % CI 12.4–14.3) than in women (6.1/1000 PY, 95 % CI 5.6–6.6); age-specific IR ranged between 8.7 and 17.6/1000 PY in males and 30-34 625 7 0.31 181 1 0.16 3.0–7.9/1000 PY in females. The incidence of COPD increased from the age of 35-39 1,609 24 0.35 712 4 0.13 45 in both sexes to the age of 80 in men and 75 in women (Figure 3.3). 40-44 1,973 45 0.47 1,161 10 0.18 20 45-49 2,153 65 0.61 1,279 12 0.19 50-54 1,879 90 1.05 1,157 21 0.42 Incidence of COPD/1,000 PY 15 55-59 1,729 74 1.25 1,002 12 0.35 60-64 745 39 1.67 264 9 1.02 10 65-69 252 24 2.75 109 7 1.69 70-74 101 17 4.95 45 3 2.05 5 Another study conducted in Japan (Fukuchi et al., 2004) reported the 0 prevalence of COPD among adults: 10.9% altogether, 16.4% among men and 45-49 50-54 55-59 60-64 65-69 70-74 75-79 >80 5.0% among women. Age categories These data on incidence and prevalence were considered in order to obtain Figure 3.3: Age-Specific Incidence of COPD by Sex and Age, Drawn from extrapolated estimates for Ukraine. However, the only disease occurrence (Terzikhan et al., 2016) indicator available for Ukraine is the prevalence of COPD from the WHO Euro Health for all database, which reports a level of 3.7-3.9% in 2005-2015. Although some studies report either relative risks of COPD among people in However, studies aimed at COPD measurement conducted, for instance, in older age groups or mention that incidence increases after age 45 (Terzikhan Norway (Johannessen, Omenaas, Bakke, & Gulsvik, 2005) found that about half et al., 2016), this does not mean that COPD is only occurring among people of COPD cases remain undiagnosed. Another study (Nielsen, 2009) projected over 45. A systematic review on global burden of COPD (Halbert et al., 2006) COPD prevalence to be 15-25% of the adult population. Yet the prevalence of reports pooled COPD prevalence at the level of 3.1% (1.8–5.0) among people COPD in Norway reported in HFADB is 0.2%. younger than 40. Additionally, the WHO global report on mortality attributable to tobacco (World Health Organization, 2012) reports tobacco-related Additionally, there is a recognized discrepancy in COPD prevalence across mortality starting from 30 years of age and estimates that 39% of COPD deaths different countries and various studies. This is believed to be determined among people aged 30-44 are attributable to tobacco. by the methods and definitions used to measure disease (Halbert, Isonaka, George, & Iqbal, 2003). Prevalence in most countries where proper measures The only study which reported the incidence of COPD by age and sex was were conducted was found to be between 4% and 10%. conducted in Japan (Kojima et al., 2007). Its findings were used to estimate incidence by age groups in Ukraine. 70 71 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Thus, there are no serious grounds to expect that the incidence of COPD in The abovementioned study from Norway found that adjusted odds ratios (OR) Ukraine should be lower than in Japan. Data on COPD incidence from the for current smokers and ex-smokers were 9.6 (95% CI 3.6-25.2) and 5.0 (95% CI Japan study (Kojima et al., 2007) are suggested for use. 1.8-13.8), compared to never smokers (Johannessen et al., 2005). Table 3.4: Suggested Incidence Rates for COPD The comparisons of risks from these two studies were extracted for the model. Age groups Table 3.5: Suggested Estimates for COPD RR in Smokers and Former Smokers, Compared with Never Smokers 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-74 Age groups Men 0.62 0.31 0.35 0.47 0.61 1.05 1.25 1.67 2.75 4.95 Women 0.00 0.16 0.13 0.18 0.19 0.42 0.35 1.02 1.69 2.05 35-40 45-50 50-55 55-60 60-65 65-69 70-74 75+ Smokers 9.6 8.4 6.6 5.0 4.6 4.5 4.9 5.0 3.2.2 Risk of COPD in smokers The incidence rate (IR) (Terzikhan et al., 2016) was higher in current and former Ex smokers 5.0 4.7 4.2 3.8 4.0 4.5 4.8 5.0 smokers than in never smokers (19.7/1000 PY, 95 % CI 18.1–21.4 in current smokers, 8.3/1000 PY, 95 % CI 7.6–9.1 in former smokers and 4.1/1000 PY, 95 % The suggested RRs are equal for men and women. With regard to various CI 3.6–4.7, in never smokers). The IR of COPD in smoking men was 15.0/1000 tobacco-related diseases, some researchers have reported that the risk of PY (95 % CI 13.9–16.2), compared to 8.6/1000 PY (95 % CI 7.8–9.5) in smoking contracting these diseases is greater in women than in men, while others women. The age-specific IR of COPD in ever smokers ranged between 7.3 and found that the risk is identical. In Danish cohorts (Prescott, Bjerg, Andersen, 15.3/1000 PY. The IR was 6.0/1000 PY (95 % CI 4.6–7.8) in never-smoking men Lange, & Vestbo, 1997), it was seen that risk associated with pack-years was and 3.7/1,000 PY (95 % CI 3.1–4.3) in never-smoking women. The age-specific higher in females than in males. As women smokers in Ukraine on average incidence of COPD in never smokers increased by age, but to a lesser extent smoke fewer cigarettes, equal RRs for men and women can be considered than the incidence of COPD in ever smokers (Figure 3. 5). grounded. 3.3 Hypertension Current smokers 30 All 3.3.1.1 Effects of smoking Former smokers While some authors find an association between tobacco smoking and high Incidence of COPD/1,000 PY Never smokers blood pressure (Tesfaye, Byass, & Wall, 2009; Tesfaye, Byass, Wall, Berhane, 20 & Bonita, 2008), it is necessary to distinguish between short-term, acute hypertensive effects and the long-term risk of developing chronic hypertension. Cigarette smoking acutely exerts a hypertensive effect, mainly through the 10 stimulation of the sympathetic nervous system. As regards the impact of chronic smoking on blood pressure, available data do not provide evidence of a direct causal relationship between these two cardiovascular risk factors (Poulter, 2002), a concept supported by the evidence that lower blood pressure values 0 have not been observed after chronic smoking cessation (Virdis, Giannarelli, 45-49 50-54 55-59 60-64 65-69 70-74 75-79 >80 Neves, Taddei, & Ghiadoni, 2010). Though the prevalence of hypertension was Age categories higher in former smokers than in never smokers (13.5 versus 8.8%, P < 0.001) and the risk of hypertension was higher [odds ratio (OR) 1.31 (1.13-1.52), P< Figure 3.5: Age-Specific Incidence of COPD by Smoking Status, Extracted 0.001] in former smokers than in never smokers (Halimi et al., 2002), these from (Terzikhan et al., 2016) findings were from a cross-sectional study, and no grounds for cause-and-effect association are found in this regard (Poulter, 2002). 72 73 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 3.3.1.1 Effects of quitting smoking Incidence rates of adenocarcinoma, small-cell carcinoma, and undifferentiated The effect of smoking cessation on the risk of developing hypertension tumors were similar in men and women; incidence rates of squamous tumors (HPT) and on BP values was studied in a longitudinal study, with a follow-up in men were about twice those in women. These findings suggest that period of 8 years, which included the participants of the Olivetti Heart Study. women are not more susceptible than men to the carcinogenic effects of These were 430 untreated normotensive non-diabetic men with normal renal cigarette smoking in the lung. In smokers, incidence rates tended to be higher function (D’Elia et al., 2014). After 8 years of follow-up, BP changes (delta) were in men than women with comparable smoking histories, but differences were significantly lower in ex-smokers than in smokers (delta SBP/DBP: 12.6 +/- modest; smoking was strongly associated with lung cancer risk in both men 13.4/7.9 +/- 8.1 vs. 16.0 +/- 14.9/10.3 +/- 10.1 mm Hg; P < 0.05; M +/- SD), also and women (Freedman, Leitzmann, Hollenbeck, Schatzkin & Abnet, 2008). after adjustment for potential confounders. Moreover, at the last examination, the overall HPT prevalence was 33%, with lower values in ex-smokers than in 3.4.2 Mortality smokers (25 vs. 38%, P = 0.01). After accounting for age, BP and BMI at baseline, and changes in smoking habits over the 8-year period, ex-smokers still had 3.4.2.1 Effects of smoking significantly lower risk of HPT than smokers (odds ratio 0.30, 95% confidence In the Japan Collaborative Cohort (JACC) Study, with 45 010 males and interval 0.15-0.58; P < 0.01). 55 724 females aged 40-79 years, 52.2% and 14.8% of lung cancer deaths Taking into account contradictory data on the impact of tobacco smoking on were attributable to current and former cigarette smoking, respectively. In developing hypertension, we decided to exclude hypertension from the list of females, the corresponding figures were 11.8% and 2.8%. Among current diseases modeled in the microsimulation of tobacco-related health impacts. male smokers, the relative risk was strongly correlated with the intensity and duration of cigarette smoking. In contrast, the PAR was associated with an 3.4 Lung Cancer intermediate level of smoking except for the years of smoking: the largest PARs were observed in those with 20-29 cigarettes per day, 40-59 pack-years 3.4.1 Relative risk and 20-22 years old at smoking inception. Absolute risks were estimated to 3.4.1.1 Effects of smoking increase with age and duration of smoking and not to decrease even after In the Seoul Male Cancer Cohort Study (SMCC), which included 14 272 men, cessation (Ando et al., 2003). cigarette smoking was associated with 4.18-fold risk of lung cancer in Korean men (Bae et al., 2007). 3.4.1.1 Effects of quitting smoking Pooled data from three large-scale cohort studies in Japan were used to 3.4.1.1 Women versus men evaluate the impact of smoking cessation on the decrease in risk of lung cancer Analysis was conducted on the data of 279 214 men and 184 623 women death in male ex-smokers by age at quitting. For simplicity, subjects were limited from eight states in the USA, aged 50-71 years at study baseline, participating to male never smokers and former or current smokers who started smoking in the NIH-AARP Diet and Health study. Findings revealed that incidence rates at ages 18-22 years. 110 002 men aged 40-79 years at baseline were included. were 20.3 (95% Cl 16.3-24.3) per 100 000 person-years in men who had never During the mean follow-up of 8.5 years, 968 men died from lung cancer. The smoked (99 cancers) and 25.3 (21.3-29.3) in women who had never smoked mortality rate ratio compared to current smokers decreased with increasing (152 cancers); for this group, the adjusted hazard ratio for lung cancer was 1.3 attained age in men who stopped smoking before age 70 years. Among (1.0-1.8) for women compared with men. men who quit in their fifties, the cohort-adjusted mortality rate ratios (95% confidence interval) were 0.57 (0.40-0.82), 0.44 (0.29-0.66) and 0.36 (0.13-1.00) Smoking was associated with increased risk of lung cancer in men and at attained ages 60-69, 70-79 and 80-89 years, respectively. The corresponding women. The incidence rate of current smokers who smoked more than two figures for those who quit in their sixties were 0.81 (0.44-1.48), 0.60 (0.43-0.82) packs per day was 1259.2 (1035.0-1483.3) in men and 1308.9 (924.2-1693.6) in and 0.43 (0.21-0.86). Overall, the mortality rate ratio for current smokers, relative women. In current smokers, in a model adjusted for typical smoking dose, the to nonsmokers, was 4.71 (95% confidence interval 3.76-5.89) and those for HR was 0.9 (0.8-0.9) for women compared with men. ex-smokers who had quit smoking 0-4, 5-9, 10-14, 15-19, 20-24 and >= 25 years before were 3.99 (2.97-5.35), 2.55 (1.80-3.62), 1.87 (1.23-2.85), 1.21 (0.66-2.22), For former smokers, in a model adjusted for years of cessation and typical 0.76 (0.33-1.75) and 0.67 (0.34-1.32), respectively. Although earlier cessation smoking dose, the HR was 0.9 (0.9-1.0) for women compared with men. of smoking generally resulted in a lower rate of lung cancer mortality in each group of attained age, the absolute mortality rate decreased appreciably after stopping smoking even in men who quit at ages 60-69 years (Wakai et al., 2007). 74 75 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 3.5 Peripheral arterial disease (PAD) In a meta-analysis of the association between cigarette smoking and PAD, was reduced compared with continuing smokers (RR, 1.8; 95% CI, 0.7 to 4.6 the pooled OR for current smokers was 2.71 (95% CI 2.28 to 3.21); for ex- vs. RR, 4.3; 95% CI, 2.1 to 8.8). The benefit of quitting smoking was observed smokers, the pooled OR was 1.67 (95% CI 1.54 to 1.81). The magnitude of the in both normotensive and hypertensive men, but the absolute benefit was association is greater than that reported for coronary heart disease. The risk is greater in hypertensive subjects. Thus, smoking cessation is associated with a lower among ex-smokers but, nonetheless, significantly increased compared considerable and rapid benefit in decreasing the risk of stroke, particularly in with never smokers (Lu, Mackay, & Pell, 2014). light smokers (< 20 cigarettes/d); a complete loss of risk is not seen in heavy smokers. Switching to pipe or cigar smoking confers little benefit, emphasizing 3.5.1 Any stroke the need for complete cessation of smoking. The absolute benefit of quitting smoking on risk of stroke is most marked in hypertensive subjects. 3.5.1.1 Effects of smoking In a meta-analysis on the possible risks of stroke from cigarette smoking In the Japan Public Health Center-based Prospective Study on Cancer and (Shinton & Beevers, 1989), the overall relative risk of stroke associated with Cardiovascular Disease (JPHC Study), relative risks (95% CIs) for current smokers cigarette smoking was 1.5 (95% confidence interval 1.4 to 1.6). Considerable compared with never-smokers, after adjustment for cardiovascular risk factors differences were seen in relative risks among the subtypes: Cerebral infarction and public health center, were 1.27 (1.05 to 1.54) for total stroke, 0.72 (0.49 to 1.9, cerebral hemorrhage 0.7, and subarachnoid hemorrhage 2.9. An effect 1.07) for intraparenchymal hemorrhage, 3.60 (1.62 to 8.01) for subarachnoid of age on the relative risk was also noted; less than 55 years 2.9, 55-74 years hemorrhage, and 1.66 (1.25 to 2.20) for ischemic stroke. The respective 1.8, and greater than or equal to 75 years 1.1. A dose response between the multivariate relative risks among women were 1.98 (1.42 to 2.77), 1.53 (0.86 number of cigarettes smoked and relative risk was noted, and there was a small to 4.25), 2.70 (1.45 to 5.02), and 1.57 (0.86 to 2.87). There was a dose-response increased risk in women compared with men. Ex-smokers under the age of 75 relation between the number of cigarettes smoked and risks of ischemic stroke seemed to retain an appreciably increased risk of stroke (1.5); for all ages, the for men. A similar positive association was observed between smoking and relative risk in ex-smokers was 1.2. risks of lacunar infarction and large-artery occlusive infarction, but not embolic infarction (Mannami et al., 2004). In a prospective study (Wannamethee, Shaper, Whincup, & Walker, 1995) of cardiovascular disease and its risk factors, 7735 men aged 40 through 59 years 3.5.1.1 Women versus men were drawn at random from the age-sex registers of one general practice in In a systematic review and meta-analysis which aimed to estimate the effect of each of 24 British towns from 1978 through 1980 (the British Regional Heart smoking on stroke in women compared with men (Peters, Huxley, & Woodward, Study). During the 12.75 years of follow-up, there were 167 major stroke events 2013), with data from 81 prospective cohort studies that included 3 980 359 (43 fatal and 124 non-fatal) in the 7264 men with no recall of previous ischemic individuals and 42 401 strokes, the pooled multiple-adjusted RRR indicated a heart disease or stroke. After full adjustment for other risk factors, current similar risk of stroke associated with smoking in women compared with men smokers had a nearly fourfold relative risk (RR) of stroke compared with never (RRR, 1.06 [95% confidence interval, 0.99-1.13]). In a regional analysis, there smokers (RR, 3.7; 95% confidence interval [CI], 2.0 to 6.9). was evidence of a more harmful effect of smoking in women than in men in Western populations (RRR, 1.10 [1.02-1.18]), but not in Asian populations Ex-smokers showed lower risk than current smokers, but showed excess risk (RRR, 0.97 [0.87-1.09]). Compared with never-smokers, the beneficial effects of compared with never smokers (RR, 1.7; 95% CI, 0.9 to 3.3; P = .11); those who quitting smoking on stroke risk among former smokers were similar between switched to pipe or cigar smoking showed a significantly increased risk (RR, 3.3; the sexes (RRR, 1.10 [0.99-1.22]). 95% CI, 1.6 to 7.1), similar to that of current light smokers. Primary pipe or cigar smokers also showed increased risk (RR, 2.2; 95% CI, 0.6 to 8.0), but the number 3.5.2 Ischemic stroke of subjects involved was small. The benefit of giving up smoking completely Smoking is associated with an increased risk of ischemic stroke or CV death in was seen within five years of quitting, with no further consistent decline in the Atherosclerosis Risk in Communities (ARIC) Study, which comprised mostly risk thereafter, but this was dependent on the amount of tobacco smoked. middle-aged to young-old subjects (65-74 years), but not in the Cardiovascular Light smokers (< 20 cigarettes/d) reverted to the risk level of those who had Health Study (CHS), which comprised mostly middle-old or oldest-old (>= 75 never smoked. Heavy smokers retained a more than twofold risk compared years) adults with atrial fibrillation. Compared with never smokers, current smokers with never smokers (RR, 2.2; 95% CI, 1.1 to 4.3). The age-adjusted RR of stroke in had a higher incidence of the composite endpoint in ARIC [HR: 1.65 (1.21-2.26)], those who quit smoking during the first five years of follow-up (recent quitters) but not in CHS [HR: 1.05 (0.69-1.61)] (Kwon et al., 2016). 76 77 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 3.6 All-Cause Mortality 3.5.2.1 Effects of quitting or reducing smoking In a cohort of 475 734 Korean men aged 30 to 58 years, compared with 3.6.1.1 Effects of smoking non-reducing heavy smokers (>= 20 cigarettes/d), those who quit smoking In a large community-based prospective cohort study comprising 6209 Beijing showed significantly lower risks of ischemic stroke with hazard ratios (95% adults (aged >= 40 years) studied for approximately eight years (1991-1999), confidence intervals [CI]) of 0.66 (0.55 to 0.79). Compared with non-reducing the multivariable-adjusted HRs for all-cause mortality were 2.7 (95% confidence heavy smokers, the risks of all strokes combined and MI among reducers interval (CI):1.56-4.69) in young adult smokers (40-50 years) and 1.31 (95% tended to decrease, although the decrements were not statistically significant CI: 1.13-1.52) in old smokers (>50 years) (Li et al., 2016). Mortality differences (Song & Cho, 2008). (/10,000 person-years) were 15.99 (95% CI: 15.34-16.64) in the young and 74.61(68.57-80.65) in the old. Compared with current smokers, the HRs of all- 3.5.3 Hemorrhagic stroke cause deaths for former smokers in younger and older adults were 0.57 (95% CI: 0.23-1.42) and 0.96 (95% CI: 0.73-1.26), respectively. 3.5.3.1 Relative risk Among 20 033 individuals participating in the Health Effects of Arsenic 3.5.3.1.1 Effects of smoking Longitudinal Study (HEALS) in Bangladesh, cigarette/bidi smoking was positively In a study of incident cerebral microbleeds (CMBs), which are asymptomatic associated with all-cause (HR 1.40, 95% CI 1.06 1.86) and cancer mortality (HR precursors of intracerebral hemorrhage, conducted among 2635 individuals 2.91, 1.24 6.80), and there was a dose-response relationship between increasing aged 66 to 93 years from the population-based Age, Gene/Environment intensity of cigarette/bidi consumption and increasing mortality. An elevated Susceptibility (AGES)-Reykjavik Study, relative risk for current smoking was 1.47 risk of death from ischemic heart disease (HR 1.87, 1.08 3.24) was associated [95% CI, 1.11-1.94] (Ding et al., 2015). with current cigarette/bidi smoking. Among women, the corresponding HRs were 1.65 (95% CI 1.16 2.36) for all-cause mortality and 2.69 (95% CI 1.20 6.01) for 3.5.1.1.1 Effects of quitting or reducing smoking ischemic heart disease mortality. Cigarette/bidi smoking accounted for about In a cohort of 475 734 Korean men aged 30 to 58 years, compared with 25.0% of deaths in men and 7.6% in women (Wu et al., 2013). non-reducing heavy smokers (>= 20 cigarettes/d), those who quit smoking showed significantly lower risks of subarachnoid hemorrhage with hazard 3.6.1.1 Effects of quitting smoking ratios (95% confidence intervals [CI]) of 0.58 (0.38 to 0.90). For hemorrhagic Effects of quitting smoking on all-cause mortality were measured in a cohort of stroke, quitters showed lower risk compared with heavy smokers, but the 1 494 Chinese people (961 men, 533 women) followed for 18 years (1976-1994) difference was not statistically significant (hazard ratio 0.82, 95% CI: 0.64 to assess changes in smoking behavior and then for an additional 17 years to 1.06). The risks of subarachnoid hemorrhage in those who reduced (1994-2011) to examine the relationships of continuing to smoke and new from moderate to light smoking tended to be lower than in non-reducing quitting with mortality risk. Ever smokers had increased risks of lung cancer, moderate (10 to 19 cigarettes/d) smokers (Song & Cho, 2008). coronary heart disease, thrombotic stroke, and COPD, with dose-response relationships. For all tobacco-related mortality, the relative risk for new quitters 3.5.1.2 Suggested relative risks compared with continuing smokers was 0.68 (95% confidence interval: 0.46, Based on the above literature on cumulative risk of all strokes, the below RRs 0.99) for those who had quit two to seven years previously and 0.56 (95% are suggested for the model. confidence interval: 0.37, 0.85) for those who had quit eight years or more Table 3.6: Estimates of All Strokes Relative Risk for Smokers and Ex-Smokers Compared to Never previously. The corresponding relative risks were 0.69 and 0.45 for lung cancer, Smokers for the Microsimulation Model by Age Group, Both Genders 0.78 and 0.51 for coronary heart disease, 0.76 and 0.84 for thrombotic stroke, and 0.89 and 0.61 for COPD, respectively (He et al., 2014). Age groups 35-40 40-50 50-55 55-60 60-65 >65 Smokers 1.7 1.7 1.5 1.5 1.2 1.2 Ex smokers 4.3 3.7 2.9 1.8 1.8 1.2 78 79 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine In the Singapore Chinese Health Study, a cohort study of middle-aged and elderly Chinese in Singapore (n=48 251), compared with current smokers, the References adjusted HR (95% CI) for total mortality was 0.84 (0.76 to 0.94) for new quitters, 0.61 (0.56 to 0.67) for long-term quitters and 0.49 (0.46 to 0.53) for never- Ando, M., Wakai, K., Seki, N., Tamakoshi, A., Suzuki, K., Ito, Y., . . . Grp, J. S. (2003). smokers. New quitters had a 24% reduction in lung cancer mortality (HR: 0.76, Attributable and absolute risk of lung cancer death by smoking status: Findings 95% CI 0.57 to 1.00), and long-term quitters had a 56% reduction (HR: 0.44, from the Japan Collaborative Cohort Study. International Journal of Cancer, 95% CI 0.35 to 0.57). The risk for coronary heart disease mortality was reduced 105(2), 249-254. doi:10.1002/ijc.11043 in new quitters (HR: 0.84, 95% CI 0.66 to 1.08) and long-term quitters (HR: 0.63, 95% CI 0.52 to 0.77), although the result for new quitters was of borderline Baba, S., Iso, H., Mannami, T., Sasaki, S., Okada, K., Konishi, M., . . . Grp, J. S. (2006). significance due to the relatively small number of cardiovascular deaths. The Cigarette smoking and risk of coronary heart disease incidence among middle- risk for chronic pulmonary disease mortality was reduced in long-term quitters aged Japanese men and women: The JPHC Study Cohort I. European Journal but increased in new quitters. The authors concluded that significant reduction of Cardiovascular Prevention & Rehabilitation, 13(2), 207-213. doi:10.1097/01. in the risk of total mortality, specifically for lung cancer mortality, can be hjr.0000194417.16638.3d achieved within five years of smoking cessation (Lim, Tai, Yuan, Yu, & Koh, 2013). Bae, J. M., Lee, M. S., Shin, M. H., Kim, D. H., Li, Z. M., & Ahn, Y. O. (2007). Cigarette smoking and risk of lung cancer in Korean men: The Seoul male cancer cohort study. Journal of Korean Medical Science, 22(3), 508-512. Castelli, W. P. (1984). Epidemiology of coronary heart disease: The Framingham study. Am J Med, 76(2a), 4-12. Critchley, J. A., & Capewell, S. (2003). Mortality risk reduction associated with smoking cessation in patients with coronary heart disease: A systematic review. Journal of the American Medical Association, 290(1), 86-97. doi:10.1001/ jama.290.1.86 D’Elia, L., De Palma, D., Rossi, G., Strazzullo, V., Russo, O., Iacone, R., . . . Galletti, F. (2014). Not smoking is associated with lower risk of hypertension: Results of the Olivetti Heart Study. European Journal of Public Health, 24(2), 226-230. doi:10.1093/eurpub/ckt041 Ding, J., Sigurdsson, S., Garcia, M., Phillips, C. L., Eiriksdottir, G., Gudnason, V., . . . Launer, L. J. (2015). Risk factors associated with incident cerebral microbleeds according to location in older people: The Age, Gene/Environment Susceptibility (AGES)-Reykjavik Study. JAMA Neurology, 72(6), 682-688. doi:10.1001/ jamaneurol.2015.0174 Freedman, N. D., Leitzmann, M. F., Hollenbeck, A. R., Schatzkin, A., & Abnet, C. C. (2008). Cigarette smoking and subsequent risk of lung cancer in men and women: Analysis of a prospective cohort study. Lancet Oncology, 9(7), 649-656. doi:10.1016/51470-2045(08)70154-2 80 81 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Fukuchi, Y., Nishimura, M., Ichinose, M., Adachi, M., Nagai, A., Kuriyama, T., . . Kwon, Y., Norby, F. L., Jensen, P. N., Agarwal, S. K., Soliman, E. Z., Lip, G. Y. H., . Zaher, C. (2004). COPD in Japan: The Nippon COPD Epidemiology Study. . . . Chen, L. Y. (2016). Association of smoking, alcohol, and obesity with Respirology, 9(4), 458-465. doi:10.1111/j.1440-1843.2004.00637.x cardiovascular death and ischemic stroke in atrial fibrillation: The Atherosclerosis Risk in Communities (ARIC) Study and Cardiovascular Health Study (CHS). PLoS Global Burden of Disease. (2016). Global Health Data Exchange. Retrieved from: One, 11(1), 13. doi:10.1371/journal.pone.0147065 http://ghdx.healthdata.org/gbd-results-tool Lerner, D. J., & Kannel, W. B. (1986). Patterns of coronary heart disease morbidity Halbert, R. J., Isonaka, S., George, D., & Iqbal, A. (2003). Interpreting COPD and mortality in the sexes: A 26-year follow-up of the Framingham population. prevalence estimates: What is the true burden of disease? Chest, 123(5), 1684- Am Heart J, 111(2), 383-390. 1692. Li, K. B., Yao, C. H., Di, X., Yang, X. C., Dong, L., Xu, L., & Zheng, M. L. (2016). Smoking Halbert, R. J., Natoli, J. L., Gano, A., Badamgarav, E., Buist, A. S., & Mannino, D. M. and risk of all-cause deaths in younger and older adults: A population-based (2006). Global burden of COPD: Systematic review and meta-analysis. Eur Respir J, prospective cohort study among Beijing adults in China. Medicine (Baltimore), 28(3), 523-532. doi:10.1183/09031936.06.00124605 95(3), 5. doi:10.1097/md.0000000000002438 Halimi, J. M., Giraudeau, B., Vol, S., Caces, E., Nivet, H., & Tichet, J. (2002). The risk of Lim, S. H., Tai, B. C., Yuan, J. M., Yu, M. M. C., & Koh, W. P. (2013). Smoking cessation hypertension in men: Direct and indirect effects of chronic smoking. Journal of and mortality among middle-aged and elderly Chinese in Singapore: The Hypertension, 20(2), 187-193. doi:10.1097/00004872-200202000-00007 Singapore Chinese health study. Tobacco Control, 22(4), 235-240. doi:10.1136/ tobaccocontrol-2011-050106 He, Y., Jiang, B., Li, L. S., Li, L. S., Sun, D. L., Wu, L., . . . Lam, T. H. (2014). Changes in smoking behavior and subsequent mortality risk during a 35-year follow-up Lu, L., Mackay, D. F., & Pell, J. P. (2014). Meta-analysis of the association between of a cohort in Xi’an, China. Am J Epidemiol, 179(9), 1060-1070. doi:10.1093/aje/ cigarette smoking and peripheral arterial disease. Heart, 100(5), 414-423. kwu011 doi:10.1136/heartjnl-2013-304082 Huxley, R. R., & Woodward, M. (2011). Cigarette smoking as a risk factor for Lubin, J. H., Couper, D., Lutsey, P. L., Woodward, M., Yatsuya, H., & Huxley, R. coronary heart disease in women compared with men: A systematic review R. (2016). Risk of cardiovascular disease from cumulative cigarette use and and meta-analysis of prospective cohort studies. Lancet, 378(9799), 1297-1305. the impact of smoking intensity. Epidemiology, 27(3), 395-404. doi:10.1097/ doi:10.1016/s0140-6736(11)60781-2 ede.0000000000000437 Jensen-Urstad, M., Viigimaa, M., Sammul, S., Lenhoff, H., & Johansson, J. (2014). Mannami, T., Iso, H., Baba, S., Sasaki, S., Okada, K., Konishi, M., . . . (2004). Cigarette Impact of smoking: All-cause and cardiovascular mortality in a cohort of smoking and risk of stroke and its subtypes among middle-aged Japanese men 55-year-old Swedes and Estonians. Scandinavian Journal of Public Health, 42(8), and women: The JPHC Study Cohort I. Stroke, 35(6), 1248-1253. doi:10.1161/01. 780-785. doi:10.1177/1403494814550177 STR.0000128794.30660.e8 Johannessen, A., Omenaas, E., Bakke, P., & Gulsvik, A. (2005). Incidence of GOLD- Mons, U., Muezzinler, A., Gellert, C., Schottker, B., Abnet, C. C., Bobak, M., defined chronic obstructive pulmonary disease in a general adult population. . . . Consortium, C. (2015). Impact of smoking and smoking cessation on Int J Tuberc Lung Dis, 9(8), 926-932. cardiovascular events and mortality among older adults: Meta-analysis of individual participant data from prospective cohort studies of the CHANCES Kojima, S., Sakakibara, H., Motani, S., Hirose, K., Mizuno, F., Ochiai, M., & consortium. British Medical Journal, 350, 12. doi:10.1136/bmj.h1551 Hashimoto, S. (2007). Incidence of chronic obstructive pulmonary disease, and the relationship between age and smoking in a Japanese population. J Moreira, G. L., Gazzotti, M. R., Manzano, B. M., Nascimento, O., Perez-Padilla, Epidemiol, 17(2), 54-60. R., Menezes, A. M. B., & Jardim, J. R. (2015). Incidence of chronic obstructive pulmonary disease based on three spirometric diagnostic criteria in Sao Paulo, Brazil: A nine-year follow-up since the PLATINO prevalence study. Sao Paulo Medical Journal, 133(3), 245-251. doi:10.1590/1516-3180.2015.9620902 82 83 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Nielsen, R. (2009). Present and future costs of COPD in Iceland and Tesfaye, F., Byass, P., Wall, S., Berhane, Y., & Bonita, R. (2008). Association of Norway: Results from the BOLD study. Eur Respir J, 34(4), 850-857. smoking and khat (Catha edulis Forsk) use with high blood pressure among doi:10.1183/09031936.00166108 adults in Addis Ababa, Ethiopia, 2006. Prev Chronic Dis, 5(3), A89. Notara, V., Panagiotakos, D. B., Kouroupi, S., Stergiouli, I., Kogias, Y., Stravopodis, Tolstrup, J. S., Hvidtfeldt, U. A., Flachs, E. M., Spiegelman, D., Heitmann, B. L., P., . . . Investigators, G. S. (2015). Smoking determines the 10-year (2004-2014) Balter, K., . . . Feskanich, D. (2014). Smoking and risk of coronary heart disease prognosis in patients with Acute Coronary Syndrome: The GREECS observational in younger, middle-aged, and older adults. American Journal of Public Health, study. Tobacco Induced Diseases, 13, 9. doi:10.1186/s12971-015-0063-6 104(1), 96-102. doi:10.2105/ajph.2012.301091 Peters, S. A. E., Huxley, R. R., & Woodward, M. (2013). Smoking as a risk factor for Virdis, A., Giannarelli, C., Neves, M. F., Taddei, S., & Ghiadoni, L. (2010). Cigarette stroke in women compared with men: A systematic review and meta-analysis smoking and hypertension. Curr Pharm Des, 16(23), 2518-2525. of 81 cohorts, including 3 980 359 individuals and 42 401 strokes. Stroke, 44(10), 2821-2828. doi:10.1161/strokeaha.113.002342 Wakai, K., Marugame, T., Kuriyama, S., Sobue, T., Tamakoshi, A., Satoh, H., . . . Tsugane, S. (2007). Decrease in risk of lung cancer death in Japanese men after Poulter, N. R. (2002). Independent effects of smoking on risk of hypertension: smoking cessation by age at quitting: Pooled analysis of three large-scale cohort Small, if present. Journal of Hypertension, 20(2), 171-172. doi:10.1097/00004872- studies. Cancer Science, 98(4), 584-589. doi:10.1111/j.1349-7006.2007.00423.x 200202000-00002 Wannamethee, S. G., Shaper, A. G., Whincup, P. H., & Walker, M. (1995). Smoking Prescott, E., Bjerg, A. M., Andersen, P. K., Lange, P., & Vestbo, J. (1997). Gender cessation and the risk of stroke in middle-aged men. JAMA, 274(2), 155-160. difference in smoking effects on lung function and risk of hospitalization for COPD: Results from a Danish longitudinal population study. Eur Respir J, 10(4), World Health Organization. (2012). WHO global report on mortality attributable to 822-827. tobacco. Sandhu, R. K., Jimenez, M. C., Chiuve, S. E., Fitzgerald, K. C., Kenfield, S. A., Tedrow, Wu, F., Chen, Y., Parvez, F., Segers, S., Argos, M., Islam, T., . . . Ahsan, H. (2013). A U. B., & Albert, C. M. (2012). Smoking, smoking cessation, and risk of sudden prospective study of tobacco smoking and mortality in Bangladesh. PLoS One, cardiac death in women. Circulation-Arrhythmia and Electrophysiology, 5(6), 1091- 8(3), 11. doi:10.1371/journal.pone.0058516 1097. doi:10.1161/circep.112.975219 Shinton, R., & Beevers, G. (1989). Meta-analysis of relation between cigarette smoking and stroke. BMJ, 298(6676), 789-794. Song, Y. M., & Cho, H. J. (2008). Risk of stroke and myocardial infarction after reduction or cessation of cigarette smoking: A cohort study in Korean men. Stroke, 39(9), 2432-2438. doi:10.1161/strokeaha.107.512632 Terzikhan, N., Verhamme, K. M. C., Hofman, A., Stricker, B. H., Brusselle, G. G., & Lahousse, L. (2016). Prevalence and incidence of COPD in smokers and non- smokers: The Rotterdam Study. European Journal of Epidemiology, 31(8), 785-792. doi:10.1007/s10654-016-0132-z Tesfaye, F., Byass, P., & Wall, S. (2009). Population-based prevalence of high blood pressure among adults in Addis Ababa: Uncovering a silent epidemic. BMC Cardiovasc Disord, 9, 39. doi:10.1186/1471-2261-9-39 84 85 86 87