DELING THE LONG- M HEALTH AND T IMPACTS OF Modeling the Long- UCING SMOKING Term Health and Cost Impacts of VALENCE Reducing Smoking Prevalence Through OUGH TOBACCO Tobacco Taxation in Ukraine ATION IN AINE Acknowledgements This report was prepared under the overall Joy Townsend, Emeritus Professor of Economics coordination of Patricio V. Marquez, Lead Public and Primary Health Care, Department of Social and Health Specialist, Health, Nutrition and Population Environmental Health Research, London School of Global Practice, World Bank Group, by a team Hygiene and Tropical Medicine. comprised of: Feng Zhao, Human Development Program Leader, Laura Webber, Director, Public Health Modeling, Ukraine, Moldova, and Belarus, Ukraine Country UK Health Forum, London, UK, and Honorary Office, World Bank Group. Lecturer, School of Environmental Health, London School of Hygiene and Tropical Medicine. Olena Doroshenko, Health Economist, Health, Nutrition and Population Global Practice, World Tatiana I. Andreeva, Associate Researcher, Alcohol Bank Group. and Drug Information Center (ADIC-Ukraine), Kiev, Ukraine, and Visiting Professor, Cluj School of Support was provided by Oleksandra Griaznova, Public Health, College of Political, Administrative Ukraine Country Office, World Bank Group, and and Communication Sciences, Babeș-Bolyai Akosua O. Dakwa, Health, Nutrition and Population University, Cluj-Napoca, Romania. Global Practice, World Bank Group. Renzo Sotomayor, Health Specialist, Health, Draft versions of the report were peer reviewed by: Nutrition and Population Global Practice. World Bank Group. Professor Prabhat Jha, University of Toronto Chair in Global Health and Epidemiology, Dalla Lana School Abbygail Jaccard, Deputy Director, Public Health of Public Health, and Executive Director, Centre Modeling, UK Health Forum, London, UK. for Global Health Research, St. Michael’s Hospital, Canada. Lise Retat, Mathematical Modeler, UK Health Forum, London, UK. Sheila Dutta, Senior Health Specialist, Health, Nutrition and Population Global Practice, World Michael Xu, Software Engineer, UK Health Forum, Bank Group. London, UK. The report was edited by Alexander Irwin. Comments, inputs, and advice were provided by: The preparation of this report was supported under Michal Stoklosa, Senior Economist, American the World Bank’s Global Tobacco Control Program, Cancer Institute. co- financed by the Bill and Melinda Gates Foundation and Bloomberg Foundation. Konstantin Krasovsky, Head, Tobacco Control Unit, Institute for Strategic Research, Ministry of Health Kiev, London and Washington, DC, August 2016- of Ukraine. February 2017. Alberto Gónima, Consultant, World Bank Group. 2 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Introduction Smoking is a leading cause of preventable premature deaths. Smoking’s effects will continue to devastate lives in many countries, including Ukraine, if measures are not implemented to reduce its prevalence. Smoking is a major cause of many chronic diseases, such as cardiovascular disease, respiratory disease, and smoking-related cancers. Over recent years, successful tobacco control policies in Ukraine have resulted in one of the fastest declines in smoking prevalence in the world (1). This is largely due to multifaceted tobacco control legislation, adopted from 2005 and subsequently upgraded. Ukraine ratified the WHO Framework Convention on Tobacco Control (FCTC) in 2006. Currently, Ukrainian legislation basically corresponds to FCTC requirements. In 2005, Ukraine adopted a first tobacco-control law. Since then, several additional tobacco-control policies have been implemented in the country. Smoke-free policies supported by media campaigns have covered many workplaces and public places since the middle of 2006. Under these policies, at least 50 percent of the area of restaurants and bars had to be isolated from the smoking area, so that tobacco smoke did not penetrate into smoke-free areas. This measure was supported by an intensive media campaign and public movement in favor of smoke-free policies. Many restaurants went completely smoke-free both before and after implementing this measure. As of December 2012, restaurants, workplaces, and other public places became 100 percent smoke-free. Designated smoking places, which figured in the legislation between 2006 and 2012, were abolished in the amended laws. As of late 2006, cigarette packs carried textual warning labels covering 30 percent of their surface, in place of a previous warning which covered 10 percent of the front surface and stated: “Ministry of Health warns: Smoking is bad for your health.” Since October 4, 2012, large (50 percent of the pack surface area), graphic health-warning labels on tobacco packaging have been introduced. A ban on outdoor tobacco advertising since January 2009 was followed by a more comprehensive tobacco advertising ban, which entered into force on September 16, 2012. 3 In addition to the reduced tobacco affordability observed during the global economic recession, the average tax incidence was increased between August 2008 and July 2010 from 0.5 UAH (Ukrainian hryvnia, the national currency of Ukraine) to 3 UAH per cigarette pack. Further changes in tobacco tax rates were less substantial and were above inflation only in some years. However, while the policies described were definitely beneficial, much progress remains to be made. As of late 2015, the prevalence of current smoking among men in Ukraine was 45 percent (2, 3), although prevalence is much lower in women, at 11 percent (2, 3). It is not a given that smoking trends will continue to decrease in Ukraine, unless effective tobacco control measures are sustained and strengthened. Especially when the economy grows, commodities/luxuries such as smoking will become more affordable. Tobacco industry tactics can also become an important factor in determining the level of cigarette consumption. One of the mechanisms of this influence derives from the industry’s right to determine the maximum retail price of cigarettes and thus to manipulate the net-of-tax portion of the price. In 2016, a new tax policy stipulated that the minimal specific tobacco tax increase by 40 percent. Thus, the retail price was expected to increase and the consumption of cigarettes to decrease. However, the actual level of cigarette consumption increased. This happened because tobacco companies, aiming to keep their customers, initiated “price wars.” This example illustrates that more factors are at play than are usually taken into account in weighing policy choices. Price and tax factors are extremely important and need to be considered when forecasting trends. The present report provides evidence from a modeling exercise undertaken to predict the health and related cost impacts that may stem from the implementation of a tobacco excise tax increase in Ukraine. Impacts are calculated relative to the status quo before the tax hike, and are modeled, beginning in 2017, for 2025 and 2035. A microsimulation model was employed to simulate the long-term impact of tobacco taxation on the future burden of a range of non-communicable diseases (NCDs). Specifically, the disease outcomes quantified were coronary heart disease (CHD), stroke, chronic obstructive pulmonary disease (COPD), and lung cancer. The microsimulation model has been deemed by the OECD the most relevant method for NCD modeling based on risk-factor data (4). This report complements modeling work done to estimate the fiscal- revenue impact and expected reduction in consumption that might stem from proposed additional tobacco excise tax increases in Ukraine. That work was carried out by the World Bank, using the Tobacco Tax Simulation Model (TaXSiM) developed by the World Health Organization (WHO) (5, 6). 4 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Summary of results Table 1 presents a summary of total disease cases (epidemiological) and costs (economic) by parameter, year, and scenario as rates per Ukraine population. The model estimated that by 2035 the specified tax increase would result in the avoidance of 126,730 new cases of smoking-related disease; 29,172 premature deaths; and 267,098 potential years of life lost, relative to no change in tax. These reductions in disease and death will avoid 1.5 billion UAH in healthcare costs and 16.5 billion UAH in premature mortality costs, respectively. Table 1: Summary Table of the Outputs as Rates per Ukraine Population, by Year Epidemiological outputs Year Sc0 (Baseline) Sc1 2025 5480948[±4237] 5427558[±4237] Cumulative incident cases 2035 11366868[±5753] 11255173[±5753] 2025 NA 56224[±6341] Cumulative incident cases avoided 2035 NA 126730[±9123] 2025 589035[±1545] 582341[±1545] Incident cases per year 2035 646600[±1545] 640799[±1727] 2025 218221[±1121] 208475[±1121] Attributable incident cases 2035 222603[±1041] 211984[±1041] 2025 NA 6372 Cumulative premature deaths avoided 2035 NA 29172 Cumulative potential years of life lost relative 2025 NA 48923 to baseline 2035 NA 267098 Economic outputs 2025 NA 542.23 Direct costs avoided (millions UAH) 2035 NA 1545.81 Cumulative premature mortality costs avoided 2025 NA 3568.4 (millions UAH) 2035 NA 16536.4 5 Summary of Methods Methods The model simulates a virtual population of If an individual’s smoking status is changed by the Ukraine, based on known population statistics. intervention, their smoking status will then remain fixed for the entire simulation. Initial smoking prevalence by age and sex was extracted from the 2015 Annual Household Time since cessation is included in the model to Survey conducted by the National Statistics account for change in disease risk for an ex-smoker. Service of Ukraine. Smokers react quickly to the tax: we modeled an Scenarios took account of price impacts on uptake immediate effect and then a linear trend, in line with of smoking and cessation. TaXSiM. Individuals within the model have a specified Limitations smoking status and a probability of contracting, No data on survival for the specified NCDs were dying from, or surviving a disease. available. Future prevalence of smoking is calculated based Data on the percentages of ex-smokers in Ukraine on the numbers of smokers and non-smokers who are limited. are still alive in a particular year. The model does not take account of future changes Data for disease incidence and mortality were in policy or technology. extracted from the Global Burden of Disease database. No change in secondhand smoke exposure is modeled. Relative risks of contracting diseases in smokers compared to non-smokers were extracted from Baseline is static over time. peer-reviewed literature. The simulation only includes four smoking-related A five-module microsimulation model was used to diseases, so results are likely underestimates of the predict the future health and economic impacts of true effects. smoking prevalence by 2035. No data on non-healthcare costs, e.g. lost productivity The model quantifies the future impact on health due to disease, were available. and related costs of different levels of tax increase relative to a “no change” scenario. No data were available to explore differences by social groups. Assumptions Smoking prevalence follows a static trend from For the scenarios, smokers were moved to never 2015 estimates. smokers in order to account for change in uptake. This will overestimate the impact of the tax increases. A specified percentage of smokers who are affected by the tax increase move to the “never- No in-depth uncertainty analysis was conducted. smoker” category in 2017, in order to account for reductions in uptake due to price increases. Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Full methods Data Collection Smoking Prevalence Data Baseline smoking prevalence data was extracted from the 2015 Annual Household Survey, conducted by the National Statistic Service in Ukraine (7). Additional data on percentages of occasional smokers and ex-smokers were extrapolated from the omnibus surveys conducted by Kiev International Institute of Sociology (2, 3). Data manipulation and assumptions 1. Daily versus current smokers: The Annual Household Survey provides prevalence data on daily smoking only, rather than current smoking. Current smoking data are preferable, since the WHO target is focused on a total reduction in smoking prevalence, rather than number of cigarettes smoked. Modeling proceeded with a focus on prevalence of current smoking. Pooled estimates from other smaller surveys with more detailed collection of smoking status data were collated and the 2015 Household survey adjusted to include estimated proportions of occasional smokers. 2. Ex-smokers: No data by age and sex were available for percentages of ex-smokers within the Annual Household Survey. However, some data were available from the recent omnibus surveys (2013-2015). Therefore, in order to take account of ex-smokers (who have a greater disease risk than never smokers), the distributions of ex-smokers and never smokers from the omnibus surveys were used to proportion out the non-smoker data into ex- smoker and never smoker from the Annual Household Survey. This enabled us to initialize the model in the start year with a more accurate estimation of ex-smokers than would be done using proxy ex-smoker data. 3. Sample sizes: Often, sample sizes by age group were not presented, therefore the total sample size was proportioned across the five-year age groups and the variance increased based on data from the UN population prospects database (8). 4. Age groups: Data for some years were in wide age groups of more than 20 years (e.g., 30-59 years), therefore prevalence was assumed to be the same across these groups. Once raw/more detailed data/data by five-year age groups become available, then the data can be updated. 7 Disease Data For this study, the following smoking-related NCDs were modeled: Coronary heart disease (CHD), stroke, lung cancer, and chronic obstructive pulmonary disease (COPD). Incidence and mortality data by age and sex were extracted from the Global Burden of Disease study database (9). Lung cancer data were grouped with trachea and bronchus data in the database, which slightly overestimates cases relative to Globocan (10). No survival data were available for these diseases in Ukraine, therefore survival data were calculated using DISMOD equations from the World Health Organization (11). Relative risks (RR) for smokers compared to non-smokers were extracted from prospective cohort studies which observed the development of CHD (12-17), COPD (17-21), lung cancer (17, 21-23), and stroke (24-26). As various cohort studies usually observed participants of different age groups, their estimates were compared and combined to cover the modeled population: Thus, relative risks for various age groups may derive from different studies. However, if RRs for neighboring age groups from various studies differed much, some smoothing was undertaken. Appendix 3 describes the method of creating RRs in more detail. For ex-smokers, RRs were assumed to decrease over time since cessation. The ex-smoker RR was computed using a decay function method developed by Hoogenveen and colleagues (27). This function uses the current smoker RR for each disease as the starting point and then models the decline in relative risk of disease for an ex-smoker over time, as detailed in Appendix 1 of the supplementary appendix. Health-Economic Data Data on direct health care costs by disease were extracted from the literature (28), but no data on indirect, non-healthcare costs by disease were available. Data on direct health care costs were included in the model and the direct healthcare cost impacts output from the model. It was possible to calculate premature mortality costs by including average annual income.1 This accumulates the lost earnings due to death before age 65 to provide a different measure of lost productivity in terms of losses of GDP due to death. However, this does not take account of losses in productivity due to morbidity. We carried out a sensitivity analysis on the costs, running the model with a discount rate of 5 percent, as is used in Russia (http://www.ispor.org/peguidelines/countrydet.asp?c=18&t=4). 1 http://data.worldbank.org/ indicator/NY.GDP.PCAP. No discount rate was available for Ukraine. CD?year_high_desc=true - average of 5 years taken from the World Bank and OECD national accounts data: $3320 per year 8 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Table 2: References for Disease Data Incidence Mortality Survival Direct healthcare costs Converted from incidence and I Denisova, P Kuznetsova CHD GBD 2016 (9) GBD 2016 (9) mortality 2014 (28) Converted from incidence and I Denisova, P Kuznetsova Stroke GBD 2016 (9) GBD 2016 (9) mortality 2014 (28) Converted from incidence and I Denisova, P Kuznetsova COPD GBD 2016 (9) GBD 2016 (9) mortality 2014 (28) Lung Converted from incidence and I Denisova, P Kuznetsova GBD 2016 (9) GBD 2016 (9) Cancer mortality 2014 (28) Population Data In order to simulate the population of Ukraine, the population by age and sex, births by mother’s age, and total fertility rate statistics were taken from the UN population prospects database (8). Total mortality rates were taken from the WHO global health estimates database (29). These parameters enable the model to simulate the Ukrainian population as close to reality as is possible. The Microsimulation Model The UK Health Forum (UKHF) microsimulation model was originally developed for the English government’s Foresight enquiry (30, 31) and has been further developed over the past decade to incorporate a number of additional interacting risk factors, including smoking. (Methods are described in greater detail in (32, 33) and in our supplementary appendix 1.) The model simulates a virtual population that reproduces the characteristics and behavior of a large sample of individuals (20-100 million). These characteristics (age, sex, smoking status) can evolve over the life course based on known population statistics and risk factor data. Individuals can be born and die in the model. Figure 1 illustrates the modular nature of the model. Module 1 uses cross-sectional data on the prevalence of the risk factor - cigarette smoking in this case. For the current study, 2015 smoking prevalence data for Ukraine was extrapolated forward to 2035. It was assumed that the proportions of the population within each smoking category as calculated in 2015 remained constant until 2035. 9 Module 2 is a microsimulation model which uses the prevalence of the risk factor over time, along with the specified data on the risks of developing diseases, to make projections of future disease burden. The model produces a wide range of different outputs, including incidence, cumulative incidence, prevalence, premature mortality, direct healthcare costs avoided, and disability-adjusted life years. To our knowledge, no other studies have used a microsimulation model to quantify the future costs and health impacts of tobacco taxation policy scenarios in Ukraine. Risk data RISK Population Disease Health Intervention data data economic scenarios Distribution data programme UKHF Microsimulation© Programme Input datasets Software programmes Output data Output Figure 1: Illustration of the Microsimulation Model Development of Scenarios An initial modeling study was carried out using the World Health Organization (WHO) TaXSiM model.2 Within this model, a scenario that reflects tobacco excise tax changes in 2017 was simulated to calculate the revenue impact as a result of this tax increase (Table 3). TaXSiM also calculated the percentage reduction in total cigarette consumption (%) due to the suggested tax changes. These taxation changes result in non-smokers’ (predominantly young people) not initiating smoking; smokers’ quitting, and smokers’ reducing the number of cigarettes smoked. Details of the TaXSiM model scenarios can be found in Table 3. 2 WHO tobacco tax simulation model (TaXSiM) http://who.int/ tobacco/economics/taxsim/en/ 10 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Table 3: TaXSiM Model Scenarios and Outputs SCENARIO 1 (2017): Ad valorem tax Expected Baseline Situation (2016): Ad Expected Expected is the same as in 2016 (12%), and Actual Contri- valorem (12%) minimum Contri- Contri- 40% increase in both the minimum 2015 bution specific (8.515 UAH) and bution bution specific excise (11.92 UAH), and to GDP simple specific (6.365 UAH) to GDP to GDP simple specific (8.91 UAH)** Total cigarettes taxed (billion 74.0 77.0 70.1 pieces) Average cigarette price (UAH per 15.3 20.8 25.7 pack) Average cigarette price (US$ per $0.63 $0.81 $0.92 pack).* Average excise tax (UAH per 1000 308.9 431.4 600.0 pieces) Total excise tax revenue (billion 22.9 1.0% 33.2 1.4% 41.8 1.6% UAH) Total excise tax revenue (US$ $0.94 $ 1.30 $1.50 billion).* Total government revenue (excise, 34.9 1.6% 49.9 2.2% 60.1 2.3% VAT and levies, billion UAH) Total government revenue (excise, $1.44 $ 1.95 $2.16 VAT and levies, US$ billion).* Total expenditure on cigarettes 56.4 79.9 90.0 (billion UAH) Percentage change in 4.1 -9.0 total cigarette consumption (%) * World Bank Group actual (2015 -2016) and forecast (2017): Annual average exchange rate = 2016 (1US$/25.6 UAH); 2017 (1US$/27.80 UAH) ** per pack of 20 cigarettes 11 Scenario Assumptions 1. Several studies suggest that around 50 percent of the effect of price increases on overall cigarette consumption results from participation changes (34, 35). Therefore, 50 percent of the estimated reduction in cigarette consumption was used as an estimate of the reduction in the total prevalence of smoking. While taxation which results in increased real prices of tobacco might reduce the intensity of smoking, research suggests that people who cut down actually inhale more, as measured by serum cotinine levels (36). Further, the WHO target is focused on a total reduction in smoking prevalence. Therefore, modeling proceeded with a focus on current smoking prevalence, as opposed to the number of cigarettes smoked. 2. Our analysis of the omnibus surveys 2013-2015 showed that, in males, 55 percent of the change in smoking prevalence was due to a reduction in uptake. Specifically, the percentage of smokers decreased, the percentage of ex-smokers did not change, while the percentage of never smokers increased. Therefore, these changes were probably due to males’“not starting smoking.” Among females, 100 percent of the change in consumption was due to “not starting smoking” (2). 3. While these average changes were not the same for each group, and people usually initiate smoking while they are under 30 years old (2), the model did not take these age differences into account, and the relative decline in percentages of current smokers was applied to all age groups. Therefore, it was assumed that taxation would result in changes in uptake. 4. A baseline “static” trend was included. This assumed that smoking prevalence remains constant at 2015 rates. The tax increase scenario was compared to this baseline. 5. The tax increase scenario represents the tax change adopted in January 2017. 6. The change in smoking prevalence occurs in the second year of the simulation only (2017). This is in line with TaXSiM. As noted above, tobacco companies’“price wars” were observed to cause an increase in cigarette consumption, rather than the decrease expected according to taxation policies. Taking this into account, it was assumed that, in 2016, there would be no change in smoking prevalence from the 2015 level (37). The baseline used in the model was 2015 smoking prevalence held constant. 7. The scenarios are based on Monte Carlo simulations (Individuals were sampled from the population and simulated through). 8. The specified percentage of smokers who are affected by the tax increase move to the never-smoker category in 2017. 12 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 9. If an individual’s smoking status is changed by the scenario, their smoking status will remain fixed for the entire simulation. 10. We assumed an immediate reduction in smoking prevalence due to the tax increases in 2017. We learned via personal communication with Prof. Joy Townsend that there are different views on the temporal impact of a tax: Econometricians follow Becker’s model, assuming that, as tobacco is very addictive, the reaction to price increases is slow and greater in the long run. Becker, therefore, uses a lagged variable of y (t-1) (38). Townsend and Atkinson take the opposite view (39): That smokers tend to react quickly to a price change. We used a model similar to theirs, with an immediate effect and then a linear trend, and in line with the TaXSiM model outputs. There were two scenarios: 1. A baseline ‘static’ trend. This assumed that smoking prevalence stays constant at 2015 rates. 2. A tax increase scenario. An earlier iteration of TaXSiM calculated that an increase in Ad valorem tax of 15 percent, a 30 percent increase in the minimum specific excise, 11.08 UAH, and a simple specific of 8.28 UAH would result in a reduction of 10.2 percent in cigarettes smoked. Using the assumptions above, this translated into a reduction in uptake of 5.61 percent in males, and 10.2 percent in females. Therefore, in 2017, this specified percentage of smokers was moved to the never-smoker group in order to take account of uptake and maintain 100 percent of smokers in the model (the population cannot exceed 100 percent). This slightly underestimates the effect of the scenario, as described in the discussion. This change occurred in 2017 only. Appendix 2 provides the full TaxSiM analysis from the earlier iteration. Scenario 1 is summarized in Table 4: Table 4: Summary of Scenarios % reduction in cigarettes Estimated expected reduction Estimated expected reduction consumption as per Table 2 in smoking (males) in smoking (females) Number of cigarettes Number of cigarettes (%) Uptake (%) Uptake (%) smoked (%) smoked (%) 10.2 5.61 4.59 10.2 0 13 Results A number of different outputs are produced from the model, and these are defined below: Smoking Prevalence (%) Table 5 shows smoking prevalence for males, females, and both males and females combined for the baseline and Scenario 1. Table 5: Smoking Prevalence by Year, Sex and Scenario (%) Year Scenario 0 (baseline) Scenario 1 M F TOTAL M F TOTAL 2016 40.8 5.7 21.5 40.8 5.7 21.5 2020 40.3 5.4 21.1 38.2 4.9 19.8 2025 39.4 5.3 20.5 37.5 4.8 19.4 2030 38.9 5.1 20.1 37.2 4.8 19.2 2035 38.2 5.0 19.8 36.7 4.7 19.0 2016 42 2020 41 2025 2030 40 2035 Percentage (%) 39 38 37 36 35 34 Males Males Baseline Scenario 0 Baseline Scenario 1 Figure 2: Male Smoking Prevalence by Year for Each Scenario. 14 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 2016 6 2020 2025 5 2030 4 Percentage (%) 2035 3 2 1 0 Females Females Baseline Scenario 0 Baseline Scenario 1 Figure 3: Female Smoking Prevalence by Year for Each Scenario. Epidemiological Indicators Results from the microsimulation are presented as rates per 100,000, then scaled to the Ukraine population for that year, as estimated by the UN population prospects (8). 1. Cumulative incidence rate per year per Ukraine population The total number of new cases of a disease divided by the total number of susceptible people in a given year and accumulated over a specified period of the simulation from the year 2016. Therefore, the cumulative number of incident cases represents a sum of all of the incident cases from the start of the simulation. 2. Cumulative incidence avoided per Ukraine population over the simulation period The total number of incident cases of disease avoided or gained as compared to baseline (i.e., scenario 0). A positive value represents the number of cases avoided, whereas a negative value represents the number of cases gained. 3. Incidence The total number of new cases of a disease, divided by the total number of susceptible people in a given year presented as a rate per population. 4. Attributable incidence rate per Ukraine population per year The number of new cases of a disease attributable to being a smoker or ex- smoker in the Ukraine population. 15 5. Premature mortality rates per Ukraine population Premature mortality refers to the total number of deaths in a given year below the life expectancy of that individual in the Ukraine population. Results are presented per year in the total population and cumulative over a given period of the simulation. 6. Potential years of life lost per Ukraine population For each individual, the difference between the reference age (life expectancy at birth) and the age of death is calculated. The average annual PYLL was calculated each year in the microsimulation. This metric considers individuals who have died in a given year and is output as a rate per 100,000, which is then scaled to a rate per Ukraine population. Economic outputs 7. Direct cost avoided These are cumulative direct costs across the period of the simulation. The result for 2020 represents the cumulative costs avoided for the period 2016 to 2020. These costs are scaled to the total population of Ukraine. 8. Premature mortality costs This relates to lost earnings due to premature deaths. The premature mortality costs for each individual in the year of death are calculated by summing over the income costs from the age of death until the individual’s life expectancy (LE) at birth. Summary Table Table 6 presents a summary table of total disease cases (epidemiological) and costs (economic) by parameter, year, and scenario as rates per Ukraine population. Scenario 0 (Sc0) refers to the baseline scenario where smoking prevalence was assumed constant based on 2015 smoking prevalence. Scenario 1 (Sc1) refers to the one-off tax scenario as summarized in table 3. Cumulative Incident Cases Table 7 presents the cumulative incident cases for each disease by year, and Table 8 presents the cumulative incident cases avoided. 16 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Table 6: Summary Table of the Outputs as Rates per Ukraine Population, by Year Epidemiological outputs Year Sc0 (Baseline) Sc1 2025 5480948[±4237] 5427558[±4237] Cumulative incident cases 2035 11366868[±5753] 11255173[±5753] 2025 NA 56224[±6341] Cumulative incident cases avoided 2035 NA 126730[±9123] 2025 589035[±1545] 582341[±1545] Incident cases per year 2035 646600[±1545] 640799[±1727] 2025 218221[±1121] 208475[±1121] Attributable incident cases 2035 222603[±1041] 211984[±1041] 2025 NA 6372 Cumulative premature deaths avoided 2035 NA 29172 Cumulative potential years of life lost relative 2025 NA 48923 to baseline 2035 NA 267098 Economic outputs 2025 NA 542.23 Direct costs avoided (millions UAH) 2035 NA 1545.81 Cumulative premature mortality costs avoided 2025 NA 3568.4 (millions UAH) 2035 NA 16536.4 Table 7: Cumulative Incident Cases for Each Disease by Year for the Total Ukraine Population CHD COPD Lung Cancer Stroke Total Sc 0 3712722[+-3390] 665256[+-1695] 194068[+-847] 908901[+-1695] 5480948[±4237] Year 2025 Sc 1 3679248[±3390] 655510[±1695] 188136[±847] 904664[±1695] 5427558[±4237] Sc 0 7707697[+-4719] 1346232[+-1966] 392110[+-1180] 1920828[+-2360] 11366868[±5753] Year 2035 Sc 1 7638872[±4719] 1323421[±1966] 379132[±1180] 1913749[±2360] 11255173[±5753] 17 Scenario 0 12,000,000 2025 Scenario 0 2035 Scenario 1 2025 10,000,000 Scenario 1 2035 Cumulative incident cases 8,000,000 6,000,000 4,000,000 2,000,000 0 CHD COPD Lung Caner Stroke TOTAL Figure 4: Cumulative Incident Cases per Ukraine Population by 2025 and 2035. Table 8: Cumulative Incident Cases Avoided Relative to Scenario 0 for the Ukraine Population by 2025 and 2035 CHD COPD Lung Cancer Stroke Total Year 2025 35252[+-4908] 10263[+-2677] 6247[+-1338] 4462[+-2677] 56224[±6341] Year 2035 78092[+-7586] 25881[+-3123] 14725[+-1784] 8032[+-3569] 126730[±9123] 18 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 140,000 Scenario 1 rel to 2025 Scenario 1 120,000 rel to 2035 Cumulative incident cases avoided 100,000 80,000 60,000 40,000 20,000 0 CHD COPD Lung Stroke TOTAL Caner Figure 5: Cumulative Incident Cases Avoided. Incident and Attributable Incident Cases per Year Table 9 presents the incidence rates per Ukraine population, and Table 10 shows the incidence rate attributable to smoking per Ukraine population by year for each disease. Figure 7 presents the incident cases by scenario for 2025 by disease. The blue bars show incident cases and red bars show incident cases attributable to smoking in the specified year per Ukraine population. For scenario 1, the cases attributable to smoking contribute a smaller portion to the overall new cases when compared with baseline (scenario 0). This is to be expected, since the scenario is impacting smokers, so we would expect the avoided cases attributable to smoking to increase over time. Table 9: Incident Cases in the Total Population per Year CHD COPD Lung Cancer Stroke Total Sc 0 398492[+-1338] 70059[+-446] 20080[+-446] 100404[+-446] 589035[±1545] Year 2025 Sc 1 394476[+-1338] 68274[+-446] 19634[+-446] 99957[+-446] 582341[±1545] Sc 0 439546[+-1338] 73629[+-446] 21419[+-446] 112006[+-446] 646600[±1545] Year 2035 Sc 1 435976[+-1338] 71844[+-446] 20527[+-446] 112452[+-892] 640799[±1727] Scenario 0 700,000 2025 Scenario 0 2035 600,000 Scenario 1 2025 Cumulative incident cases Scenario 1 2035 500,000 400,000 300,000 200,000 100,000 0 CHD COPD Lung Caner Stroke TOTAL Figure 6: Incident Cases in Ukraine in 2025 and 2035. Table 10: Attributable Incident Cases per Year CHD COPD Lung Cancer Stroke Total Sc 0 145339[+-847] 35170[±424] 15678[±424] 22034[±424] 218221[±1121] Year 2025 Sc 1 138983[+-847] 33051[±424] 15254[±424] 21187[±424] 208475[±1121] Sc 0 148664[+-787] 36183[±393] 15338[±393] 22418[±393] 222603[±1041] Year 2035 Sc 1 142371[+-787] 33823[±393] 14552[±393] 21238[±393] 211984[±1041] 20 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Scenario 0 250,000 2025 Scenario 0 2035 Scenario 1 2025 200,000 Scenario 1 2035 Attributable Incident cases 150,000 100,000 50,000 0 CHD COPD Lung Caner Stroke TOTAL Figure 7: Attributable Incident Cases per Ukraine Population in the Years 2025 and 2035. TOTAL Sc 1 2035 Stroke Lung Cancer COPD 2035 CHD TOTAL Incident Sc 0 2035 Stroke cases Lung Cancer Attrib Incident COPD cases CHD Total Stroke Sc 1 2025 Lung Cancer COPD 2025 CHD Total Stroke Sc 0 2025 Lung Cancer COPD CHD 0 200,000 400,000 600,000 800,000 1,000,000 Figure 8: Incident and Attributable Incident Cases in 2025 for Baseline and Scenario 1 by Disease. 21 Premature Deaths Table 11 presents the premature deaths, premature deaths avoided, and cumulative premature deaths avoided in the total Ukraine population relative to scenario 0. The results show that, by 2025, there will be 6,372 premature deaths averted given the scenario 1 tax increase. By 2035, this increases to 29,172 premature deaths averted for scenario 1. Figure 8 presents the cumulative premature deaths avoided by scenario for 2025 and 2035. Table 11: Premature Mortality in the Total Ukraine Population Premature deaths Premature deaths Cumulative premature deaths avoided avoided Sc 0 307204 NA NA Year 2025 Sc 1 305086 2119 6372 Sc 0 328004 NA NA Year 2035 Sc 1 326824 1180 29172 Cumulative premature mortality avoided Sc 1 rel 0 2035 Scenario by year Sc 1 rel 0 2025 0 5000 10,000 15,000 20,000 25,000 30,000 35,000 Total cases in the population Figure 9: Cumulative Premature Deaths Avoided in the Ukraine Population by 2025 and 2035. Potential Years of Life Lost Table 12 presents the cumulative potential years of life lost (PYLL) for each scenario by year in the total Ukraine population. By 2025, scenario 1 is predicted to avoid 267,098 PYLL relative to baseline. 22 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Table 12: Cumulative PYLL by scenario and year, and PYLL avoided due to scenario 1 relative to scenario 0 Year Sc 0 Sc 1 PYLL avoided Sc1 rel 0 2025 47173316 47124394 48923 2035 92950950 92683852 267098 Direct Costs Avoided Table 13 presents the cumulative direct healthcare costs avoided for scenario 1 relative to scenario 0. Relative to scenario 0, scenario 1 results in the avoidance of the following direct healthcare costs by disease: CHD (UAH1.1bn/US$130 million3), followed by COPD (UAH 0.16bn/US$25 million). Table 13: Direct Cumulative Healthcare Costs Avoided (UAH millions) CHD COPD Lung Cancer Stroke Total Year 408.37 26.16 54.05 53.65 542.22 Sc 1 rel 0 2025 [+-0.06] [+-0.06] [+-0.06] [+-0.06] [+-0.12] Year 1133.1 160.54 143.35 108.82 1545.82 Sc 1 rel 0 2035 [+-0.08] [+-0.08] [+-0.08] [+-0.08] [+-0.16] 1800 Scenario 1 rel to 2025 Scenario 1 1600 rel to 2035 1400 Direct cumulative healthcare costs 1200 avoided (UAH millions) 1000 800 600 400 200 0 CHD COPD Lung Stroke TOTAL Caner 3 The exchange rate of 1US$/23.8 UAH is used here. Figure 10: Direct Cumulative Healthcare Costs Avoided (UAH millions) 23 As expected, discounting at 5 percent has a large impact on the cumulative direct costs avoided by 2025 and 2035. For example, scenario 1 is predicted to result in 399 million UAH avoided by 2025, compared to 542 million UAH avoided by 2025 without discounting. The results are presented in Table 14. Table 14: Direct Cumulative Healthcare Costs Avoided (UAH millions) (with 5 percent discounting) CHD COPD Lung Cancer Stroke Total Year Sc 1 rel 0 301.18 16.66 40.76 40.13 398.74 2025 Year Sc 1 rel 0 660.51 82.23 85.24 68.36 896.34 2035 Premature Mortality Costs Avoided Table 14 presents the premature mortality costs avoided, relative to baseline. In 2035 alone, UAH 1.97 billion (US$82.7 million) premature mortality costs could be avoided. Cumulatively, by 2035, UAH 16.5 billion (US$695 million) premature mortality costs could be avoided relative to baseline. Table 15: Premature Mortality Costs Avoided (UAH millions) PM costs avoided Cumulative costs avoided Year 2025 206 3568.4 Year 2035 1968.3 16536.4 24 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Discussion This study explored the impact of a one-time tobacco tax increase in Ukraine on the future burden of four smoking-related diseases through 2035. The results showed that small changes in smoking prevalence in one year can have large impacts in terms of disease incidence and premature mortality cases avoided into the future. Our results show that implementation of a one-off tax has an impact on smoking-related health burden, but highlights the need for continuous tobacco control measures, if smoking prevalence is to continue to decrease and have sizeable impacts on related disease occurrence. As well as benefits in terms of morbidity, particularly CHD cases avoided, we observe large savings in terms of premature mortality and potential years of life lost. This is important, since Ukraine is experiencing a decreasing population over time, and a lower life expectancy compared to the EU average of 78 for males and around 83 for females (40). Tobacco taxation is one important step to improving life expectancy in Ukraine, especially amongst men, whose smoking prevalence is high. The study included just four smoking-related diseases (CHD, COPD, stroke, lung cancer). However, we know that smoking is responsible for many more diseases, and harms almost every organ in the body (41). Therefore, we are likely to see much wider epidemiological benefits than those observed here. Future work could update this study by including additional smoking-related diseases. While the microsimulation method is advantageous in NCD modeling, one key disadvantage is that the model is data intensive. Fortuitously, during the period of the study, the Global Burden of Disease team published an online database that included many of the data inputs that were required (9). While country- specific data are preferable, and the GBD is based on modeled estimates and recommended as a cross-country comparative tool, few other data were available for Ukraine. Inter alia, there were neither survival data nor relative risks specific for Ukraine available. Once better data become available, the model can easily be updated. No data on indirect costs such as productivity losses by disease were available. Large savings to the health system were observed with just a small change in smoking prevalence. However, wider societal costs such as losses in productivity are likely to be higher than those reflected here, making a stronger case for the implementation of regular tax hikes for tobacco control (42). If indirect cost data by disease become available, then the model can once again easily be updated in the future. 25 One notable limitation of our scenario methodology is that smokers were moved to the never-smokers category to account for changes in uptake due to the tax. While this is not realistic, it was the only solution by which to model change in uptake within the total population (and ensure we maintain 100 percent of people in the population). This approach could result in an underestimation of the health impact of a tobacco tax increase. This effect could arise, because some of those smokers who become never smokers may already have a smoking-related disease. We know that social groups react differently to tax increases (43). Due to small sample sizes, it was not possible to model the long-term health impacts on different social groups within the microsimulation. However, we can infer from research conducted in Ukraine (43, 44) that the largest impact of taxation will be observed in the poorest social groups. This is important, since it means that tobacco taxation could contribute to reducing social inequalities in health. One specific limitation of any predictive model is that it does not take account of major future changes in circumstances, such as the behavior of the tobacco industry, or the introduction of new drugs or technologies. In theory, their effects can be estimated by altering parameters in the model, but these will significantly increase the degrees of uncertainty. However, they could be simulated as additional scenarios in the future relative to a “no change” scenario. At present, the model does not take account of multimorbidity and the joint effect of several risk factors on disease occurrence and related mortality. However, individuals can get more than one smoking-related disease in their lifetime. Future work could expand the scope of the model to take account of technological and economic changes and their potential effects, and also to model the clustering of risk factors and diseases in the same individuals. The model did not take account of passive smoking/secondhand smoke. Understanding the combined risk of smoking and passive smoking on later disease outcomes will enable us to model the combined impact of these risk factors on later disease outcomes. It was beyond the scope of this study, given the time constraints, to carry out an in-depth uncertainty and sensitivity analysis. We are aware that this is good practice; however, there is a lack of validated datasets by which to compare our outputs. Furthermore, the microsimulation is complex, relative to spreadsheet models, for example. It involves many thousands of calculations which are completed during the simulation of 50 million individuals. Given this complexity, local uncertainty analysis would demand many thousands of consecutive runs and would require a supercomputer to complete the exercise in a realistic time scale. However, we did carry out a small sensitivity analysis of the costs – running the model with and without a 5 percent discount rate. Further work should develop more sophisticated interventions, for example, 26 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine individuals of different ages could be affected differently by the intervention. It was beyond the scope of this study to include this development within the microsimulation. However, a prototype, user-friendly tool has been developed for Ukraine that enables the user to select different age cohorts (as opposed to a population distribution of individuals) and run simulations to quantify health and cost impacts by population groups. Further work should also explore the impact of other potential policies in Ukraine, such as a tobacco duty escalator, as well as a combination of tobacco- control measures including smoking cessation services. This study complements that which was carried out using the TaXSiM model and shows the health and related economic benefits of increasing tobacco tax in Ukraine. Even small reductions in smoking prevalence in one year will have long-term impacts on disease incidence and subsequent health costs. 27 Bibliography 1. WHO. WHO report on the global tobacco epidemic 2011: Warning about the dangers of tobacco. Geneva: WHO; 2011. 2. Andreeva T. Results of omnibus surveys with tobacco-related questions conducted in Ukraine in 2013, 2014, 2015. 2016. 3. The results of KIIS survey on tobacco smoking in Ukraine as of 2015 compared to 2013-2014 Kiev: Kiev International Institute of Sociology; 2016. Available at: http://www.kiis.com.ua/?lang=eng&cat=reports&id=587&page=1. 4. Oderkirk J, Sassi F, Cecchini M, Astolfi R, OECD Health Division. Toward a New Comprehensive International Health and Health Care Policy Decision Support Tool. OECD Directorate for Employment, Labour and Social Affairs; 2012. 5. Feenberg D, Coutts E. An introduction to the TAXSIM model. Journal of Policy Analysis and Management. 1993;12(1):189-94. 6. Butrica BA, Burkhauser RV. Estimating federal income tax burdens for Panel Study of Income Dynamics (PSID) families using the National Bureau of Economic Research TAXSIM model. 1997. 7. State Statistical Service of Ukraine. Population’s self-perceived health status and availability of selected types of medical aid in 2015 (in Ukrainian). Kiev: State Statistical Service of Ukraine; 2016. Available at: http://ukrstat.gov.ua/ druk/katalog/kat_u/2015/sb/zb_snsz_2015.zip. 8. United Nations. World population prospects 2015. Available at: http://esa. un.org/unpd/wpp/. 9. Global Burden of Disease. Global Health Data Exchange. In: Institute for Health Metrics and Evaluation, editor. Available at: http://ghdx.healthdata.org/ gbd-results-tool2016. 10. World Health Organization. Globocan 2012: Estimated Cancer Incidence, Mortality and Prevalence Worldwide in 2012. Available at: http://globocan.iarc. fr/Default.aspx. 11. World Health Organization. Health statistics and information systems - Software tools - DISMOD II 2014 [23/02/15]. Available at: http://www.who.int/ healthinfo/global_burden_disease/tools_software/en/. 28 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 12. Song YM, Cho HJ. Risk of stroke and myocardial infarction after reduction or cessation of cigarette smoking: A cohort study in Korean men. Stroke. 2008;39(9):2432-8. 13. Baba S, Iso H, Mannami T, Sasaki S, Okada K, Konishi M, et al. Cigarette smoking and risk of coronary heart disease incidence among middle-aged Japanese men and women: The JPHC Study Cohort I. European Journal of Cardiovascular Prevention & Rehabilitation. 2006;13(2):207-13. 14. Tolstrup JS, Hvidtfeldt UA, Flachs EM, Spiegelman D, Heitmann BL, Balter K, et al. Smoking and risk of coronary heart disease in younger, middle-aged, and older adults. American Journal of Public Health. 2014;104(1):96-102. 15. Burns DM. Epidemiology of smoking-induced cardiovascular disease. Progress in Cardiovascular Diseases. 2003;46(1):11-29. 16. Cronin EM, Kearney PM, Kearney PP, Sullivan P, Perry IJ. Impact of a national smoking ban on hospital admission for acute coronary syndromes: A longitudinal study. Clinical Cardiology. 2012;35(4):205-9. 17. U.S. Department of Health and Human Services. The health consequences of smoking—50 years of progress: a report of the Surgeon General. Washington, DC: U.S. Department of Health and Human Services; 2014. 18. Prescott E, Bjerg AM, Andersen PK, Lange P, Vestbo J. Gender difference in smoking effects on lung function and risk of hospitalization for COPD: Results from a Danish longitudinal population study. The European Respiratory Journal. 1997;10(4):822-7. 19. Johannessen A, Omenaas E, Bakke P, Gulsvik A. Incidence of GOLD- defined chronic obstructive pulmonary disease in a general adult population. International Journal of Tuberculosis and Lung Disease. 2005;9(8):926-32. 20. Terzikhan N, Verhamme KMC, Hofman A, Stricker BH, Brusselle GG, Lahousse L. Prevalence and incidence of COPD in smokers and non-smokers: The Rotterdam Study. European Journal of Epidemiology. 2016;31(8):785-92. 21. Thun MJ, Carter BD, Feskanich D, Freedman ND, Prentice R, Lopez AD, et al. 50-Year Trends in Smoking-Related Mortality in the United States. New England Journal of Medicine. 2013;368(4):351-64. 22. Freedman ND, Leitzmann MF, Hollenbeck AR, Schatzkin A, Abnet CC. Cigarette smoking and subsequent risk of lung cancer in men and women: Analysis of a prospective cohort study. Lancet Oncol. 2008;9(7):649-56. 29 23. Bae JM, Lee MS, Shin MH, Kim DH, Li ZM, Ahn YO. Cigarette smoking and risk of lung cancer in Korean men: The Seoul male cancer cohort study. J Korean Med Sci. 2007;22(3):508-12. 24. Mannami T, Iso H, Baba S, Sasaki S, Okada K, Konishi M, et al. Cigarette smoking and risk of stroke and its subtypes among middle-aged Japanese men and women: The JPHC Study Cohort I. Stroke. 2004;35(6):1248-53. 25. Shinton R, Beevers G. Meta-analysis of relation between cigarette smoking and stroke. BMJ. 1989;298(6676):789-94. 26. Wannamethee SG, Shaper AG, Whincup PH, Walker M. Smoking cessation and the risk of stroke in middle-aged men. Journal of the American Medical Association. 1995;274(2):155-60. 27. Hoogenveen RT, van Baal PH, Boshuizen HC, Feenstra TL. Dynamic effects of smoking cessation on disease incidence, mortality and quality of life: The role of time since cessation. Cost Eff Resour Alloc. 2008;6:1. 28. Denisova I, Kuznetsova P. The effects of tobacco taxes on health : An analysis of the effects by income quintile and gender in Kazakhstan, the Russian Federation, and Ukraine. The World Bank; 2014 Oct. 29. World Health Organisation. World Health Statistics 2015. Global Health Observatory (GHO) data 2015. Available at: http://www.who.int/gho/ publications/world_health_statistics/2015/en/. 30. McPherson K, Marsh T, Brown M. Foresight tackling obesities: Future choices – modelling future trends in obesity and the impact on health. Foresight Tackling Obesities Future Choices. 2007. 31. Wang YC, McPherson K, Marsh T, Gortmaker SL, Brown M. Health and economic burden of the projected obesity trends in the USA and the UK. Lancet. 2011;378(9793):815-25. 32. Forum CRUUH. Aiming High: Why the UK should aim to be tobacco-free. 2016. 33. Forum UH. Appendix B4. Detailed Methodology Technical Document. http://econdaproject.eu/2015. 34. Farrelly MC, Bray JW, Zarkin GA, Wendling BW. The joint demand for cigarettes and marijuana: Evidence from the National Household Surveys on Drug Abuse. J Health Econ. 2001;20(1):51-68. 30 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 35. Response to increases in cigarette prices by race/ethnicity, income, and age groups--United States, 1976-1993. Morbidity and Mortality Weekly Report. 1998;47(29):605-9. 36. Fidler JA, Stapleton JA, West R. Variation in saliva cotinine as a function of self-reported attempts to reduce cigarette consumption. Psychopharmacology (Berl). 2011;217(4):587-93. 37. Krasovsky K. Public health and revenue impact of cigarette “price wars” in Ukraine. ECTOH-20172017. 38. Becker G, Murphy K. A Theory of Rational Addiction. Journal of Political Economy. 1988;96(4):675-700. 39. Atkinson AB, Skegg JL. Anti-Smoking Publicity and the Demand for Tobacco in the U.K.*†. The Manchester School. 1973;41(3):265-82. 40. Eurostat: Statistics Explained. File: Life expectancy at birth, EU-28, 2002-14 2016 [cited 2016 20.12.2016]. Available at: http://ec.europa.eu/ eurostat/statistics-explained/index.php/File:Life_expectancy_at_birth,_EU- 28,_2002%E2%80%9314_%28%C2%B9%29_%28years%29_YB16.png. 41. Centers for Disease Control and Prevention. Health Effects of Cigarette Smoking 2016. Atlanta: Centers for Disease Control and Prevention; 2016. Available at: https://www.cdc.gov/tobacco/data_statistics/fact_sheets/health_ effects/effects_cig_smoking/. 42. Action on smoking and health. The economics of tobacco. ASH factsheet. 2015. 43. Krasovsky K. Sharp changes in tobacco products affordability and the dynamics of smoking prevalence in various social and income groups in Ukraine in 2008-2012. Tob Induc Dis. 2013;11(1):21. 44. Krasovsky K, Andreeva T, Krisanov D, Mashliakivsky M, Rud G. The Economics of tobacco control in Ukraine from the public health perspective. Kiev 2002. 128 p. 31 32 Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix Appendix 1. Technical Appendix 1. Microsimulation Framework ppendix 1 of Our simulation consists Microsimulation Framework two modules. The first module calculates the predictions of risk factor 1 trends over1timeMicrosimulation Framework 1 Microsimulation Framework Our simulation consists of two modules. The first module calculates the pr based on data from rolling cross- Microsimulation Framework 1 Microsimulation Framework 1 Microsimulation Framework 1 Microsimulation Framework 1 Microsimulation Framework trends over time based on data from rolling cross-sectional studies. The se Our simulation consists of two modules. The first module c Our simulation consists of two modules. The first module ca sectional studies. The second module performs the microsimulation of a mework Our simulation consists of two modules. The first module calculates the Our simulation consists of two modules. The first module calculates the predictions of risk factor Our simulation consists of two modules. The first module calculates the predictions of risk factor Our simulation consists of two modules. The first module calculates the predictions of the microsimulation of a virtual population, generated with demographic c Our simulation consists of two modules. The first module calculates the predictions of risk factor virtual population, generated with demographic trends over time based on data from rolling cross-sectiona trends over time based on data from rolling cross-sectional characteristics matching dules. The first module calculates the predictions of risk factor trends over time based on data from rolling cross-sectional studies. The trends over time based on data from rolling cross-sectional studies. The second module performs trends over time based on data from rolling cross-sectional studies. The second module performs those of trends over time based on data from rolling cross-sectional studies. The second modu those of the observed data. The health trajectory of each individual from t rends over time based on data from rolling cross-sectional studies. The second module performs the observed data. The health the microsimulation of a virtual population, generated with the microsimulation of a virtual population, generated with trajectory of each individual from m rolling cross-sectional studies. The second module performs the microsimulation of a virtual population, generated with demographi the microsimulation of a virtual population, generated with demographic characteristics matching the microsimulation of a virtual population, generated with demographic characteristics matching the microsimulation of a virtual population, generated with demographic characterist over time allowing them to contract, survive, or die from a set of diseases he microsimulation of a virtual population, generated with demographic characteristics matching the population is simulated over time those of the observed data. The health trajectory of each in those of the observed data. The health trajectory of each in allowing them to contract, survive, or pulation, generated with demographic characteristics matching those of the observed data. The health trajectory of each individual from those of the observed data. The health trajectory of each individual from the population is simulated those of the observed data. The health trajectory of each individual from the population is simulated those of the observed data. The health trajectory of each individual from the populati analyzed risk factors. The detailed description of the two modules is prese hose of the observed data. The health trajectory of each individual from the population is simulated die from a set of diseases or injuries related over time allowing them to contract, survive, or die from a over time allowing them to contract, survive, or die from a to the analyzed risk factors. The alth trajectory of each individual from the population is simulated over time allowing them to contract, survive, or die from a set of disease over time allowing them to contract, survive, or die from a set of diseases or injuries related to the over time allowing them to contract, survive, or die from a set of diseases or injuries related to the detailed over time allowing them to contract, survive, or die from a set of diseases or injuries r over time allowing them to contract, survive, or die from a set of diseases or injuries related to the description of the two modules analyzed risk factors. The detailed description of the two m analyzed risk factors. The detailed description of the two m is presented below. t, survive, or die from a set of diseases or injuries related to the 1.1 Module One: Predictions of Smoking Prevalence Ov analyzed risk factors. The detailed description of the two modules is pre analyzed risk factors. The detailed description of the two modules is presented below. analyzed risk factors. The detailed description of the two modules is presented below. analyzed risk factors. The detailed description of the two modules is presented below nalyzed risk factors. The detailed description of the two modules is presented below. description of the two modules is presented below. 1.1 Module One: Predictions 1.1 Prevalence 1.1 For the risk factor (RF), let of Smoking Module One: Predictions of Smoking Pre Module One: Predictions of Smoking Pr N be the number of categories for a given risk fa Over Time 1.1 1.1 Module One: Predictions of Smoking Prevalence Over Time 1.1 Module One: Predictions of Smoking Prevalence Over Time 1.1 Module One: Predictions of Smoking Prevalence Over Time 1.1 Module One: Predictions of Smoking Prevalence O Module One: Predictions of Smoking Prevalence Over Time smoking. Let ! = 1, 2, …, N number these categories, and For the risk factor (RF), let For the risk factor (RF), let For the risk factor (RF), let N be the number of categories for a given N #$ (&) denote the be the number of categories f N be the number of categories fo risk ons of Smoking Prevalence Over Time For the risk factor (RF), let For the risk factor (RF), let or the risk factor (RF), let N For the risk factor (RF), let N be the number of categories for a given risk be the number of categories for a given risk factor, e.g. N be the number of categories for a given risk factor, e.g. For the risk factor (RF), let N be the number of categories for a given risk factor, e.g. corresponds to the category N be the number of categories for a given risk factor, e.g. N N = 3 for = 3 for and #$ (&) using a multino N factor, e.g. N = 3 for smoking. Let k = 1, 2, …, NN smoking. Let smoking. Let = 3 for ! ! number at time ! these t = 1, 2, …, = 1, 2, …, . We estimate NN number these categories, and number these categories, and categories, # #$ number of categories for a given risk factor, e.g. smoking. Let ! = 1, 2, …, N N = 3 for smoking. Let number these categories, and # ! = 1, 2, …, (&) N number these categories, and #$ (&) denote t denote the prevalence of the RF that smoking. Let moking. Let ! = 1, 2, …, N number these categories, and ! = 1, 2, …, denote N number these categories, and smoking. Let the #$ (&) prevalence ! = 1, 2, …, model with prevalence of RF category denote the prevalence of the RF that of N$ # the RF that$ number these categories, and (&) denote the prevalence of the RF that corresponds to the category corresponds to the category corresponds ! as the outcome, and time to the category ! # $ (&) ! at time at time at time denote the prevalence tt. We estimate . We estimate t as a sin # #$$ (&) (&) these categories, and #$ (&) denote the prevalence of the RF that corresponds to the category corresponds to the category ! at time t. We estimate #$ (&) using a mult corresponds to the category orresponds to the category ! at time ! at time ! at time t. We estimate We estimate #t $. We estimate t . We estimate # corresponds to the category For ! a< (&) using a multinomial logistic regression using a multinomial logistic regression , we have ! at time t. We estimate (&) using a multinomial logistic regression using ) #$ (&)model with prevalence of RF category multinomial $ model with prevalence of RF category logistic regression model #$ (&) using a multinomial logist with !! as the outcome, an as the outcome, an me t. We estimate #$ (&) using a multinomial logistic regression model with prevalence of RF category model with prevalence of RF category ! as the outcome, and time t as a model with prevalence of RF category model with prevalence of RF category prevalence ! as the outcome, and time ! as the outcome, and time model with prevalence of RF category ! as the outcome, and time of RF category t tt as a single explanatory variable. as a single explanatory variable. as the outcome, and time k as a single explanatory variable. For ! For as the outcome, !< < ) !time , we have ), we have and t as a single explanatory t as a single explan ory ! as the outcome, and time For ! < ) t , we have as a single explanatory variable. For ! < ) , we have æ pk ( t ) ö For ! < ) , we have or ! < ) , we have For ! < ) , we have variable. For we have ÷ = b0 æ + b1 t k k ln ç p ( t 1 p (ø) æppkk ((tt))öö kk æ p (t ) ö æ p ( t ) ö æ pkkk ( t ) ö = b kk + b kkt è æ pk ( t ) ö ln ç æ t ) ö ln ln çç ÷ ÷ =b = b00 + +b = bè (b( )) k k k ln ç k ÷ = b0kln+ç ÷ k ÷ 0p+ p tt tø æ pk ( t ) ö k b b b b k ln b t = + t ln = + 0 p (1 t (1.1) è ø è 1 t) ø çp1 (t ) ÷ 0 1 ç ÷ (1.1) (1.1) 11 1 ln ç ÷ = b0 + b1 t k è p1 ( t ) ø èp è The prevalence of the first category is obtained by using the normalization 1 (t ) ø 1 (1.1) ø 0 1 è p1 ( t ) ø è p1 ( t ) ø The prevalence of the first category is obtained by using th The prevalence of the first category is obtained by using the 1. Solving equation (1.1) for #$ (&), we obtain The prevalence The prevalence of the first category is obtained by using the normalizati 1 The prevalence of the first category is obtained by using the normalization constraint of the first category is obtained by ** # (&) = * using the normalization The prevalence of the first category is obtained by using the normalization constraint The prevalence of the first category is obtained by using the normalization constraint he prevalence of the first category is obtained by using the normalization constraint Depending on the 11 #! (&) = . Solving equation (1.1) for . Solving equation (1.1) for $+, $+, ## #! (&) $(&) $+, (&) = , we obtain , we obtain ! ( ) circumstances, this assumption * $ y is obtained by using the normalization constraint 1 . Solving equation (1.1) for constraint # (&) # , we obtain (&) = 1 . Solving equation (1.1) for =1. Solving equation (1.1) for # (&) ,, we obtain we obtain exp b 0k + b1k t . Solving equation (1.1) for 1. Solving equation (1.1) for will1be #$ (&), we obtain more or less accurate #$ 1, we obtain $ (&) $+, . Solving equation (1.1) for ! #$ (&), we obtain $ p ( t ) = , ( b b b ) (( we obtain and more or less necessary. In k' b exp exp bkk + b +b ) k å N b )(( )( k' 1k+ p exp + +1k tN 1 t ) (b b ) general, it is both extremely )b k exp 00 ((b b b ( ()b k exp 0 + k k 1t k p k 1 tt = =0 0 useful and accurate. For simple exp k + b k t exp + t exp + k ¢ t= () p t å b ) ((b kk = å exp , 0k N (b ) p k tt = , b k exp k the individual surveys, 0 + k b pk t = 1 t probabilities Bayesian () p 0 =) 1 , 0 1 () k ' pk ,t = (1.2) 0 1 (N (1.2) (1.2) 1 1,+ + k ' kk¢= exp b ) (b b ) k ()t = prior and posterior –' the , 1 + k¢=1exp 1 N k å (b 1 + (1.2) 0 + 1k N k ' exp N ¢t å å =1exp b b k' k '+which respects all constraints on the prevalence values, i.e. normalization 0 + k ' 0 + 1 t 1 t k ' å N 1 + k¢=1exp 0 + k¢=1 1 å+ (b b )k ' b kexp ' 1 t 0 + ¢= 11k 't 1 0 å Nare Beta distributions 1 + k¢likelihood =1 exp being (b k' + 0 binomial. k b t 1 For ) k¢=1 which respects all constraints on the prevalence values, i.e which respects all constraints on the prevalence values, i.e. reasonably large samples, 1.1.1which respects all constraints on the prevalence values, i.e. normalizatio Multinomial logistic regression for smoking prevalence which respects all constraints on the prevalence values, i.e. normalization and [0, 1] bounds. thewhich respects all constraints on the prevalence values, i.e. normalization and [0, 1] bounds. which respects all constraints on the prevalence values, i.e. normalization and [0, 1] b which respects all constraints on the prevalence values, i.e. normalization and [0, 1] bounds. approximation of the e prevalence values, i.e. normalization and [0, 1] bounds. which respects all constraints on the prevalence values, i.e. normalization and Measured data consist of sets of probabilities, with their variances, at spec Beta distributions by normal 1.1.1 Multinomial logistic regression for smoking p 1.1.1 Multinomial logistic regression for smoking p 1.1.1 distributions isMultinomial logistic regression for smoking prevalence both legitimate [0, 1] bounds. 1.1.1 Multinomial logistic regression for smoking prevalence 1.1.1 Multinomial logistic regression for smoking prevalence 1.1.1 Multinomial logistic regression for smoking prevalence and a practical necessity. For 1.1.1 Multinomial logistic regression for smoking prevalence the year of the survey). For any particular time, the sum of these probabili Measured data consist of sets of probabilities, with their va Measured data consist of sets of probabilities, with their va ression for smoking prevalence Measured data consist of sets of probabilities, with their variances, at sp Measured data consist of sets of probabilities, with their variances, at specific time values (typically Measured data consist of sets of probabilities, with their variances, at specific time values (typically complex, multi-PSU, stratified Measured data consist of sets of probabilities, with their variances, at specific time va data might be the probabilities of smoker, ex-smoker, never smokers as th Measured data consist of sets of probabilities, with their variances, at specific time values (typically the year of the survey). For any particular time, the sum of the year of the survey). For any particular time, the sum of surveys, it is again assumed obabilities, with their variances, at specific time values (typically 1.1.1 Multinomial logisticthe year of the survey). For any particular time, the sum of these probab regression for smoking prevalence the year of the survey). For any particular time, the sum of these probabilities is unity. Typically such 1 thatthe year of the survey). For any particular time, the sum of these probabilities is unity. Typically such he year of the survey). For any particular time, the sum of these probabilities is unity. Typically such these base probabilities the year of the survey). For any particular time, the sum of these probabilities is unity survey data set. Each data point is treated as a normally distributed data might be the probabilities of smoker, ex-smoker, neve data might be the probabilities of smoker, ex-smoker, neve rando ticular time, the sum of these probabilities is unity. Typically such are approximately normally Measured data consist data might be the probabilities of smoker, ex-smoker, never smokers as of sets of probabilities, data might be the probabilities of smoker, ex-smoker, never smokers as they are extracted from the with their variances, at specific data might be the probabilities of smoker, ex-smoker, never smokers as they are extracted from the data might be the probabilities of smoker, ex-smoker, never smokers as they are extracted from the distributed and, again, it is an data might be the probabilities of smoker, ex-smoker, never smokers as they are extra are a set of N groups (number of years) of K probabilities {{ti, µki, ski |k1 survey data set. Each data point is treated as a normally dis survey data set. Each data point is treated as a normally dis Î[0, assumption that makes the time values (typically the moker, ex-smoker, never smokers as they are extracted from the survey data set. Each data point is treated as a normally distributed survey data set. Each data point is treated as a normally distributed year1 of the survey). 1 1For any particular time, the sum 1 random variable; together they ran survey data set. Each data point is treated as a normally distributed urvey data set. Each data point is treated as a normally distributed analysis tractable. Depending survey data set. Each data point is treated as a normally distributed random variable; together they each year the set of K probabilities form a distribution – their sum is equal are a set of are a set of random variable; together they N groups (number of years) of N groups (number of years) of random variable; K probabilities {{ K probabilities {{ 1 on of these probabilities unity. Typically isare a set of such data might be the probabilities reated as a normally distributed are a set of N random variable; together they groups (number of years) of K probabilities {{ N i groups (number of years) of , µ t-1]} | ki, s | K probabilities {{ti, µki, ski |kÎ are a set of Ndata groups (number of years) of re a set of N groups (number of years) of the nature of the raw set are a set of K probabilities {{ K t probabilities {{ ki, ski |kÎ[0,Kt i, N i, µki, s µ groups (number of years) of iÎ ki [0, each year the set of each year the set of ki |kkÎN [0, [0,KKK-1]} | -1]} | Î-1]}. For K iiÎ Î[0, [0,N probabilities {{ N-1]}. For -1]}. For ti, µki, ski |kÎ[0,K-1]} | iÎ[0 probabilities form a distribution – th K probabilities form a distribution – the it may be possible to use non- ars) of K probabilities {{ ti, µki, each year the set of ski |kÎ K[0, Kof smoker, -1]} | [0,ex-smoker, N-1]}. For never iÎeach year the set of smokers as they each year the set of probabilities form a distribution – their sum is equal to unity. Kare extracted from the survey probabilities form a distribution – their sum is equ each year the set of each year the set of K probabilities form a distribution – their sum is equal to unity. parametric statistical methods K probabilities form a distribution – their sum is equal to unity. 1 K probabilities form a distribution – their sum is equal to unity. Depending on the circumstances, this assumption will be more or less accurate a form a distribution – their sum is equal to unity. for this analysis. data set. Each data point is treated as a normally distributed 1 random variable; general, it is both extremely useful and accurate. For simple surveys, the individua 11 Depending on the circumstances, this assumption will be more Depending on the circumstances, this assumption will be more 1 1 posterior probabilities are Beta distributions – the likelihood being binomial. For r Depending on the circumstances, this assumption will be more or less accurat 1 Depending on the circumstances, this assumption will be more or less accurate and more or less necessary. In 1 general, it is both extremely useful and accurate. For simple surv general, it is both extremely useful and accurate. For simple surve Depending on the circumstances, this assumption will be more or less accurate and more or less necessary. In Depending on the circumstances, this assumption will be more or less accurate and more or Depending on the circumstances, this assumption will be more or less accurate and more or less necessary. In approximation of the Beta distributions by normal distributions is both legitimate assumption will be more or less accurate and more or less necessary. In general, it is both extremely useful and accurate. For simple surveys, the individ general, it is both extremely useful and accurate. For simple surveys, the individual Bayesian prior and 33 posterior probabilities are Beta distributions – the likelihood bein posterior probabilities are Beta distributions – the likelihood bein general, it is both extremely useful and accurate. For simple surveys, the individual Bayesian prior and general, it is both extremely useful and accurate. For simple surveys, the individual Bayesian p eneral, it is both extremely useful and accurate. For simple surveys, the individual Bayesian prior and complex, multi-PSU, stratified surveys, it is again assumed that these base probab posterior probabilities are Beta distributions – the likelihood being binomial. Fo posterior probabilities are Beta distributions – the likelihood being binomial. For reasonably large samples, the accurate. For simple surveys, the individual Bayesian prior and approximation of the Beta distributions by normal distributions i approximation of the Beta distributions by normal distributions is posterior probabilities are Beta distributions – the likelihood being binomial. For reasonably large samples, the posterior probabilities are Beta distributions – the likelihood being binomial. For reasonably la osterior probabilities are Beta distributions – the likelihood being binomial. For reasonably large samples, the normally distributed and, again, it is an assumption that makes the analysis tracta approximation of the Beta distributions by normal distributions is both legitima approximation of the Beta distributions by normal distributions is both legitimate and a practical necessity. For utions – the likelihood being binomial. For reasonably large samples, the complex, multi-PSU, stratified surveys, it is again assumed that th complex, multi-PSU, stratified surveys, it is again assumed that th approximation of the Beta distributions by normal distributions is both legitimate and a practical necessity. For approximation of the Beta distributions by normal distributions is both legitimate and a practic pproximation of the Beta distributions by normal distributions is both legitimate and a practical necessity. For of the raw data set it may be possible to use non-parametric statistical methods fo complex, multi-PSU, stratified surveys, it is again assumed that these base prob complex, multi-PSU, stratified surveys, it is again assumed that these base probabilities are approximately s by normal distributions is both legitimate and a practical necessity. For normally distributed and, again, it is an assumption that makes t normally distributed and, again, it is an assumption that makes th complex, multi-PSU, stratified surveys, it is again assumed that these base probabilities are approximately complex, multi-PSU, stratified surveys, it is again assumed that these base probabilities are app omplex, multi-PSU, stratified surveys, it is again assumed that these base probabilities are approximately å k¢=1 ( 0 1 ) s all constraints on the prevalence values, i.e. normalization and [0, 1] bounds. nomial logistic regression for smoking prevalence a consist of sets of probabilities, with their variances, at specific time values (typically e survey). For any particular time, the sum of these probabilities is unity. Typically such the probabilities of smoker, ex-smoker, never smokers as they are extracted from the 1 et. Each data point is treated as a normally distributed random variable; together they together they are a set of N groups (number of years) of K probabilities groups (number of years) of K probabilities {{ti, µki, ski |kÎ[0,K-1]} | iÎ[0,N-1]}. For For each year the set of K probabilities set of K probabilities form a distribution – their sum is equal to unity. form a distribution – their sum is equal to unity. The regression consists of fitting a set of logistic functions {p (a, b, t)|kÎ[0,K-1]} to these dat The regression consists of fitting a set of logistic functions k the circumstances, this assumption will be more or less accurate and more or less necessary. In function for each to these data – onek-value. At each time value, the sum of these functions is unity. Thus, for e function for each k-value. At each time value, the sum of th extremely useful and accurate. For simple surveys, the individual Bayesian prior and when measuring smoking in the three states already mentioned, the abilities are Beta distributions – the likelihood being binomial. For reasonably large samples, the The regression consists of fitting a set of logistic functions { these functions is unity. Thus, pk(a, for The regression consists of fitting a set of logistic functions { pk(K b, texample, )|kÎ[0, a-1]} to these data – one , b, t)|k when Î[0,K-1]} to these data – one measuring smoking in k = 0 regression functio the of the Beta distributions by normal distributions is both legitimate and a practical necessity. For represents the probability of being a never smoker over time, three states already mentioned, the k = 0 regression function represents k = 1 the probability of being a the function for each function for each k-value. At each time value, the sum of these functions is unity. Thus, for example, k-value. At each time value, the sum of these functions is unity. Thus, for example, The regression consists of fitting a set of logistic functions { pk(a, b, t)|kÎ[0,K-1]} to these data – one -PSU, stratified surveys, it is again assumed that these base probabilities are approximately probability smoker, and function for each of being a never smoker k over time, k = k = 2 the probability of being a smoker. when measuring smoking in the three states already mentioned, the when measuring smoking in the three states already mentioned, the 1 the probability of being k = 0 regression function = 0 regression function k-value. At each time value, the sum of these functions is unity. Thus, for example, buted and, again, it is an assumption that makes the analysis tractable. Depending on the nature and ex-smoker, and represents the probability of being a never smoker over time, represents the probability of being a never smoker over time, k = 2 the set it may be possible to use non-parametric statistical methods for this analysis. probability when measuring smoking in the three states already mentioned, the of being a smoker. k = 1 the probability of being and ex- k = 1 the probability of being and ex- k = 0 regression function The regression equations are most easily derived from a familiar least square minimization. smoker, and k smoker, and k = 2 the probability of being a smoker. = 2 the probability of being a smoker. represents the probability of being a never smoker over time, k = 1 the probability of being and ex- smoker, and k 1 from a familiar least square following equation set the weighted difference between the measured and predicted proba The regression equations are most easily derived = 2 the probability of being a smoker. written as S; the logistic regression functions a,b;t) are chosen to be ratios of sums of pk(weighted The regression equations are most easily derived from a familiar least square minimization. In the The regression equations are most easily derived from a familiar least square minimization. In the minimization. In the following equation set the difference between The regression equations are most easily derived from a familiar least square minimization. In the exponentials (This is equivalent to modeling the log probability ratios, following equation set the weighted difference between the measured and predicted probabilities is following equation set the weighted difference between the measured and predicted probabilities is pk/p0, as linear functio the measured and predicted probabilities is written as S; the logistic regression following equation set the weighted difference between the measured and predicted probabilities is time). written as S; the logistic regression functions pk(a,b;t ) are chosen to be ratios of sums of functions pk(a,b;t) are chosen to be ratios of sums of written as S; the logistic regression functions are chosen to be ratios of sums of exponentials (This is written as S; the logistic regression functions pk(a,b;t) are chosen to be ratios of sums of exponentials (This is equivalent to modeling the log probability ratios, exponentials (This is equivalent to modeling the log probability ratios, equivalent to modeling the log probability ratios, pk/p0, as linear functions of pk/p0, as linear functions of as linear functions exponentials (This is equivalent to modeling the log probability ratios, pk/p0, as linear functions of ( ) 2 time). time). time). of time). k = K -1 i = N -1 pk ( a, b; ti ) - µki å å S ( a, b ) = 1 2 s ki 2 (p ( p ( a, b; t ) - ()a(,a ) -µµ )) 2 k =0 i =0 22 k = K -1 i = N -1 k =K -kµ 1 (p =iK =-N b,;b 1- 1N -1 i=t; ) t- S ( a, b ) = 1 2 å S ( a,å b()a= S , b) =åå 1k 2 å s å 1 2 2 i ss kki k i i 2 2 e Ak kiki (1.3) (1.3) (1.3) p ( a, b, t ) º k =0 i =0 k =0 k =i0= ki0 i =0 ki ki k A 1 + e A1 + .. + e AK -1 Ak e0A,e (e t ) ºa º ( a K -1 ) , b º ( b0 ,b1 ,..,bK -1 ) k a1 ,.., a k pk ( a, b, t ) º pk ( a p,kb a , ,) t b,º A K1 e A1e+ .. -1 + ++ .. + 1 eKA A K -1 1 + e A1 + .. + e 1A+ A e -1 º 0, Ak º ak + bk t a º ( a0 , a1 ,.., aK -1 ) , b º ( b 0 1 ,..,bK -1 ) 0 ,b (1.4) a º ( a0 , a1 ,..,a K -( aº 1)a,0 , ab ºa 1 ,.., (b 1 )1,,..,b -,b bKº ( -1 ) b0 ,b 1 ,..,bK -1 ) The regression consists of fitting a set of logistic functions { k(a, b, t)|kÎ[0,K p(1.4) K0 (1.4) A0 º 0, Ak ºall ak + bk t and are used merely to preserve the symmet A0 º 0, AA k º 0, a0 and The parameters function for each A0aº k +0,bk tb A0 are k º ak + bk t zero k-value. At each time value, the sum of these functions is u expressions and their manipulation. For a K-dimensional set of probabilities, there k will b The parameters A0, a0 and b0 when measuring smoking in the three states already mentioned, the are all zero and are used merely to preserve the symmetry of the = 0 re 0 and b0A The Aparameters The parameters 0, a expressions 0 , regression parameters to be determined. are a0all and and their b zero 0 and represents the probability of being a never smoker over time, are all are manipulation. zero For and used merely a are used to preserve K-dimensional merely to set of the preserve symmetry probabilities, of symmetry the there the of will be 2(K k = 1 the prob the -1) expressions and regression parameters to be determined. expressions their and their The manipulation. K manipulation. parameters For a For a K A0, a0 and b0k -dimensional smoker, and -dimensional set are of set of probabilities, all zero and are used merely to preserve the probabilities, there will be = 2 the probability of being a smoker. K there 2( will -1) be 2( K -1) For a given dimension K there are K-1 independent functions pk – the remaining function bei regression parameters to be determined. regression parameters to be determined. he regression consists of fitting a set of logistic functions { symmetry of p thek ( a , b , t expressions )| k Î [0, K and -1]} to these data – one their manipulation. For a K-dimensional set determined from the requirement that the complete set of For a given dimension K there are K-1 independent functions pk – the remaining function being K form a distribution and sum to of probabilities, The regression equations are most easily derived from a familiar least squar there will be 2(K-1) regression unction for each k-value. At each time value, the sum of these functions is unity. Thus, for example, parameters to be determined. For a given dimension determined from the requirement that the complete set of For a given dimension K there are K-1 independent functions K there are K-1 independent functions K form a distribution and sum to unity. pk – the remaining function being pk – the remaining function being when measuring smoking in the three states already mentioned, the following equation set the weighted difference between the measured and k = 0 regression function Note that the parameterization ensures the necessary requirement that each pk be interpret determined from the requirement that the complete set of determined from the requirement that the complete set of K form a distribution and sum to unity. K form a distribution and sum to unity. Note that the parameterization ensures the necessary requirement that each p be interpretable as For a given dimension epresents the probability of being a never smoker over time, written as S ; the logistic regression functions K there are K-1 independent functions k – the k = 1 the probability of being and ex- a probability – a real number lying between 0 and 1. k p ( a , b ;t ) are chosen to be ratio a probability – a real number lying between 0 and 1. remaining exponentials (This is equivalent to modeling the log probability ratios, function being determined from p the requirement that the complete p k/ p 0, moker, and k = 2 the probability of being a smoker. Note that the parameterization ensures the necessary requirement that each Note that the parameterization ensures the necessary requirement that each pk be interpretable as k be interpretable as The minimum of the function set of K form S time). S is determined from the equations distribution and sum to unity. a is determined from the equations a probability – a real number lying between 0 and 1. a probability – a real number lying between 0 and 1. The minimum of the function he regression equations are most easily derived from a familiar least square minimization. In the ¶S ¶the S necessary requirement ( ) 2 ollowing equation set the weighted difference between the measured and predicted probabilities is The minimum of the function The minimum of the function S is determined from the equations S is determined from the equations ¶S ¶S Note that the parameterization = =0 ensures for = 0 for 1 =j=1,2,....,k-1 k = K -1 i = N -1that j=1,2,....,k-1 p ( a,(1.5) k each b; ti ) - µki written as S; the logistic regression functions pk(bea,binterpretable ;t) are chosen to be ratios of sums of ¶a as ¶ j abprobability j ¶a j– a ¶ S ( a, blying b j number real ) = 2between 0 and 1. 2 s å å ¶S ¶S ¶S ¶S xponentials (This is equivalent to modeling the log probability ratios, p /p , as linear functions of k = 0 i = 0 ki = noting the relations = 0 k 0 for = 0 = for j=1,2,....,k-1 j=1,2,....,k-1 (1.5) (1.5) me). noting the relations ¶a j ¶b j ¶a j ¶b j A ek pk ( a , b , t ) º Ak ¶pk ¶ æ 2 pe ¶ æö Ak noting the relations noting the relations ( k = K -1 i = N -1 p a, b ( =; t ) - ç µ )¶ ÷ = pkd kj e- pk p j ö = p d 1 + e A1 + .. + e AK -1 = k kj - pk p j k A + .. + e ç ø A AK -1 ÷ 2 å å k ¶A S ( a, b ) = 1 j ¶A j è 1 + e i ki ( K -1 ) , b º ( b0 ,b1 ,..,bK -1 ) 1 K -1 A1 ¶ A ¶ A j è 1 + e a + .. º + ae , a (1.3) ,.., 1 ø a Ak s 2 j 0 Ak ¶ p k =0 ¶ =0 ¶p iæ e ¶ æ ki ö e ¶ ¶ ö k = k = ç AK -1 ÷ =¶ pa =- p p k d kj A ¶ k p¶ k =j pk d kj - p A º 0, Ak º ak + bk t (1.6) ç ¶A A K -1 ÷ = j 0 j e ¶A j A1 ¶Aj ¶Aj è 1 + e 1 +¶ A.. + èe 1 + eø + .. + ø j j e Ak The parameters ¶a j ¶Aj 34 pk ( a , b , t ) º ¶ ¶ ¶ ¶ =¶ A0, ¶ t a0 and b0 are all zero and are used merely to preser 1 + e A1 + ..expressions = e K -1 + A = ¶and their ¶ ¶Aj manipulation. ¶ a K-dimensional ¶ a ¶ A ¶ a b j ¶A = t For (1.6) set of probabilit (1.6) a º ( a0 , a1 ,.., aK -1 ) , b º ( b0 ,b1 ,..,bK -1 ) j regression parameters to be determined. j j j ¶b j ¶ Aj (1.4) A º 0, A º a + b t ¶ ¶ ¶ ¶ The parameters A0, a0 and b0 are all zero and are used merely to preserve the symmetry of the expressions and A The parameters their 0, a0 manipulation. For and b0 are all a K zero -dimensional and set of probabilities, are used merely there to preserve the will be 2( symmetry of Kthe -1) regression parameters to be determined. expressions and their manipulation. For a K-dimensional set of probabilities, there will be 2(K-1) regression parameters to be determined. For a given dimension K there are K-1 independent functions pk – the remaining function being Modeling the Long-Term Health and Cost Impacts of Reducing For a given dimension K there are K-1 independent functions determined from the requirement that the complete set of Smoking pk – the remaining function being K form a distribution and sum to unity. Prevalence through Tobacco Taxation in Ukraine determined from the requirement that the complete set of K form a distribution and sum to unity. Note that the parameterization ensures the necessary requirement that each pk be interpretable as a probability – a real number lying between 0 and 1. Note that the parameterization ensures the necessary requirement that each pk be interpretable as a probability – a real number lying between 0 and 1. The minimum of the function S is determined from the equations The minimum of the function S is determined from the equations The minimum of the function S is determined from the equations ¶S ¶S = =0 for j=1,2,....,k-1 (1.5) ¶ Sj ¶ ¶a Sj ¶b = =0 for j=1,2,....,k-1 (1.5) ¶a j ¶b j noting the relations noting the reations noting the relations ¶pk ¶ æ e Ak ö = ç = pkd kj - pk p j pkj ¶¶ ¶A A æ è 1 + e A1 +e .. + e ÷ Ak AK -1 ö ø = j ç AK -1 ÷ = pkd kj - pk p j ¶Aj ¶Aj è 1 + e A1 ¶ + .. + e¶ ø = (1.6) ¶¶a j = ¶¶ Aj (1.6) ¶¶a j ¶A ¶j =t ¶¶ b j = t ¶¶ Aj ¶b j ¶Aj The values of the vectors a, b that satisfy these equations are denoted ˆ .. They provide the tr 2 ˆ,b a ˆ . T The values of the vectors The values of the vectors a , b that satisfy these equations are denoted a, b that satisfy these equations are denoted a ˆ,b lines They provide p k aˆ , b ( the trend lines ) lines pk a ˆ,b ( ˆ ; t , for the separate probabilities. The confidence intervals for the , for the separate probabilities. The ˆ ; t , for the separate probabilities. The confidence intervals for the trend lines are ) 2 the trend lines are derived most confidence intervals for derived most easily from the underlying Bayesian analysis of the problem. easily from the The values of the vectors a, b that satisfy thes derived most easily from the underlying Bayesian analysis of the problem. underlying Bayesian analysis of the problem. The values of the vectors a, b that satisfy these equations are denoted 1.1.2 Bayesian interpretation aˆ,b ˆlines pk a ˆ,bˆ ; t , for the separate probabilities . They provide the trend ( ) 1.1.2 Bayesian interpretation The values of the vectors lines pk a lines pk a ( ˆ,b ˆ ( )) The 2 1.1.2 K a, b that satisfy these equations are denoted -2 regression parameters { The 2K-2 regression parameters { ˆ ; t , for the separate probabilities. The confidence intervals for the trend lines are ˆ , b; t , for the separate probabilities. The confidence intervals for the trend lines are ˆ , b . They provide the trend a ˆ Bayesian interpretation a,b} are regarded as random variables whose posterior distribu a,b} are regarded as random variables whose derived most easily from the underlying Baye is proportional to the function exp(-S(a,b)). The maximum likelihood estimate derived most easily from the underlying Bayesian analysis of the problem. The 2K-2 regression parameters {a,b} is proportional to the function exp(- derived most easily from the underlying Bayesian analysis of the problem. S(aare regarded as random 1.1.2 variables Bayesian interpretation ,b)). The maximum likelihood estimate of this probability distribution function, the minimum of the function S, is obtained at the values whose posterior distribution is proportional to the function exp(-S(a,b)). The The 2 K -2 regression parameters { a,b} are rega 1.1.2 Bayesian interpretation distribution function, the minimum of the function S, is obtained at the values of the (2 K -2)-dimensional probability distribution function are obtained by firs ˆ a , bˆ . Other proper 1.1.2 Bayesian interpretation maximum likelihood estimate of this probability distribution is proportional to the function exp(- function, the S(a,b)). Th The 2K-2 regression parameters { a,b} are regarded as random variables whose posterior distribution (2K-2)-dimensional normal distribution whose mean is the maximum likelihood of the (2K-2)-dimensional probability distribution function are obtained by first approximating it The 2K-2 regression parameters { a,b} are regarded as random variables whose posterior distribution ˆ . They provide the trend minimum of The values of the vectors is proportional to the function exp(- a, b the function that satisfy these equations are denoted S S, is obtained at the values a (a,b)). The maximum likelihood estimate of this probability ˆ distribution function, the minimum of the fun ,b . Other properties of is proportional to the function exp(- S ( a ,b ) amounts to expanding the function S(a,b) in a Taylor series as far as terms qua ). The maximum likelihood estimate of this probability (2K-2)-dimensional normal distribution whose mean is the maximum likelihood estimate. This lines pk a ˆ,b ( ˆ ; tthe ) (2K-2)-dimensional probability distribution function , for the separate probabilities. The confidence intervals for the trend lines are distribution function, the minimum of the function S, is obtained at the values distribution function, the minimum of the function S, is obtained at the values amounts to expanding the function differences S((a,- ˆ ), b - b a ˆ,b a of the (2 (2 are K aˆ obtained ˆ . Other properties ,b ˆ about the maximum likelihood estimate ˆ . Other properties K-2)-dimensional probability distribut b) in a Taylor series as far as terms quadratic in the by first -2)-dimensional normal distribution whose ( Sˆ ºS a ˆ,b ˆ ) ( approximating it as a (2 K -2)-dimensional normal distribution whose mean is of the (2 (2(2 derived most easily from the underlying Bayesian analysis of the problem. of the (2 ( K-2)-dimensional probability distribution function are obtained by first approximating it as a K-2)-dimensional probability distribution function are obtained by first approximating it as a differences the maximum (a - a ˆ ), b - b likelihood ˆ about the maximum likelihood estimate estimate. This amounts to expanding K-2)-dimensional normal distribution whose mean is the maximum likelihood estimate. This K-2)-dimensional normal distribution whose mean is the maximum likelihood estimate. This ) amounts to expanding the function the function ( k ( a, b; ti ()a--µa ˆ º S2 a S ˆ,b ˆ . Hence S(a,b) in a ( ) ˆki),) b - b ( ) k = K -1 i = N -1 p 1.1.2 Bayesian interpretation ˆ about the maximu S (a , b) = 1 2 å å S(a,b) in a Taylor amounts to expanding the function amounts to expanding the function S(a,series as far as terms quadratic b) in a Taylor series as far as terms quadratic in the S(a,b) in a Taylor series as far as terms quadratic in the differences in the differences (a ˆ ) about the maximum likelihood estimate s ki ( )) ( ) 2 The 2K-2 regression parameters { a,b} are regarded as random variables whose posterior distribution differences (a - a differences (aˆ- ),about b), ˆ -bb - the b ( maximum ) likelihood ˆ about the maximum likelihood estimate pk ( a, b; k = K -1 i = N -1 estimate S ti ) - ˆkµ ˆ ºS a , =0 b ˆ S .2 ˆ . Hence Hence i= ki º S a 0 ˆ,b ˆ . Hence ( is proportional to the function exp(- S ( a, b ) = 2 1S ( a , b ) ( ) ( ºS a ˆ, b s å å ). The maximum likelihood estimate of this probability ˆ2 + 1 a - a ) ( ˆ, b - b ˆ P -1 a - a ˆ, b - b ) ˆ k =+K- 1 i = N -1 p ... ( k ( a, b; ti ) ( ) 2 å å 2 ˆ S a, b = 1 ( ) ˆ a , b 2 k = K -1 i = N -1 p ( distribution function, the minimum of the function S, is obtained at the values a , b ; t ) k - = 0 µ i = 0 ki . Other properties s ( ki ) ˆ 2 2 S ( a, b ) = 1 2 å 1 å pk (2ia1 ; tki) - µ ˆ ˆki ( ) ( ,b ) ( () ) k1 k = K -1 i = N - ˆ ˆ ¶2S k =0 i =0 ¶2 S of the (2K S ( a, b ) =k =20 å -2)-dimensional probability distribution function are obtained by first approximating it as a º =0 Så aˆ, b s + ki 2 a2- a i ˆ, b -» b SPa -1 ˆ ˆ , b a+ -2a1 ˆå , b( - ai b -a + ˆ i )... (Saj - a ˆ j )+ ˆ +1 2 å ( ai - a 1 ˆˆ ) -1 ˆ (2K-2)-dimensional normal distribution whose mean is the maximum likelihood estimate. This º S( ˆ a , bˆ ) + ( a - aˆ k =0 , b i - b 1ˆ ) i =0 P ( -1 a - aˆ , b s ki - bˆ ) + ... i, j ¶a ˆi ¶a ˆº j( ) ( ˆ, b a 2 a i , j- a ) ( ˆ , b - ib ¶P aˆi ¶ba j ˆ ˆ º S (a ˆ) + (a - a ) ( ˆ )¶ 2 2 » S (ˆa +S ¶ S ) ( ) 2 ˆS ... ( aˆ - a amounts to expanding the function ˆ, b ˆ a,b) in a Taylor series as far as terms quadratic in the (P å ( ˆ, b ) ) ˆ +( aå ( ˆ) ˆˆ ˆ -1 ,b -b a -a ˆ- b ˆS ¶ b¶ S 2 2 ( ) ( ˆ) ( )ˆ 1 2 +-¶b ˆS ˆ ,b + a - a a)(1.7) - a ( b) + - ba 1 1 » S (a differences ( ˆa ˆ , b- )+a ), ( ˆ åb- (ab) ˆ -a ˆ) 1 ¶ Sˆ 2 ( ˆ ) + å (a - a about the maximum likelihood estimate a ˆ- a + ¶ å ˆ a ¶ aˆ ˆ) b 2 -¶ b Sˆ2 ¶( S b ˆ ˆb¶ºa-ˆS 2 ij, j b( ˆ a) ˆ ,+b) ˆ - a iˆ . Hence + » S 1å aˆ i , ¶ bˆ aˆ 1 2-ˆ ¶ b b å ( (1.7) ¶b ¶b ˆa ˆ- ˆb) i ¶ aˆ ¶a 2j i j 2 j i, j j i 1 i 2 i 1 2i j i j i j j ˆ ¶ S ˆ ¶ S ˆ ii ,i j j 2 i j i, j » S (a ˆ) + ( ˆ )+ 2 i i j 2 i ij j j i, j ¶a ˆ ¶a ¶a ˆ ¶b i j i j ˆ, b å (a - a ˆ)i, j 1 ¶( aˆ - a S ˆ ) + å (a - a i j ˆ) ˆ 2 jb -ˆ ¶ S b i, j 1 ˆ i j j 2 a ( ) ( ˆ) +( b ) 2 i i j 2 i i j ˆ a aˆ ¶ ¶ ( ) 2 ˆ å ( ˆa a)¶ - ˆˆ ˆå ˆ ¶a ¶b ( ˆ- ˆ ) ¶ S (a - a ˆ )+ å 2 2 ˆ ) ¶+ S ba b S a( b b 2 -b b + å (b - b - ( b;( ) ) + ) - The (2 ki = K -2)-dimensional covariance matrix 1-1 i = N -1 , jK P is the inverse of the appropriate ex i, j 1 ( ) µ b b i j ˆp , t - i j ˆ 1 1 S ( a, b ) =¶b 2 ˆ¶ å i aˆ¶ S å ˆ -i +1 å ¶b 2 b ˆ ¶a 2 - jb ˆ ¶b i j k i1 ˆ ¶b 2 ˆ ii- b ˆ i ki j ¶ ˆ ¶b b ˆ This matrix is central to the construction of the confidence limits for the trend ¶ bˆ ¶a j j ˆ 2 2j i i 2 j ij i j j + å (b - b ˆ) + å (b - b j i, j ˆ ) ¶ S (b - b ˆ) i, j i, j ( ˆ )s i, j 2i i , j i2 j i j i j i j 1 a -a k =0 i =0 ki 1 2 ˆ ¶b ¶a ˆ i i j j ˆ2 ¶b ¶b ˆ i i j j º S (a ˆ ) + (a - a ˆ) P ( ˆ) + i, j i, j The (2K-2)-dimensional covariance matrix The (2 ˆ,Kb ˆ, b - P is the inverse of the appropriate expansion coefficients. b -2)-dimensional covariance matrix 11.1.3 i a -Estimation of the confidence intervals j ˆ, b - b a -1 ... The (2K-2)-dimensional covariance matrix P is P is the inverse of the appropriate expansion coefficien i j 2 This matrix is central to the construction of the confidence limits for the trend lines. The logistic regression functions p (t) can be approximated as a normally distri This matrix is central to the construction of th The (2 This matrix is central to the construction of the confidence limits for the trend lines. K-2)-dimensional covariance matrix ˆ P is the inverse of the appropriate expansion coefficients. ˆ (1.7) k ( t ))1.1.3 2 2 ¶ S ¶ S » S (a ˆ) + å (a - a ˆ ) random variable ( ˆ ) + Nå pˆ ((a t ),-sa ˆ () ( ˆ )p+ about its maximum likelihoo 2 1.1.3 ˆ, b Estimation of the confidence intervals 1 a -a This matrix is central to the construction of the confidence limits for the trend lines. by expanding b -b 1 k ¶a 2 ˆ ¶a ˆ i i ¶a ˆ ¶b ˆ Estimation of the confidence inter j j 2 k i ki j j ˆ (t ) = p(a ˆ , t ) 1.1.3 Estimation of the confidence intervals The logistic regression functions i, j p (t) can be approximated as a normally distributed time-varying ˆ ,b i j i, j i j line) p k The logistic regression functions p (t) can be a N (p t )) by expanding k ˆ (t ),s (ˆ k 1.1.3 Estimation of the confidence intervals The logistic regression functions ˆ ˆ p (t) can be approximated as a normally distributed time-varying N (p ˆ (t ),s (t )) by expandin 2 2 ¶ S 2 ˆ ) ¶ S ( b - random variable + å (b - b ) ˆ ) + å (b - b ˆ) random variable p about its maximum likelihood estimate (the trend ˆ(p ( a -a k b k 2 ˆ (t ),s (t )) by expanding 1 k k 1 ˆ (t ) = p(a ˆ , t ) The logistic regression functions ˆ¶ p ( t ) can be approximated as a normally distributed time-varying ˆ ¶p ˆ about its maximum likelihood estimate (the tre k k , t ) = ˆp ( a , t ) 35 ) 2 i i j j2 2 i i j j ¶b N a k ¶b b p (t ) = p(a line) p random variable ˆ ,b p ( a, b ˆ +b-b ˆ,t random variable N (p ˆ (t ),s (t )) by expanding p about its maximum likelihood estimate (the trend k i, j i j k k ˆ + a ˆ- a i, j ˆ, b ˆ i kj 2 line) ,b ˆ (t ) = p(a ˆ , t ) P is the inverse of the appropriate expansion coefficients. k k k k k k t ) = p(a ˆ , b, t )p line) ˆp ˆ ,b ( a, b, t ) = p ( a ˆ,t) The (2 ˆ (K-2)-dimensional covariance matrix ˆ +a-a ˆ +b-b ˆ, b æa-a ˆö ( ) k line) p ˆ p ( t ) ˆ p ( t ) b, t ˆ= p ( a k k k = + Ñ , Ñ ç ÷ + ... p ( a, b è -)b This matrix is central to the construction of the confidence limits for the trend lines. ˆ+ k ˆ a ˆ b k a-a ˆ (1.8) ø k k =0 ) ( p ( a, b;kt=0) - iµ s ki a, b() p=( a,ab,;bt )=- µ () k ( i ki ) S (å 2 (å ) å å å k = K -1 i = N -1 S ( a, b ) = å å k i ki k i S 1 1 ki ( ) å 1 k i S a, b = 1 2 S (( ˆ()a+ 2 ( a - a , b - b ) P ( a - a , b - b ) + ... s ki s 2 2 s t ) - µˆ ) ˆ -1 2 2 2 k= ˆp kK a ,b ,b -1 i =iN;distribution function, the minimum of the function S, is obtained a -1 ˆ ˆ s 2 2 k =0 i =0 S ( a, b ) ˆ= å ºå =0 1=0 ki k= 0 i =0 ki k i ki k =0 i =0 ki 1 (a ˆ) + (a - ˆ) P (a 1 º S (a ˆ, b ) + ( a - a ( a - aˆ, b K ˆ ) P sof the (2 ˆ ) + ... ˆ ( ) ( ( ) ) (( ) ( --2)-dimensional probability distribution function are obt ˆ ˆº S ˆ, b ˆˆ a ˆ, b -1 2 1 ˆ, b - b b -1 2 -1-b Sa ˆa ,ˆb,b ˆa 1 ˆ, b 2 º - b+ P 2 aa -- a , b -b bˆ P + ... a - a ˆ ˆ, 2b- -1 2k =0 i =0 ki ºS a 2, ˆ b +1 a- 2 ¶ S ¶ S ºS a (( )) ( ˆ + 1 a» ˆ, b -S aˆ , ba ˆ-,b ˆ b ( ) (å ˆ ¶+2 S P -1 (2 ) ˆ a -K ( ) å( ) -2)-dimensional normal distribution whose mean is the maxim (a ˆi, b-- ( a i ) + ˆb ) ˆ ... ( ) a j - a ˆ j2 S ¶ ˆ+ 2 1 (2a ˆˆi - ˆi )1 (1.7) 2 b a ˆ j -¶ bˆ2 S j + ˆ ( ) ˆ( ¶ ˆS ˆ + å ( ˆ ) ˆ +1 å ( ) å S 2 ( ) ( ) ¶ ( ) å 1 ˆ ˆ ¶ a ¶ a (a j) »S a ˆ, b 2 a - aˆ a - aˆ + a - aˆ b » -S b a , b + ¶ aˆ ¶ ab - aˆ 2 i i ˆi2 ¶a ¶ amounts to expanding the function ˆj j a i , j j » S 2 a ( iˆ , b ji + 2 å 1 i »¶ S(aaa ˆi2i¶ - ˆ bˆ, aˆi ) + ˆ b i j , S j ( 1 a , j b ) in a Taylor series as far a ( a ij - a -a2 ˆ ˆ,ij ) i + 2 i j å ( aia 1 i ¶a -¶ ˆ aˆ-ˆ ij a j ¶ i)a ˆ ( ) ( ) ˆ ¶ aˆ2 ¶a ˆ i(1.7) ¶ S ¶ S ( ) ¶a ˆi ¶a ˆj i, j i, j (a ˆ j ) +(a )a ˆ about the maximum likelihood estimat j ˆ + 1 (a - a ˆ 2 å 2i 2å ˆ) ˆ(2 ) i , j i , j »S a ˆ, b differences ¶j2- S ˆa 1 -a ,i b -- aˆi b bj - ¶b i 2i ,ˆj + S j å( ) ( ( ) () ( å ) (å) ) ( ( () ) ) 2 ˆ + ¶ Sˆ i b¶a ˆ - ¶ bˆ ˆ a a - a ˆ ¶ S ˆ+ ¶ aˆ b 2¶ b -ˆ ˆ b j b - bˆ ¶ S ( ) ˆ 1 1 2 ˆ ( ˆ j ) +¶2 ˆ j ˆ (¶ ˆj )+ 2 å bi - bi +1 ˆ i , j 2 a j i- a i i j 1 ˆå bi - jˆ b å i , ( ( j j 2ˆb j -¶b ˆiSi j i +1 ( ˆ å )ˆ bi1 ¶ 2- Såˆjb a jS -a b ¶ ˆ a ( ) + ( ) b - b a - ¶ baˆ ¶ b + i b -b 1 bj - 1i ˆi i , j j ˆ ˆ å ˆ ˆ j 2 ¶b ¶ a ˆˆj i , j ii , j j 2 ¶ b ¶ ibj + ˆ i i , j 1 bˆ ¶a ˆ b j - b j 2 a¶b i i-¶a iˆˆ a bˆ+ bˆ b ( å )å ( ) ¶ ¶ 2 S ¶ 2 S j¶ ¶ ( ) ( ) 2 ( ) i, j The (2K-2)-dimensional ( ) covariance matrixˆ P is the inverse -1 the j of appropriate ˆ µ i i +1 å b - ˆ b a - aˆ + 1 å b - b i , j b k =K - - 2 iˆ b1 i=N i p i a , ¶ b b ; ¶t a i ˆ , j - j j i 2j S i( a¶ ) is the inverse of the appropriate expansion coe i j k i ki i , j 2 i expansion The (2 i ¶ ˆ ¶a b ˆj coefficients. j This matrix K-2)-dimensional covariance matrix j 2 i is central ,b b ˆ¶ Pi to bˆ = 1 j jthe construction 2 The (2 j , of the 2 i j i, j i, j The (2K-2)-dimensional covariance matrix P is the inverse of the appropriate expansion coefficients. i The (2K-2)-dimensional covariance matrix k =0 i =0 s ki K-2)-dimensional covariance matrix P is the inverse of the appropr P is confidence limits for the trend lines. The (2K-2)-dimensional covariance matrix This matrix is central to the construction of the confidence limits for the trend lines. This matrix is central to the construction of the This matrix is central to the construction of the confidence limits for the trend lines. P is the inv The (2K-2)-dimensional covariance matrix P is the inverse of the appropriate expansion coefficients. 1.1.3 Estimation of the confidence intervals This matrix is central to the construction of the confidence limits for the S aˆ, bˆ + 1 a-a 2 This matrix is central to the construction of the confidence limits for the trend lines. 1.1.3 ˆ, b - b ˆ P -1 a - a ºThis matrix is central to the construction of the confid ˆ, b - b Estimation of the confidence inter ( ) ( ˆ + ... ) ( ) 1.1.3 Estimation of the confidence intervals 1.1.3 Estimation of the confidence intervals 1.1.3 Estimation of the confidence intervals ¶2S ˆ The logistic regression functions The logistic p regression k ( t The logistic regression functions 1.1.3 Estimation of the confidence intervals ( ( ) ( )) ( ( ) ( )) ) can be approximated as a normally distributed time-varying functions p k ( t ) can be approximated as a normally distributed time-v » S can ˆ a ˆ b The logistic regression functions 1.1.3 be, Estimation of the confidence intervals + 1The logistic regression functions approximated ap k (-t ˆ a as a normally ) aj - a( ) å( ( (( ) ( ))) å ˆ j pk+ ) can be approximated as a normally (t) can be a 1 ai ( ( ( ) ( )) 2 i i 2 random variable N pˆ t , s 2 t by expanding p random variable about its maximum likelihood estimate (the trend ˆ k tp,ks ¶ ˆ a N ¶a pˆ ˆ t , s 2 t by expandin distributed The logistic regression functions random variable k time-varying k p k(N p ,s kt variable ˆ t random 2 random variable Np k t by expanding pk about its maximum lik The logistic regression functions by expanding i2, j by expanding t) can be approximated as a normally distributed time-varying i kj pk(t) can be approxim about its maximum likelihood estimate ( k i , j ˆ (t ) = p(a ˆ( ˆ ) ( s (t ))ˆ by expanding (( t( ˆ , t ) )) by expanding k k line) p random variable k ˆ ,N ,p about b line) p t t ),maximum ˆ (t ) = p(a its k ,b t ) line) p ˆ ,likelihood 2 k pestimate ˆ (t ) = + p ( (the aˆ ,b å ( t ) ˆ ˆ ,trend random variable b - b line)¶ line) ) p ˆN ˆ S tp about its maximum likelihood estimate (the trend k ) =(p ˆ ( a ),ˆs a - ,b ˆ ) a1 (t + å ( b -b ˆ ) p¶ abo Sˆ 2 k k 2 k 1 k 2 ˆ (t ) = p(a ˆ , t ) k The values of the vectors ( ˆ ˆa ˆ, t ) a , b that satisfy these equations ar ˆ ¶b ˆ k 2 i i j j 2 i i line) p ˆ ,b ˆ (t ) = p ¶ bˆ ,¶ ¶b p b(- t ), for the separate probabilities. The confiden line) p a b p ( a, b, t ) = p ( a ;) t )p (a + i, j i j i, j i ˆ( ) ,= k ˆ ,( ˆ +a- a ˆ ˆ, b + ˆ ,bˆˆ, t ˆp ab k , b- , tb ˆ ˆ lines a b p ( aˆ , b, t ) = p ˆ +a-a a ,b ( ) + k k ( ) k k p a, b , t = p aˆ + a - ˆ a , b + b - b k t k k p ( a, b, t ) = p (The (2 K ˆ +b-b ˆ ,æt) ˆ-2)-dimensional covariance matrix k P is the inverse of the a k = k pˆ ˆ + a - a t ) ( + Ñ a , ,b derived most easily from the underlying Bayesian analysis o Ñ ) (This matrix is central to the construction of the confidence limits f ˆ p k ( )ç ˆ ÷ t a-a ˆ ö + ... = pˆ æ p ( ta) ( - + (1.8) a, b, t ) = p (ˆ aÑ ö , Ñ ) pˆ (( a t ˆ= ) +ap æ ç ˆ-- a ˆa (atˆö,+ ) + k k ˆ÷ k ˆ a ˆ k k ˆ ( t ) ç ( ÷ + ... ) b b a+ -a b ˆÑ ( ) ( ) k ˆ a ˆ k = ˆ p æ è t ö ø , Ñ pˆ t + (1.8) ... b - b b =p ˆ ( t ) + ( Ñ , Ñ ) p 1.1.2 Bayesian interpretation ç ˆ÷k è ˆ aø ˆ b k 1.1.3 The 2 ˆ Estimation of the confidence intervals b- b ˆ (t ) + (Ñ , Ñ ˆ K k è b - b -2 regression parameters { ø ˆ a b è a,b k ø =p } are regarded as rando Denoting mean values by angled brackets, the Denoting mean values by angled brackets, the variance of p is thereby approximated as Denoting mean values by angled brackets, the variance of k p is thereby a k ˆ a Denoting mean values by The logistic regression functions angled brackets, the variance of pk(is is proportional to the function exp(- t) can be approximated as a no thereby S(a,b)). The maximum li k Denoting mean values by angled brackets, the variance of Denoting mean values by angled brackets, the variance of approximated as random variable pk is thereby approximated as Np ˆ k tæ a ,s - 2 t distribution function, the minimum of the function S, is obta kaˆ ö æ a - a( ( ) ( )) pk is thereby approximated as by expanding ˆö T pk about its maxim a s k ( t ) º ( pk ( a, b, t ) - p ˆ k ( t ) ) = ( Ñaˆ p ) Denoting mean values by angled brackets, the varian ( ) æ= 2 ( ) ( ,)t,)Ñ ˆk ( t) ( ) 2 2 ˆk s ( t2) , ( Ñ t ) ˆº k ˆ( p ( a ˆ p ( t ) ç b t ) ÷ çˆ p s ( 2 t )t) 2 ÷ º ´( p pˆa ,(b t -p p ˆ ( t ) ) line) pˆ ( of the (2 t ) = k p Kb a-2)-dimensional probability distribution function a ˆ , b , t k æ è , b a a , - ˆ b ˆö - øæ b èa a k k - bˆ ˆö ø T = Ñ k ˆ a k æ a (1.9) -a ˆ ˆ öæ a - a b k ˆ ö èb T ç = ( Ñ(2 )) Ñ k s k2 ( t ) º ( pk ( as )) ( ) 2 ,b 2, t ) - p k (T t) ºˆ k ( tp k ( a, b, ta )( ˆpˆ-K (p ˆt ) ,(Ñ k t) ˆpˆ2k ( t = ça pˆk ( t ) ÷ ç, Ñ k -2)-dimensional normal distribution whose mean is the m ˆ øèb T pˆˆø)÷ ( t ) ´ ç ÷ ç ÷ ´ ( ˆ)ø è= ( b b b b b ˆp ( Ñaˆ p ˆ k ( t ) ) = ( Ñaˆ p ) aˆPˆk ( -p ) - ˆ ˆ k T ˆ k ( t ) , Ñbˆ p ˆ k ( t ) , Ñbˆ amounts to expanding the function ˆ k ( t() ˆ k (st)è2 ( ) ˆ k(( ) b -( b) b - b () ) p t )aˆ,p ) º ( paˆk (ka, b, tˆ)bˆ -k p ˆ t Ñ p t , Ñ ˆ p t (1.9) )ˆt )ˆ ( = ˆ) in a Taylor series a Ñ ˆø ( tˆ , ) ( ) T (Ñ ,( Ñ ( 2 Ñ p Ñ ˆ p (t t ) = Ñ pˆ t ) , S Ñ( a è ,pˆ b ( ˆk ( t ) P Ñ pˆ ˆ kÑ ˆ(t ˆp ,Ñ k ( a, b, tT) = pk a + a - a, b + b - b, t ˆ a k ˆ a k ˆ kk p bˆ k b b a a k ( Ñaˆ p ˆ k ( t ) ) = ( Ñaˆ p T ( t ) ) P ( Ñ p )) ( ( ))) ( T ˆ k ( t ) , Ñbˆ p ˆ k ( t ) , Ñbˆ differences ˆ p (aˆaˆ- k (at ),, Ñ ˆ( t about the maximum likelihood e ˆ kb p ( ( )) ( T) b ( ) ) ˆ ( T ˆk t ,Ñ ˆ p Ñaˆ p ˆ () ˆ k t K=3 this equation can be written as the 4- ˆ k t , Ñ ˆWhen ˆk t ,Ñ ˆ pˆk t bbˆ- () = Ñaˆ p p P Ñaˆ p k t K=3 this equation can be written as the 4-dimensional inner product k When K=3 this equation can be written as the 4-dimensional inner product When b æa b ( Ñaˆ p When K=3 this equation can be written as the 4-dimensional inner product ˆ k ( t ) , Ñbˆ = p ˆ(kt ) ˆ kp () =((Ñ t) + Ña ˆ, p ˆa ˆ k bˆ()t ) Ñ ˆ,kÑ p ( tˆ) pˆçk ( 3 b ( k ( a, b; ti ) - µki ) èb 2 k = K -1 i = N -1 p When K=3 this equation can be written as the 4-dimensional inner product 2 å inner å product 2 3 When K=3 this equation can be written as the When S ( a, b ) = 1 4-dimensional K=3 this equation can be written as the 4-dimens Denoting mean values by angled brackets, the variance of k =0 i =0 ki pk is the s ( ) ( ˆ, b ºS a ˆ + 1 a-a 2 æ)¶p ( t ˆ P -1 a - a ˆ, b - b ˆ () öæ ¶p ˆ + ˆ, b - b ˆ (t ) ö ) k ç ˆˆ ÷ç ÷ k ( ) ¶¶(a aˆ 2 (t ) º S p ¶ 2 s » S (a ( ) ˆ, b ) + å ( a - a ˆ ( ) ( ) + (t )ù( 2 p a ˆ , b, t - p t ç = Ñ ˆ ÷ t , ˆ ) Ñ ÷pˆ (t ) ÷ P å k k ˆ )P ç a - 1 a k 1 ˆ a k 1 ˆ k1 ˆtP ˆP b éP Pé P P P P P é ù 綶 aˆ ¶( ˆP p a ÷ç ¶p 2 i i aa11 i k ab12 aa12 j j ab11 2 ab1 öê aa11 i , j aa12 aa11 aa12 ab11 ab12 k ( ) ( t )t ö ab11 ç ÷ç P ÷ P j k ( t )T æ ¶p ˆ t( ¶êˆP p ¶êˆP p ¶p ˆk ( t )P úP Paa ú æ ¶pˆ ( t ) æ ¶ ˆ p ( t ) 2¶p (ˆ ) ( t ) ¶ ˆ kp t ) ö ¶ ˆ kp ( ) P P P P ê ¶Paˆ 222 ÷ç ¶a ˆab ( ç )) ( ( )( ( ) aa 21 22 21 ab 2 s t = ˆ 2 ÷ s k ( t ) = çs k ( t ) = ç k ç ( ê ) ê ( ) ¶ 2 S ÷ ( ú ) ú ( ) å( 2 2k k Ñk pˆ t aak21 , Ñ ˆ p aa 22 t aa 21 ab 21 = aa Ñ 22 ab 22 ˆ p ab 21 t , Ñ ab ˆ p t P Ñ pˆ ÷ ˆ ˆ ) å k ˆ ÷ k ¶b ˆ ˆ è ¶a ˆ1 ˆaˆ k ¶a Pˆ 1b P P ¶b2 ˆ k êP çba ˆP k÷12 ç ˆ kP 11 ÷ ˆaˆbb P t) j( 12 è ¶a ˆ1 è ¶a ˆ2 ¶a bˆ ¶b2 øê+ 12 ˆ b ¶b bba P ø êiP- Pa bbP øa j - ú11 p ¶11a ˆˆ P b +ú ba 1 ¶ p ( bbb t i - bi ) 1 1 12 1 2 ba 11 ê i, j ê 2 12 ba11 i bb11 ¶ bˆ ba 12 ¶ a ˆ 12 bbê ç k bb úPba 21 ˆ Pba ÷ 2 ç ú 22 i , j P ÷ ë Pba 21 Pba ëP P 21 ba 21 bbP i P ba 22 bbP j ë û21ç ¶ P b 1 22 û÷ç ¶b ˆbb 21 ÷ Pbb 2 When K=3 this equation can be written as the 4-dimensional inner pr 22 22 bb bb 1 ç ÷ç ÷ The (2K-2)-dimensional covariance matrix ¶ ç kP pˆ ( is the inverse of t ) ÷ç ¶p ˆ k (t ) ÷ ç ¶b ˆ ÷ç ¶b ÷ This matrix is central to the construction of the confidence l è 2 øè ˆ2 ø (1.10) (1.10) (1.10) 1.1.3 Estimation of the confidence intervals where Pcdij º ( where c P - ˆ c ) º d (c-where where i cdij i j i j i jˆc d - ˆ ) . The 95% confidence interval for d Pcdijˆ º . The 95% confidence interval for -( d j (ci - c)(ˆ )d j -d The i95% ) ˆ . The 95% confidence interval for pk(t) is centred g j confidence k(t) is centred given as pinterval The logistic regression functions ( pk(t) can be approximated a pk(t) is centred given as for is ) [k ˆ (t ) - 1.96 p [pk (tk) - 1.96pˆ (t )[ k) + ]1.96s k (t )]. s ( t ) ˆ s (t ), p (t ) + 1.96ks (t ) . k random variable centred given as - 1.96 k s k (t ), pk (t , pk (t ) + 1.96 s (t ) k N p . ˆ t ,s t by expanding pk about its m 2 ] ( k ( ) k ( )) line) p 1.2 Module Two: Microsimulation ˆ k (t ) = p a 1.2 Module Two: Microsimulation ˆ ,t ˆ ,b 1.2 Module Two: Microsimulation ( ) 1.2.1 Microsimulation initialization: Birth, disease and death models 1.2.1 Microsimulation initialization: Birth, disease and death models pk a, b, t = pk a 1.2.1 Microsimulation initialization: Birth, disease and death models Simulated people are generated with the correct demographic statistics in the sim Simulated people are generated with the correct demographic statistics in the simulation’s start- ˆ +a-a ˆ ˆ, b Simulated people are generated with the correct demographic statistics in the simulation’s start- +b- ( ) ( year. In this year, women are stochastically allocated the number and years of birt year. In this year, women are stochastically allocated the number and years of birth of their children year. In this year, women are stochastically allocated the number and years of birth of their children =p ˆ k t + Ñaˆ , Ñbˆ p – these are generated from known fertility and mother’s age at birth statistics (valid in the start- – these are generated from known fertility and mother’s age at birth statistics (valid in the start- ˆk – these are generated from known fertility and mother’s age at birth statistics (val () ( ) year). If a woman has children, then those children are generated as members of t year). If a woman has children, then those children are generated as members of the simulation in year). If a woman has children, then those children are generated as members of the simulation in the appropriate birth year. the appropriate birth year. 36 the appropriate birth year. Denoting mean values by angled brackets, the variance of p The microsimulation is provided with a list of relevant diseases. These diseases use The microsimulation is provided with a list of relevant diseases. These diseases used the best The microsimulation is provided with a list of relevant diseases. These diseases used the best available incidence, mortality, survival, relative risk and prevalence statistics (by ag available incidence, mortality, survival, relative risk and prevalence statistics (by age and sex). available incidence, mortality, survival, relative risk and prevalence statistics (by age and sex). 2 2 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 1.2 Module Two: Microsimulation 1.2.1 Microsimulation initialization: Birth, disease and death models Simulated people are generated with the correct demographic statistics in the simulation’s start-year. In this year, women are stochastically allocated the number and years of birth of their children – these are generated from known fertility and mother’s age at birth statistics (valid in the start-year). If a woman has children, then those children are generated as members of the simulation in the appropriate birth year. The microsimulation is provided with a list of relevant diseases. These diseases used the best available incidence, mortality, survival, relative risk and prevalence statistics (by age and sex). Individuals in the model are simulated from their year of birth (which may be before the start year of the simulation). In the course of their lives, simulated people can die from one of the diseases caused by smoking that they might have acquired or from some other cause. The probability that a person of a given age and sex dies from a cause other than the disease are calculated in terms of known death and disease statistics valid in the start year. It is constant over the course of the simulation. The survival rates from tobacco-related diseases will change as a consequence of the changing distribution of smoking level in the population. The microsimulation incorporates a sophisticated economic module. The module employs Markov-type simulation of long-term health benefits, health care costs, and cost-effectiveness of specified interventions. It synthesizes and estimates evidence on cost-effectiveness analysis and cost-utility analysis. The model can be used to project the differences in quality-adjusted life years (QALYs), direct and indirect lifetime health-care costs, and, as a consequence of interventions, incremental cost effectiveness ratios (ICERs) over a specified time scale. Outputs can be discounted for any specific discount rate. The section provides an overview of the initialization of the microsimulation model and will be expanded upon in the next sections. 1.2.2 Population models Populations are implemented as instances of the TPopulation C++ class. The TPopulation class is created from a population (*.ppl) file. Usually a simulation will use only one population, but it can simultaneously process multiple populations (for example, different ethnicities within a national population). 37 1.2.2.1 Population Editor The Population Editor allows editing and testing of TPopulation objects. 1.2.2 Population models 1.2.2 1.2.2 Population models Population models Populations are implemented as instances of the TPopulation C++ class. The TPopulation class is The population is created in the start year and propagated forwards in time Populations are implemented as instances of the TPopulation C++ class. The TPopulation class is Populations are implemented as instances of the TPopulation C++ class. The TPopulation class is created from a population (*.ppl) file. Usually a simulation will use only one population, but it can by allowing females to give birth. An example population pyramid which can created from a population (*.ppl) file. Usually a simulation will use only one population, but it can created from a population (*.ppl) file. Usually a simulation will use only one population, but it can simultaneously process multiple populations (for example, different ethnicities within a national be used when initializing the model is shown in Figure 1. It shows the 2015 simultaneously process multiple populations (for example, different ethnicities within a national simultaneously process multiple populations (for example, different ethnicities within a national population). population distribution in Ukraine used in the initialization of the model. population). population). 1.2.2.1 Population Editor 1.2.2.1 1.2.2.1 Population Editor Population Editor Ukraine males (20.8m) females (24.1m) The Population Editor allows editing and testing of TPopulation objects. The Population Editor allows editing and testing of TPopulation objects. The Population Editor allows editing and testing of TPopulation objects. (90+) 0.3% (90+) 0.8% The population is created in the start year and propagated forwards in time by allowing females to The population is created in the start year and propagated forwards in time by allowing females to (80-89) 2.2% (80-89) 4.8% The population is created in the start year and propagated forwards in time by allowing females to give birth. An example population pyramid which can be used when initializing the model is shown give birth. An example population pyramid which can be used when initializing the model is shown in give birth. An example population pyramid which can be used when initializing the model is shown in Figure 1. It shows the 2015 population distribution in Ukraine used in the initialization of the mode (70-79) 5.8% (70-79) 9.7% Figure 1. It shows the 2015 population distribution in Ukraine used in the initialization of the model. Figure 1. It shows the 2015 population distribution in Ukraine used in the initialization of the model. (60-69) 10.5% (60-69) 13.0% (50-59) 14.2% (50-59) 14.6% (40-49) 14.2% (40-49) 13.1% (30-39) 17.1% (30-39) 14.6% (20-29) 14.9% (20-29) 12.3% (10-19) 9.9% (10-19) 8.1% (0-9) 9.9% (0-9) 8.8% Figure 1: Population Pyramid in 2015 in Ukraine Figure 1 Population Pyramid in 2015 in Ukraine People within the model can die from specific diseases or from other causes. A Figure 1 Population Pyramid in 2015 in Ukraine Figure 1 Population Pyramid in 2015 in Ukraine disease file is created within the program to represent deaths from other causes. People within the model can die from specific diseases or from other causes. A disease file is create People within the model can die from specific diseases or from other causes. A disease file is created The following distributions are required by the Population Editor (Table 1). People within the model can die from specific diseases or from other causes. A disease file is created within the program to represent deaths from other causes. The following distributions are required within the program to represent deaths from other causes. The following distributions are required within the program to represent deaths from other causes. The following distributions are required by the Population Editor (Table 1). by the Population Editor (Table 1). by the Population Editor (Table 1). Table 1: Summary of the Parameters Representing the Distribution Component Table 1 Summary of the Parameters Representing the Distribution Component Table 1 Summary of the Parameters Representing the Distribution Component Table 1 Summary of the Parameters Representing the Distribution Component Distribution name Distribution name symbol symbol note note Distribution name Distribution name symbol symbol note note MalesByAgeByYear #/ 0 Input in year Input in year 0 – probability of a male having age a MalesByAgeByYear MalesByAgeByYear # 0 Input in year 0 – probability of a male having 0 – probability of a male having age a MalesByAgeByYear / 0 #/ Input in year age a 0 – probability of a male having age a FemalesByAgeByYear #1 0 Input in year0 – probability of a female having age a FemalesByAgeByYear FemalesByAgeByYear # 1 0 Input in year Input in year Input in year 0 – probability of a female having age a 0 – probability of a female having FemalesByAgeByYear # 1 0 0 – probability of a female having age a age a BirthsByAgeofMother #2 0 Input in year0 – conditional probability of a birth at age a| th BirthsByAgeofMother # 2 0 Input in year Input 0 – conditional probability of a birth at age a| the in year BirthsByAgeofMother BirthsByAgeofMother # 2 0 Input in year 0 – conditional probability of a 0 – conditional probability of a birth at age a| the mother gives birth. mother gives birth. birth at age a| the mother gives birth. mother gives birth. NumberOfBirths #l 3 TFR, Poisson distribution, probability of lºTFR, Poisson distribution, probability of giving birth to n NumberOfBirths NumberOfBirths # 3 lº TFR, Poisson distribution, probability of giving birth to n NumberOfBirths #ll 3 lº giving birth to n children TFR, Poisson distribution, probability of giving birth to n children children children 38 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 1.2.2.2 Birth model th model 1.2.2.2years Any female in the childbearing {AgeAtChild.lo, AgeAtChild.hi} is deemed Birth model n the childbearing years {AgeAtChild.lo, AgeAtChild.hi } is deemed capable of giving birth. capable of giving birth. The number of children, n, that she has in Any female in the childbearing years { her life is AgeAtChild.lo, AgeAtChild.hi } is d of children, n, that she has in her life is dictated by the Poisson distribution dictated by the Poisson distribution #l 3 where where the mean of the Poisson The number of children, n, that she has in her life is dictated by the Poi 1.2.2.2 Birth model the Poisson distribution is the Total Fertility Rate (TFR) parameter. 2 distribution is the Total Fertility Rate (TFR) parameter.2 1.2.2.2 Birth model 1.2.2.2 Birth model the mean of the Poisson distribution is the Total Fertility Rate (TFR) par Any female in the childbearing years {AgeAtChild.lo, AgeAtChild.hi} is deemed capable of giving birth. Any female in the childbearing years { AgeAtChild.lo, AgeAtChild.hi 1.2.2.2 Birth model } is deemed capable of giving birth. Any female in the childbearing years { AgeAtChild.lo, AgeAtChild.hi} is deem 1.2.2.2 Birth model The number of children, n, that she has in her life is dictated by the Poisson distribution #l 3 where ity that a mother (who does give birth) gives birth to a child at age a is determined from The probability that a mother (who does give birth) gives birth to a chil The number of children, n, that she has in her life is dictated by the Poisson distribution The Any female in the childbearing years { probability that a mother (who does give birth) gives birth AgeAtChild.lo, AgeAtChild.hi Any female in the childbearing years { 2 to a# l 3 where The number of children, n, that she has in her life is dictated by the Poisso child at } is deemed capable of giving birth AgeAtChild.lo, AgeAtChild.hi } is deem the mean of the Poisson distribution is the Total Fertility Rate (TFR) parameter. AgeOfMother distribution as #2 0 . For any particular mother, the births of multiple 2 the mean of the Poisson distribution is the Total Fertility Rate (TFR) parameter. a is determined from the BirthsByAgeOfMother distribution as BirthsByAgeOfMother distribution as #2 0 . For any particular mot the mean of the Poisson distribution is the Total Fertility Rate (TFR) param theThe number of children, n, that she has in her life is dictated by the Poisso age The number of children, n, that she has in her life is dictated by the Poisson distribution #l 3 wher treated as independent events, so that the probability that a mother who produces N For any children are treated as independent events, so that the probability tha The probability that a mother (who does give birth) gives birth to a child at age a is determined from particular mother, the births of multiple children are treated as the mean of the Poisson distribution is the Total Fertility Rate (TFR) parameter. 2 the mean of the Poisson distribution is the Total Fertility Rate (TFR) param The probability that a mother (who does give birth) gives birth to a child at age a is determined from The probability that a mother (who does give birth) gives birth to a child a duces n of them at age a is given as the Binomially distributed variable, the BirthsByAgeOfMother distribution as independent events, so children produces n of them at age a is given as the Binomially distribu # that . For any particular mother, the births of multiple 0 the probability that a mother who produces N 1.2.2.2 Birth model the BirthsByAgeOfMother distribution as 2 . For any particular mother, the births of multiple #2 0 the BirthsByAgeOfMother distribution as #2 0 . For any particular mother The probability that a mother (who does give birth) gives birth to a child at age a is determined from The probability that a mother (who does give birth) gives birth to a child a children are treated as independent events, so that the probability that a mother who produces N children produces n of them at age a is Any female in the childbearing years { given as the Binomially distributed AgeAtChild.lo, AgeAtChild.hi } is deemed capable of giving birth. children are treated as independent events, so that the probability that a mother who produces N children are treated as independent events, so that the probability that a N! the BirthsByAgeOfMother distribution as #2 0 . For any particular mother, the births of multiple ( ( )) ( ( )) n N -n the BirthsByAgeOfMother distribution as #2 N0!. For any particular mothe ( pb n at a | N = ) children produces n of them at age a is given as the Binomially distributed variable, a pb The number of children, n, that she has in her life is dictated by the Poisson distribution variable, 1 - pbm a children produces n of them at age a is given as the Binomially distributed p (1.11) n at a | children produces n of them at age a is given as the Binomially distributed variable, N = n ( pb a #l 31 where - pbm ) ( ( )) ( ( ) children are treated as independent events, so that the probability that a mother who produces N children are treated as independent events, so that the probability that a n ! N - n ! the mean of the Poisson distribution is the Total Fertility Rate (TFR) parameter. b n ! N - n ! 2 ( ) N children produces n of them at age a is given as the Binomially distributed ! children produces n of them at age a is given as the Binomially distributed variable, ( p (a )) ( )((1 - p ( a ) ) n N -n pb ( n at a | N ) = N! N! ) ( )( n N -n (1.11) n pb ( n at a | N ) = n !( N - ) ! pb ( a ) 1 - pbm p( a ) at a | N ) = b (n b bm pb ( a ) 1 - pbm ( a ) (N - n (1.11) The probability that a mother (who does give birth) gives birth to a child at age a is determined from ity that the mother gives birth to n children at age a is n! n )! The probability that the mother gives birth to n children at age a is N! n !( N N- ) ! nN !n ( )( ) ( )( n - (a a)| N ) - pbm ( a ) pb ( n at a | N ) = #2 0 . For any particular mother, the births of multiple n the BirthsByAgeOfMother distribution as pb ( npb at 1= pb ( a ) 1 - pbm (a (1.11 ¥ l N The probability l N that the mother gives birth The probability that the mother gives birth to n children at age a is n! ( to Nn- n) ! N at age a is children n !( N - n )! ¥ ¥ children are treated as independent events, so that the probability that a mother who produces N ( The probability that the mother gives birth to n children at age a is )( l ) l n The probability that the mother gives birth to n children at age a is N - n ¥ N at a ) = e - l å N =n N ! pb ( n at a | N ) = e - l children produces n of them at age a is given as the Binomially distributed variable, N =n n !-(lN¥ - n å ) ! pb ( a ) 1 - p pbb ((¥ an)at a ) (1.12) = e-l å pb ( n at a | N ) = e - l p å ( ¥N! N - n )! N =n n ! ( N The probability that the mother gives birth to n children at age a is ¥ lN N lNN The probability that the mother gives birth to n children at age a is e( )) n (1 - p ( a ) ) N - n (1.12) n N - n pb ( n at a ) = e - l å l pb ( n at a | N ) = e - l å l b ( al ( a¥) -l ¥ l N =n -p N ¥ l pb ( n at a ) = e N å =n N ! b p ( n at a | N ) = e N å p bn(n! (at Na -)Nn=) ! ! ( p å ) ( p 1b (n - pat b | )N ) = b (a ) Ne-n å -l (1.12) ( pb ( (bN - n ) !) - ( )( N =n n ! ( N ) ) b n l ( n at = n)n ( N =n N ! p he summation in this equation gives the simplifying result that the probability ¥ l a N | N =n =!p ( n at a p ¥ N nb l a N !) l1 - pbm ( a ) -n N ¥ l -n N !(1.11) p ( n at a ( e) (å1 - pb ( a ) ) (1.12 (p ( N N- n pb ( n at a ) = e å p( )n å a) pb )( = Nl= Performing the summation in this equation gives the simplifying result (aN =)e !å b ( nn -b pb n at !a at | )N= - e - l |N l on distributed with mean parameter 4#2 0 , N =n N ! Performing the summation in this equation gives the simplifying result that the probability Nnn ! ! (bN - n ) ! pb(n at a) ( N - n )! b n !, is itself Poisson distributed with mean parameter Performing the summation in this equation gives the simplifying result that the probability Performing the summation in this equation gives the simplifying result b N = N n = p ( Performing the summation in this equation gives the simplifying result tha that 4# n at a N ) 2 = n0 is itself Poisson distributed with mean parameter 4#2 0 , theThe probability that the mother gives birth to n children at age a is is itself Poisson distributed with mean parameter probability p (n at a) is is itself Poisson distributed with mean parameter itself 4# 0 Poisson , distributed with mean parameter 4#2 0 , )) ( l p ( ais itself Poisson distributed with mean parameter Performing the summation in this equation gives the simplifying result that the probability Performing the summation in this equation gives the simplifying result tha n pb(n at a n 2 (l p (a )) b pb ( n at a ) = e - l pb ( a ) = p ( ) ( n ) is itself Poisson distributed with mean parameter l ( l p ( a ) ) (1.13) 4#p 0 b n , l (n at a ) = e ( ) 4# 0 , N N - l pb a b p )) = ¥ ¥ å N(!lp )) a= ( !) ) n 2 ( ) l(p 2 n p ((n 1 -( n N -n n ! pp p bb(( l pb a nn at at ))==ee aa b ( n at a ) = e - l pb ( a ) -l - l pb ( a ) n! a at |Np) =( e) ( n b b b å) = p ( ) ( n ) pn( b ( !n a) at - N l pb a ) n= !e p ((a))(1.13) -l (1.13) n a p (a= l p (a ) (1.12) p ( ) (n) -l bpb a b b b ( ) (l p (a )) ( n( )) ( !a ) ) N =n l pb a N =n b n n l pb a n! l p n( p ( n at a ) = e b b( ) l pb (a ) l pb ( a ) ( ) - l pb a rage, a mother at age 0 will produce 4#2 0 children in that year. Thus, on average, a mother at age =p 0 will produce p n at a = e - l pb 4#2 0 children in th a (1.13 b =p n Thus, on average, a mother at age 0 will produce b 4# 0 children in that year. n! Performing the summation in this equation gives the simplifying result that the probability n! pb(n at a ) Thus, on average, a mother at age 0 will produce 4#2 2 0 children in that year. 0 will produce 4#2 0 children in that y Thus, on average, a mother at age 3 Thus, of the children is determined by the probability on average, pmale a mother at age a will is itself Poisson distributed with mean parameter produce =1-pfemale. In the baseline model this The gender of the children 4#3 0 , children in that year. 2 is determined by the probability p male=1-pfe 3 The gender of the children is determined by the probability Thus, on average, a mother at age pmale=1-pfemale 0 will produce Thus, on average, a mother at age 4# 3 . In the baseline model this 0 children in that year. 0 will produce 4#2 0 children in that y The gender of the children3 is determined by the probability pmale=1-pfemale The gender of the children 2 . In the baseline model this is determined by the probability pmale=1-pfemale e the probability Nm/(Nm+N ). is taken to be the probability f Nm/(Nm+Nf). is taken to be the probability Nm/( nNm+Nf). is taken to be the probability Nm /(Nm +Nf). 3 is The The gender of the children gender of the children is taken to be the probability 3 is determined by the probability determined by - l pthe The gender of the children (a N ( a ) probability)m/(N pm pb ( n at a ) = e b +Nf). ( l pb3 is determined by the probability male=1-pfemale. In the baseline model thi p =1-p) = pl pb ( a ) ( n ) male female (1.13) on Editor menu item Population Editor\Tools\Births\show random birthList creates an The Population Editor menu item Population Editor\Tools\Births\show random birthList creates an In the is taken to be the probability baseline model this is is taken to be the probability taken N m /( toN be m + Nthe The Population Editor menu item Population Editor\Tools\Births\show f ). probability n ! N /( N + N ). The Population Editor menu item Population Editor\Tools\Births\show random birthList creates an The Population Editor menu item Population Editor\Tools\Births\show ran m m f instance of the TPopulation class and uses it to generate and list a (selectable) sample of mothers he TPopulation class and uses it to generate and list a (selectable) sample of mothers instance of the TPopulation class and uses it to generate and list a (sele instance of the TPopulation class and uses it to generate and list a (selectable) sample of mothers Thus, on average, a mother at age instance of the TPopulation class and uses it to generate and list a (selecta 0 will produce 4#2 0 children in that year. The Population Editor menu item Population Editor\Tools\Births\show random birthList creates an The Population Editor menu item Population Editor\Tools\Births\show ran s in which they give birth. and the years in which they give birth. The Population and the years in which they give birth. Editor menu item Population Editor\Tools\Births\show random and the years in which they give birth. and the years in which they give birth. instance of the TPopulation class and uses it to generate and list a (selectable) sample of mothers 3 instance of the TPopulation class and uses it to generate and list a (selecta The gender of the children birthList 1.2.2.3 Deaths from modeled diseases creates an instance of is determined by the probability the TPopulation class and uses pmale =1- it to pfemale. In the baseline model this generate and the years in which they give birth. and the years in which they give birth. aths from modeled diseases 1.2.2.3 Deaths from modeled diseases is taken to be the probability 1.2.2.3 1.2.2.3 Deaths from modeled diseases Deaths from modeled diseases and list a (selectable) sample ofN m/( Nm+Nfand mothers The simulation models any number of specified diseases, some of which may be fatal. In the start ). the years in which they give birth. The simulation models any number of specified diseases, some of which may be fatal. In the start on models any number of specified diseases, some of which may be fatal. In the start 1.2.2.3 The simulation models any number of specified diseases, some of which m The simulation models any number of specified diseases, some of whic Deaths from modeled diseases 1.2.2.3 Deaths from modeled diseases year, the simulation’s death model uses the diseases’ own mortality statistics to adjust the The Population Editor menu item Population Editor\Tools\Births\show random birthList creates an year, the simulation’s death model uses the diseases’ own mortality statistics to adjust the ulation’s death model uses the diseases’ own mortality statistics to adjust the year, the simulation’s death model uses the diseases’ own mortality statis The simulation models any number of specified diseases, some of which may be fatal. In the start The simulation models any number of specified diseases, some of which m year, the simulation’s death model uses the diseases’ own mortality st probabilities of death by age and gender. In the start year, the net effect is to maintain the same 1.2.2.3 Deaths from modeled diseases instance of the TPopulation class and uses it to generate and list a (selectable) sample of mothers probabilities of death by age and gender. In the start year, the net effect is to maintain the same probabilities of death by age and gender. In the start year, the net effect i of death by age and gender. In the start year, the net effect is to maintain the same year, the simulation’s death model uses the diseases’ own mortality statistics to adjust the year, the simulation’s death model uses the diseases’ own mortality statis probabilities of death by age and gender. In the start year, the net effe probability of death by age and gender as before; in subsequent years, however, the rates at which simulation models any number Theand the years in which they give birth. of specified diseases, some of which may be probability of death by age and gender as before; in subsequent years, however, the rates at which probability of death by age and gender as before; in subsequent years, ho probabilities of death by age and gender. In the start year, the net effect is to maintain the same f death by age and gender as before; in subsequent years, however, the rates at which probabilities of death by age and gender. In the start year, the net effect i people die from modeled diseases will change as modeled risk factors change. The population probability of death by age and gender as before; in subsequent years, fatal. In the start year, the simulation’ s death model uses the diseases’ own mortality people die from modeled diseases will change as modeled risk factors change. The population people die from modeled diseases will change as modeled risk factors cha probability of death by age and gender as before; in subsequent years, however, the rates at which om modeled diseases will change as modeled risk factors change. The population 1.2.2.3 probability of death by age and gender as before; in subsequent years, ho Deaths from modeled diseases dynamics sketched above will be only an approximation to the simulated population’s dynamics. The statistics to adjust the probabilities people die from modeled diseases will change as modeled risk factors of death by age and gender. In the start year, the dynamics sketched above will be only an approximation to the simulated population’s dynamics. The dynamics sketched above will be only an approximation to the simulated p people die from modeled diseases will change as modeled risk factors change. The population people die from modeled diseases will change as modeled risk factors cha The simulation models any number of specified diseases, some of which may be fatal. In the start latter will be known only on completion of the simulation. etched above will be only an approximation to the simulated population’s dynamics. The net effect is to maintain the latter will be known only on completion of the simulation. dynamics sketched above will be only an approximation to the simulat same probability of death by age and gender as before; latter will be known only on completion of the simulation. dynamics sketched above will be only an approximation to the simulated population’s dynamics. Th dynamics sketched above will be only an approximation to the simulated known only on completion of the simulation. year, the simulation’s death model uses the diseases’ own mortality statistics to adjust the 2 1.2.3 This could be made to be time in subsequent The risk factor model latter will be known only on completion of the simulation. years, however, the rates at which people die from modeled diseases latter will be known only on completion of the simulation. latter will be known only on completion of the simulation. dependent; in the1.2.3 baseline model probabilities of death by age and gender. In the start year, the net effect is to maintain the same The risk factor model 1.2.3 The risk factor model will change as modeled risk factors change. The population dynamics sketched it is constant. The distribution of risk factors (RF) in the population is estimated using regression analysis stratified risk factor model probability of death by age and gender as before; in subsequent years, however, the rates at which The distribution of risk factors (RF) in the population is estimated using regression analysis stratified 1.2.3 1.2.3 The risk factor model The distribution of risk factors (RF) in the population is estimated using re The risk factor model above will be by both sex S = {male, female} and age group only an approximationA1.2.3 to The risk factor model the simulated population’s dynamics. The = {0-9, 10-19, ..., 70-79, 80+}. The fitted trends are by both sex The probability of people die from modeled diseases will change as modeled risk factors change. The population ion of risk factors (RF) in the population is estimated using regression analysis stratified child gender S = {male, female} and age group A = {0-9, 10-19, ..., 70-79, 80+}. The fitted trends are by both sex S = {male, female} and age group A = {0-9, 10-19, ..., 70-79, 80 The distribution of risk factors (RF) in the population is estimated using 3 The distribution of risk factors (RF) in the population is estimated using regression analysis stratified The distribution of risk factors (RF) in the population is estimated using re can be made time dependent. latter will be known only on completion of the simulation. dynamics sketched above will be only an approximation to the simulated population’s dynamics. The S = {male, female} and age group 2 A = {0-9, 10-19, ..., 70-79, 80+}. The fitted trends are by both sex S = {male, female} and age group by both sex by both sex A = {0-9, 10-19, ..., 70-79, 80+}. The fitted trends are S = {male, female} and age group S = {male, female} and age group A = {0-9, 10-19, ..., 70-79, A = {0-9, 10-19, ..., 70-79, 80 This could be made to be time dependent; in the baseline model it is constant. 2latter will be known only on completion of the simulation. 2 3 This could be made to be time dependent; in the baseline model it is constant. This could be made to be time dependent; in the baseline model it is constant. 3 The probability of child gender can be made time dependent. 3 The probability of child gender can be made time dependent. 2 The probability of child gender can be made time dependent. e made to be time dependent; in the baseline model it is constant. 2 This could be made to be time dependent; in the baseline model it is constant. 1.2.3 The risk factor model 2 39 This could be made to be time dependent; in the baseline model it is constant. 3 This could be made to be time dependent; in the baseline model it is constan 3 6 ity of child gender can be made time dependent. The probability of child gender can be made time dependent. 3 The probability of child gender can be made time dependent. The distribution of risk factors (RF) in the population is estimated using regression analysis stratified 6 The probability of child gender can be made time dependent. by both sex S = {male, female} and age group A = {0-9, 10-19, ..., 70-79, 80+}. The fitted trends are 6 1.2.3 The risk factor model The distribution of risk factors (RF) in the population is estimated using regression analysis stratified by both sex S = {male, female} and age group A = {0-9, 10-19, ..., 70-79, 80+}. The fitted trends are extrapolated to forecast the distribution of each RF category in the future. For each sex-and-age-group extrapolated to forecast the distribution of each RF category in the future. For each stratum, the set tribution of each RF category in the future. For each sex-and-age- of cross-sectional, time-dependent, discrete distributions group stratum, the set of cross-sectional, time-dependent, discrete distributions 5 sectional, time-dependent, discrete distributions 5 = {#$ & |! = 1, … ); & > 0}, is used to manufacture RF trends for individual members of the pop is used to manufacture RF trends for individual facture RF trends for individual members of the population. members of the population. We model different risk factors, some of which are continuous (such as BMI) and so some of which are continuous (such as BMI) and some are categorical (smoking status). We model different risk factors, some of which are continuous (such as BMI) 1.2.3.1 Categorical risk factors and some are categorical (smoking status). extrapolated to forecast the distribution of each RF category in the future. For each sex-and-age- ors Smoking is the categorical risk factor. Each individual in the population may belong group stratum, the set of cross-sectional, time-dependent, discrete distributions 5 = {#$ & |! = actor. Each individual in the population may belong to one of the 1.2.3.1 Categorical three possible smoking categories { risk factors never smoked, ex-smoker, smoker} with their pr 1, … ); & > 0}, is used to manufacture RF trends for individual members of the population. es {never smoked, ex-smoker, smoker} with their probabilities { p , p p Smoking is the0 categorical 2 1 }. These states are updated on receipt of the information that the person is eithe , risk factor. Each individual in the population may n receipt of the information that the person is either a smoker or a non-smoker. They will be a never smoker or an ex-smoker depending on their origin We model different risk factors, some of which are continuous (such as BMI) and some are belong to one of the three possible smoking categories {never smoked, ex- er smoker or an ex-smoker depending on their original state (an ex- categorical (smoking status). smoker can never become a never smoker). smoker, smoker} with their probabilities {p0, p1, p2}. These states are updated on ver smoker). receipt of the information 1.2.3.1 Categorical risk factors that the person is either a smoker or a non-smoker. The complete set of longitudinal smoking trajectories and the probabilities of their They will be a l smoking trajectories and the probabilities of their happening is never smoker or an ex-smoker depending on their original state generated for the simulation years by allowing all possible transitions between smo Smoking is the categorical risk factor. Each individual in the population may belong to one of the ars by allowing all possible transitions between smoking categories: (an ex-smoker cannever smoked three possible smoking categories { never become a never , , ex-smoker smoker} with their probabilities {p0, p1, smoker). {never smoked} ® {never smoked, smoker} p2}. These states are updated on receipt of the information that the person is either a smoker or a moked} ® {never smoked, smoker} {ex-smoker} ® {ex-smoker, smoker} non-smoker. They will be a never smoker or an ex-smoker depending on their original state (an ex- The complete set of longitudinal smoking trajectories and the probabilities of ker} ® {ex-smoker, smoker} smoker can never become a never smoker). { smoker } ® { ex-smoker , smoker } their happening is generated for the simulation years by allowing all possible } ® {ex-smoker, smoker} transitions between smoking categories: When the probability of being a smoker is p the allowed transitions are summarized The complete set of longitudinal smoking trajectories and the probabilities of their happening is smoker is p the allowed transitions are summarized in the state update equation: generated for the simulation years by allowing all possible transitions between smoking categories: {never smoked} — {never smoked, smoker} {never smoked {ex-smoker} } ® {never smoked — {ex-smoker, smoker} , smoker é} p0 ' ù é1 - p 0 0 ù é p0 ù é p0 ù é1 - p ' 0 0 ù é p0 ù { ex-smoker } ® { ex-smoker , smoker } ê 'ú ê úê ú ê 'ú ê ú ê ú {smoker} — {ex-smoker, smoker} ê p1 ú = ê 0 1 - p 1 - p ú ê p1 ú {smoker } ® {ex-smoker, smoker} ê p1 ú = ê 0 1 - p 1 - p ú ê p1 ú ê p2 ' ú ê û ë p p p ú ûêë p2 ú (1.14) ë û ê p2 ' ú ê p p p ú ê p ú When the probability of being a smoker is p the allowed transitions are ë û ë ûë 2û When the probability of being a smoker is p the allowed transitions are summarized in the state summarized in the update equation: After the final simulation year, the smoking trajectories are completed until the per state update equation: he smoking trajectories are completed until the person’s maximum possible age of 110 by supposing that their smoking state stays fixed. The life expec g that their smoking state stays fixed. The life expectancy calculation éwill consist in summing over the probability of being alive in each possible year of li p0' ù é1 - p 0 0 ù é p0 ù probability of being alive in each possible year of life. ê 'ú ê ê p1 ú = 0 1 - p 1 - p ú ê p1 ú In the initial year of the simulation, a person may be in one of the three smoking ca (1.14) ê úê ú ê p 2 ú ë ' N on, a person may be in one of the three smoking categories; after ëN û ê p p ´ 2 updates there will be 3 p possible trajectories. These trajectories will each have úë û ê p2 û ú sible trajectories. These trajectories will each have a calculated probability of occurring; the sum of these probabilities is 1. m of these probabilities is 1. After the final simulation year, the smoking trajectories are completed until the person’s maximum In each year the probability of being a smoker or a non-smoker will depend on the After the final simulation year, the smoking trajectories are completed until the possible age of 110 by supposing that their smoking state stays fixed. The life expectancy calculation eing a smoker or a non-smoker will depend on the forecast smoking scenario, which provides exactly that information. Note that these states are two-d person’s maximum possible age of 110 by supposing that their smoking state will consist in summing over the probability of being alive in each possible year of life. y that information. Note that these states are two-dimensional and cross-sectional {non-smoking, smoking}, and they are turned into three-dimensiona stays fixed. The life expectancy calculation will consist in summing over the moking}, and they are turned into three-dimensional states { never smoked, ex-smoker, smoker} as described above. The time evolution of the three-d In the initial year of the simulation, a person may be in one of the three smoking categories; after N probability N of being alive in each possible year of life. described above. The time evolution of the three-dimensional states are the smoking trajectories necessary for the computation of disease-table disease updates there will be 3 ´ 2 possible trajectories. These trajectories will each have a calculated essary for the computation of disease-table disease and death probabilities. probability of occurring; the sum of these probabilities is 1. In the initial year of the simulation, a person may be in one of the three afterSmoking 1.2.3.2 In each year the probability of being a smoker or a non-smoker will depend on the forecast smoking smoking categories; N updates there will be 3 x 2N possible trajectories. The microsimulation framework applied to smoking enables us to measure the futu scenario, which provides exactly that information. Note that these states are two-dimensional and These trajectories will each have a calculated probability of occurring; the sum applied to smoking enables us to measure the future health impact cross-sectional {of of changes in rates of tobacco consumption. This includes the impact of giving up sm these probabilities }, and they are turned into three-dimensional states { non-smoking, smoking is 1. never onsumption. This includes the impact of giving up smoking on the following diseases: i) Chronic obstructive pulmonary disease (COPD), ii) Coronary he smoked, ex-smoker, smoker} as described above. The time evolution of the three-dimensional states 40 structive pulmonary disease (COPD), ii) Coronary heart disease (or are the smoking trajectories necessary for the computation of disease-table disease and death probabilities. 7 1.2.3.2 Smoking Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine In each year the probability of being a smoker or a non-smoker will depend on the forecast smoking scenario, which provides exactly that information. Note that these states are two-dimensional and cross-sectional {non-smoking, smoking}, and they are turned into three-dimensional states {never smoked, ex-smoker, smoker} as described above. The time evolution of the three- dimensional states are the smoking trajectories necessary for the computation of disease-table disease and death probabilities. 1.2.3.2 Smoking The microsimulation framework applied to smoking enables us to measure the future health impact of changes in rates of tobacco consumption. This includes the impact of giving up smoking on the following diseases: i) Chronic obstructive pulmonary disease (COPD), ii) Coronary heart disease (or Myocardial Infarction if CHD data are not available), iii) stroke, and iv) lung cancer. In the simulation, each person is categorized into one of the three smoking groups: Smokers, ex-smokers, and people who have never smoked. Their initial distribution is based on the distribution of smokers, ex-smokers Myocardial Infarction if CHD data are not available), iii) stroke, and iv) lung cancer. In the simulation, and never smokers from published data. each person is categorized into one of the three smoking groups: Smokers, ex-smokers, and people who have never smoked. Their initial distribution is based on the distribution of smokers, ex- During the simulation, a person may change smoking states, and their relative smokers and never smokers from published data. risk will change accordingly. Relative risks associated with smokers and people During the simulation, a person may change smoking states, and their relative risk will change who have never smoked have been collected from published data. The accordingly. Relative risks associated with smokers and people who have never smoked have been relative risks associated with ex-smokers (RRex-smoker) are related to the relative collected from published data. The relative risks associated with ex-smokers ( RRex-smoker) are related risk of smokers (RRsmoker to the relative risk of smokers (RRsmoker ). The ex-smoker relative risks are assumed to decrease ). The ex-smoker relative risks are assumed to decrease over over time with the number of years since time with the number of years since smoking cessation ( smoking Tcessation cessation (Tcessation). These ). These relative risks are computed relative risks are computed in the model using equations 1.19 and 1.20 (1). in the model using equations 1.19 and 1.20 (1). RRex-smoker ( A, S , Tcessation ) = 1 + ( RRsmoker ( A, S ) - 1)exp( -g ( A)Tcessation ) (1.15) g ( A) = g 0 exp( -h A) (1.16) where γ is the regression coefficient of time dependency. The constants γ0 and η are intercept and where γ is the regression coefficient of time dependency. The constants γ0 regression coefficient of age dependency, respectively, which are related to the specified disease and η are intercept and regression coefficient of age dependency, respectively, Table 2. which are related to the specified disease Table 2. Table 2 Parameter Estimates for γ0 and η Related to Each Disease (1) Table 2: Parameter Estimates for γ0 and η Related to Each Disease (1) Disease γ0 η AMI Distribution name γ0 0.24228 η 0.05822 Stroke 0.31947 0.01648 AMI 0.24228 0.05822 COPD 0.20333 0.03087 Stroke Lung cancer 0.31947 0.15637 0.01648 0.02065 COPD 0.20333 0.03087 Lung cancer 0.15637 0.02065 However, a minimum exists when the cessation time is equal to η-1. The minimum value was calculated by the method detailed below (equations (1.17), (1.18) and (1.19)). Where time t is equal to the age A of an individual. 41 r Exsmk ( t ) = 1 + ( r smk - 1) f ( t ) (1.17) AMI 0.24228 0.05822 Stroke 0.31947 0.01648 COPD 0.20333 0.03087 Lung cancer 0.15637 0.02065 However, a minimum exists when the cessation time is equal to η-1. The However, a minimum exists when the cessation time is equal to η-1. The minimum value was minimum value was calculated by the method detailed below (equations calculated by the method detailed below (equations (1.17), (1.18) and (1.19)). Where time t is equal (1.17), (1.18) and (1.19)). Where time t is equal to the age A of an individual. to the age A of an individual. r Exsmk ( t ) = 1 + ( r smk - 1) f ( t ) (1.17) f ( t ) = exp ( -g 0 ( t - t0 ) exp ( -ht ) ) Þ (1.18) f ¢ ( t ) = -g 0 f ( t ) e -ht ( -h ( t - t ) + 1) 0 The function f(t) has the following properties: The function f(t) has the following properties: f( f (t 0) t0 )= =1 1 ¢ f ¢( f (t t0 ) 0 )==- g 0e -g e- 0 -h htt0 0 (1.19) (1.19) f f((t t)) has a minimum minimum at has a at t t==t +h--1 0 +h 1 t0 8 f f((¥¥) )= =A A -1 In order to keep the RR In order In order to keep the RR to keep ex-smoker ex-smoker from increasing, the cessation time was set equal to η the RRex-smoker from increasing, the cessation time from increasing, the cessation time was set equal to η -1 when the was set equal when the -1 cessation time was greater than η -1 (see equation (1.20)). to η-1 when the cessation cessation time was greater than η time was greater than η-1 (see equation (1.20)). (see equation (1.20)). ì1 -g 5} The model uses two parameters {p The model uses three parameters {p1, R, R , R, R>5} } 1, R} 1 The model uses three parameters {p 1 >5 } Given the 1-year survival probability Given the 1-year survival probability Given the 1-year survival probability psurvival (5) psurvival (5) psurvival (1) and the 5-year survival probability (1)and the 5-year survival probability and the 5-year survival probability and the 5-year survival probability psurvival survival (1) survival ( 5) Given the 1-year survival probability p p psurvival (1) and the 5-year survival probability psurvival ( 5) p1 = p1 =11- -p psurvival ((1) survival 1) 1 æ æp psurvival (55) öö R= R =-- 1 ln lnçç survival ( ) ÷ ÷ (1.36) 4 èp 4 è survival ( psurvival (11)) ø ø (1.36) 13 13 R>5 = =- 1 æp - 1 ln lnç æ psurvival ((10) ö survival 10 ) ÷ ö R ç >5 5 è p 5 psurvival ( è survival (55))÷ ø ø Approximating single-state disease survival data from mortality and prevalence 1.2.6.7 Approximating single-state disease survival data from mortality and prevalence 1.2.6.7 An example is provided here with a standard life-table analysis for a disease d An example is provided here with a standard life-table analysis for a disease . d. Consider the 4 following states: Consider the 4 following states: state state Description Description 47 0 0 alive without disease d alive without disease d 1 1 alive with disease d alive with disease d p p p 1= 11 =1 = 1- 1 -p - p p survival(1 survival survival )) (1 1 survival(( ))öö 1.2.6.6.3 Survival model 2 11 1 lnæææp p p 55 5 ö The model uses three parameters {p 1, R, R>5} R R= R = =- - - lnlnçççp survival survival ÷÷ ÷ 44 4 èè pp survival(1( 1 1 è psurvival (5) øø)) ø Given the 1-year survival probability psurvival (1) and the 5-year survival probability survival survival 11 ææ 1 æp p p survival(( survival 10))öö 10 10 ö p1 = 1 - psurvival (1) R R R > 5 = = = -- - ln ln lnç ç ç survival ÷ ÷ ÷ >>55 55 5 èè p p p ((55 5)) è survival survival survival øøø 1 æp - ln ç survival R =single-state ( 5) ö 1.2.6.7 Approximating disease ÷ survival data from mortality (1.36) 1.2.6.7 1.2.6.7 1.2.6.7 4 è psurvival (1) ø Approximating single-state disease survival data from mortali Approximating single-state disease survival data from mortali Approximating single-state disease survival data from mortalit and prevalence An example is provided here with a standard life-table analysis for a disease d . An example is provided here with a standard life-table analysis for a disease An An example is provided here with a standard life-table analysis for a disease example is provided here1withæa survival (10life-table pstandard ) ö analysis for a disease d. d d . . R>5 = - ln ç ÷ 5 è psurvival ( 5) ø Consider the 4 following states: Consider the 4 following states: Consider the 4 following states: Consider the 4 following states: 1.2.6.7 Approximating single-state disease survival data from mortality and prevalence state state state Description Description Description An example is provided here with a standard life-table analysis for a disease d. state Description 00 0 Consider the 4 following states: alive without disease alive without disease alive without disease d d d 0 alive without disease d state Description 11 1 1 alive with disease alive with disease alive with disease alive with disease d dd d 0 alive without disease d 22 2 2 dead from disease dead from disease d dead from disease dead from disease dd d 1 alive with disease d 3 dead from another disease 2 33 3 dead from another disease dead from another disease dead from disease d dead from another disease 3 p p p ik ikdead from another disease ik is theis the probability of disease d incidence, is the probability of disease is the probability of disease probability of disease d d d incidence, aged incidence, aged incidence, aged aged k kk k pik is the probability of disease p d incidence, aged k is the probability of dying from the disease d, aged k p p ww w k kk is the is the probability of dying from the disease is the probability of dying from the disease probability of dying from the disease d, aged k d d , aged , aged kk pwk is the probability of dying from the disease d, aged k # # # >$ >$ >$ is the probability of dying other than from disease d, aged is theis the probability of dying other than from disease d, aged is the probability of dying other than from disease d, aged probability of dying other than from disease d, aged k kk k #>$ is the probability of dying other than from disease d, aged k The state transition matrix is constructed as follows The state transition matrix is constructed as follows The state transition matrix is constructed as follows The state transition matrix is constructed as follows The state transition matrix is constructed as follows é p0 ( k + 1) ù é(1 - p é ékp éw p p 0( )( 00 1(kk k+ - p+ + 1 ik )))ùù 1 1 ù (é 1é(( é 1 -1 1- p- wkp - p p- ww w )()1 k)( kp k wk 11p-- a kp - p p ik) 0 0ù ik ik ) ((1 é1 1- p- 0 (p - p p kw)w wùk - kk - -p p pww w k) p kk )p paa akk k 0 0 0 0 ê ú ê êê ê úú ú êê ê ú ê ú p ê 1 ( k + 1 ) ú=ê (1 - êêp ê pp p w1 () 11( k k k+k p + +1 ik 1 1 ) ) úú(1 - ú =êê ê wk(1 p ( -1p 1-- -wkp)( p pww w 1 k) p kk ) - pp p a ik ik ik k ) 0 ( (11 0 1ú-- -êpp p p ( w k 1wk - w ) kk - - p ú p pww wkk )( )(1 1 1- k(1.37) - -p p paa a k) kk ) 0 0 0 0 ê p2 ( k + 1) ú ê 0 ( k + 1) ú ê == pwk 0 1 0ú ê p2 ( k ) ú p ê ú ê êê êpp p 2( k 22 k+ +1 1) úú êê 00 úê úp pw kk 1 1 1 0 ë p3 ( k + 1) û ë êê ú 0 1û ë p3 ( k ) û w w k êp úú êê ê pwk p ( k + 1) wk 3(( ))ûû p p 0 ëp ëë p33 kk+ +11 û ëë ë p pww wkk k p pww wkk k 0 0 1 It is worth noting that the separate columns correctly sum to unity. It is worth noting that the separate columns correctly sum to unity. worth noting that the separate columns correctly sum to unity. It isIt is worth noting that the separate columns correctly sum to unity. It is worth noting that the separate columns correctly sum to unity. The disease mortality equation is that for state-2, The disease mortality equation is that for state-2, 14 The disease mortality equation is that for state-2, The disease mortality equation is that for state-2, p2 ( k + 1) = pwk p2 + 1) = p )k p1 ( k ) + p2 ( k ) 1 ( k ) + p2 ( kw (1.38) (1.38 p2 ( k + 1) = pwk p1 ( k ) + p2 ( k ) (1.38) The probability of dying from the disease in the age interval [ The probability of dying from the disease in the The probability of dying from the disease in the age interval [ k, k+1] is k, pwk p1[k, age interval k k ( ) is pwk p1 k - this is otherwise +1] is k+1] - this is otherwise ( ) p k p1 (k ) this is otherwise the (cross-sectional) the (cross-sectional) disease mortality, the (cross-sectional) disease mortality, disease The probability of dying from the disease in the age interval [ kmortality, pmor(k). p1 , k p (k). p1(k) is otherwise (k) is otherwise known as the disease prevalence, +1] is - this is otherwise pmor(k). p1(k) is otherwise known as the disease prevalence, wmor ppre(k). Hence the relation known pmor(k). p1(kp as the disease prevalence, ppre(k). Hence the relation the (cross-sectional) disease mortality, ) is otherwise known as the disease prevalence, pre (k). Hence the relation ppre(k). Hence the relation p (k ) pmor ( k ) pwk = mor pwk = (1.39) (1.39 pmor ( k ) p pre ( k ) pwk = pre (1.39) p pre ( k ) For exponential survival probabilities, the probability of dying from the disease in the age-interval For exponential survival probabilities, the probability of dying from the disease in the age-interval [ k+1] is denoted [k, p k, k+1] is denoted pWk and is given by the formula Wk and is given by the formula For exponential survival probabilities, the probability of dying from the disease in the age-interval [k, k+1] is denoted pWk and is given by the formula pwk = 1 - e- Rk p Þ 1k-= wk =R e- Rk - ln (1 Þ R - pw ) kk = - ln (1 - pwk ) (1.40) (1.40 48 pwk = 1 - e - Rk Þ Rk = - ln (1 - pwk ) (1.40) When, as is the case for most cancers, these survival probabilities are known, the microsimulation When, as is the case for most cancers, these survival probabilities are known, the microsimulation will use them. When they are not known or are too old to be any longer of any use, the will use them. When they are not known or are too old to be any longer of any use, the When, as is the case for most cancers, these survival probabilities are known, the microsimulation microsimulation uses survival statistics inferred from the prevalence and mortality statistics microsimulation uses survival statistics inferred from the prevalence and mortality statistics will use them. When they are not known or are too old to be any longer of any use, the p ( k + 1) = p p ( k ) + p ( k ) The probability of dying from the disease in the ag (1.38) 2 wk 1 The probability of dying from the disease in the age interval [ k2, k+1] is pwk p1 k - this is otherwise The probability of dying from the disease in the age interval [ ( ) k, k+1] is pwk p1 k - this ( ) The probability of dying from the disease in the age interval [ bility of dying from the disease in the age interval [ the (cross-sectional) disease mortality, k, k+1] is pwk p1 k k, k ( ) +1] is pwk p1 k - this is otherwise mor ( ) p (k). p (k) is otherwise known as the disease p - this is otherwise the (cross-sectional) disease mortality, p (k). p (k) is otherwise known as the disease prevalence, the (cross-sectional) disease mortality, 1 pmor(k). p1(k mor 1 the (cross-sectional) disease mortality, sectional) disease mortality, pprep (kmor pmor(k). p1(p ). Hence the relation kpre (k). p1(k) is otherwise known as the disease prevalence, (k). Hence the relation Modeling the Long-Term ppreHealth (k). Hence the relation ) is otherwise known as the disease prevalence, and Cost Impacts of Reducing nce the relation The probability of dying from the disease in the age interval [k, k+1] is pw ppre(k). Hence the relation k p1 (k ) - this is otherwise Smoking Prevalence through Tobacco Taxation in Ukraine the (cross-sectional) disease mortality, pmor(p k). p pmor ( k ) p (k ) 1(k) is otherwise known as the disease prevalence, wk = pwk = mor (1.39) pwk = pmor ( k ) ppre(k). Hence the relation pmor ( k ) p pre ( k ) p pre ( k ) pwk = pwk = (1.39) (1.39) p pre ( k ) p pre ( k ) For exponential survival probabilities, the probability of dying from the disease in the age-interval For exponential survival probabilities, the probability of dying from the disease in the ag For exponential survival probabilities, p the pwk = mor ( k )probability of dying from the disease in For exponential survival probabilities, the probabi (1.39) thepage-interval [k, k+1] is denoted Wk and is given by the formula [k, [k, k+1] is denoted k+1] is denoted p p (k ) W k and is given by the formula and is given by the formula For exponential survival probabilities, the probability of dying from the disease in the age-interval ential survival probabilities, the probability of dying from the disease in the age-interval pre [k, k+1] is denoted pWk and is given by the formula denoted , W [kp k+1] is denoted k p and is given by the formula W k and is given by the formula p = 1 - e k Þ R = - ln (1 - R p- p= 1)- e- Rk Þ R = - ln ( 1 - p ) (1.40) For exponential survival probabilities, the probability of dying from the disease in the age-interval wk k wk wk k wk pwk = 1 - e- Rk Þ [k, k+1] is denoted pWk and is given by the formula 1 - e- Rk Þ pwk =Rk1-=e Rk --ln (1Þ pwkR)k = - ln (1 - pwk ) pwk = When, as is the case for most cancers, these survival probabilities are known, the microsimulation - When, as is the case for most cancers, these survival probabilities are known, the micro (1.40) (1.40) When, as is the case for most cancers, these survival probabilities are known, the When, as is the case for most cancers, these surviv will use them. When they are not known or are too old to be any longer of any use, the will use them. When they are not known or are too old to be any longer of any use, the pwk = 1 - e- Rk Þ Rk = - ln (1 - pwk ) (1.40) microsimulation will use them. When they are not known or are too old to be any will use them. When they are not known or are to When, as is the case for most cancers, these survival probabilities are known, the microsimulation s the case for most cancers, these survival probabilities are known, the microsimulation microsimulation uses survival statistics inferred from the prevalence and mortality statistics microsimulation uses survival statistics inferred from the prevalence and mortality stati longer of any use, (equation (1.39)). the microsimulation uses survival will use them. When they are not known or are too old to be any longer of any use, the (equation (1.39)). em. When they are not known or are too old to be any longer of any use, the statistics inferred from the microsimulation uses survival statistics inferred fro When, as is the case for most cancers, these survival probabilities are known, the microsimulation prevalence and mortality statistics (equation (1.39)). microsimulation uses survival statistics inferred from the prevalence and mortality statistics lation uses survival statistics inferred from the prevalence and mortality statistics (equation (1.39)). will use them. When they are not known or are too old to be any longer of any use, the An alternative derivation equation (1.39) is as follows. Let Nk be the number of people in the An alternative derivation equation (1.39) is as follows. Let Nk be the number of people i (equation (1.39)). microsimulation uses survival statistics inferred from the prevalence and mortality statistics (1.39)). population aged k, and let nk be the number of people in the population aged k with the disease. population aged k, and let nk be the number of people in the population aged k with the An alternative derivation equation (1.39) is as follows. (equation (1.39)). An alternative derivation equation (1.39) is as follo Let Nk be the number of Then, the number of deaths from the disease of people aged k can be given in two ways: as pwk Then, the number of deaths from the disease of people aged k n k can be given in two way An alternative derivation equation (1.39) is as follows. Let tive derivation equation (1.39) is as follows. Let people in Nkthe be the number of people in the population aged k, and let nk be the population aged Nk be the number of people in the number of people k, and let in the nk be the number of pe and, equivalently, as pmor(k)Nk. Observing that the disease prevalence is and, equivalently, as pmor( An alternative derivation equation (1.39) is as follows. Let kk N nk/Nk leads to the equation ) be the number of people in the Nk. Observing that the disease prevalence is nk/Nk leads to t n aged k,population aged k, and let nk be the number of people in the population aged and let nk be the number of people in the population aged population aged k with the disease. k with the disease. Then, the k with the disease. Then, the number of deaths from the disease of p number of deaths from the disease of population aged k, and let nk be the number of people in the population aged k with the disease. Then, the number of deaths from the disease of people aged number of deaths from the disease of people aged people aged k can be given in two ways: as k can be given in two ways: as k can be givenpWkin two = pways: pWk nkp= as pwknand, equivalently, as k and, equivalently,w kn pmor as (k)Nk. Observing that the nk Then, the number of deaths from the disease of people aged mor ( k ) N k pk k can be given in two ways: as mor ( k ) p N kn k wk and, equivalently, as alently, as pmor(k)Nk. Observing that the disease prevalence is pmor(k)Nk. Observing that the disease prevalence is Observing that the diseasen k/ N k is nk/Nk leads to the equation leads to the equation prevalence n leads to the equation n and, equivalently, as pmor(k)Nk. Observing that the disease prevalence is nk/Nk leads to the equation p pre ( k ) = k p pre ( k ) = k pWk nk = Nk Nk pWk nk = pmor ( k ) p NWkk nk = pmor ( kp)WN = pmor ( k ) N k (1.41) kn k kÞ Þ p pre ( k ) = n n nk ( k ) p pre ( k ) = k p pre ( k ) = k p pre ( k ) = pmor p (k ) Nk Nk pWk = N pWk = mor p pre ( k ) k (1.41) p pre ( k ) (1.41) (1.41) Þ Þ Þ Þ pWk an = pmor ( k ) p = pmor ( k ) 1.2.6.8 Approximating multi-state 1.2.6.8 disease survival data Approximating from incidence multi-state and mortality, disease survival data from incidence pmor ( k ) pWk = passuming no remission Wk = assuming no remission Wk p pre ( k ) p pre ( k ) p pre ( k ) Disease Mortality statistics give the probability that a person will die from the disease in a given year Disease Mortality statistics give the probability that a person will die from the disease in 1.2.6.8 Approximating multi-state disease of life. They make no reference to when the disease from which the person dies was contracted. of life. They make no reference to when the disease from which the person dies was co 1.2.6.8 Approximating multi-state disease survival data from incidence and mortality, 1.2.6.8 Approximating Approximating multi-state disease multi-state 1.2.6.8 disease survival survival data from Approximating data from incidence multi-stateand incidence disease survivaland mortality, assuming no remission data mortality, from incidence and assuming no remission Disease Survival statistics give the probability that a person will die from the disease in a given year assuming no remission assuming no remission Disease Survival statistics give the probability that a person will die from the disease in Disease Mortality statistics give the probability tha mortality, assuming no remission Disease Mortality statistics give the probability that a person will die from the disease in a given year of life, given that they contracted the disease in some earlier year. of life, given that they contracted the disease in some earlier year. of life. They make no reference to when the diseas Disease Mortality statistics give the probability that a person will die from the disease in a given year ortality statistics give the probability that a person will die from the disease in a given year of life. They make no reference to when the disease from which the person dies was contracted. Disease Mortality statistics give the probability that a person will die from the of life. They make no reference to when the disease from which the person dies was contracted. y make no reference to when the disease from which the person dies was contracted. disease in a given year of life. They make no reference The connection between the two is provided by the equation of the form to when the disease from The connection between the two is provided by the equation of the form Disease Survival statistics give the probability that Disease Survival statistics give the probability that a person will die from the disease in a given year which the person dies was contracted. of life, given that they contracted the disease in so Disease Survival statistics give the probability that a person will die from the disease in a given year rvival statistics give the probability that a person will die from the disease in a given year of life, given that they contracted the disease in some earlier year. of life, given that they contracted the disease in some earlier year. en that they contracted the disease in some earlier year. The connection between the two is provided by the equation of the form Disease Survival statistics give the probability thatThe connection between the two is provided by th a person will die from the disease in a given year of life, given that they contracted the disease in some earlier year. 15 The connection between the two is provided by the equation of the form ction between the two is provided by the equation of the form The connection between the two is provided by the equation of the form 15 pmor ( a ) = å pw ( a a ) p ( a ) 15 a0 0) , ap Õ w( ça è 0 ) = å 1 - p0 k ( a ) ÷ p0 K0 0 . . At subsequent ø ages, the 16 (1.43) that they die from other causes.) Suppose they acquire the disease at age =0 k =1 a0 in stage K, then the 16 16 hese probabilities are linked by the state-update equation, pw ( a | a0 , K0 ) the probability of being dead (from the disease) at age state probabilities are given by At subsequent ages, the state probabilities are given by the recursion equation a the, given that the recursion equation 1.2.6.8.4 Disease survival Once a person has the disease, they can poss 2.6.8.3 Disease incidence disease was contracted at 1.2.6.8.4a0 in state K Disease survival 0 Once a person has the disease, they can possibly change disease stage, or they can die from the disease. (This analysis focuses only on the ide 16 æ p0 ( a + 1 | a0 , K 0 ) ö æ p0 ( a | a0 , K 0 ) ö he probability that a person, who at age 0 does not have the disease, first gets the disease at age pinc ( a0 , K ) disease. (This analysis focuses only on the identified disease and does not allow for the possibility ÷ a in state K0 , given no disease at that they die from other causes.) Suppose the 0ç ÷ ç the probability of first getting the disease in ç p1 ( a + 1 | a0 , K 0 ) ÷ p1 ( a | a0 , K 0 ) ÷ 0 in state K 0 given as ç that they die from other causes.) Suppose they acquire the disease at age a0 in stage K, then the ç age 0 ÷ = T ( a , a0 ) ç ÷ .. ... ç ÷ çæ a = a0 -1 K ö ÷ pw ( a | a0 ) ç K ) ( a the probability of dying from the disease in stage K at age pN -150 + 1 | ap 0 , K( inc 0a ÷ 0 , K 0 ) = Õ ç ç p1N--1 ( a p å | a0( 0k ,aK)0 )p ÷÷ 0 K0 a , given that the (1.43) 16 ç p (a + 1 | a , K ) ÷ disease was contracted at age a a = 0 ç è p (k a =1 ) | a , K ø ÷ and that the person was alive at age a-1 è w 0 0 ø è w 0 0 0 ø 2.6.8.4 Disease survival Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine pii ( a00) i (d 0) =(d (0 initial state vector is determined from the initial conditions initial state vector is determined from the initial conditions initial state vector is determined from the initial conditions p= aiK initial state vector is determined from the initial conditions initial state vector is determined from the initial conditions iK initial state vector is determined from the initial conditions 0 0 K p ppi i( iK i( 000( > a aa ))= )K ) , 0d = = pw > d d w) 0 (a,0 (( ( 0K) p K K =(> 00> w a0 0 0 > 0 0 0 )),= . . ) 0 ) ,, p pp . w(((a aa ))= =00 . 0) = 0 . . initial state vector is determined from the initial conditions p i ( a0 ) = d At subsequent ages, the state probabilities are given by the recursion equation At subsequent ages, the state probabilities are given by the recursion equation iK0 ( At subsequent ages, the state probabilities are given by the recursion equation K 0 > 00 00 ) , p iKiK iK w 0( a ) = 00 0 0 . ww 00 At subsequent ages, the state probabilities are given by the recursion equation At subsequent ages, the state probabilities are given by the recursion equation At subsequent ages, the state probabilities are given by the recursion equation At subsequent ages, the state probabilities are given by the recursion equation æ p00 ( aæ+ p10|(a00+ ,K 1 |0 ) a 0ö,K (((a ) 11ö æ ) pö0ö 0 (a æ |p a00,(K a0 |0 ) a æö,K 0()(aaö| |aa ,,K ç p (a ç +1| a , K ææ0 æ ppp ÷ a0+ a ++ 1÷|||a aa ,,,K K K ç 0) )ö ç ææ0 ÷ ppp 0( a÷| a00 0, K K ))) öö ö æ p 0 (a + p1 ( | a 0+, Kç 1ç | ) ) a 00 0 ö , K ) 000 00 æ p p ÷÷ (( a a | | a p a ,, ( K Ka | ) a) ç ç ö , K 00 ) 00 0 ÷÷ ç 11 ç 1 00 çç (((a ))) ÷1 ÷ 0 ç 0(( ))) ÷÷ 0 0 0 0 ç ç 00p 0÷ p p ÷ a0+ a ++1 1 1÷|||a aa ,,,KKK ç ç 1 ç ÷ 0 10 0 ç 0÷ p ç ÷p p 1( aa÷ a |||aa a ,,,K K K ÷ ç 1 ( ç .. 0 ç.. ) ÷ = T (÷a= 1( )çç÷ 11 1 000 000 11 00 0 00 0 p a + 1 | a , K ç p ÷ a | a , K ç ÷ 0) (ç )= ç ç0 ÷ , aT ça, a ç ...0 0... ÷ ÷ ç ç .. ç ÷ = T (÷ .. .. .. 0 ÷ ç ÷÷ ç ... ÷= ÷ 0 = TT T ( ( (aa a , , ,aa a )) ) ç ÷ ÷... ... ... ÷÷ ÷ ç p ( ap+ 1 ( | a , +çç K 0 çp1 | ) a ÷ , K ) a , a ) ç p ÷ ( ap | a ,( 0 K a 00 | ç) a 0 ç÷ ç ÷ , K ) ÷÷ ç ç N N--1 ç N -1 1 0 çç p0 ÷ p 0 0 (( ÷ 1( aa0+ a ++1÷11|||aaa 0 ,,,Kç K ç K ))N )N ÷÷ - -11 ç N -1 0 0 0 ç ç p ÷ 0 p p 0(( 1( a÷ a|||a a a a ,,,KK K )))÷÷ ÷ (( ç )0÷ ÷ ( )a ç ÷ NN --11 000 000 NN-11 - 00 0 000 ç pp aa+ ç + p 1(|aa+ 0 ,çK ) N ÷ - ) ÷ ç pp ( ç a) || a (,,aK) ÷,K ÷ N - ç 1 w| 0, K1ç | a , K ç ÷÷ a p a K | ç ç ÷ ÷÷ ÷(( (a ))) ø÷ w( ((aa)))|||a è Nw -1 w è 0 èp èç 00 0 pp ø aa+ 0 +1 + 1ø 1|||aa a ,,,KKKè N w -1 è 0 ø ( a ) | a , K èç w w 00 0 0 0 ø è÷ p 0 p p 0 a ø a a ,,,K K K 0ø ÷ ø ç p (a + 1 | a , K ç 0) ø è w 0 è w w w 00 0 00 0 è pw ø 0 0 ø è ww 000 00 ø ææ ææ ö ææææö ö ö 1- å 1p- å pæ æ ööö 0 0 ... ... 0 ... 0 0÷ 0 0÷ öö 0÷ö ç ææç ç ç k k 0 0 ÷ ö>0 ç k ççç01 ç1÷ 1 -å -- å pp pkkk000÷÷ 000 0 ... ... ö 0 0 00 ÷ çèç1 - å ç kç k>>è 00p k k0 ÷ ø ç èø kkk èè ø÷0 ø ... 0÷ ÷ ÷ ÷ çç ç >> >000 ø ÷÷ ÷ ç è kç>0 ç ø ç æ æ ö ö ÷ ÷ ÷ ç ç pç0,1 0,1 p0,1 ç ç æ ç ç1 - å ç kk> pçp p 1 p - 0,1 kö 1, 1, k kå ÷ (1p -ææ 1, æp k11÷ 1 1 - w( 1 w - 1 -å( a - å | p a pp p 1 w00 ( ) ) ö aöö| a ((11 ... 1- -p 0 - ) ) p p 111 ... w( ((a aa|||aaa 0 ))) ) ...... ... 0 0 ÷ ÷ 000 ÷0 00 0÷ ÷÷ ÷ ÷ ÷ T ( a, aT0) (º a,ç a0 ) º pç èç 1 å è10,1p ø ( >1 1 - ççç pøw 1 ( a | a 1,1, ) kk ) k÷÷ ÷ ... w 000 0 0 ÷ ÷ 0 ç TT Tç (((a, a000))) ºçç aa 0,1 , a , a ºº ç - > 1 0,1 1, k ÷ è è è k >1 k > k 1 >1 1, 0 øøø w ÷ ÷÷ ÷ T ( a , a0 ) º ç ç ... è ... çç ø ... ... ... ... ... ... ÷ ÷ ... ... ÷ ç ...... ... ÷ ÷÷ ÷ k > 1 ç ç ç ... ... ... ... ... ... ... ÷ ... ... ÷ ... ... ...÷÷ ... p çç ç ... ... 1... ... ... 1 ( 1, N -1w ( ( pp ( ) (( )p (|||aa ( ( ... (() ) -11 ((a ÷ 0) )p ... 00 ) ) ç p ç p p 1 - p 1 1 1 a - | p a 1 a | a ... - p NN- 1 - 1 1 - a p N -1 | a a | a0 ÷ 0a÷ ç çç ç p p p 11 1- - p11 (( 1 w( aa a ))) ) ( 1 1- -p 0p÷N N w ( N- 1 - |||aa )) ) 00÷÷÷ 0) 0,N 0, N-- 1 0, N -1 1 1, 1, NN- -1 w w 0 0 w w w p 1 - a ... 1 - p a a 0 ç ç ç p1, N -1 ( 1- pw ( 1 a | a0 ) ) ... (1 - N 1 (a |1a0 ) ) 0 ÷ ÷ ÷ 0, N 0, N- 11 - 1 1, N 1, N- 11 - w 000 w 00 ç ç p0, ç0 çç 0, N - 1 1 1, N - w p N -1 ÷ w ÷ ÷÷ ÷ N -1 0 çç p 1 ( a | pa ( ) a | a ) 1 (( ... a0)) p ... N-- w1 ( ap N- | a ( ) a | a ) 1 1 ÷÷ ) w w÷ ( ))) è ç è 0 p 1 a a|||a p0pN÷ N N ø --1 ( 1 ø (aa|||a 1 - 1 èç 00 w w w0 0 pp w( a w w ... ... 0 0 a 11ø÷ è 0 ...a00 ( a | a... p a a 1 ø ç è 0 è pw 1 ( a | a0 ) (1.44) (1.44) (1.44) ww pw N -1 0) w 1w ø 000 ø (1.44) (1.44) (1.44) (1.44) Where, for survival model 2, Where, for survival model 2, Where, for survival model 2, Where, for survival model 2, Where, for survival model 2, Where, for survival model 2, Where, for survival model 2, Where, for survival model 2, ì pwKK ì( pa K = a( 0a = a 0 ) )00 0 ( (a ))) w0 0 w0 0 ìììpp p KK K ( aa= =a = a a p K K ( a p | K a () a = | ï ì a í p p) K ï K K w=0K í((pa a K= a < ( a a£ïïïa < www a 00 + £4)) a + 4 ) 0 w( )))a 1 ( ))) K ( (a K 0 K K (1.45) (1.45) (1.45) 0pp aa|w |aa = 0 p p a 0 < incidence rem > remission pre > prevalence mor > mortality Cost indirect 11 Cost indirect 22 sur > survival bmi > body mass index smk > smoking -- Common 01 > stage0 to stage 1 Cost direct 11 Cost direct 22 02 > stage1 to stage 2 20 > stage1 to stage 2 11 > stage1 Disease DiseaseState DiseaseState stage 1 stage 2 Risk inc 01 relrisk bmi relrisk smk Risk inc 02 relrisk bmi relrisk smk Risk inc 12 relrisk bmi relrisk smk Risk rem 20 relrisk bmi relrisk smk Risk rem 10 relrisk bmi relrisk smk Risk rem 21 relrisk bmi relrisk smk Figure 4: Mulitistage Disease Architecture 57 Table 5: The C++ Tdisease Class TDisease Description Data field name Disease name terminal Boolean, true if the disease is terminal state Disease state {normal, severe,…} *DataAvailability Boolean array of Data availability by risk type **IncidenceRisk Incidence rates by age, gender ***SurvivalRisk Survival rates by age, gender, state **PrevalenceRisk Prevalence rates by age, gender ***RemissionRisk Remission rates by age and gender,state ***MortalityRisk Mortality rates by age, gender, state Relative risks by risk factor type, age, gen- ***RelRisk der … … Method TDisease(aFile) Constructor using data from aFile LoadFromFile(aFile) Fills the data fields from aFile WriteToFile(aFile) Writes the data fields to aFile GetRisk(state, medHistory,risk- Returns risk for specified risktype type) … … Processing is user-specified to be either random (Monte Carlo) or deterministic. The random option can process any specified population or cohort; the deterministic option processes only cohorts. In this context: A population is a specified number of males and females whose age distributions and risk factor distributions are input as appropriate tab delimited text files; a cohort is a text file of individuals specifying, for each individual, their initial state and medical history. The user options and necessary data files are specified in the application’s simulation editor. The user must also specify the set of diseases and the set of risk factors that are being simulated. Again, this is done via the appropriate application editor: The disease editor allows the construction and identification of a batch file of disease files; the simulation editor allows for the specification of the mix of risk factors and, where necessary, their distributions by age and gender. The simulation editor also provides the mechanism by which essential run parameters are specified – the start year, stop year, number of trials, and so on. 58 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Individuals are processed one at a time from the simulation’s start year until they either die or reach the simulation’s stop year. In each simulated year they can either contract any mix of the modeled diseases which they do not yet have, achieve remission from any disease or disease stage they might have, die from any terminal disease they might have, or die from other causes. (Other causes are modeled as a single, instantly fatal, terminal disease; its incidence probability is constructed via the disease editor from the modeled diseases’ mortality statistics and the appropriate national mortality statistics). Each run of the model requires the specification of a risk-factor scenario for each risk factor modeled. These scenarios can simply maintain risk-factor distributions at their start year values, or they can allow for the modeling of risk factor trends or medical advances resulting in the reduction of disease incidence or improvements in the survivability of specified diseases. 2.3.3 Tscenario C++ class Scenarios are modeled as instances of the C++ Tscenario class and are constructed by the scenario editor, which is accessed via the simulation editor. Runs can be organized into batches, with different runs having different risk- factor scenarios. This allows for direct comparisons to be made – for example, what happens to life expectancy with or without improvements to the treatment of stroke. Scenarios are implemented as instances of the C++ class, Tscenario; an indication of its data fields and methods is provided in Table 6. The scenario objects are constructed from files that are created by the scenario editor. Much of the input data (disease data, mortality data, demographic data, etc.) is typically changed on an annual basis. Such changes are easily accommodated and logged via the input editors – the disease, distribution, and simulation editors. New diseases that are described by the current set of risk factors can be added to (or subtracted from) the simulation via the disease editor. The model has essentially only two external software dependencies: Its own C++ development environment and its host processor’s operating system. The configuration was chosen for ease of its maintainability. 59 Table 6: The C++ Tscenario Class Tscenario Data field scenarioType Type of scenario eg. {,smoking,… } start year Year at which scenario starts stop year Year at which scenario stops futureRiskFile File specifying future risk distribution targetAgeGroup Target age group eg. {18+} targetGenderGroup Target gender group eg. {males,females} … … Method Tscenario(aFile) Constructor using data from aFile LoadFromFile(aFile) Fills data fields from aFile … … 60 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine References 1. Hoogenveen RT, van Baal PH, Boshuizen HC, Feenstra TL. Dynamic effects of smoking cessation on disease incidence, mortality and quality of life: The role of time since cessation. Cost Eff Resour Alloc. 2008. 2. Gardner JW, Sanborn JS. Years of Potential Life Lost (YPLL)-What Does it Measure? Epidemiology. 1990;1(4):322-9. 3. Health and Social Care Information Centre. Indicator Specification: CCG OIS 1.1 Potential Years of Life lost (PYLL) from causes considered amenable to healthcare 2015. Available at: https://indicators.hscic.gov.uk/download/ Clinical%20Commissioning%20Group%20Indicators/Specification/CCG_1.1_ I00767_S.pdf. 4. Gold M, Siegel JE, Russell LB, Weinstein MC. Cost-effectiveness in Health and Medicine. Press OU, editor1996. 5. Menzin J, Marton JP, Menzin JA, Willke RJ, Woodward RM, Federico V. Lost productivity due to premature mortality in developed and emerging countries: An application to smoking cessation. BMC medical research methodology. 2012;12(1):1. 61 Appendix 2. Results of the TaXSiM model: Ukraine Summary Cigarette Tax - Scenarios Output – 2015 – 2017 SCENARIO 1 Baseline Situation (2017): Ad valorem (2016): Ad valorem remains equal and Expected (12%) minimum Expected Actual 12% tax increase Government Revenue Type Contribution specific (8.515 Contribution 2015 in minimum spe- to GDP UAH) and simple to GDP cific excise (9.54 specific (6.365 UAH), and simple UAH) specific (7.13 UAH) Total cigarettes taxed (billion 73.8 66.9 64.0 pieces) Average cigarette price (UAH per 15.2 19.2 21.2 pack) Average cigarette price (US$ per $0.63 $0.81 $0.87 pack) * Average excise tax (UAH per 1000 308.9 430.7 482.6 pieces) Total excise tax revenue (billion 22.8 1.0% 28.8 1.3% 30.9 UAH) Total excise tax revenue (US$ $0.94 $1.21 $ 1.27 billion) * Additional tobacco excise (billion 6.0 0.3% 2.1 UAH) /percentage of GDP Additional tobacco excise (U$ $ 254 $85 million) * Total government revenue (excise, 34.9 1.6% 42.1 1.8% 45.0 VAT and levies, billion UAH) Total government revenue (excise, $1.4 $1.8 $1.8 VAT and levies, US$ billion) * Total expenditure on cigarettes 56.3 64.3 67.9 (billion UAH) Percentage change in total ciga- -9.3 -4.3 rette consumption (%) * World Bank Group forecast: Annual average exchange rate = 2016 (1US$/23.8 UAH); 2017 (1US$/24.4 UAH) 62 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine SCENARIO 2 SCENARIO 3 SCENARIO 4 (2017): Increase (2017): Increase (2017): Increase Ad valorem tax 30% the Ad va- Ad valorem and Expected (15%), and 30% Expected lorem, and 50% Expected specific tax (40%), Expected Contribution Increase in the Contribution Increase in the Contribution adopting a simpli- Contribution to GDP minimum specific to GDP minimum specific to GDP fied tax structure to GDP excise (11.08 UAH), excise (12.77 UAH), with uniform and simple specific and simple specific specific excise tax (8.28 UAH) (9.55 UAH) (11.92 UAH) 60.1 53.4 48.8 24.7 32.9 41.4 $1.01 $ 1.35 $1.69 573.0 825.8 1106.1 1.2% 34.4 1.3% 44.1 1.7% 54.0 2.1% $1.41 $ 1.81 $ 2.21 0.1% 5.6 0.2% 15.3 0.6% 25.1 1.0% $230 $ 626 $1,030 1.8% 49.9 1.9% 61.9 2.4% 73.9 2.9% $2.0 $ 2.5 $ 3.0 74.2 88.0 100.9 -10.2 -20.2 -27.1 63 Appendix 3. Adjustment of Epidemiological Input Data for the Microsimulation Model of the Health Impacts of Tobacco Taxation in Ukraine The goal of this annex is to summarize research findings relevant to the microsimulation model, including data on the incidence of tobacco-use- related diseases, relative risks of their development in smokers and former smokers compared with never smokers, and related risk of premature death. Below, diseases included in the model are considered and, for each of them, extracts are shown from studies related to incidence, relative risks, and mortality. Based on the published evidence and statistics available for Ukraine, estimates of incidence and relative risks were elaborated and suggested as inputs for the microsimulation model. As estimates from the Global Burden of Disease database became available during our research (Global Burden of Disease, 2016), we utilized incidence inputs from this database. Estimates for relative risks were used, as explained below. 3.1 Cardiovascular Diseases, Particularly Coronary Heart Disease (CHD) 3.1.1 Incidence Because there are no proper estimates of CHD incidence in Ukraine, and most studies report only relative risks and no absolute risks, data were adapted from the seminal study on ischemic heart disease – the Framingham study (Castelli, 1984; Lerner & Kannel, 1986). The Framingham study reports average morbidity and mortality for men and women. It was calculated that the number of CHD cases is about five times greater than the number of CHD deaths. SDR for CHD in Ukraine was taken for the years 2012-2014 and multiplied by the coefficient to estimate the approximate level of average incidence. Table 3.1: Calculation of Estimated CHD Morbidity for Ukraine (per 100 000) Morbidity/ Framingham Ukraine mortality morbidity mortality ratio morbidity mortality Males 28.7 6.2 4.6 2777 600 Females 14.5 2.8 5.2 1864 360 64 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Subsequently, from the graphs included in papers on the Framingham study, the biennial levels of new cases of CHD per 1000 were extracted, smoothed to five-year intervals, and multiplied by 50 to switch to rates per year per 100000. Then, average unweighted incidence for people aged 35-84 was calculated, the coefficients were determined to adjust the incidence to the expected value based on mortality levels (1.6 for men and 1.9 for women), and the expected incidence was calculated. It was assumed that people older than 84 years have the same incidence as people aged 80-84. In Table 3.2, columns which correspond to age groups 0-34 are not shown, as their incidence of CHD is assumed to be 0. Table 3.2: Calculation of Estimated Incidence of CHD in Ukraine By Gender And Age, per 100 000 Population per Year Age groups Coefficients 35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-74 75-79 80-84 >84 Framingham biennial per 1000 male 9 9 21 21 40 40 48 48 52 52 female 1 1 7 7 20 20 26 26 46 46 Smoothed male 9 13 17 27 34 43 45 49 51 52 female 1 3 5 11 16 22 24 33 39 46 Per year per 1000000 male 450 650 850 1367 1683 2133 2267 2467 2533 2600 1700 female 50 150 250 567 783 1100 1200 1633 1967 2300 1000 Multiplied for Ukraine male 735 1062 1389 2233 2750 3485 3703 4030 4139 4248 4248 1.6 female 93 280 466 1056 1460 2051 2237 3045 3666 4288 4288 1.9 65 3.1.2 Relative risk Peer-reviewed literature shows that younger smokers have a much greater relative risk of contracting CHD than older ones. Relative to never smokers, CHD risk among current smokers was highest in the youngest and lowest in the oldest participants. For example, among women aged 40 to 49 years, the hazard ratio was 8.5 (95% confidence interval [CI] = 5.0, 14), while it was 3.1 (95% CI = 2.0, 4.9) among those aged 70 or older. The largest absolute risk differences between current smokers and never smokers were observed among the oldest participants. Finally, the majority of CHD cases among smokers were attributable to smoking. For example, attributable proportions of CHD by age group were 88% (40-49 years), 81% (50-59 years), 71% for (60-69 years), and 68% (70 years) among women who smoked (Tolstrup et al., 2014). Graphs from this systematic review which display RR by age groups are shown below. 16 8 Relative Risk 4 2 1 40-49 50-59 60-69 >70 Age Group, Years Figure 3.1: RRs of Developing CHD among Female Smokers Compared with Non-Smokers by Age Groups, Results from a Systematic Review (Tolstrup et al., 2014). 66 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 16 Relative Risk 8 4 2 1 40-49 50-59 60-69 >70 Age Group, Years Figure 3.2: RRs of Developing CHD among Male Smokers Compared with Non-Smokers by Age Groups, Results from a Systematic Review (Tolstrup et al., 2014). A population-based prospective cohort study of 19 782 men and 21 500 women aged 40-59 years between 1990-1992 and 2001 was conducted to examine the relationship between smoking status and the risk of CHD. A total of 260 incidences of CHD were confirmed among men, including 174 myocardial infarctions (MI). The numbers among women were 66 and 43, respectively. The multivariate relative risk [95% confidence interval (CI)] for current smokers versus never-smokers in men, after adjustment for cardiovascular risk factors and several lifestyle factors, was 2.85 (1.98, 4.12) for total CHD and 3.64 (2.27, 5.83) for MI. These respective risks in women were 3.07 (1.48, 6.40) and 2.90 (1.18, 7.18). Among men, a dose-dependent relationship was observed between the number of cigarettes and the risk of MI. The population-attributable risk percent (95% CI) of CHD was 46% (34, 55) in men and 9% (0, 18) in women. Smoking cessation, however, led to a rapid decline in the CHD risk within 2 years (Baba et al., 2006). Smoking fewer cigarettes/day for a longer duration was more deleterious than smoking more cigarettes/day for a shorter duration (P < 0.01). For 50 pack-years (365,000 cigarettes), estimated RRs of CVD were 2.1 for accrual at 20 cigarettes/day and 1.6 for accrual at 50 cigarettes/day (Lubin et al., 2016). 3.1.2.1 Women versus men A systematic review and meta-analysis of prospective cohort studies with data for 3 912 809 individuals and 67 075 coronary heart disease events from 86 prospective trials concluded as follows: In 75 cohorts (2.4 million participants) that adjusted for cardiovascular risk factors other than coronary heart disease, 67 the pooled adjusted female-to-male RRR of smoking compared with not smoking for coronary heart disease was 1.25 (95% CI 1.12-1.39, p<0.0001). This outcome was unchanged after adjustment for potential publication bias, and there was no evidence of important between-study heterogeneity (p=0.21). The RRR increased by 2% for every additional year of study follow-up (p=0.03). In pooled data from 53 studies, there was no evidence of a sex difference in the RR between participants who had previously smoked compared with those who never had (RRR 0.96, 95% CI 0.86-1.08, p=0.53) (Huxley & Woodward, 2011). 3.1.2.2 Effects of quitting smoking In a cohort of 475 734 Korean men aged 30 to 58 years, compared with non- reducing heavy smokers (>= 20 cigarettes/d), those who quit smoking showed significantly lower risks of MI with hazard ratios (95% confidence intervals [CI]) of 0.43 (0.34 to 0.53) (Song & Cho, 2008). 3.1.2.3 Suggested RR for smokers compared with non-smokers Based on the above results from (Tolstrup et al., 2014) and (Song & Cho, 2008), the updated risk ratios might be as follows. Table 3.3: Suggested Input Risk Ratios of Developing CHD among Smokers Compared to Never Smokers in Ukraine, by Gender and Age Age groups Gender 35-40 45-50 50-55 55-60 60-65 >65 men 5.0 4.0 3.0 3.0 2.0 2.0 women 8.5 8.5 6.6 4.8 3.4 3.1 3.1.3 Mortality In the Greek cohort study (Notara et al., 2015), which observed 10-year Acute Coronary Syndrome (ACS) prognosis among 2172 cardiovascular patients, patients with >60 pack-years of smoking had 57.8 % higher ACS mortality and 24.6 % higher risk for any ACS event. A nested model, adjusted only for age and sex, revealed that, for every 30 pack-years of smoking increase, the associated ACS risk increased by 13 % (95 % CI 1.03, 1.30, p = 0.001). Smoking is a strong independent risk factor for cardiovascular events and mortality even at older age, advancing cardiovascular mortality by more than five years, and demonstrating that smoking cessation in these age groups is still beneficial in reducing the excess risk. Random effects meta-analysis of the association of smoking status with cardiovascular mortality (based on the data of 503 905 participants aged 60 and older, of whom 37 952 died from 68 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine cardiovascular disease) yielded a summary hazard ratio of 2.07 (95% CI 1.82 to 2.36) for current smokers and 1.37 (1.25 to 1.49) for former smokers compared with never smokers. Corresponding summary estimates for risk advancement periods were 5.50 years (4.25 to 6.75) for current smokers and 2.16 years (1.38 to 2.39) for former smokers. The excess risk in smokers increased with cigarette consumption in a dose-response manner and decreased continuously with time since smoking cessation in former smokers (Mons et al., 2015). In Sweden and Estonia, a 13-year follow-up regarding all-cause and cardiovascular mortality revealed that smoking and, to a lesser extent, plasma levels of interleukin-6 were significant predictors of CVD and non-CVD mortality in men, but none of the other conventional risk factors reached statistical significance (Jensen-Urstad, Viigimaa, Sammul, Lenhoff, & Johansson, 2014). In a large prospective cohort of women (Sandhu et al., 2012) without coronary heart disease at baseline (among 101 018 women participating in the Nurses’ Health Study), a strong dose-response relationship between cigarette smoking and SCD risk was observed, and smoking cessation significantly reduced and eventually eliminated excess SCD risk. Compared with never smokers, current smokers had a 2.44-fold (95% CI, 1.80-3.31) increased risk of SCD after controlling for coronary risk factors. In multivariable analyses, the quantity of cigarettes smoked daily (P value for trend, <0.0001) and smoking duration (P value for trend, <0.0001) were linearly associated with SCD risk among current smokers. Small-to-moderate amounts of cigarette consumption (1-14 per day) were associated with a significant 1.84-fold (95% CI, 1.16-2.92) increase in SCD risk and every 5 years of continued smoking was associated with an 8% increase in SCD risk (hazard ratio, 1.08; 95% CI, 1.05-1.12; P<0.0001). The SCD risk linearly decreased over time after quitting and was equivalent to that of a never-smoker after 20 years of cessation (P value for trend, <0.0001). 3.1.3.1 Effects of quitting smoking A systematic review was conducted to determine the magnitude of risk reduction achieved by smoking cessation in patients with CHD. The researchers estimated a 36% reduction in crude relative risk (RR) of mortality for patients with CHD who quit, compared with those who continued smoking (RR, 0.64; 95% confidence interval [CI], 0.58-0.71) (Critchley & Capewell, 2003). 69 3.2 COPD 3.2.1 Incidence among the population Incident cases of COPD in a population-based prospective 9-year study in Sao Paulo, Brazil, ranged from 1.4% to 4.0%, depending on the diagnostic criterion used (Moreira et al., 2015). In the Rotterdam Study (Terzikhan et al., 2016), the overall IR was higher in men (13.3/1000 PY, 95 % CI 12.4–14.3) than in women (6.1/1000 PY, 95 % CI 5.6–6.6); age-specific IR ranged between 8.7 and 17.6/1000 PY in males and 3.0–7.9/1000 PY in females. The incidence of COPD increased from the age of 45 in both sexes to the age of 80 in men and 75 in women (Figure 3.3). 20 Incidence of COPD/1,000 PY 15 10 5 0 45-49 50-54 55-59 60-64 65-69 70-74 75-79 >80 Age categories Figure 3.3: Age-Specific Incidence of COPD by Sex and Age, Drawn from (Terzikhan et al., 2016) Although some studies report either relative risks of COPD among people in older age groups or mention that incidence increases after age 45 (Terzikhan et al., 2016), this does not mean that COPD is only occurring among people over 45. A systematic review on global burden of COPD (Halbert et al., 2006) reports pooled COPD prevalence at the level of 3.1% (1.8–5.0) among people younger than 40. Additionally, the WHO global report on mortality attributable to tobacco (World Health Organization, 2012) reports tobacco-related mortality starting from 30 years of age and estimates that 39% of COPD deaths among people aged 30-44 are attributable to tobacco. The only study which reported the incidence of COPD by age and sex was conducted in Japan (Kojima et al., 2007). Its findings were used to estimate incidence by age groups in Ukraine. 70 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Table 3.4: Extract on the Incidence of COPD by Age and Sex in Japan (Kojima et al., 2007) Males Females Number of Incidence rate (per Number of Incidence rate (per Age n incidence cases n incidence cases (years) for COPD 100 person-years) for COPD 100 person-years) Total 11,160 387 0.81 5,946 79 0.31 25-29 94 2 0.62 36 0 0.00 30-34 625 7 0.31 181 1 0.16 35-39 1,609 24 0.35 712 4 0.13 40-44 1,973 45 0.47 1,161 10 0.18 45-49 2,153 65 0.61 1,279 12 0.19 50-54 1,879 90 1.05 1,157 21 0.42 55-59 1,729 74 1.25 1,002 12 0.35 60-64 745 39 1.67 264 9 1.02 65-69 252 24 2.75 109 7 1.69 70-74 101 17 4.95 45 3 2.05 Another study conducted in Japan (Fukuchi et al., 2004) reported the prevalence of COPD among adults: 10.9% altogether, 16.4% among men and 5.0% among women. These data on incidence and prevalence were considered in order to obtain extrapolated estimates for Ukraine. However, the only disease occurrence indicator available for Ukraine is the prevalence of COPD from the WHO Euro Health for all database, which reports a level of 3.7-3.9% in 2005-2015. However, studies aimed at COPD measurement conducted, for instance, in Norway (Johannessen, Omenaas, Bakke, & Gulsvik, 2005) found that about half of COPD cases remain undiagnosed. Another study (Nielsen, 2009) projected COPD prevalence to be 15-25% of the adult population. Yet the prevalence of COPD in Norway reported in HFADB is 0.2%. Additionally, there is a recognized discrepancy in COPD prevalence across different countries and various studies. This is believed to be determined by the methods and definitions used to measure disease (Halbert, Isonaka, George, & Iqbal, 2003). Prevalence in most countries where proper measures were conducted was found to be between 4% and 10%. 71 Thus, there are no serious grounds to expect that the incidence of COPD in Ukraine should be lower than in Japan. Data on COPD incidence from the Japan study (Kojima et al., 2007) are suggested for use. Table 3.4: Suggested Incidence Rates for COPD Age groups 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-74 Men 0.62 0.31 0.35 0.47 0.61 1.05 1.25 1.67 2.75 4.95 Women 0.00 0.16 0.13 0.18 0.19 0.42 0.35 1.02 1.69 2.05 3.2.2 Risk of COPD in smokers The incidence rate (IR) (Terzikhan et al., 2016) was higher in current and former smokers than in never smokers (19.7/1000 PY, 95 % CI 18.1–21.4 in current smokers, 8.3/1000 PY, 95 % CI 7.6–9.1 in former smokers and 4.1/1000 PY, 95 % CI 3.6–4.7, in never smokers). The IR of COPD in smoking men was 15.0/1000 PY (95 % CI 13.9–16.2), compared to 8.6/1000 PY (95 % CI 7.8–9.5) in smoking women. The age-specific IR of COPD in ever smokers ranged between 7.3 and 15.3/1000 PY. The IR was 6.0/1000 PY (95 % CI 4.6–7.8) in never-smoking men and 3.7/1,000 PY (95 % CI 3.1–4.3) in never-smoking women. The age-specific incidence of COPD in never smokers increased by age, but to a lesser extent than the incidence of COPD in ever smokers (Figure 3. 5). Current smokers 30 All Former smokers Incidence of COPD/1,000 PY Never smokers 20 10 0 45-49 50-54 55-59 60-64 65-69 70-74 75-79 >80 Age categories Figure 3.5: Age-Specific Incidence of COPD by Smoking Status, Extracted from (Terzikhan et al., 2016) 72 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine The abovementioned study from Norway found that adjusted odds ratios (OR) for current smokers and ex-smokers were 9.6 (95% CI 3.6-25.2) and 5.0 (95% CI 1.8-13.8), compared to never smokers (Johannessen et al., 2005). The comparisons of risks from these two studies were extracted for the model. Table 3.5: Suggested Estimates for COPD RR in Smokers and Former Smokers, Compared with Never Smokers Age groups 35-40 45-50 50-55 55-60 60-65 65-69 70-74 75+ Smokers 9.6 8.4 6.6 5.0 4.6 4.5 4.9 5.0 Ex smokers 5.0 4.7 4.2 3.8 4.0 4.5 4.8 5.0 The suggested RRs are equal for men and women. With regard to various tobacco-related diseases, some researchers have reported that the risk of contracting these diseases is greater in women than in men, while others found that the risk is identical. In Danish cohorts (Prescott, Bjerg, Andersen, Lange, & Vestbo, 1997), it was seen that risk associated with pack-years was higher in females than in males. As women smokers in Ukraine on average smoke fewer cigarettes, equal RRs for men and women can be considered grounded. 3.3 Hypertension 3.3.1.1 Effects of smoking While some authors find an association between tobacco smoking and high blood pressure (Tesfaye, Byass, & Wall, 2009; Tesfaye, Byass, Wall, Berhane, & Bonita, 2008), it is necessary to distinguish between short-term, acute hypertensive effects and the long-term risk of developing chronic hypertension. Cigarette smoking acutely exerts a hypertensive effect, mainly through the stimulation of the sympathetic nervous system. As regards the impact of chronic smoking on blood pressure, available data do not provide evidence of a direct causal relationship between these two cardiovascular risk factors (Poulter, 2002), a concept supported by the evidence that lower blood pressure values have not been observed after chronic smoking cessation (Virdis, Giannarelli, Neves, Taddei, & Ghiadoni, 2010). Though the prevalence of hypertension was higher in former smokers than in never smokers (13.5 versus 8.8%, P < 0.001) and the risk of hypertension was higher [odds ratio (OR) 1.31 (1.13-1.52), P< 0.001] in former smokers than in never smokers (Halimi et al., 2002), these findings were from a cross-sectional study, and no grounds for cause-and-effect association are found in this regard (Poulter, 2002). 73 3.3.1.1 Effects of quitting smoking The effect of smoking cessation on the risk of developing hypertension (HPT) and on BP values was studied in a longitudinal study, with a follow-up period of 8 years, which included the participants of the Olivetti Heart Study. These were 430 untreated normotensive non-diabetic men with normal renal function (D’Elia et al., 2014). After 8 years of follow-up, BP changes (delta) were significantly lower in ex-smokers than in smokers (delta SBP/DBP: 12.6 +/- 13.4/7.9 +/- 8.1 vs. 16.0 +/- 14.9/10.3 +/- 10.1 mm Hg; P < 0.05; M +/- SD), also after adjustment for potential confounders. Moreover, at the last examination, the overall HPT prevalence was 33%, with lower values in ex-smokers than in smokers (25 vs. 38%, P = 0.01). After accounting for age, BP and BMI at baseline, and changes in smoking habits over the 8-year period, ex-smokers still had significantly lower risk of HPT than smokers (odds ratio 0.30, 95% confidence interval 0.15-0.58; P < 0.01). Taking into account contradictory data on the impact of tobacco smoking on developing hypertension, we decided to exclude hypertension from the list of diseases modeled in the microsimulation of tobacco-related health impacts. 3.4 Lung Cancer 3.4.1 Relative risk 3.4.1.1 Effects of smoking In the Seoul Male Cancer Cohort Study (SMCC), which included 14 272 men, cigarette smoking was associated with 4.18-fold risk of lung cancer in Korean men (Bae et al., 2007). 3.4.1.1 Women versus men Analysis was conducted on the data of 279 214 men and 184 623 women from eight states in the USA, aged 50-71 years at study baseline, participating in the NIH-AARP Diet and Health study. Findings revealed that incidence rates were 20.3 (95% Cl 16.3-24.3) per 100 000 person-years in men who had never smoked (99 cancers) and 25.3 (21.3-29.3) in women who had never smoked (152 cancers); for this group, the adjusted hazard ratio for lung cancer was 1.3 (1.0-1.8) for women compared with men. Smoking was associated with increased risk of lung cancer in men and women. The incidence rate of current smokers who smoked more than two packs per day was 1259.2 (1035.0-1483.3) in men and 1308.9 (924.2-1693.6) in women. In current smokers, in a model adjusted for typical smoking dose, the HR was 0.9 (0.8-0.9) for women compared with men. For former smokers, in a model adjusted for years of cessation and typical smoking dose, the HR was 0.9 (0.9-1.0) for women compared with men. 74 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Incidence rates of adenocarcinoma, small-cell carcinoma, and undifferentiated tumors were similar in men and women; incidence rates of squamous tumors in men were about twice those in women. These findings suggest that women are not more susceptible than men to the carcinogenic effects of cigarette smoking in the lung. In smokers, incidence rates tended to be higher in men than women with comparable smoking histories, but differences were modest; smoking was strongly associated with lung cancer risk in both men and women (Freedman, Leitzmann, Hollenbeck, Schatzkin & Abnet, 2008). 3.4.2 Mortality 3.4.2.1 Effects of smoking In the Japan Collaborative Cohort (JACC) Study, with 45 010 males and 55 724 females aged 40-79 years, 52.2% and 14.8% of lung cancer deaths were attributable to current and former cigarette smoking, respectively. In females, the corresponding figures were 11.8% and 2.8%. Among current male smokers, the relative risk was strongly correlated with the intensity and duration of cigarette smoking. In contrast, the PAR was associated with an intermediate level of smoking except for the years of smoking: the largest PARs were observed in those with 20-29 cigarettes per day, 40-59 pack-years and 20-22 years old at smoking inception. Absolute risks were estimated to increase with age and duration of smoking and not to decrease even after cessation (Ando et al., 2003). 3.4.1.1 Effects of quitting smoking Pooled data from three large-scale cohort studies in Japan were used to evaluate the impact of smoking cessation on the decrease in risk of lung cancer death in male ex-smokers by age at quitting. For simplicity, subjects were limited to male never smokers and former or current smokers who started smoking at ages 18-22 years. 110 002 men aged 40-79 years at baseline were included. During the mean follow-up of 8.5 years, 968 men died from lung cancer. The mortality rate ratio compared to current smokers decreased with increasing attained age in men who stopped smoking before age 70 years. Among men who quit in their fifties, the cohort-adjusted mortality rate ratios (95% confidence interval) were 0.57 (0.40-0.82), 0.44 (0.29-0.66) and 0.36 (0.13-1.00) at attained ages 60-69, 70-79 and 80-89 years, respectively. The corresponding figures for those who quit in their sixties were 0.81 (0.44-1.48), 0.60 (0.43-0.82) and 0.43 (0.21-0.86). Overall, the mortality rate ratio for current smokers, relative to nonsmokers, was 4.71 (95% confidence interval 3.76-5.89) and those for ex-smokers who had quit smoking 0-4, 5-9, 10-14, 15-19, 20-24 and >= 25 years before were 3.99 (2.97-5.35), 2.55 (1.80-3.62), 1.87 (1.23-2.85), 1.21 (0.66-2.22), 0.76 (0.33-1.75) and 0.67 (0.34-1.32), respectively. Although earlier cessation of smoking generally resulted in a lower rate of lung cancer mortality in each group of attained age, the absolute mortality rate decreased appreciably after stopping smoking even in men who quit at ages 60-69 years (Wakai et al., 2007). 75 3.5 Peripheral arterial disease (PAD) In a meta-analysis of the association between cigarette smoking and PAD, the pooled OR for current smokers was 2.71 (95% CI 2.28 to 3.21); for ex- smokers, the pooled OR was 1.67 (95% CI 1.54 to 1.81). The magnitude of the association is greater than that reported for coronary heart disease. The risk is lower among ex-smokers but, nonetheless, significantly increased compared with never smokers (Lu, Mackay, & Pell, 2014). 3.5.1 Any stroke 3.5.1.1 Effects of smoking In a meta-analysis on the possible risks of stroke from cigarette smoking (Shinton & Beevers, 1989), the overall relative risk of stroke associated with cigarette smoking was 1.5 (95% confidence interval 1.4 to 1.6). Considerable differences were seen in relative risks among the subtypes: Cerebral infarction 1.9, cerebral hemorrhage 0.7, and subarachnoid hemorrhage 2.9. An effect of age on the relative risk was also noted; less than 55 years 2.9, 55-74 years 1.8, and greater than or equal to 75 years 1.1. A dose response between the number of cigarettes smoked and relative risk was noted, and there was a small increased risk in women compared with men. Ex-smokers under the age of 75 seemed to retain an appreciably increased risk of stroke (1.5); for all ages, the relative risk in ex-smokers was 1.2. In a prospective study (Wannamethee, Shaper, Whincup, & Walker, 1995) of cardiovascular disease and its risk factors, 7735 men aged 40 through 59 years were drawn at random from the age-sex registers of one general practice in each of 24 British towns from 1978 through 1980 (the British Regional Heart Study). During the 12.75 years of follow-up, there were 167 major stroke events (43 fatal and 124 non-fatal) in the 7264 men with no recall of previous ischemic heart disease or stroke. After full adjustment for other risk factors, current smokers had a nearly fourfold relative risk (RR) of stroke compared with never smokers (RR, 3.7; 95% confidence interval [CI], 2.0 to 6.9). Ex-smokers showed lower risk than current smokers, but showed excess risk compared with never smokers (RR, 1.7; 95% CI, 0.9 to 3.3; P = .11); those who switched to pipe or cigar smoking showed a significantly increased risk (RR, 3.3; 95% CI, 1.6 to 7.1), similar to that of current light smokers. Primary pipe or cigar smokers also showed increased risk (RR, 2.2; 95% CI, 0.6 to 8.0), but the number of subjects involved was small. The benefit of giving up smoking completely was seen within five years of quitting, with no further consistent decline in risk thereafter, but this was dependent on the amount of tobacco smoked. Light smokers (< 20 cigarettes/d) reverted to the risk level of those who had never smoked. Heavy smokers retained a more than twofold risk compared with never smokers (RR, 2.2; 95% CI, 1.1 to 4.3). The age-adjusted RR of stroke in those who quit smoking during the first five years of follow-up (recent quitters) 76 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine was reduced compared with continuing smokers (RR, 1.8; 95% CI, 0.7 to 4.6 vs. RR, 4.3; 95% CI, 2.1 to 8.8). The benefit of quitting smoking was observed in both normotensive and hypertensive men, but the absolute benefit was greater in hypertensive subjects. Thus, smoking cessation is associated with a considerable and rapid benefit in decreasing the risk of stroke, particularly in light smokers (< 20 cigarettes/d); a complete loss of risk is not seen in heavy smokers. Switching to pipe or cigar smoking confers little benefit, emphasizing the need for complete cessation of smoking. The absolute benefit of quitting smoking on risk of stroke is most marked in hypertensive subjects. In the Japan Public Health Center-based Prospective Study on Cancer and Cardiovascular Disease (JPHC Study), relative risks (95% CIs) for current smokers compared with never-smokers, after adjustment for cardiovascular risk factors and public health center, were 1.27 (1.05 to 1.54) for total stroke, 0.72 (0.49 to 1.07) for intraparenchymal hemorrhage, 3.60 (1.62 to 8.01) for subarachnoid hemorrhage, and 1.66 (1.25 to 2.20) for ischemic stroke. The respective multivariate relative risks among women were 1.98 (1.42 to 2.77), 1.53 (0.86 to 4.25), 2.70 (1.45 to 5.02), and 1.57 (0.86 to 2.87). There was a dose-response relation between the number of cigarettes smoked and risks of ischemic stroke for men. A similar positive association was observed between smoking and risks of lacunar infarction and large-artery occlusive infarction, but not embolic infarction (Mannami et al., 2004). 3.5.1.1 Women versus men In a systematic review and meta-analysis which aimed to estimate the effect of smoking on stroke in women compared with men (Peters, Huxley, & Woodward, 2013), with data from 81 prospective cohort studies that included 3 980 359 individuals and 42 401 strokes, the pooled multiple-adjusted RRR indicated a similar risk of stroke associated with smoking in women compared with men (RRR, 1.06 [95% confidence interval, 0.99-1.13]). In a regional analysis, there was evidence of a more harmful effect of smoking in women than in men in Western populations (RRR, 1.10 [1.02-1.18]), but not in Asian populations (RRR, 0.97 [0.87-1.09]). Compared with never-smokers, the beneficial effects of quitting smoking on stroke risk among former smokers were similar between the sexes (RRR, 1.10 [0.99-1.22]). 3.5.2 Ischemic stroke Smoking is associated with an increased risk of ischemic stroke or CV death in the Atherosclerosis Risk in Communities (ARIC) Study, which comprised mostly middle-aged to young-old subjects (65-74 years), but not in the Cardiovascular Health Study (CHS), which comprised mostly middle-old or oldest-old (>= 75 years) adults with atrial fibrillation. Compared with never smokers, current smokers had a higher incidence of the composite endpoint in ARIC [HR: 1.65 (1.21-2.26)], but not in CHS [HR: 1.05 (0.69-1.61)] (Kwon et al., 2016). 77 3.5.2.1 Effects of quitting or reducing smoking In a cohort of 475 734 Korean men aged 30 to 58 years, compared with non-reducing heavy smokers (>= 20 cigarettes/d), those who quit smoking showed significantly lower risks of ischemic stroke with hazard ratios (95% confidence intervals [CI]) of 0.66 (0.55 to 0.79). Compared with non-reducing heavy smokers, the risks of all strokes combined and MI among reducers tended to decrease, although the decrements were not statistically significant (Song & Cho, 2008). 3.5.3 Hemorrhagic stroke 3.5.3.1 Relative risk 3.5.3.1.1 Effects of smoking In a study of incident cerebral microbleeds (CMBs), which are asymptomatic precursors of intracerebral hemorrhage, conducted among 2635 individuals aged 66 to 93 years from the population-based Age, Gene/Environment Susceptibility (AGES)-Reykjavik Study, relative risk for current smoking was 1.47 [95% CI, 1.11-1.94] (Ding et al., 2015). 3.5.1.1.1 Effects of quitting or reducing smoking In a cohort of 475 734 Korean men aged 30 to 58 years, compared with non-reducing heavy smokers (>= 20 cigarettes/d), those who quit smoking showed significantly lower risks of subarachnoid hemorrhage with hazard ratios (95% confidence intervals [CI]) of 0.58 (0.38 to 0.90). For hemorrhagic stroke, quitters showed lower risk compared with heavy smokers, but the difference was not statistically significant (hazard ratio 0.82, 95% CI: 0.64 to 1.06). The risks of subarachnoid hemorrhage in those who reduced from moderate to light smoking tended to be lower than in non-reducing moderate (10 to 19 cigarettes/d) smokers (Song & Cho, 2008). 3.5.1.2 Suggested relative risks Based on the above literature on cumulative risk of all strokes, the below RRs are suggested for the model. Table 3.6: Estimates of All Strokes Relative Risk for Smokers and Ex-Smokers Compared to Never Smokers for the Microsimulation Model by Age Group, Both Genders Age groups 35-40 40-50 50-55 55-60 60-65 >65 Smokers 1.7 1.7 1.5 1.5 1.2 1.2 Ex smokers 4.3 3.7 2.9 1.8 1.8 1.2 78 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine 3.6 All-Cause Mortality 3.6.1.1 Effects of smoking In a large community-based prospective cohort study comprising 6209 Beijing adults (aged >= 40 years) studied for approximately eight years (1991-1999), the multivariable-adjusted HRs for all-cause mortality were 2.7 (95% confidence interval (CI):1.56-4.69) in young adult smokers (40-50 years) and 1.31 (95% CI: 1.13-1.52) in old smokers (>50 years) (Li et al., 2016). Mortality differences (/10,000 person-years) were 15.99 (95% CI: 15.34-16.64) in the young and 74.61(68.57-80.65) in the old. Compared with current smokers, the HRs of all- cause deaths for former smokers in younger and older adults were 0.57 (95% CI: 0.23-1.42) and 0.96 (95% CI: 0.73-1.26), respectively. Among 20 033 individuals participating in the Health Effects of Arsenic Longitudinal Study (HEALS) in Bangladesh, cigarette/bidi smoking was positively associated with all-cause (HR 1.40, 95% CI 1.06 1.86) and cancer mortality (HR 2.91, 1.24 6.80), and there was a dose-response relationship between increasing intensity of cigarette/bidi consumption and increasing mortality. An elevated risk of death from ischemic heart disease (HR 1.87, 1.08 3.24) was associated with current cigarette/bidi smoking. Among women, the corresponding HRs were 1.65 (95% CI 1.16 2.36) for all-cause mortality and 2.69 (95% CI 1.20 6.01) for ischemic heart disease mortality. Cigarette/bidi smoking accounted for about 25.0% of deaths in men and 7.6% in women (Wu et al., 2013). 3.6.1.1 Effects of quitting smoking Effects of quitting smoking on all-cause mortality were measured in a cohort of 1 494 Chinese people (961 men, 533 women) followed for 18 years (1976-1994) to assess changes in smoking behavior and then for an additional 17 years (1994-2011) to examine the relationships of continuing to smoke and new quitting with mortality risk. Ever smokers had increased risks of lung cancer, coronary heart disease, thrombotic stroke, and COPD, with dose-response relationships. For all tobacco-related mortality, the relative risk for new quitters compared with continuing smokers was 0.68 (95% confidence interval: 0.46, 0.99) for those who had quit two to seven years previously and 0.56 (95% confidence interval: 0.37, 0.85) for those who had quit eight years or more previously. The corresponding relative risks were 0.69 and 0.45 for lung cancer, 0.78 and 0.51 for coronary heart disease, 0.76 and 0.84 for thrombotic stroke, and 0.89 and 0.61 for COPD, respectively (He et al., 2014). 79 In the Singapore Chinese Health Study, a cohort study of middle-aged and elderly Chinese in Singapore (n=48 251), compared with current smokers, the adjusted HR (95% CI) for total mortality was 0.84 (0.76 to 0.94) for new quitters, 0.61 (0.56 to 0.67) for long-term quitters and 0.49 (0.46 to 0.53) for never- smokers. New quitters had a 24% reduction in lung cancer mortality (HR: 0.76, 95% CI 0.57 to 1.00), and long-term quitters had a 56% reduction (HR: 0.44, 95% CI 0.35 to 0.57). The risk for coronary heart disease mortality was reduced in new quitters (HR: 0.84, 95% CI 0.66 to 1.08) and long-term quitters (HR: 0.63, 95% CI 0.52 to 0.77), although the result for new quitters was of borderline significance due to the relatively small number of cardiovascular deaths. The risk for chronic pulmonary disease mortality was reduced in long-term quitters but increased in new quitters. The authors concluded that significant reduction in the risk of total mortality, specifically for lung cancer mortality, can be achieved within five years of smoking cessation (Lim, Tai, Yuan, Yu, & Koh, 2013). 80 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine References Ando, M., Wakai, K., Seki, N., Tamakoshi, A., Suzuki, K., Ito, Y., . . . Grp, J. S. (2003). Attributable and absolute risk of lung cancer death by smoking status: Findings from the Japan Collaborative Cohort Study. International Journal of Cancer, 105(2), 249-254. doi:10.1002/ijc.11043 Baba, S., Iso, H., Mannami, T., Sasaki, S., Okada, K., Konishi, M., . . . Grp, J. S. (2006). Cigarette smoking and risk of coronary heart disease incidence among middle- aged Japanese men and women: The JPHC Study Cohort I. European Journal of Cardiovascular Prevention & Rehabilitation, 13(2), 207-213. doi:10.1097/01. hjr.0000194417.16638.3d Bae, J. M., Lee, M. S., Shin, M. H., Kim, D. H., Li, Z. M., & Ahn, Y. O. (2007). Cigarette smoking and risk of lung cancer in Korean men: The Seoul male cancer cohort study. Journal of Korean Medical Science, 22(3), 508-512. Castelli, W. P. (1984). Epidemiology of coronary heart disease: The Framingham study. Am J Med, 76(2a), 4-12. Critchley, J. A., & Capewell, S. (2003). Mortality risk reduction associated with smoking cessation in patients with coronary heart disease: A systematic review. Journal of the American Medical Association, 290(1), 86-97. doi:10.1001/ jama.290.1.86 D’Elia, L., De Palma, D., Rossi, G., Strazzullo, V., Russo, O., Iacone, R., . . . Galletti, F. (2014). Not smoking is associated with lower risk of hypertension: Results of the Olivetti Heart Study. European Journal of Public Health, 24(2), 226-230. doi:10.1093/eurpub/ckt041 Ding, J., Sigurdsson, S., Garcia, M., Phillips, C. L., Eiriksdottir, G., Gudnason, V., . . . Launer, L. J. (2015). Risk factors associated with incident cerebral microbleeds according to location in older people: The Age, Gene/Environment Susceptibility (AGES)-Reykjavik Study. JAMA Neurology, 72(6), 682-688. doi:10.1001/ jamaneurol.2015.0174 Freedman, N. D., Leitzmann, M. F., Hollenbeck, A. R., Schatzkin, A., & Abnet, C. C. (2008). Cigarette smoking and subsequent risk of lung cancer in men and women: Analysis of a prospective cohort study. Lancet Oncology, 9(7), 649-656. doi:10.1016/51470-2045(08)70154-2 81 Fukuchi, Y., Nishimura, M., Ichinose, M., Adachi, M., Nagai, A., Kuriyama, T., . . . Zaher, C. (2004). COPD in Japan: The Nippon COPD Epidemiology Study. Respirology, 9(4), 458-465. doi:10.1111/j.1440-1843.2004.00637.x Global Burden of Disease. (2016). Global Health Data Exchange. Retrieved from: http://ghdx.healthdata.org/gbd-results-tool Halbert, R. J., Isonaka, S., George, D., & Iqbal, A. (2003). Interpreting COPD prevalence estimates: What is the true burden of disease? Chest, 123(5), 1684- 1692. Halbert, R. J., Natoli, J. L., Gano, A., Badamgarav, E., Buist, A. S., & Mannino, D. M. (2006). Global burden of COPD: Systematic review and meta-analysis. Eur Respir J, 28(3), 523-532. doi:10.1183/09031936.06.00124605 Halimi, J. M., Giraudeau, B., Vol, S., Caces, E., Nivet, H., & Tichet, J. (2002). The risk of hypertension in men: Direct and indirect effects of chronic smoking. Journal of Hypertension, 20(2), 187-193. doi:10.1097/00004872-200202000-00007 He, Y., Jiang, B., Li, L. S., Li, L. S., Sun, D. L., Wu, L., . . . Lam, T. H. (2014). Changes in smoking behavior and subsequent mortality risk during a 35-year follow-up of a cohort in Xi’an, China. Am J Epidemiol, 179(9), 1060-1070. doi:10.1093/aje/ kwu011 Huxley, R. R., & Woodward, M. (2011). Cigarette smoking as a risk factor for coronary heart disease in women compared with men: A systematic review and meta-analysis of prospective cohort studies. Lancet, 378(9799), 1297-1305. doi:10.1016/s0140-6736(11)60781-2 Jensen-Urstad, M., Viigimaa, M., Sammul, S., Lenhoff, H., & Johansson, J. (2014). Impact of smoking: All-cause and cardiovascular mortality in a cohort of 55-year-old Swedes and Estonians. Scandinavian Journal of Public Health, 42(8), 780-785. doi:10.1177/1403494814550177 Johannessen, A., Omenaas, E., Bakke, P., & Gulsvik, A. (2005). Incidence of GOLD- defined chronic obstructive pulmonary disease in a general adult population. Int J Tuberc Lung Dis, 9(8), 926-932. Kojima, S., Sakakibara, H., Motani, S., Hirose, K., Mizuno, F., Ochiai, M., & Hashimoto, S. (2007). Incidence of chronic obstructive pulmonary disease, and the relationship between age and smoking in a Japanese population. J Epidemiol, 17(2), 54-60. 82 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Kwon, Y., Norby, F. L., Jensen, P. N., Agarwal, S. K., Soliman, E. Z., Lip, G. Y. H., . . . Chen, L. Y. (2016). Association of smoking, alcohol, and obesity with cardiovascular death and ischemic stroke in atrial fibrillation: The Atherosclerosis Risk in Communities (ARIC) Study and Cardiovascular Health Study (CHS). PLoS One, 11(1), 13. doi:10.1371/journal.pone.0147065 Lerner, D. J., & Kannel, W. B. (1986). Patterns of coronary heart disease morbidity and mortality in the sexes: A 26-year follow-up of the Framingham population. Am Heart J, 111(2), 383-390. Li, K. B., Yao, C. H., Di, X., Yang, X. C., Dong, L., Xu, L., & Zheng, M. L. (2016). Smoking and risk of all-cause deaths in younger and older adults: A population-based prospective cohort study among Beijing adults in China. Medicine (Baltimore), 95(3), 5. doi:10.1097/md.0000000000002438 Lim, S. H., Tai, B. C., Yuan, J. M., Yu, M. M. C., & Koh, W. P. (2013). Smoking cessation and mortality among middle-aged and elderly Chinese in Singapore: The Singapore Chinese health study. Tobacco Control, 22(4), 235-240. doi:10.1136/ tobaccocontrol-2011-050106 Lu, L., Mackay, D. F., & Pell, J. P. (2014). Meta-analysis of the association between cigarette smoking and peripheral arterial disease. Heart, 100(5), 414-423. doi:10.1136/heartjnl-2013-304082 Lubin, J. H., Couper, D., Lutsey, P. L., Woodward, M., Yatsuya, H., & Huxley, R. R. (2016). Risk of cardiovascular disease from cumulative cigarette use and the impact of smoking intensity. Epidemiology, 27(3), 395-404. doi:10.1097/ ede.0000000000000437 Mannami, T., Iso, H., Baba, S., Sasaki, S., Okada, K., Konishi, M., . . . (2004). Cigarette smoking and risk of stroke and its subtypes among middle-aged Japanese men and women: The JPHC Study Cohort I. Stroke, 35(6), 1248-1253. doi:10.1161/01. STR.0000128794.30660.e8 Mons, U., Muezzinler, A., Gellert, C., Schottker, B., Abnet, C. C., Bobak, M., . . . Consortium, C. (2015). Impact of smoking and smoking cessation on cardiovascular events and mortality among older adults: Meta-analysis of individual participant data from prospective cohort studies of the CHANCES consortium. British Medical Journal, 350, 12. doi:10.1136/bmj.h1551 Moreira, G. L., Gazzotti, M. R., Manzano, B. M., Nascimento, O., Perez-Padilla, R., Menezes, A. M. B., & Jardim, J. R. (2015). Incidence of chronic obstructive pulmonary disease based on three spirometric diagnostic criteria in Sao Paulo, Brazil: A nine-year follow-up since the PLATINO prevalence study. Sao Paulo Medical Journal, 133(3), 245-251. doi:10.1590/1516-3180.2015.9620902 83 Nielsen, R. (2009). Present and future costs of COPD in Iceland and Norway: Results from the BOLD study. Eur Respir J, 34(4), 850-857. doi:10.1183/09031936.00166108 Notara, V., Panagiotakos, D. B., Kouroupi, S., Stergiouli, I., Kogias, Y., Stravopodis, P., . . . Investigators, G. S. (2015). Smoking determines the 10-year (2004-2014) prognosis in patients with Acute Coronary Syndrome: The GREECS observational study. Tobacco Induced Diseases, 13, 9. doi:10.1186/s12971-015-0063-6 Peters, S. A. E., Huxley, R. R., & Woodward, M. (2013). Smoking as a risk factor for stroke in women compared with men: A systematic review and meta-analysis of 81 cohorts, including 3 980 359 individuals and 42 401 strokes. Stroke, 44(10), 2821-2828. doi:10.1161/strokeaha.113.002342 Poulter, N. R. (2002). Independent effects of smoking on risk of hypertension: Small, if present. Journal of Hypertension, 20(2), 171-172. doi:10.1097/00004872- 200202000-00002 Prescott, E., Bjerg, A. M., Andersen, P. K., Lange, P., & Vestbo, J. (1997). Gender difference in smoking effects on lung function and risk of hospitalization for COPD: Results from a Danish longitudinal population study. Eur Respir J, 10(4), 822-827. Sandhu, R. K., Jimenez, M. C., Chiuve, S. E., Fitzgerald, K. C., Kenfield, S. A., Tedrow, U. B., & Albert, C. M. (2012). Smoking, smoking cessation, and risk of sudden cardiac death in women. Circulation-Arrhythmia and Electrophysiology, 5(6), 1091- 1097. doi:10.1161/circep.112.975219 Shinton, R., & Beevers, G. (1989). Meta-analysis of relation between cigarette smoking and stroke. BMJ, 298(6676), 789-794. Song, Y. M., & Cho, H. J. (2008). Risk of stroke and myocardial infarction after reduction or cessation of cigarette smoking: A cohort study in Korean men. Stroke, 39(9), 2432-2438. doi:10.1161/strokeaha.107.512632 Terzikhan, N., Verhamme, K. M. C., Hofman, A., Stricker, B. H., Brusselle, G. G., & Lahousse, L. (2016). Prevalence and incidence of COPD in smokers and non- smokers: The Rotterdam Study. European Journal of Epidemiology, 31(8), 785-792. doi:10.1007/s10654-016-0132-z Tesfaye, F., Byass, P., & Wall, S. (2009). Population-based prevalence of high blood pressure among adults in Addis Ababa: Uncovering a silent epidemic. BMC Cardiovasc Disord, 9, 39. doi:10.1186/1471-2261-9-39 84 Modeling the Long-Term Health and Cost Impacts of Reducing Smoking Prevalence through Tobacco Taxation in Ukraine Tesfaye, F., Byass, P., Wall, S., Berhane, Y., & Bonita, R. (2008). Association of smoking and khat (Catha edulis Forsk) use with high blood pressure among adults in Addis Ababa, Ethiopia, 2006. Prev Chronic Dis, 5(3), A89. Tolstrup, J. S., Hvidtfeldt, U. A., Flachs, E. M., Spiegelman, D., Heitmann, B. L., Balter, K., . . . Feskanich, D. (2014). Smoking and risk of coronary heart disease in younger, middle-aged, and older adults. American Journal of Public Health, 104(1), 96-102. doi:10.2105/ajph.2012.301091 Virdis, A., Giannarelli, C., Neves, M. F., Taddei, S., & Ghiadoni, L. (2010). Cigarette smoking and hypertension. Curr Pharm Des, 16(23), 2518-2525. Wakai, K., Marugame, T., Kuriyama, S., Sobue, T., Tamakoshi, A., Satoh, H., . . . Tsugane, S. (2007). Decrease in risk of lung cancer death in Japanese men after smoking cessation by age at quitting: Pooled analysis of three large-scale cohort studies. Cancer Science, 98(4), 584-589. doi:10.1111/j.1349-7006.2007.00423.x Wannamethee, S. G., Shaper, A. G., Whincup, P. H., & Walker, M. (1995). Smoking cessation and the risk of stroke in middle-aged men. JAMA, 274(2), 155-160. World Health Organization. (2012). WHO global report on mortality attributable to tobacco. Wu, F., Chen, Y., Parvez, F., Segers, S., Argos, M., Islam, T., . . . Ahsan, H. (2013). A prospective study of tobacco smoking and mortality in Bangladesh. PLoS One, 8(3), 11. doi:10.1371/journal.pone.0058516 85 86 87