Attention Please! Health Plan Choice and (In-)Attention

We study the role of inattention as a key source of inertia in health plan choices. Our structural model shows that more than 90% of the elderly in Switzerland are inattentive and thus stick to their previous plan. We estimate sizeable switching costs even conditional on attention explaining part of the observed choice persistence. Inattention leads to overspending and generates considerable welfare losses for most consumers. A policy simulation shows that eliminating financially dominated plans from the choice set yields welfare gains for two thirds of individuals.


INTRODUCTION
Consumers routinely need to make decisions between different alternatives and products in their everyday lives. For example, when entering the supermarket, they face the choice between numerous cereal brands, wine varieties and types of cheese. Similarly, they have to choose a broadband provider, find a suitable hair dresser as well as a family doctor. Although individuals potentially could engage in an active decision-making process every time they choose, more often than not, they are "passive" decision-makers who simply stick to their earlier choices. Such consumer inertia or choice persistence has been documented in mobile phone and credit card provider choices (Burnham et al., 2003;Lee and Neale, 2012), in the market for retirement savings (Madrian and Shea, 2001;Chetty et al., 2014;Luco, 2019), the electricity market (Hortaçsu et al., 2017), the car insurance market (Honka, 2014) and can also be observed in the brand choice of consumer goods (Dubé et al., 2010).
Inertial choice behavior also plays a crucial role in the context of health insurance as many healthcare systems exhibit remarkably low switching rates. For example, less than 10% of the elderly switch their drug prescription plan in the U.S. in Medicare Part D (Polyakova, 2016).
Similarly low switching rates have been documented in the Swiss, German and Dutch health insurance market (e.g., Schut and Hassink, 2002;Frank and Lamiraud, 2009;Douven et al., 2017;Schmitz and Ziebarth, 2017;Bischof and Schmid, 2018). This observation is especially striking as buying health insurance often takes up a sizeable portion of household finances: American and Swiss households spend roughly 5% and 6% of pre-tax income on health insurance (Bureau of Labor Statistics, 2017;Federal Statistical Office, 2016). Furthermore, many insurance markets are modeled after the principles of managed competition, which crucially relies on the price sensitivity of consumers (Enthoven, 1978). The welfare gains of managed competition are jeopardized by inertia because insurance companies may exploit this sluggishness to avoid competition, allowing them to keep premiums high. This paper studies the role of inattention as a major source of inertia in individual health plan choices. 1 Most of the existing literature on plan choice (e.g., Buchmueller and Feldstein, 1997;Cutler and Reber, 1998;Buchmueller, 2000;Beaulieu, 2002;Schut and Hassink, 2002;Strombom et al., 2002;Buchmueller, 2006;Handel and Kolstad, 2015;Abaluck andGruber, 2011, 2016) assumes fully attentive decision-makers, i.e., consumers are assumed to consider all plans in the market and to pay attention to the characteristics (e.g. premiums, out-of-pocket costs) of all of them. If, however, (a part of) consumers do not pay attention to plan choice, such an assumption is problematic: While inattentive consumers can be thought of as being "asleep" ignoring plan characteristics and switching costs, consumers who pay attention factor these components in when making their decisions. Standard choice models who neglect "attention" thus tend to underestimate the consumer responsiveness to plan attributes and overestimate switching costs. We relax the assumption of fully attentive consumers by estimating the consideration set models recently proposed by Abaluck and Adams-Prassl (2021) which allow us to recover structural plan choice parameters conditional on consumer attention. By accounting for attention, we obtain more reliable estimates of the utility weights consumers attach to plan characteristics as well as more plausible swichting costs.
Previous work has shown that consumers could save hundreds of dollars annually by switching away from their current health plan (e.g., Handel, 2013;Heiss et al., 2013;Abaluck and Gruber, 2016). Therefore, inert consumers leave significant sums of money on the table by sticking to financially dominated options which may lead to substantial welfare losses (e.g. Bhargava et al., 2017). Switching costs in health plan choices may play an important role in explaining inertia (e.g., Handel and Kolstad, 2015;Polyakova, 2016;Yeo and Miller, 2018). The basic idea is that time and hassle costs (e.g., information acquisition time, paperwork, changing healthcare providers) may hinder individuals from switching to more suitable plans. 2 There is a fairly broad consensus that individuals face substantial switching costs when making their health plan decisions: for example, the estimates for switching costs of Part D enrollees range from $1,000 (Polyakova, 2016) to between $1,500-1,700, or 54-60% of annual prescription drug expenditures (Yeo and Miller, 2018). Subsequent analyses were focused on other reasons for inertia such as salience (e.g., Ketcham et al., 2012Ketcham et al., , 2015 and, more recently, inattention (e.g., Heiss et al., forthcoming;Ho et al., 2017;Abaluck and Adams-Prassl, 2021). Heiss et al. (forthcoming) develop a two-stage model of plan choice that accounts for consumer attention. Specifically, their model comprises of an attention and a plan-choice stage, both of which are allowed to 2 Switching costs are typically estimated using discrete choice models including a dummy for choosing the prior plan. The ratio of the coefficient of this dummy and the premium is then interpreted as the amount of premium dollars consumers are willing to give up to stay with the default. be affected by unobserved consumer heterogeneity ("acuity") capturing (latent) factors such as health insurance literacy and "effort" of the decision maker. Identification is achieved by imposing exclusion restrictions, e.g., by assuming that changes in premiums or health shocks solely impact attention but not utility directly. The authors identify overspending in previous years as an important attention trigger and provide switching cost estimates ranging from $200 (13% of total costs) for consumers with high levels of acuity up to $2,700 (180% of total costs) for low-acuity individuals, even conditional on attention. Ho et al. (2017) analyze inattention and discuss attention triggers. Their model accounts for unobserved heterogeneity; however, it does now allow to estimate switching costs. Abaluck and Adams-Prassl (2021) develop a choice-theoretic model that incorporates inattention in a discrete choice framework. 3 They find that only 15% of consumers are attentive, reducing the estimated switching costs from $1,200 to $290. All three papers analyze choices in Medicare Part D.
Our paper contributes to the existing literature in several ways. First, we examine how two key frictions -consumer inattention and switching costs -influence inertia in a health insurance market that differs in several important dimensions from Medicare Part D and the ACA market places: In the Swiss managed competition setting, individuals gather experience with the insurance system throughout their entire lives; insurance is mandatory, so there is no opting-out; and everyone has a default plan. The comprehensive coverage is standardized which further reduces the differences between plans and should decrease the complexity of the choice situation. We thus provide novel evidence on the question whether familiarity with the insurance system attenuates or reinforces frictions and ultimately whether "user experience" safeguards consumers against the costs of inertia. Second, we are able to show that not all premium increases are alike: consumer attention is not very sensitive to gradual premium increases, but sudden and above-average increases are important attention triggers. This finding is in line with Ho et al. (2017) who show that consumers primarily respond to health and premium "shocks." Third, we estimate the welfare implications of inattention. In previous work, Heiss et al. (forthcoming) analyze the impact of inattention but only focus on switching probabilities and overspending without a welfare analysis. Abaluck and Adams-Prassl (2021) perform a welfare analysis of a "smart default policy" but do not analyze the welfare effect of inattention per se. By contrast, we estimate the welfare costs of inattention and further analyze an interesting policy originating in a real-world proposal: abolishing financially dominated deductibles. Taking inattention into account in this exercise is important because consumers currently enrolled in such plans are forced to make a new choice.
We examine health plan choices of elderly individuals in Switzerland in 2012 and 2013.
The empirical analysis is based on claims data from a large Swiss health insurer that provides mandatory health insurance to around 1.2 million individuals each year (Swiss population: 8.5 million). In addition, we complement the individual data with publicly available information on all offered plans in the market. Our analysis gives rise to five main results. First, we find that the vast majority of the elderly are not making financially sound plan choices as the average consumer could save up to CHF 1,000 (22% of current total costs) per year by switching to other plans. 4 Second, our results reveal systematic asymmetries in the sensitivity of plan switching to characteristics of the default and non-default plans. For example, the switching response to an increase in the default premium by CHF 100 is almost twice as large as the reaction to a similar decrease in the premium of non-default plans. Third, our structural analysis shows that inattention accounts for a large part of the observed inertia. We find that more than 90% of the elderly do not pay attention to the plan choice problem and thus remain with their previous plan. Accounting for attention leads to a general increase in the estimated (utility) weight individuals give to plan features such as premiums and to a devaluation of the default. We further show that consumers face considerable switching costs of approximately CHF 1,300 (28% of current total costs), even conditional on attention. Fourth, the analysis shows that consumers are significantly more attentive when exposed to a premium shock and subsequently switch to more suitable plans. Fifth, our policy simulation shows that inattention leads to considerable welfare losses and increases the expected foregone savings for 70% of the elderly. We further analyze a policy that reduces the choice set by eliminating financially dominated plans. Based on our simulation, such a policy enhances welfare for two thirds of individuals leading to an increase in average consumer surplus by more than CHF 180 per person and year.
The remainder of this paper is structured as follows. Section 2 provides information on the institutional background. Section 3 describes our data. The empirical strategy and the underlying structural model are described in Section 4. Section 5 presents and discusses the empirical results, followed by a welfare analysis of inattention and a counterfactual policy 4 1 CHF ≈ 1 $ simulation. Finally, Section 6 concludes.

The Swiss Health Insurance Market
In Switzerland, all residents are mandated to buy basic health insurance (BHI) since the Health Insurance Act from the early 1990s (see Schmid et al., 2018, for a comprehensive description).
The BHI system is based on Enthoven's (1978) managed competition model in which privately run insurance companies engage in price and quality competition to attract consumers. The set of medical services covered by BHI is regulated and ranges from out-and inpatient care, nursing home care, physiotherapy to pharmaceuticals and medical appliances. The private insurance companies in the market are obliged to offer the "basic plan" entailing the standard annual deductible of CHF 300 and free provider choice (see below for a detailed description). In addition to the deductible, consumers face a co-payment rate of 10% beyond hitting the annual deductible level, up to a stop-loss of CHF 700 per year. Currently, 37 private insurance companies are operating in the BHI not-for-profit market, running an additional 15 subsidiaries.
Premiums are community-rated within 42 geographic premium regions, but premiums may differ for three age groups (children aged 18 and below, young adults, and adults aged 26 and above). However, within age groups and regions, consumers pay exactly the same premium regardless of other risk factors (e.g. pre-existing conditions).
Standardization of plan coverage is a key element in the system as it essentially allows consumers to make their plan choice exclusively based on price and service quality: switching to cheaper plans in the BHI market is financially safe as consumers can be sure that lowerpriced plans did not realize premium reductions by hidden coverage gaps (Enthoven, 1993).
Although the design of the BHI system allows consumers the direct comparison of plans between providers, the Swiss BHI market is characterized by large and persistent premium differentials within local markets and low switching rates on the side of consumers. In fact, the premium differential for adults aged over 26 amounts to more than CHF 1,900 per year for the basic plan in certain regions in 2017. 5 Despite these substantial premium differences, annual 5 The displayed number is the difference between the 95%-and the 5%-percentile for the canton of Basel.
Numbers are based on the Statistics on compulsory health insurance by the Federal Office of Public Health (FOPH), Table T805d, https://www.bag.admin.ch/dam/bag/de/dokumente/kuv-aufsicht/stat/ average switching rates are well below 10% rendering inertia a defining feature of the Swiss health insurance system (FOPH, 2018).
Since much of the related work focuses on the US (mostly Medicare Part D), we briefly discuss the two key dimensions along which the Swiss system differs. First, while the majority of working-age Americans purchases insurance through their employer and thus only have access to a limited number of plans, insurees in Switzerland, irrespective of their age or working status, choose their insurance contract from a large choice set (e.g., Dafny et al., 2013). Second, while most Americans transition from employer-sponsored insurance to Medicare upon retirement, individuals in Switzerland remain in the same insurance system throughout their entire life. Hence, by entering Medicare Part D, the elderly in the US are exposed to an entirely new insurance system potentially for the first time allowing them to choose among plans that differ in coverage and service quality. In sharp contrast to that, the elderly in Switzerland can be expected to be familiar with the insurance choice problem through their life-long experience with the system.

The Health Insurance Choice Problem
Although the standardization of coverage allows consumers to compare plans in the BHI market directly, they face a complex insurance choice problem: during an annual open enrollment window in November, consumers have the choice among a large number of insurance providers active in their local market each of which offering an average of 12 plans during our observation period. 6 It is important to note that switching involves no fees and that insurance carriers are obliged to accept all individuals. In exchange for a premium rebate, consumers may opt for a higher voluntary deductible, or restrict their provider choice by including a managed care (MC) feature. The deductible level determines the financial risk of illness borne by the individual and, besides the standard deductible, five other levels may be offered, ranging up to CHF 2,500 p.a. 7 The available MC features can be broadly grouped into three categories: In the preferred provider organization model (PPO), consumers select a general practitioner publications-aos/statistik-oblig-kv-2015-tabellen2.xlsx. 6 The number of offered plans ranges from 1 to 24. Some insurers additionally offer several distinct HMO plans. If we additionally distinguish between different HMO categories per insurers, we get an average of 16 plans offered, with the maximum offered plans equal to 56. (Calculations based on https://www.priminfo.admin.ch/ de/archiv).
7 To be precise, beyond the standard deductible of CHF 300, individuals may choose a higher voluntary deductible. The available levels for individuals above the age of 18 are CHF 500, 1,000, 1,500, 2,000 and 2,500.
(GP) from a pre-specified list of providers who serves as their first point of contact in case of any health-related issue. Alternatively, consumers may opt for a health maintenance organization (HMO), where the gatekeeper is not a single GP but instead a unified organization of physicians (GPs and specialists). These operate in a group practice or as a network of practices in different locations. 8 Finally, the most recently introduced TelMed model obliges insurees to contact a (medical) call center first before potentially being referred to a GP or specialist.

Premium Shock at Retirement
The Swiss BHI system incorporates premium adjustments that exclusively influence certain populations. In particular, the newly retired who leave the labor market permanently lose the previously employer-sponsored accident insurance leading to a premium increase of 7%. 10 Since accident coverage is mandatory and insurers collect any contract-related adjustments, we observe these premium changes in our data. Such price hikes are likely to trigger attention, as they occur on top of the usual annual premium growth. We exploit this institutional feature to study whether the retirement premium shock "wakes up" the elderly and leads to adjustments in their plan choice (see Section 5.2.1 for details). 11 8 Note that in Switzerland, HMO and insurers are typically not vertically integrated, as is, for example, common in the US. 9 The figure shows the premium trajectories for the premiums of CSS insurance (including its subsidiaries), one of the largest health insurance providers in Switzerland. Our main econometric analysis is based on claims data from CSS insurance (see Section 3). 10 The official retirement age is 65 for men and 64 for women. Note that employer-sponsored accident coverage is tied to working at least 8 hours per week so that the non-working and self-employed population pay for accident insurance themselves. Another premium shocks occurs for individuals turning 26 (see Bischof and Schmid, 2018, for an analysis). 11 Our data does not include employment status. Thus, we use a proxy for entering retirement which is equal to 1 if, first, an individual newly included accident coverage during the year or will do so at (or shortly after) the turn of the year, and, second, the individual will keep the accident coverage up to the end of the sample period. Note that we cannot distinguish between individuals entering retirement and those entering long unemployment spells. I.e. an individual that enters a new unemployment spell and remains unemployed until the end of the sample also satisfies our criteria. However, due to the age of the sample, this should be a minor problem. Workers losing their job above the age of 50 commonly suffer from long unemployment spells that are often directly followed by (early) retirement (see, for example, SECO, 2018).

Data Sources and Sample Construction
Our analysis is based on register data of a large Swiss health insurer. The data set spans the period from 2012 to 2014 for individuals aged 59 to 70. 12 We observe all individuals who buy BHI from the CSS conglomerate, that is, either from the main carrier (CSS) or one of the subsidiaries. 13 Thus, each person is observed for up to three years. The data contains demographic and health-related characteristics (date of birth, gender, region of residence, language of correspondence, pharmaceutical cost groups (PCGs)), the annual health insurance contract (deductible level, insurance carrier, MC feature, accident coverage, contract duration, annual premium), and total healthcare expenditures (HCE). Further, we observe plan choices for the subsequent year for individuals staying enrolled to one of the CSS carriers. 14 To construct the choice set for each consumer, we attach public information on the available health plans in each market (see Subsection 2.2 for details on the individual choice sets).
Our initial data set contains 142'574 individuals and 254'861 person-year observations. We  Table I reports summary statistics for the years 2012 and 2013. 16 The average age in the pooled sample is slightly above 64, i.e., our sample consists of individuals close to or already in retire-12 To be precise: We observe women aged 59 to 69 and men between the ages of 60 and 70. 13 Between 2012 and 2014, the CSS Insurance conglomerate consisted of four insurance carriers, namely, CSS, INTRAS, Arcosana, and Sanagate.

Descriptive Statistics
14 We further identify consumers who switch to a health plan outside the CSS group. However, we do not observe any specifics on their plain choice. 15 By moving into another premium region, the premium level changes because premiums are community-rated on a regional level. Further, the choice set is also subject to change as some insurers only operate in certain regions, and they do not always offer the same options. 16 Note that Table I does not print numbers for individuals who drop out of the sample in the following year. Specifically, we exclude the year 2014 as well as the last yearly observation of individuals aging out of the panel.
ment. The table further shows that the majority of individuals in that age group struggle with health problems as more than 50% of them suffer from chronic conditions. About 58% (21%) of the elderly choose the minimum deductible of CHF 300 (500) effectively limiting the financial consequences of poor health. Moreover, the elderly prefer the most generous (free access) standard plan over plans with managed care features (pooled share: 39%). Those who switch, however, mostly switch from the standard to the PPO model resulting in a small market share for the HMO and TelMed models in this age group. Besides, approximately 75% include accident coverage in their insurance contract, reflecting the institutional feature that accident coverage is no longer sponsored by the employer when entering retirement (see Subsection 2.3). In our sample, individuals on average spend roughly CHF 3,900 on premiums and pay CHF 650 out of pocket. The average gross healthcare spending amounts to approximately CHF 5,100. We observe high levels of inertia as only 6.5% of the elderly on average adjust their health insurance plan each year. This includes changes in the deductible, the insurance model, the carrier, exiting or any combination of these possibilities. The majority of adjustments (4.7%), however, are made with respect to plan attributes (i.e. deductible and model) within the current provider and exits account for only a small share of the variation.
-Insert Table I

Foregone Savings
To evaluate the potential financial consequences resulting from inertial behavior, we compute the total costs for each plan in the individual choice set. Specifically, we calculate the total costs of a plan by adding up premiums and (expected) OOP costs. 17 Similar to Ho et al. (2017) and Abaluck and Gruber (2016), we then compute the "foregone savings" as the difference in total costs between the chosen plan and the least-cost alternative in the individual choice set.
Based on the literature on financial literacy, we expect substantial savings potential. For example, Agarwal et al. (2009) show that financial mistakes follow a U-shaped pattern over the life cycle, which would imply relatively poor financial decisions by the elderly. Figure 2 shows the distribution of foregone savings (in % of total costs). 18 17 See Appendix A for more details on our OOP cost calculator. 18 The plotted savings are calculated from the observed costs averaged over the full sample period (2012)(2013)(2014). Doing so is equivalent to using the perfect foresight measure of OOP costs described in Section Appendix A. Note however that the histograms of foregone savings for the other OOP cost measures are almost identical (see Figure  A.1 in the Appendix).
-Insert Figure 2 about here - The graph shows that a substantial fraction of the elderly could decrease their expected total costs by switching to other plans as only 6.6%) of individuals choose the least-cost alternative in their choice set. Moreover, the foregone savings range from CHF 533 to CHF 3,400 for 75% of the sample implying substantial financial losses due to their current choices. Furthermore, switching to the least-cost plan could have saved on average approximately CHF 1,065 which corresponds to 22% of current total costs. In other words, the typical individual in our sample could save up to almost one quarter of total costs by choosing more adequate insurance.

Does Switching Reduce Overspending?
Given that the elderly leave substantial sums of money on the table by sticking to financially dominated plans, we study whether the switches we observe in the data lead to reductions in foregone savings. Our analysis shows that once individuals "wake up" and make deliberate plan choices, most of them reduce overspending. We find that plan switching yields average cost savings of CHF 380 per year (or 7.2% of total costs) compared to staying in the default plan. Despite cutting their expenditures, many switchers could save even more as only around 10% switch to the cost-minimizing plan (thereby reducing the foregone savings to zero). Interestingly, around 20% change to an overall more expensive plan. The median foregone savings of all switchers amount to CHF 656. These results provide first descriptive evidence of the (financial) benefits of a higher switching rate. Of course, consumers may also value non-financial aspects of plans, which are ignored in these calculations. Furthermore, the ex-ante comparison of healthcare costs ignores the fact that individuals face a choice under uncertainty, as they are not fully informed how their health status will evolve. Both points will be considered more carefully in the ensuing structural analysis.

EMPIRICAL STRATEGY
Our analysis is based on the default-specific consideration model (DSC) recently proposed by Abaluck and Adams-Prassl (2021). The model belongs to the class of consideration set models which generalize discrete choice models by relaxing the assumption that consumers consider all available goods in their choice set (Manski, 1977). Loosely speaking, in the DSC model not all individuals are attentive to the choice problem in each period. Attentive consumers consider all elements of their choice set so that the consideration set and the choice set coincide.
In that case, utility-maximizing agents choose optimally among all available options. On the other hand, inattentive individuals only consider the default option so that the consideration set is a subset of the choice set with exactly one element. In our setting, the default option is an individual's current health plan.
In the DSC model, the probability of being attentive is assumed to depend on the characteristics of the default exclusively. Put differently, consumers only reconsider their choice once their default option gets sufficiently bad. This assumption is consistent with a behavioral pattern noted in earlier studies (see e.g., Ho et al., 2017). Further, the rational inattention literature provides a direct microfoundation for the model: searching for a new plan is costly, and individuals are only willing to pay the search costs if their current plan (i.e., their default) has gotten sufficiently bad (see Gabaix, 2019, for an overview). In the remainder of this section, we first describe the structural model and then discuss how the model is estimated.

Structural Model
Choice Under Full Attention. We model the recurring choice problem of individuals i = 1, ..., N who at time t can choose exactly one option j from their choice set J. We take a random utility approach, where individual i's utility from choosing good j can be decomposed into a deterministic component (depending on the good's characteristics x jt ) and a random error term: Under the Daly-Zachary conditions 19 (Daly and Zachary, 1978) and assuming quasilinearity in the premium (x q ijt ), Abaluck and Adams-Prassl (2021) show that the probability s * j (•) of choosing option j under full consideration (i.e. being attentive) is given by: Choice Under Limited Attention. Next, we introduce a friction to the model by allowing inattentive consumers. Formally, their consideration set contains only the default option. In contrast, attentive individuals consider the full choice set as described above. We assume that attention only depends on the characteristics of the default option, i.e., of the current plan.
The underlying rationale is that consumers only re-evaluate their health insurance once their current plan becomes sufficiently unattractive.
Formally, the probability of being attentive is a function of characteristics of person i's de- . Thus, the observed probability of individual i choosing good j at time t can be decomposed as follows: As noted above, standard economic theory implies that with full attention there are no asymmetries in the sensitivity of market shares to characteristics of different goods. Therefore, the presence of any asymmetry of cross-derivatives in the data is evidence for imperfect consideration. 20 In the model, imperfect consideration is the only mechanism available for asymmetries to arise. Abaluck and Adams-Prassl (2021) show how the consideration set probabilities are identified from deviations from Slutsky symmetries in the observed market shares. The essential insight is that changes in consideration probabilities can be expressed as a function of observable differences in cross-derivatives and market shares.
In contrast to most discrete choice models, the DSC model does not assume that all consumers consider all their options. Relaxing this assumption allows disentangling attention from utility parameters. Similar models have been applied previously in the health insurance context (Ho et al., 2017;Heiss et al., forthcoming). A common caveat of these earlier approaches is the necessity of imposing additional exclusion restrictions. These typically restrict which factors only affect consideration but not utility and vice versa. On the other hand, the DSC model identifies consideration probabilities from cross-derivative asymmetries. Thus, the DSC model offers an advantage in that we do not need to formulate any additional exclusion restrictions.

Estimation
Parameterization and Functional Form. We parameterize the functions μ(•) and s * j (•) such that we can write the probability of choosing alternative j as a function of the parameters θ = (β, γ): The log-likelihood in period t is given by: where y ij = 1 if individual i chooses plan j. The parameters β, γ can be estimated by maximum In particular, we take the following assumptions on the functional forms of s ijt (•) and μ it (•): 20 See Theorem 1 in Abaluck and Adams-Prassl (2021). Note that differences in cross-derivatives are the discrete choice analogue of Slutsky asymmetries.
where η it is a logistic error term, δ t is a time-specific constant, and z idt are characteristics of the default good d (premium, mean and variance of expected OOP costs, deductible, a dummy for managed care and a constant). 21 All attributes are demeaned at the individual level to account for differences across years, regions, and age categories.
We assume that individual i's utility from choosing plan j at time We parameterize the indirect utility as a linear function of Endogeneity. Regarding the identification of the structural parameters of interest, one might suspect that certain plan features are correlated to the error term leading to endogeneity issues. For example, the premium of a plan is likely associated with demand factors (e.g., local healthcare utilization) and unobservable plan characteristics (e.g., service quality) biasing the corresponding coefficient estimates. To address such concerns, our model specifications include all publicly available information on plan features that are potentially choice relevant (premiums, deductibles, and MC features). This approach is standard in the literature (Handel and Kolstad, 2015;Heiss et al., forthcoming;Abaluck andGruber, 2016, 2011). In addition, since we focus on plan choice within a specific insurance provider, otherwise unobservable dimensions such as service quality can be plausibly assumed to be identical across plans, so not 21 The vector z idt may include all, or part of, the characteristics of the covariate vector x idt and additional variables. 22 Note that choice under full attention satisfies the Daly-Zachary conditions and is consequently characterised by symmetric cross-derivatives (Slutsky symmetry) and an absence of nominal illusion. We refer to Abaluck and Adams-Prassl (2021) for a detailed discussion of the model assumptions and identification. Appendix B describes the estimation in more detail.
including them should not introduce bias.

RESULTS AND DISCUSSION
In this section, we present and discuss our estimation results. We first provide reduced-form evidence of excess sensitivity of plan switching to characteristics of the default plan (see subsection 5.1, which motivates our model choice). 23 Subsection 5.2 then presents the results of the structural model, followed by a welfare analysis of inattention and a counterfactual policy simulation.

Reduced Form Evidence
First, we examine potential asymmetries in the response of consumers to changes in plan characteristics of their default or rival (non-default) plans. Observing such asymmetries is compatible with a model of consumer choice in which consumers do not actively re-evaluate their plan choice in every period unless the default worsens considerably so that some of them "wake up" and become attentive to the plan choice problem. Such a model would imply that consumers show a stronger response to changes in default attributes as opposed to characteristics of non-default options in their choice set. To assess these claims, we estimate fixed-effects specifications of the following form: The reduced-form analysis is based on 229,367 observations of 129,698 individuals. The sample size in the structural analysis is smaller as we need to exclude the leavers due to data restrictions. This sample is analyzed in Column 2 of the reduced form (Table II). 24 Expected OOP costs for period t+1 are defined as E(HCE t+1 ) = HCE t , i.e., in the main analysis we assume backward-looking agents who expect that next year's costs remain at the level of the current year. Results using different measures for the cost variable are shown in Appendix D (perfect foresight) and Appendix E (rational expectations). and an indicator for MC features (which equals one for PPO, HMO and TelMed plans). Likewise, the vectorx irt contains the same set of plan attributes for the rival plans in the choice set, averaged across all non-default plans. Finally, we include time-varying individual characteristics z it (age, indicators for chronic illnesses) and year fixed effects λ t . We use the OLS within-estimator to estimate Equation (6) Table II about here - Table II shows that the elderly are substantially more responsive to adjustments in the attributes of their default plan as opposed to the characteristics of rival options. We find that while the overall switching probability significantly increases by 2.2 percentage points for an increase in the default premium by CHF 100, an identical decrease of the rival premium yields an increase in the likelihood of switching by only 1.2 percentage points. Similarly, changes in the expected OOP costs of the status-quo imply a stronger response in the overall switching probability than identical changes in rival plans (1.5 vs. 1 percentage points). Moreover, the negative coefficient on the MC indicator reflects that those currently enrolled in MC plans are significantly less likely to switch their plan than those in the standard model. Overall, the same asymmetries emerge when exclusively focusing on pure plan adjustments 26 (column 2), the exit decision (column 3) and alternative definitions for the expected OOP costs (see Tables A.III and A.VII in the Appendix).
In summary, the switching behavior we observe in the data is consistent with a model in which consumers do not deliberately overthink their plan choice every period, but instead pay attention once their default plan becomes sufficiently bad. Because the characteristics of the default good (i.e., the current plan) are relevant for the attention probability as well as utility, the attributes of the default good prompt greater responsiveness in such a model. Therefore, our findings further motivate the structural approach. 25 We also estimate the model for two additional outcomes within smaller subsamples. First, we only model switching within the CSS conglomerate using the estimation sample of the structural analysis in Section 5.2. Second, we model the exit decision on the subsample of stayers and leavers. 26 The "switching" indicator is one for any changes in the deductible, insurance model or insurance carrier, and zero for individuals who remain with their default plan.

Structural Results
Next, we show and discuss the results from the structural estimation described in Section 4.
We estimate the DSC model in the pooled sample covering the years 2012-2013, and later add an indicator for the premium shock. As a benchmark, we compare the results to a standard conditional logit model including the same covariates (see Appendix Table A.I). 27 Table III shows the estimation results of the DSC model for the plan choice and the attention stage. The estimates indicate that the elderly prefer plans with lower premiums, expected outof-pocket costs and deductibles as the corresponding utility weights are altogether negative and statistically significant. Moreover, they dislike managed care features, potentially because these plans impose restrictions on the free access to healthcare providers through "gatekeeping". Also, the elderly have a strong preference for the main brand and attach a large utility weight to their default plan. The default indicator consistently displays the largest coefficient among all plan attributes reflecting the fact that the elderly are willing to spend a lot of money to remain with the plan they chose in the previous year. In monetary terms, our estimates translate to average switching costs of approximately CHF 1,200. 28 Thus, similar to evidence from the U.S. (e.g., Heiss et al., forthcoming; Polyakova, 2016), we find that the elderly face considerable switching costs at least in part explaining why we observe low switching rates in the data (see Table I). 29 Moreover, recall that switching yields average cost savings of approximately CHF 400 (7.2% of total costs). In light of the estimated switching costs, the decision to remain with the status-quo seems reasonable from a purely financial perspective.
However, it is conceivable that these costs are only perceived, for instance caused by a misunderstanding of the insurance system. Many individuals have supplementary insurance, but pre-existing conditions at the time of contract signing are not covered by those. Therefore, most elderly want to keep their existing supplementary insurance. They might fear that they would lose their supplementary insurance when changing the mandatory insurance provider (which they would not). There may be other reasons for their overvaluation of their current choice, not realizing that actual switching costs would be much smaller than perceived switching costs. 27 Note that our preferred specification, the DSC model, fully nests the standard conditional logit model. 28 Switching costs are obtained by dividing the coefficient on the default plan by the premium coefficient multiplied by 100 because of the scaling of the premium variable. In general, money-metric effects can be computed by dividing the coefficient of interest by the estimated premium parameter. 29 Based on their preferred specification that allows conditioning on consumer attention, Heiss et al. (forthcoming) report switching costs that range between $200 (13% of total costs) and $2,700 (180% of total costs).
Moreover, the switching cost estimates demonstrate why it is crucial to account for consumer attention when modeling plan choice. Table A.I in the Appendix shows the estimated parameters based on the standard conditional logit (CL) using the same data. The CL is based on the assumption that all consumers pay attention to the plan choice problem and the decision to stay with the status-quo (or any plan) is deliberate. Given that the vast majority of individuals actually stays with the default, the CL underestimates the response to plan characteristics and overstates the utility weight on the default in order to fit the data. In fact, when comparing the CL to the DSC estimates, we find that the coefficients on all plan attributes are much smaller in the CL and the utility weight on the default plan is inflated by a factor of almost three translating to unreasonably high switching costs of more than CHF 7,000.
Similar to previous work on plan choice (e.g., Abaluck andGruber, 2011, 2016;Heiss et al., forthcoming), we also find that the elderly have inconsistent preferences. For example, they are willing to give up approximately CHF 130 in premiums to reduce their OOP costs by CHF 100. A rational decision-maker, on the other hand, would value a CHF 100 decrease in OOP costs equivalently to a premium reduction of the same amount as both decrease total costs by the same amount. Moreover, consumers are willing to spend CHF 59 in premiums for a CHF 100 reduction of the deductible. Because the financial consequences of a deductible are fully accounted for by the premium and the level and variance of OOP costs, the deductible should play no role in the decision. In that sense, consumers overweight the deductible level in their decision. This possibly results in inappropriately low deductible levels, which in turn unnecessarily drive up foregone savings and possibly reduce welfare.
Finally, our analysis shows that inattention is a defining feature of plan choice among the elderly. We estimate that merely 7% of individuals are attentive. Comparing this number to the observed switching rate of 4.8%, this finding implies that once consumers "wake up", most switch plans (about 70%). 30 On the other hand, more than nine in ten individuals remain in their previous plan because they are not paying attention. Accordingly, inattention explains a large part of the high degree of inertia.
Further, our estimates in the attention stage indicate that an increase in the expected OOP costs (e.g. triggered by a health shock) of the default plan tends to "wake up" individuals and makes them reconsider their plan choice. Moreover, we find that individuals with a managed 30 To be specific: The switching rate in the non-leaver sample pooled over the years is equivalent to 0.047/ (1 − 0.018) ≈ 4.8% (see Table I). The resulting share of switchers conditional on attention is 0.048/0.07 ≈ 70%. care plan or high deductibles as their default are significantly more attentive than those with non-managed care or low deductible defaults. This result suggest that attention is linked to risk as typically low risk-types self-select into managed care plans and high deductibles.
In conclusion, we identify two important sources of inertia: First, even when accounting for attention, the elderly face non-negligible switching costs preventing them from switching. Second, the majority of individuals does not pay attention to plan choice in the first place resulting in low switching rates and thus inertia.
-Insert Table III about here -

The Role of Premium Shocks as Attention Triggers
As briefly discussed in Section 2.3, the Swiss regulation obliges the elderly to pay for accident insurance out-of-pocket once they retire. To analyze the choice implications of this premium shock, we re-estimate our structural model and include a retirement indicator capturing the corresponding price hike in the attention stage. Table IV shows the corresponding estimates.
The parameters in the plan choice stage are essentially unchanged by the inclusion of the retirement premium shock. More importantly, however, the premium shock is a strong attention trigger as indicated by the comparably large and positive coefficient. To illustrate the impact of the premium shock on attention, Table A.II in the Appendix shows the average predicted attention probability when the premium shock is switched on and off. We find that exposure to the retirement premium shock increases the attention probability from 7% to 9% and thus by almost 30%.
-Insert Table IV about here -These results suggest that individuals are much more likely to "wake up" once they are hit with a sizeable shock. 31 Even though premiums regularly increase by around 4.5%, these smaller, more gradual increases are not sufficient to attract much attention to plan choice. This finding has direct implications for health insurance policy, as managed competition crucially relies on sufficiently price-sensitive consumers. While our results suggest that consumers are, 31 Of course, retirement has many effects apart from having to include accident insurance. There is also a significant drop in income, and individuals have more time to think about health insurance. These factors may also influence the individual attention allocation. in fact, sensitive to prices as is indicated by the negative utility weight on premiums, this sensitivity is not sufficient given the majority remains inattentive. Perhaps attention could be triggered by requiring insurers to increase the visibility of price differences to the consumers (e.g., Schmitz and Ziebarth, 2017). For this purpose, the regulator could instruct insurers to print the cost difference to the least-cost plan on the information letter announcing next year's premium. Furthermore, switching costs could perhaps be reduced by providing personalized information (Kling et al., 2012).

Welfare Effect and Policy Simulation
Next, we use the estimates from the structural model to simulate the welfare implications of two counterfactual policies. First, we recover the welfare loss of inattention by comparing consumer choices to counterfactual choices under full attention (see Subsection 5.3.1). In that case, the choices are purely driven by consumer preferences, as everyone is forced to pay attention.
This comparison allows us to calculate the welfare loss of inattention. Second, we assess a policy that entails a modification of the choice environment in which financially dominated plans 32 are eliminated from the choice set (see Subsection 5.3.2). Such a choice-set based policy forces those who choose these dominated plans to reconsider their plan choice. While this policy by design reduces overspending, the welfare implications for the forced switchers are ex-ante unclear as they may switch to plans that are inferior in terms of other utility-relevant characteristics.
To evaluate the counterfactuals, we have to take a stand on what factors determine normative utility. We follow previous work in assuming that normative utility depends on costs, risk protection and observable quality indicators (i.e. premiums, out-of-pocket costs, financial risk and the inclusion of managed care mechanisms) (Abaluck andGruber, 2016, 2011;Abaluck and Adams-Prassl, 2021;Heiss et al., 2007). As standard in the choice literature, we transform the normative utility to money-metric normative utility by dividing the estimated coefficients on the relevant plan characteristics by the premium coefficient. Based on the money-metric normative utility, we calculate the expected consumer surplus for each consumer before and 32 We consider plans as financially dominated which are never minimizing expected out-of-pocket costs for any level of healthcare spending. after implementing the policies as (Train, 2009): where μ i is the probability that person i pays attention; s * ij is the probability that the same person chooses plan j among the non-default plans (conditional on being attentive), u ij is the money-metric normative utility 33 for plan j and j = d marks the default plan. The expected foregone savings and plan characteristics are calculated analogously.

The Welfare Costs of Inattention
To quantify the welfare costs of inattention, we compare the expected foregone savings and consumer surplus under the status quo to a full attention counterfactual in which all consumers are forced to pay attention. Our analysis shows that inattention increases insurance costs for the typical consumer as average expected foregone savings decrease by 40% from CHF 1,093 when inattentive to CHF 770 under full attention. Moreover, full attention considerably reduces the variation in foregone savings we observe in the data (CHF 310 vs. CHF 648 under the status quo). Also, forced attention reduces the overspending for 70% of individuals. 34 In that sense, inattention seems to harm consumers financially.
On the one hand, inattention increases the financial burden because of lower market shares for managed care plans, which come at a lower price tag. At the same time, the market shares for mid-deductible plans increase when consumers are attentive, mostly on behalf of the lowest-deductible plans. As will be discussed in Subsection 5.3.2, these mid-deductibles are never cost-minimizing and thus increase foregone savings. Thus, inattention may also guard some individuals from making financially dominated choices.
Focusing on cost savings alone, however, is insufficient to evaluate the welfare costs of inattention as out-of-pocket costs are only one component of (normative) utility. Switching plans may have implications for the premiums, risk protection and the access and quality of care insurees receive and thus have to be factored in when evaluating the welfare implications.
We proceed by analyzing how the policy affects the expected consumer surplus. 33 To fix ideas, u ij = PREMIUM + γ 2 OOP + γ 3 V(OOP) + γ 4 MC, where γ = β/β Premium . 34 See also Figure A.2 which plots the expected foregone savings under the status quo against those under the full-attention counterfactual. Each dot represents the expected foregone savings for a single observation. Figure 3 plots the expected consumer surplus under the new policy (yaxis) against the status quo (x-axis). Each dot represents the expected consumer surplus of one individual in the sample. While welfare has not changed for individuals on the 45-degree line, it has increased (decreased) for those above (below) the line. The figure shows that the majority of individuals would gain from being attentive, as is suggested by the large mass above the 45degree line. The bottom panel in Figure 3 shows the distribution of the difference in expected welfare between the counterfactual and the status quo (inattention). Overall, inattention generates a welfare loss of CHF 125 per person and year. By eliminating inattention, 50% of the elderly could expect a gain of at least CHF 52 in consumer surplus. However, the probability mass to the left of zero indicates that welfare gains are unequally distributed; forcing attention reduces the (expected) consumer surplus for roughly 40% of individuals. This result suggests that inattention may actually guard some individuals from taking even worse choices. Besides inattention, the remaining welfare losses arise from individuals putting weight on characteristics such as the deductible and brands that are irrelevant for normative utility. So even in the absence of inattention, a non-negligible fraction of consumers can be expected to overspend money on health insurance (see Figure A.2).

The top panel in
-Insert Figure 3 about here -

Mid-Deductible Elimination
Consumers in the basic health insurance market in Switzerland have the choice between six deductible levels ranging from CHF 300 to CHF 2,500 per year (see Section 2). 35 Two deductibles are financially dominated within this choice set: the middle deductibles of CHF 1,000 and CHF 1,500 (hereafter mid-deductibles). They are financially dominated because consumers can always opt for another deductible level that provides lower total costs (out-of-pocket costs & premiums) no matter what their healthcare costs are. Figure 4 illustrates the total costs of a consumer as a function of health spending for a deductible of CHF 300, CHF 1,500, and CHF 2,500. 36 The graph shows that up to a level of CHF 2,000 in annual healthcare spending, the maximum deductible of 2,500 CHF (green line) 35 The maximum premium rebate health insurers can offer to consumers for increasing their deductible level above CHF 300 amounts to 70% of the incremental deductible. For example, when moving from the minimum to the maximum deductible, the annual premium rebate can't exceed CHF 1,540 (i.e. (2500 − 300) × 0.7). 36 Figure 4 is based on premiums for the standard model at CSS in 2012. Note that premiums are decreasing in the deductible. minimizes total costs. Above that, consumers minimize total costs by choosing the minimum deductible of CHF 300 (blue line). In other words, from a purely financial perspective, consumers should never opt for the mid-deductibles as they are financially dominated for any level of healthcare expenditures.
-Insert Figure 4 about here -Based on this observation, we propose a "mid-deductible elimination" policy in which the choice set of consumers is reduced to low and high-deductible plans. Under this proposal, all enrollees in the mid-deductible plans are forced to pay attention and switch to either a lower (CHF 300 or 500) or higher deductible (CHF 2,000 or 2,500). In the years 2012-2013, more than 13% of the elderly chose a mid-deductible plan ("mid-deductible choosers"). For these mid-deductible choosers, we simulate the welfare implications of the policy and analyze their choices.
Our policy simulation shows that eliminating the mid-deductible plans leads to significant cost savings for the mid-deductible chosers. The expected foregone savings on average drop from CHF 1,017 in the status quo to CHF 827 under the new policy (-19%). Hence, the typical mid-deductible chooser could save about CHF 190 in total costs p.a. through the policy.
Moreover, the choice-set limitation leads to a substantial reduction in the standard deviation of overspending from almost CHF 570 to less than CHF 260 limiting maximum foregone savings to CHF 1,420 (pre-policy max: CHF 2,660).  Figure 5 depicts the change in expected consumer surplus. The majority of the probability mass is concentrated above zero reflecting that most individuals would benefit from the middeductible elimination policy. The average welfare gains amount to slightly more than CHF 180 and 50% of individuals could increase their expected consumer surplus by CHF 150 or more.
Our simulation further shows that those who would benefit from the policy are individuals with comparably high healthcare expenditures. They realize their welfare gains by choosing lower deductibles which drastically reduce their total costs. Moreover, they are more likely to opt for managed care plans which offer lower premiums. In contrast, the remaining third of mid-deductible choosers who are worse off after the policy tend to choose inappropriately low deductibles for their spending profile and do not include managed care features into their health plans making them pay high premiums.

CONCLUSION
This paper analyzes the role of inattention as a major source of inertia in health plan choices. We contribute to the literature on plan choice by explicitly accounting for (in-)attention in decisionmaking in a healthcare market with mandatory insurance under managed competition. Applying the approach by Abaluck and Adams-Prassl (2021), we recover structural utility parameters and attention probabilities for the elderly population in Switzerland. Our results show that consumers systematically overspend and remain in their current health plan even when financially better options are available. We find that inattention explains large parts of consumer inertia: at least 9 in 10 individuals do not consider options beyond their default plan.
Compared to simpler discrete choice models, taking inattention into account reduces estimated switching cost by a factor of 5. Still, even conditional on attention, implied switching costs are sizeable and explain parts of the observed levels of choice persistence. Further, we find that premium shocks can operate as an attention trigger that induces consumers to make active and better health plan choices.
Because inert consumers are a threat to the success of managed competition, a better understanding of potential barriers to switching is necessary to improve the system. We have identified two main barriers; a high degree of inattention and switching costs which on average exceed the financial gains of switching. Our policy simulations show that under full attention, 70% of consumers would choose financially superior plans (in the sense of having lower expected foregone savings) and about 60% could increase their consumer surplus. Taken together, our findings suggest that the current system is highly inefficient. There are private welfare losses due to inattention, and there is a social welfare loss because managed competition does not work under these circumstances.
Switching costs are probably better interpreted as the willingness to pay for staying with the current plan because the actual monetary costs of switching are negligible in Switzerland.
There are several possible reasons for the high willingness to pay we find in our analysis. One may be a misunderstanding of the insurance system. Many individuals buy supplementary insurance, but pre-existing conditions at the time of contract signing are not covered by those.
Therefore, most elderly want to keep their existing supplementary insurance. They might fear losing their supplementary insurance when changing the mandatory insurance provider (which they would not). Bhargava et al. (2017) also provide evidence that the main reason for making bad choices is a lack of understanding health insurance. In addition, Lamiraud and Stadelmann (2020) show that low-price supplementary products contribute to attract consumers to basic contracts and discourage switching between health plans in basic insurance.
Switching to another insurance company then indeed creates indirect switching costs because the low price for the supplementary product is conditional on having basic insurance with the same company.
Both inattention and high willingness to pay due to misunderstanding may be addressed by personalized information that, for example, informs consumers about the cost-minimizing plan given their spending profile in the previous year. A more radical alternative would be to actually enroll individuals to their best plan (smart default) but giving them the option to switch back to the previous plan. It is, however, unclear who should provide the personalized information or implement the smart default policy. The insurance companies have no interest in increasing the switching rates because, for them, low switching avoids competition. An alternative would be the regulator (i.e. the Federal Office of Public Health). Still, the informational requirements would be large (the same as for the consumers), especially if the comparison is made across all insurance companies. Also, it may be problematic if the regulator sends out personalized information and recommends insurance companies. For these reasons, we are skeptical whether it is possible in the current framework to increase consumer awareness and switching rates.
There are several limitations to our study. First, we only analyze a subset of both the population (the elderly) and their choice set (only one insurance conglomerate). The advantage of analyzing the behavior of the elderly is that they already have 15 years of experience with the health insurance system after the major reform of 1996 introducing the current system.
Because this is the subpopulation with the lowest switching rates, our simulations provide a lower bound for the expected effect in the entire population. At the same time, we provide upper bounds for individual welfare losses due to inattention.
Another limitation is that we do not account for changes in premiums caused by limiting the choice set or by increasing the switching rate. The mid-deductibles are profitable for the insurance company because expected costs are below premium payments. However, as it is not allowed to make profits in the compulsory insurance system, companies use these gains to subsidize the premiums in the low and high deductible plans. If the mid-deductibles are eliminated premiums in the remaining plans may increase. On the other hand, having higher switching rates may increase competition among insurance companies which in turn may lead to smaller premiums. So it is ex-ante unclear how premiums will change when the reforms we discussed are implemented. that moral hazard is non-existent by keeping the (expected) expenditures constant across plans.
In other words, healthcare expenditures are not a function of the chosen deductible level or any other plan feature. This assumption is certainly restrictive and may be relaxed in future work. 38 As previously mentioned, we only observe a single realization of OOP costs for each individual and year. To get a measure for the distribution of costs, we compare individuals who look ex-ante similar in observable characteristics (see subsection A.2 for details). Based on the standard deviation of expected OOP costs within this group of comparable individuals, we construct a plan-specific measure of variance.
To assess the sensitivity of our results to the assumptions on expectation formation, we consider two additional measures of OOP costs: perfect foresight and rational expectations. 38 However, we allow for cost differences due to selection. Empirical evidence on the relative importance of these two mechanisms is mixed. Perhaps the most influential empirical estimate of health cost elasticity due to moral hazard is −0.2 from the RAND experiment (Aron-Dine et al., 2013). Other experimental evidence includes the Oregon HIE (Finkelstein et al., 2012). Estimates for the market under consideration (Switzerland) remain mixed: the estimated elasticities of healthcare expenditures due to moral hazard when changing a deductible range from -0.2 to 0 (Gerfin et al., 2015;Boes and Gerfin, 2016). Therefore, our assumption is in line with some of the empirical estimates for the market under consideration. If this is the relevant elasticity, relaxing the assumption is unlikely to drastically alter the findings of our analysis.

A.1 Details on Calculating Out-of-Pocket Expenditures
In this section, we describe the construction of the out-of-pocket (OOP) expenditure measure in greater detail. Throughout the ensuing description, we model the expected OOP costs of an individual i who chooses a health plan for the year t + 1 at some point during the open enrolment window at the end of year t. The default plan is the plan the individual is currently enrolled in (year t).
We proceed in three steps. Based on the realized gross costs and certain individual characteristics, we calculate expected gross costs for each individual under three scenarios. Based on these imputed gross costs, and under the assumption of no moral hazard, we then calculate the expected OOP costs for each plan in an individual's choice set. Lastly, to account for the uncertainty in measuring these expectations, we also derive a measure of the expected variance in OOP costs.

A.2 Grouping Individuals into Distinct Cells
To form the rational expectation measure and the variance measure, the dataset is grouped into distinct cells, based on their year-t characteristics as follows: age group x cost percentile x gender x year.
We use one age group; elderly (59 to 70). The cost percentiles 39 are based on the age groupspecific distribution of total HCE in the current calendar year (year t, i.e. the year at the end of which the open enrolment window takes place). We exclude cells with fewer than 10 individuals. The average cell size is 3,418 (minimum: 2,130, maximum: 8,041).

A.3 Calculation of Gross Cost Variables: Three Alternative Measures
Backwards-Looking (BL) Individuals assume that current year spending is their best forecast for next-year spending. They incorporate no private information beyond spending in their cost prediction. To get a measure of distribution, we use the within-cell variance for cell c, plan j and year t.

Perfect Foresight (PF)
Upon taking the decision about next year's health plan, every individual is perfectly informed about their (gross) healthcare expenditures for the next year. 39 5% percentiles are used.
That is, individuals perfectly anticipate all health shocks. To get a measure of distribution, we use the within-cell variance for cell c, plan j and year t + 1.

Rational Expectations (RF)
Conditional on their previous spending pattern and certain characteristics, individuals anticipate the level of their next year spending up to a nuisance term. Formally, OOP i,t+1 = α c,t+1 + ξ i,t+1 , and E t [ξ] = 0. To predict the level of gross spending in t + 1, we use the average expenditures by (ex-ante) similar individuals. To be precise, we use the average costs for cell c and year t + 1 (based on the realized gross cost in year t + 1). 40 Again, to get a measure of distribution, we use the within-cell variance for cell c, plan j and year t + 1.
Formally, the three models of gross cost can be summarized as follows:

A.4 Out-of-Pocket Costs
Given (an estimate of) the total healthcare expenditures C i of individual i, the OOP costs of a health plan j with deductible of d j are given by the following expression: Note that this is identical to predicting costs from a set of year t characteristics and spending in a fully saturated regression. Deriving the means from the cells allows us to recover an additional moment of the cost distribution.
Formally, the OOP costs for each plan under the three models of gross cost described above can be summarized as follows: , where the function oop(•, •) (described above) takes two arguments: predicted gross cost and the deductible.

A.5 A Measure of Variance
Given a prediction of the OOP plans, the variance of the cost distribution under each scenario may formally be expressed as follows: and n c the number of individuals in cell c.

Appendix B Functional Form of the Empirical Approach
In the following, we provide further information on the functional form and the estimation of the DSC model, which is the foundation of our structural approach. We refer to Abaluck and Adams-Prassl (2021) for the modeling assumptions and a detailed proof.

B.1 Estimation
Functional Form and Parametrization. We model the functions s ijt (•), μ i (•) as follows: Attention Probability: where η it is a logistic error term, δ t is a time-specific constant, and z id are characteristics of the default good d. The vector z id may include all or part of the same characteristics as the covariate vector x id . Additionally, it may contain other variables. In practice, we include premium, mean and variance of expected OOP costs, deductible, a dummy for MC feature, and a year fixed effect.

Latent Choice Probability:
We assume that individual i's utility from choosing good j at time t is u ijt = v ij (x jm ) + ijt . We parametrize indirect utility as a linear function of the of option j, v ij (x jm ) = x ijt β, where x ijt is a vector of characteristics of option j (premium, mean and variance of expected OOP costs, deductible, MC feature), including a default dummy.
Further, we assume that ijt is distributed i.i.d. Type I Extreme Value. It follows that the probability of choosing option j under full attention takes a conditional logit form: Log Likelihood Function (General) By parametrizing the functions μ(•) and s * j (•), we can write the probability of choosing alternative j as a function of the parameters θ = (β, γ): And construct the likelihood as follows: , where y ij = 1 if individual i chooses good j. The parameters β, γ can then be estimated by Individual contribution to the Log Likelihood. To fix ideas, we can combine our likelihood function with the functional forms described above to get an expression for the individual contributions to the likelihood function.
where y ij takes on value 1 if i chooses j.
Plugging in for s ijt (β): Plugging in μ i (•) yields: Plugging in s * j (•) yields: So, individual i's contribution to the Log-Likelihood is: Which reduces to the following expression, depending on whether i chooses the default or not:  Notes: The table shows FE reduced form estimates for health plan switching and leaving as outcome variables. Column 1 includes all individuals, Column 2 includes stayers and switchers, and Column 3 includes stayers and leavers. The 'any change' indicator is one for changes in the deductible, the inclusion of a managed care feature, changing the carrier, or any combination of these possibilities. The 'switching' indicator is one for all listed changes within the conglomerate. The exit indicator is one for those changing to an outside-carrier. The list of time-varying individual characteristics includes age categories and indicators for chronic diseases (PCGs). Standard errors clustered at the individual level in parentheses: ** p < 0.01 * p < 0.05.               Premium (  Notes: The table shows FE reduced form estimates for health plan switching and leaving as outcome variables. Column 1 includes all individuals, Column 2 includes stayers and switchers, and Column 3 includes stayers and leavers. The 'any change' indicator is one for changes in the deductible, the inclusion of a managed care feature, changing the carrier, or any combination of these possibilities. The 'switching' indicator is one for all listed changes within the conglomerate. The exit indicator is one for those changing to an outside-carrier. The list of time-varying individual characteristics includes age categories and indicators for chronic diseases (PCGs). Standard errors clustered at the individual level in parentheses: ** p < 0.01 * p < 0.05.