Learning from failure in healthcare: Dynamic panel evidence of a physician shock effect

Procedural failures of physicians or teams in interventional healthcare may positively or negatively predict subsequent patient outcomes. We identify this effect by applying (non)linear dynamic panel methods to data from the Belgian transcatheter aorta valve implantation registry containing information on the first 860 transcatheter aorta valve implantation procedures in Belgium. We find that a previous death of a patient positively and significantly predicts subsequent survival of the succeeding patient. We find that these learning from failure effects are not long-lived and that learning from failure is transmitted across adverse events.


INTRODUCTION
Physician learning is an umbrella term covering multiple types of learning, forgetting and knowledge transfer. The literature on learning for physicians typically focuses on specific casestudies and the identification of different types of learning. Most notably, a distinction is made between physician experience, economies of scale and human capital depreciation (Hockenberry and Helmchen, 2014;Van Gestel et al., 2016). In Van Gestel et al. (2016), these three types of learning have been further investigated with a focus on patient subgroups. In this paper, we build on this work by looking at performance responses to a failure. An adverse event may, either positively or negatively, disrupt health provider performance. A failure may have an impact on performance through three primary channels. Firstly, a failure might alarm and shock a provider.
As a result, in a subsequent procedure, the provider might be better prepared and/or better motivated to obtain a positive outcome. This response might arise because of, among others, loss aversion. Second, a failure may provide the physician with more information on specific aspects of the procedure. Lastly, subsequent to a failure, patients with different characteristics may be selected to undergo the procedure. Although clearly a selection effect, this might indicate that physicians learn to more appropriately select patients over time. Hence, the three channels may all represent certain aspects of provider learning from failure. Throughout the literature, endogeneity problems hamper inference of learning effects with reverse causality and risk selection over time as main validity threats (e.g. Gaynor, 2005;Hentschker and Mennicken, 2016). In this paper, our empirical strategy directly aims at addressing potential endogeneity issues when estimating learning from failure effects. 3 The role of the physician is, evidently, widely acknowledged to be important in the production of health. Physician incentives and behavior, but also characteristics, are expected to be predictive of patient health outcomes. Aspects like education and specialization (Jollis et al., 1996), adherence to guidelines (Ward, 2006) and experience (Hockenberry and Helmchen, 2014) have all been shown to affect patients' health outcomes. At a more aggregated level, physician supply and contract-types (contracted vs. municipality GP's in Aakvik et al. (2006)) have been shown to be correlated with mortality rates Sundmacher et al., 2011;Iizuka, 2016).
As a result, a number of initiatives target provider performance. Firstly, these initiatives may raise physicians' information and skills through continued medical education (Cervero et al., 2015), traineeships, etc. (Haynes et al., 1995). Secondly, market-based interventions change incentives for patients and physicians which also influences subsequent performance. For example, publicly available report cards (Kolstad, 2013) and pay for performance programs (Li, 2014) target incentives through both intrinsic motivation and market-based stimuli. In this paper, we contribute to this literature by exploring if previous failures (e.g. adverse events like patient mortality and stroke) affect subsequent physician performance. This research may inform policy makers on the scope to introduce future information or awareness campaigns. In addition, this analysis can also serve as a starting point to further explore underlying reasons for such shock effects.
Learning from failure is important at all levels of the healthcare sector. Nation-wide failed health reforms should inform future reforms and organizations are expected to foster failure driven learning even though they might face barriers to do so (Oberlander, 2007;Edmondson, 2004). At the provider level, learning from failure might be encouraged by the information provided by the failure itself (e.g. physicians may learn more from risky patients), by the incentive to avoid malpractice claims (Pänthofer, 2017) or more generally by loss aversion. This loss aversion may 4 relate to income, patient break-ups because of low performance and/or adverse events (Rizzo, 2003;Hareli, 2007). However, failure may not only stimulate learning, it could also result in more failures. This result, typically found in the organizational literature, occurs when for example failed (business) projects negatively influence a team (Shepherd, 2013). Psychologically, personal goal failure may lead to negative affective states and may therefore translate into negative subsequent outcomes (Jones, 2013). Also, physician inertia may contribute to continued failure. A failure to respond to adverse outcomes may stem from habit formation and from the reluctance to adjust treatment practice because of sizeable search and learning costs (Janakiraman, 2008). Whereas the idea of learning from failure for physicians is related to research on physician inertia in pharmaceutical prescriptions, we apply the idea of previous experience and inertia to a more interventional setting.
The identification of the learning from failure effect on physician performance relies on previous experiences as patient mortality for a physician depends on mortality of the previous patient(s).
Estimation of such learning from failure effects imposes two main econometric challenges: First, in the context of binary response fixed effects (FE) dynamic panel data models, the well-known incidental parameter problem and Nickell bias lead to possibly inconsistent coefficient estimates (for an overview see Lancaster, 2000). Second, with irregular spacing of panel data, FE may not appropriately be accounted for in dynamic models (McKenzie, 2001). In our data, this irregular spacing occurs because the time span between Transcatheter Aorta Valve Implantation (henceforth TAVI) procedures differs within hospitals. That is, we only observe provider activity when a patient is actually treated in a given facility and therefore the time span between procedures is irregularly spaced between patients.

5
As a remedy to the incidental parameter problem and the Nickell bias, we apply the bias-corrected FE estimator proposed by De Vos et al. (2015) which has shown to have superior small sample properties compared to classical GMM-type estimators and FE linear probability models. In addition to the bias-correction approach, we estimate the learning from failure effects using the split-panel jackknife FE estimator recently proposed by Dhaene and Jochmans (2015). We also discuss the irregular spacing of our data in which the dependency between observations can be modelled as an AR(1)-process.
In this paper, we provide evidence for substantial learning from failure effects. We find that if the previous patient died, the probability to die within one month is associated to decrease by 5 to 11%points for the next patient. Although different non-linear dynamic panel methods provide slightly different results, all specifications provide qualitatively similar results. We also find minor evidence for a transmission mechanism of shocks between adverse events. A previous stroke, a common complication during or shortly after the TAVI procedure, is correlated with a lower likelihood of dying.
In the remainder of this paper, we discuss the background of our application to TAVI and our data in section 2. In section three we focus on the methods to measure the effect of learning from failure.
In section 4, we present our results after which we discuss and conclude in section 5.

THE INTRODUCTION OF TRANSCATHETER AORTA VALVE IMPLANTATION (TAVI)
The application in this paper considers the introduction and evolution of the TAVI procedure in Belgium for which the first procedure in a Belgian hospital has taken place in 2007. With a TAVI procedure, the defunct aortic valve is replaced with an artificial replacement valve through means 6 of a minimally invasive catheterization. In our sample period the procedure was not reimbursed and was therefore only financed by physicians and hospitals for patients that were anatomically inoperable and that were unable to bear the cost themselves. Later, in 2015, TAVI became reimbursed under certain conditions. We use register data for patients from 2007, including the first patient undergoing TAVI, to the beginning of 2012 in 20 hospitals. As such, our data is able to describe the performance of physicians for the full introduction period of TAVI. In each hospital only one team performs this TAVI procedure and during the sample period total workload for TAVI was limited to about one day a week. Furthermore, team composition hardly varies over the sample period 2007-2012.
Information is available on a wide range of patient specific characteristics (see Table I for details) and we have access to hospital identifiers. However, the data does not contain information about additional hospital (e.g. hospital budget) or surgeon-team characteristics (e.g. years of experience).
Note that with the exception of "Ejection fraction", all variables in our data set are binary indicators. Approval was obtained from the institutional ethics committees for data collection at the participating hospitals. On average, 1-month mortality amounts to about 9% with proportions ranging between 5 to 27% between hospitals. In addition, Table II gives an overview on the total number of patients treated over the different years and Table I shows the total number of patients treated at each of the different hospitals over the entire sample period.
---Insert Table I here ------Insert Table II here --- Table III provides first descriptive evidence for the learning from failure effect for interventional care. Descriptive numbers are provided for four common adverse events: mortality, having a stroke, 7 renal failure and pacemaker implantation. Cerebrovascular stroke is a common complication during the procedure because of, e.g. blood or calcium cloths that are dislodged during the procedure (for a technical explanation see Ghelam et al., 2016). Also, renal failure and pacemaker implantation regularly take place during or shortly after the TAVI procedure (Eggebrecht et al., 2011). The estimates in the "Lag=0" columns contain the probabilities for the adverse event if the previous patient did not suffer from the corresponding adverse event. The column "Lag=1" shows the likelihood of the adverse event after an adverse event. For 1-month mortality and stroke, the probability of suffering from a stroke or dying within one-month is substantially lower if the previous patient exhibited the same adverse event pointing towards learning from failure effects.
For renal failure and pacemaker, point estimates have the opposite sign and are statistically insignificant. Throughout the next sections, we will thoroughly scrutinize these preliminary descriptive findings. Since only mortality and stroke exhibit interesting patterns and because only for mortality there are subsequent failures, we will focus our attention entirely on mortality. 1 ---Insert Table III here ---

Empirical specification and learning curves
The literature on learning curves in health distinguishes between three types of learning (see e.g. Hockenberry and Helmchen (2014); Van Gestel et al. (2016)): cumulative experience (CE), economies of scale (EOS) and human capital depreciation (HCD). Learning from cumulative 8 experience refers to the idea that treating an additional patient generally improves physician (or team) performance. When referring to economies of scale, we capture the fact that higher volume providers usually have better infrastructure (e.g. equipment, staff) and more standardized procedures. Lastly, the human capital depreciation hypothesis states that provider performance decreases with longer temporal distance to previous procedures. In its most simple form, this leads to the specification in equation (1) which can empirically be estimated using standard regression techniques: In the empirical strategy, our data are considered a pseudo-panel dataset in which the individual patients are consecutive observations on provider and hospital performance. In equation (1), , ℎ and denote individual in hospital ℎ in year 2 . The outcome variables in our analysis are binary mortality indicators. Furthermore, we control for a vector of background characteristics and comorbidities contained in ,ℎ, and hospital fixed effects ℎ . Lastly, ,ℎ, is a typical error term which in case of LPM's is non-normally distributed. Consequently, we use heteroscedasticityrobust standard errors in all our model specifications.
In equation (2), we extend (1) by adding a lag of the outcome variable as additional independent variable to the model. Note here that the outcome, say mortality, of patient i in hospital h in period t is regressed on the mortality indicator of the previous patient i-1 which was treated just before 9 patient i in the same hospital h in period t. 1 is therefore capturing the learning from failure effect.
Furthermore, we include the learning variables from equation (1) to overcome the potential omitted variable bias resulting from a likely correlation between the lagged outcome and the learning indicators. In fact, without the inclusion of the typical learning effects, 1 would likely be an upper bound on the true effect 3 .
Because of the dynamics in equation (2), there are several issues that may obstruct estimation and inference of the failure effect. First, the binary outcomes may require the estimation of a non-linear dynamic panel model with fixed effects. In this setting the fixed effects may induce a substantial incidental parameter problem and a Nickell bias. Second, by estimating the dependence between observations, the irregular spacing in the model may cause the within-estimator to be biased because the panel unit fixed effects are not fully accounted for. These estimation issues are further discussed throughout the following sections.

Dynamic panels and the failure effect
Since the dynamic effects for the adverse events are of primary interest in this paper, our main goal is to obtain consistent and efficient estimates of the potential failure effect. Specifically, we analyze whether the patient outcome of a team or physician procedure is correlated with previous (negative) experiences. In this section we describe potential improvements over the basic FE LPM shown in equation (2).

Incidental parameters
In the context of binary dependent variables and fixed effects, the incidental parameters problem frequently presents a hurdle to obtain consistent and efficient estimates. Because of the incidental parameters problem, the fixed effects estimates are inconsistently estimated and this carries over into inconsistency for all other coefficients (Baltagi, 2008;Wooldridge, 2010). The inclusion of incidental parameters is even more problematic with lagged dependent variables because of the Nickell bias 4 (Moon et al., 2015). To address these endogeneity issues, difference and system GMM-type estimators are commonly used in the applied literature. For example, Salge et al. (2016) apply dynamic instrumental variable panel methods in relationship with quality of care. They find that infections decrease with better overall cleaning, training on infection control, hand hygiene and a favorable error-reporting environment. We refrain from applying GMM-type estimators as they regularly suffer from poor small sample properties due to weak instrument problems (see Pua, 2015;De Vos et al., 2012;Bruno, 2005). Instead, to address these econometric challenges, we run a series of alternative estimation techniques. As a starting point, we estimate the learning from failure effects using simple fixed effects Linear Probability Models (henceforth FE LPM's) and fixed effects logit (henceforth FE logit) specifications. Although widely used in the applied literature, Chernozhukov et al. (2013;p. 546) demonstrate that the FE LPM provides inconsistent estimates of the average marginal effects in dynamic panel settings. Furthermore, applying nonlinear fixed effects models (e.g. FE logit/probit) has shown to produce persistent (upward) bias in the estimated slope coefficients due to the incidental parameters problems. In fact, based on Monte coefficients increases, the smaller the time horizon T in the data.
To address these econometric challenges, we apply in a first step the bias-corrected FE estimator In a second step, we apply the split-panel jackknife for non-linear fixed effect models recently suggested by Dhaene and Jochmans (2015) which addresses the incidental parameters problem, while at the same time accounting for the dynamics. The method is specifically designed for moderately large panels. We apply the jackknife-based approach because we do not include the typical time fixed effects 6 as in Fernández-Val and Weidner (2016) where the analytical bias correction is preferred. Whereas most solutions to the abovementioned difficulties are computationally complex and require analytical solutions, the jackknife is relatively straightforward to implement and performs similarly or even better than other approaches. The drawback however is the difficulty to include time trends in non-linear fixed effects models which makes it hard/impossible to correct for the overall learning curve. With different dynamics in the subpanels, the estimates of all coefficients may differ. Consequently, we provide a range of estimates.

12
The intuition underlying the split-panel jackknife is to divide each panel in smaller subpanels of consequent observations. 7 Because the bias depends on the length T of the panel, using different panel lengths by generating subpanels and comparing the coefficient estimates of the subpanels with the estimate for the complete panel provides an estimate of the bias. Subtracting the estimated bias from the complete panel estimate generates the split-panel jackknife estimator (Dhaene and Jochmans, 2015, p. 998). One simple choice to determine the subpanel lengths also suggested in Dhaene and Jochmans (2015) is the half-panel jackknife where the panels are simply divided in two. In a dynamic setting (p. 1007) the jackknife may perform sub-optimally when the dynamics are very different in the half-panels. Subpanel estimates can be compared to test for sensitivity to differential dynamics.

Irregularly spaced panel data
In our setting, patients are considered as consecutive observations on provider performance. As such, because periods between patients may substantially vary (across hospitals and patients), our panel can be considered as irregularly spaced. With irregularly spaced panel data, the fixed effects approach may not fully account for the fixed effects and as a consequence, the within estimator may be biased (McKenzie, 2001;Tamm et al., 2007). One approach, that will be discussed in more detail in the robustness section is to model the data as an AR(1)-process to better illustrate the dependence between patients. Although intuitively attractive, technical and practical considerations hinder the applicability of this method in our setting. 7 We provide a brief and intuitive summary of the split-panel jackknife. For more detailed technical information, please see Dhaene and Jochmans (2015).

Baseline estimates
As a starting point, we estimate simple fixed effects linear probability (FE LPM) and fixed effects logit (FE logit) specifications before presenting the bias-corrected FE and the split-panel jackknife estimates. Table IV below presents the FE LPM and FE logit average discrete probability effects (ADPE) of our learning from failure effects for the outcomes of one-and 24-month mortality after the TAVI procedure. Both specifications include patient-and procedure specific characteristics (see descriptive statistics in Table I), hospital fixed effects and the different learning effects described above. Remark that, especially for 24-month mortality, the lagged outcome may not yet be available to the physician at the start of the next procedure. In this case, we make the implicit assumption that there is a latent variable indicating treatment success according to the physician that closely matches the one or 24-month mortality.
Overall, we find highly significant negative coefficients on our lagged outcome variables for both mortality outcomes. In fact, our FE LPM estimates indicate that the predicted likelihood of dying in the first month after the TAVI procedure is associated to decrease by about 7.8%-points if the last patients past away. Likewise, the probability of dying 2-years after the procedure is predicted to decrease by about 7.4%-points pointing again towards strong learning from failure effects. The ADPE estimates in the FE logit specifications are of similar magnitude and thus lead to the same conclusions. These preliminary findings suggest that there is substantial room for self-correction and personal improvement after failure.
Furthermore, in line with the findings in Van Gestel et al. (2016), our estimates provide evidence for a significant positive learning from cumulative experience effect as treating an additional 14 patient is associated with a decrease in 2-year mortality of about 0.2%-points. Although seemingly a small effect, this quickly becomes large for sizeable patient samples.

Bias-corrected FE LPM and split-panel jackknife FE probit estimates
As discussed in the methodology section, the dynamics in the model specifications above cause the errors to be correlated with the lagged dependent variables inducing a Nickell Bias on all coefficient estimates shown in Table IV. The preferable strategy therefore is to consider the dynamics and simultaneously address the incidental parameter problem. To this end, we apply the bias-corrected Table V below shows the bias-corrected FE estimates of the learning from failure effects for both mortality outcomes. The specification is based on 250 bootstrap samples and we use the burn-in initialization scheme to set the initial values of the lagged dependent variables 8 . Moreover, we allow for general heteroscedasticity of the error term using the wild bootstrap suggested by Liu (1988) and Mammen (1993) in the algorithm.
Overall, the coefficients in Table V below are qualitatively and quantitatively similar to the results in Table IV: the likelihood that the current patient dies within one-or 24-months after the TAVI procedure is associated to decrease if the last patient died. Although not significant anymore for one-month mortality, they have the same sign and order of magnitude compared to previous results.
In addition, we again find evidence for significant learning from cumulative experience effects for 2-year mortality and human capital depreciation. Note that in contrast to the FE LPM and logit estimates above, the bias-corrected FE estimates tend to be smaller in absolute value for the mortality indicators thus pointing toward upward bias in the LPM estimates above 9 .
---Insert Table V here ---In a next step, we estimate the learning from failure effects using the split-panel jackknife FE probit estimator recently proposed by Dhaene and Jochmans (2015). Table VI below shows the estimated learning from failure effects for 1-month and 2-year mortality while controlling for the usual patient-and procedure-specific characteristics and also including hospital fixed effects. Overall, we again find significant evidence for the presence of a learning from failure effect. Specifically, our split-panel jackknife estimate suggest that the likelihood of survival is associated to increase by about 11%-points (resp. 7%-points) if the last patient passed away within one-month (24months) after the TAVI procedure.
In conclusion, although point estimates between methods vary, our different estimation approaches all hint that physician (or team) performance tends to positively respond to previous failures. We observe both a significant decrease in the likelihood of short-and long-term mortality for the next patient.
---Insert Table VI here ---9 To assess the relative performance of the bias-corrected FE estimator against the FE LPM, we ran Monte Carlo simulations showing that the bias-corrected FE estimator clearly outperforms the FE LPM in terms of bias . The results may however be biased with serially correlated errors. (see Appendix A for details).

Nature and interpretation of the failure effect
The significant lagged effect may be interpreted in several ways. Firstly, we might expect that more can be learned from a failure than a success. If this would hold true, we would expect that this translates in a (slope-) shift of the "typical" learning curve in figure 1  ---Insert Figure 1 here ---Secondly, we are interested in the persistence of the shock effect, i.e., whether longer lags on the dependent variable are still significant. However, in none of our analyses, the second lag is statistically significant which suggests that the shock effect is short-lived. Additionally, it would also be of interest to investigate whether the second lag effect differs according to the value for the first lag. However, because there are almost no subsequent failures, the interaction would hold little information. In fact, in our data there are only three cases for which the two previous patients deceased.

Transmission of Risks
The LPM's in Table VII below (columns one and three) show evidence of a transmission of shocks between adverse events. Firstly, we find that when a patient suffers a stroke during a hospitalization, (s)he is also more likely to die. As such having a stroke is strongly correlated with a procedural failure. Secondly, given that the previous patient had a stroke, the probability of mortality is lower for the next patient. The same intuition holds for the effect of a previous mortality on a subsequent stroke. However, when correcting for incidental parameters with the bias correction of De Vos et al. (2015), the results become insignificant and thus no longer indicate spill-over effects from one adverse event to another.

ROBUSTNESS
To further test for the robustness of our results, we provide several additional tests and specifications throughout this section. Firstly, we test for the robustness of non-stationarity of regressors in the split-panel jackknife estimation. Secondly, the learning-related regressors are likely to have a non-linear relationship with patient health outcomes. We show that adding learning variables in different specifications does not qualitatively alter our results.

Irregular spacing
To account for the irregular spacing in our sample and similarly to McKenzie (2001), the dependence between observations in our pseudo-panel data set could be modelled in a linear probability model set-up as an AR(1)-process as follows: Where: For which ℎ~. . (0, 2 ) and ℎ,~. . (0, 2 ). Writing our dynamic panel estimation taking into account observed time then results in the following: McKenzie (2001) shows that within this setup, the fixed effects may not be fully accounted for if the irregular spacing is not taken into account. Because of the nonlinear parameter restrictions in equation (4), there is a need to estimate with nonlinear least squares estimation. Including the intercepts, the estimation equation becomes: The obvious drawback of this approach is that the effect is, by assumption, highly (multiplicatively) dependent on time. The time difference in the powers are the days between procedures for our application. The specification in equation (5)  although potentially preferred in theory because of the irregular spacing of patients, the best available alternative method also suffers from substantial shortcomings.

Quadratic learning curves
As an additional robustness check, we allow for non-linearities in the relationship between the typical learning variables (cumulative experience, economies of scale and human capital depreciation) and patient mortality or having a stroke. The resulting FE LPM estimates of learning from failure effects can be found in table VIII below. Overall, the estimated effects are in line with our previous findings: previous failures are both negatively associated with short-and long-term mortality of the next patient and having complications during the procedure. Non-linearities play only a minor role in explaining these patient outcomes as all the coefficients on the squared learning variables are near to zero and therefore not economically significant.

CONCLUSION
Identifying different channels through which physicians affect patients' health may help policy makers to efficiently allocate resources to policy interventions. In this paper, we shed light on the question how procedural failures of physicians (or teams) affect subsequent patient outcomes. We show that this "learning from failure effect" is an important source of physician learning besides the commonly identified factors such as economies of scale, learning from cumulative experience 20 and human capital depreciation. To identify such learning from failure effects, we apply the recently developed bias-corrected fixed effects estimator by De Vos et al. (2015) and the splitpanel jackknife estimator proposed by Dhaene and Jochmans (2015) to address the econometric challenges inherent to non-linear dynamic panel data settings.
Our findings for TAVI heart valve replacements provide evidence for a significant and sizeable negative effect from a previous failure on subsequent patient mortality. We find that a previous death is significantly associated to decrease the probability of a subsequent patient death between 6-11%-points. However, our results suggest that these effects are only short-lived, and they do not Our simulation results are based on the following data generating process: (1) = + 0.5 −1 + (2) = + where is a latent continuous dependent variable, ~ (0, ) is an unobserved individual effect and is a cross-sectionally dependent error term with ~ (1,4), ~ (0,1) and ~ (0,1). First, we explicitly allow for cross-sectional dependence in the errors. In our application this implies that across hospitals, similar patients (corrected for observed covariates) are treated at certain experience levels 11 . Initial values for the dependent variable are randomly drawn from a standard normal. We subsequently dichotomize the latent dependent variable based on the following threshold mechanism: = ( ≥ 0). To mimic the irregularly spaced TAVI data set, we generate different panel lengths ranging from T = 3-5 for the different 11 Remember that our panel time dimension is based on experience.

27
observations. Moreover, we use N = 800 observations in each simulation sample to again closely resemble the data dimension in our actual data set used in the paper. Figure A1 below shows the result from our Monte Carlo experiment. Our results suggest that FE LPM estimates suffer from substantial finite sample biases as they are nowhere close to the true ADPE. On the other hand, the BCFE estimator is substantially less biased and clearly outperforms the FE LPM 12 .
Second, as an additional exercise, we check the relative performance of the two estimators in case of autocorrelated errors. This means in our application that error terms for different patients are correlated over time within a hospital, while including patient characteristics and hospital fixed effects. This could happen when providers select on unobserved characteristics in a time-dependent manner. Error terms were generated based on the following AR(1)-process: (3) = + −1 + Initial values for were drawn from an iid standard normal and we simulate based on two parametrizations for : in the first setup, we implement a negative autocorrelation in the error of = −0.6 and in the second, a positive one with = +0.6.
The corresponding Monte Carlo experiment shows that the introduction of serial correlation in general induces quite substantial distortions onto the estimated effects. This suggests that are our results are not fully robust to autocorrelated errors, which may exist when providers select patients on unobserved characteristics and when this selection is time-dependent. Although certainly an 12 We have also run a similar Monte Carlo experiment with iid errors and the results are even more in favor of the BCFE estimator than in the above case.

28
interesting finding, the corresponding econometric issue would have to be carefully analyzed theoretically and empirically in a stand-alone future research project 13 . Figure A1: Monte Carlo simulation results -Cross-sectional dependence 13 The corresponding simulation results are available upon request.

TABLES AND FIGURES
The table shows the summary statistics for the observed outcomes, patient and procedure characteristics, as well as the total number of TAVI procedures per hospital over the years 2007-2012.   In columns Lag=0, the probability of the adverse events provided for individuals for which the previous patient did not encounter the adverse event. In the Lag=1 column, the previous patients did encounter the adverse event. The pvalue refers to the test for equality of proportions between the Lag=0 and Lag=1 columns.  Learning from failure effects are estimated using the bias-corrected FE LPM proposed by De-Vos et al. (2015) for the outcomes of one-and 24-month mortality. Bootstrapped standard errors in parenthesis: *** p<0.01, ** p<0.05, * p<0.1. The table shows the estimates of the learning from failure effects using the Split-panel jackknife FE probit estimator proposed by Dhaene and Jochmans (2015). Standard errors in parenthesis: *** p<0.01, ** p<0.05, * p<0.1. For 1month mortality, the estimates do not converge when HCD is included. Overall, the coefficient of the Lagged outcome is robust to leaving out the learning variables (i.e., leaving out all learning variables for 1-month mortality and leaving out HCD for 2-year mortality). Restricting the sample size to 730 instead of 761 for 24-m mortality does not change the results dramatically (the point estimate is -0.062 and statistically significant at the 10% level).