TECHNOLOGY SHOCKS AND HOURS WORKED: A CROSS-COUNTRY ANALYSIS

We reassess the evidence for (or against) a key implication of the basic real business cycle model: that aggregate hours worked increase in response to a positive technology shock. Two novel aspects are the scope (14 OECD countries) and the inclusion of data on both labor supply margins to analyze the key margin of adjustment in aggregate hours. The short-run response of aggregate hours to a positive technology shock is remarkably similar across countries, with an impact fall in 13 out of 14 countries. In contrast, its decomposition into intensive and extensive labor supply margins reveals substantial heterogeneity in labor market dynamics across OECD countries. For instance, movements in the intensive margin are the dominant channel of adjustment in aggregate hours in 6 out of 14 countries of our sample, including France and Japan.


INTRODUCTION
Fueled by the seminal contributions of Kydland and Prescott (1982) and Long and Plosser (1983), much attention has been devoted to estimating the impact of technology shocks on aggregate hours worked. A focal point of the empirical literature has been to test the key predictions of the basic real business cycle (RBC) model, namely that aggregate hours worked per capita rise after a permanent technology shock and that the correlation of the technology-induced components of aggregate hours and labor productivity is positive. Although the We wish to thank Luca Benati and Fabrice Collard for guidance, and Andrea Raffo and Lee Ohanian for making the data available to us. We also thank the editor of the journal, two anonymous referees, Mario Forni, Jordi Galí, Christian Myohl, and participants of the Macro Workshop at the University of Bern and of the 2017 Spring Meeting of Young Economists for valuable comments. The views, opinions, findings, and conclusions or recommendations expressed in this paper are strictly those of the authors. They do not necessarily reflect the views of the State Secretariat for Economic Affairs (SECO). SECO does not take responsibility for any errors or omissions in, or for the correctness of, the information contained in this paper. empirical evidence is not clear-cut, there is today a firm presumption that-at least for the USA-these predictions are rejected by the data. 1 For countries other than the USA, the empirical evidence stands on a more insecure footing. Data availability is a key difficulty: Until recently, consistent quarterly data on hours worked per employee (the intensive margin of labor supply) have not been available for many countries other than the USA. For lack of an alternative, the international empirical evidence on the role of technology for labor supply relies on employment (the extensive margin of labor supply) as a proxy for aggregate hours, including for instance the evidence on G7 and Eurozone countries provided by Galí (1999Galí ( , 2004 or Dupaigne and Fève (2009). Addressing this lack of data, Ohanian and Raffo (2012) have published a homogeneous quarterly data set for 14 Organisation for Economic Co-operation and Development (OECD) countries on aggregate hours worked and its two components, hours per employee and employment. Their data show that in many countries (and in contrast to the USA), much of the cyclical variations in aggregate hours take place in average hours per employee. This finding suggests that employment is potentially a poor proxy for aggregate hours and puts much of the existing international evidences on the role of technology shocks for aggregate hours into question.
In this paper, we use the homogeneous data by Ohanian and Raffo (2012) (and a subsequent update provided by the authors) and analyze how the response of aggregate hours to permanent technology shocks compares across industrialized countries. We estimate a separate structural vector autoregressive (VAR) model for each country. Importantly, we address the potential issues concerning data pretreatment and identification in a unified, comprehensive, and transparent manner. To shed further light on the dynamics of aggregate hours, we disentangle the aggregate hours response to technology shocks into adjustments along the intensive and extensive margins of labor supply. This novel exercise is particularly interesting in our cross-country set-up, as the countries we study feature a wide range of labor market institutions (LMIs) and legislation, which translates into different incentives to adjust labor along both margins.
In our baseline specification, aggregate hours enter in levels and are quadratically detrended to remove low-to medium-frequency movements. Labor productivity enters the system in first differences. Besides aggregate hours and labor productivity, the VAR further includes information on consumption-to-output and investment-to-output ratios, interest rates, and inflation. For the decomposition of the aggregate hours response into intensive and extensive margins of labor supply, we append information on employment to the system and use generalized least squares to estimate the VAR with linear constraints, following Lütkepohl (2005). Technology shocks are identified as the shocks which explain the maximum fraction of forecast error variance (maxFEV) of labor productivity at the 10-year horizon, a statistical approach associated with Uhlig (2003Uhlig ( , 2004. Overall, our results confirm across a broad set of countries the findings of Galí (1999). On impact, we find a fall in aggregate hours worked in response to a positive technology shock in 13 out of 14 countries of our sample (Japan is the single exception). The technology-driven components of aggregate hours and labor productivity are negatively correlated in 11 countries. There is more variation in our findings when we decompose our results into intensive and extensive margins of labor supply. Employment is the dominant channel of adjustment in 8 out of 14 countries, including the Anglo-Saxon economies. The opposite holds for Austria, Finland, France, Japan, Korea, and Norway: In these economies, most of the aggregate hours responses to technology shocks are through movements in average hours per employee. More generally, our decomposition shows that the intensive margin is important for the short-run adjustment of aggregate hours. As the response of employment is typically slow and builds up over time, employment is overall an inadequate proxy for the short-run response of aggregate hours. This finding also adds to recent evidence provided by Abbritti, Weber, et al. (2018) and Hantzsche, Savsek, and Weber (2018), who attribute an important role to LMIs for shaping business cycle fluctuations.
We also investigate the robustness of our main findings. In a first step, we disentangle neutral (N) and investment-specific (I) technological change. Following Fisher (2006), the simplifying assumption that technology shocks affect the production of all goods homogeneously may be problematic if N-and I-shocks have permanent effects on labor productivity, but different effects on aggregate hours. In addition, we also investigate if the relative importance of the two labor supply margins for the aggregate hours response varies depending on the type of technology shock. 2 Rather than simply taking ratios of consumption and investment deflators (as done for instance by Watanabe (2012) or Beaudry, Moura, and Portier (2015)), we construct for a subsample of five countries measures for the relative price of investment (RPI) of sufficiently high quality similar to the existing ones for the USA. The time series are based on specific national accounts statistics defined by the ratios of chain-weighted deflators for investment and consumption. With respect to the decomposition into neutral and investment-specific technical change, we find that the inclusion of investment-specific technical change has little effect on the qualitative features uncovered under the one-technology assumption. We also consider alternative detrending methods for aggregate hours. We show that accounting for low-to medium-frequency movements in hours is crucial to avoid substantial estimation bias when hours enter the VAR in levels or first differences, reemphasizing the same point by Fernald (2007), Canova, Lopez-Salido, andMichelacci (2010), or Chaudourne, Fève, andGuay (2014) across a broader set of countries.
The remainder is organized as follows. Section 2 describes the data and the empirical methodology applied. Section 3 reports our evidence on the role and importance of permanent technology shocks for the 14 OECD countries of our sample. We disentangle neutral and investment-specific technical change for a subsample of five countries in Section 4. Further robustness exercises are considered in Section 5. Section 6 concludes.

DATA AND METHODOLOGY
This section outlines our empirical framework. We first introduce the reducedform VAR. Then, Section 2.2 introduces the data and motivates the data pretreatment. Section 2.3 outlines how the aggregate hours results are decomposed into movements in the intensive and extensive labor supply margins. Section 2.4 describes identification.

VAR Specification
For each country, we estimate the following VAR of order 4 on quarterly data: is a vector of n series observed at time t, B 0 is an n × 1 vector of intercepts, the remaining B's are n × n coefficient matrices, and u t is an n × 1 vector of zero-mean innovations with covariance matrix E[u t u t ] = . Our vector Y t consists of seven series: labor productivity (lp t ), aggregate hours expressed in per capita terms (H t ), the consumption-to-output ratio (cy t ), the investment-to-output ratio (iy t ), 3-month interest rates (i t ), inflation (π t ), and employment per capita (n t ). The first two series-labor productivity and aggregate hours-correspond to the minimum-system necessary to identify the effect of technology shocks on aggregate hours. Four variables are added to this system to mitigate against omitted variable bias. In particular, we include information on nominal interest rates and inflation to help control for the monetary policy setting, which in sticky price models matters for the transmission of technology shocks (see for instance Galí, López-Salido, and Vallés (2003)). We also include real consumption-and investment-to-output ratios. In general equilibrium models, these ratios are typically jointly determined with labor productivity and hours. 3 Finally, information on employment is added in the last position of the VAR to allow identifying the key margin of adjustment in aggregate hours. Importantly, this last variable is only included to decompose the aggregate hours response into its two margins and is treated in a special way. Section 2.3 provides further details.

Data
The baseline data set covers quarterly series for 14 OECD countries. The sample covered is country-dependent, typically ranging from 1970Q1 to 2016Q4 (188 observations). There are five exceptions: Ireland (sample starts 1973Q1), Italy (1971Q1), Korea (1976Q4), Sweden (1980Q1), and the UK (1972Q1). 4 Data on aggregate hours and its two components-average hours per employee and employment-are obtained from Ohanian and Raffo (2012) and a subsequent update provided by the authors. Their measures combine information from various sources, including national statistical offices, establishment surveys, and household surveys. To ensure consistency, the series are adjusted for differences across countries such as paid vacation or sick days. Basic descriptive statistics for these series are summarized in Appendix Section A.2. Data on population aged 16-64, consumption, investment, and output are drawn from the OECD Quarterly National Accounts. Consumption-to-output and investment-to-output ratios correspond to the shares of real private consumption and real gross fixed capital formation (GFCF) of real GDP. We construct a measure of labor productivity based on information on aggregate hours, real output, and population aged 16-64, namely (in logs): We use the OECD Main Economic Indicators' measure of 3-month nominal rates. For Germany, Ireland, and Korea, this series only starts in the early 1990s and 3-month money market rates are used to expand the sample. Finally, we obtain inflation as π t = log(P t ), where P t corresponds to quarterly seasonally adjusted GDP deflators (P t ) obtained from national statistical offices. 5 Data pretreatment is a key issue in any structural VAR analysis. We are particularly concerned with low-to medium-frequency movements in aggregate hours. Figure 1 displays the development in the raw series of aggregate hours (solid black lines) together with the estimated quadratic trend (solid red line) for all countries in the sample. As also documented by Ohanian, Raffo, and Rogerson (2008), there are large differences in trend changes across OECD countries. For instance, Austria, France, Germany, and Japan show substantial declines in hours, while for Canada the trend is clearly upward sloping. For Korea and the USA, the trend line shows an inverse U-shaped pattern. The cross-country differences in the trend of hours worked are at least partly due to institutional, policy, and regulatory factors. For instance, Ramey (2006, 2009) show that low-frequency movements in aggregate hours arise from demographic changes, sectoral shifts, or trends in labor markets such as increasing female participation rates. Ohanian, Raffo, and Rogerson (2008) identify taxes as an important driver of changes in hours worked both over time and across countries.
There is an open debate on whether trends in aggregate hours should be removed prior to estimating the SVAR. On the one hand, over-detrending or overdifferencing may distort the sign of the impact response of aggregate hours to technology shocks (e.g., Christiano, Eichenbaum, and Vigfusson (2004)). On the other hand, if low-frequency movements in the data are not removed, they can pollute high-frequency results (such as IRFs), as pointed out by Blanchard and Quah (1989), Fernald (2007), Canova, Lopez-Salido, andMichelacci (2010), or Chaudourne, Fève, andGuay (2014). 6 In our baseline specification, we follow Fernald (2007) and Canova, Lopez-Salido, and Michelacci (2010) and quadratically detrend aggregate hours to allow for intercept heterogeneity. Given the importance of this choice, we discuss various robustness exercises in Section 5. For instance, the IRFs displayed in Figure 2 also depict results for aggregate hours in levels or first differences.

Decomposition
To decompose the aggregate hours response into movements in the intensive and extensive margins of labor supply, we append information on employmentn t to the system. 7 The same data pretreatment as for aggregate hours is applied. Since we do not want this addition to affect the estimation results, we impose the linear restriction that information onn t is not allowed to have any contemporaneous or lagged impact on all other variables of the system. Withn t ordered last in the VAR, this restriction amounts to imposing zeros in positions B j (1 : n − 1, n) ∀ B j , j > 0. The VAR model subject to linear constraints is estimated with estimated generalized least squares (EGLS) following Lütkepohl (2005). Further details can be found in Appendix Section A.1.1.
To sum up, our baseline specification is defined by Y t = [ lp t ,Ĥ t , cy t , iy t , i t , π t ,n t ] , where ≡ 1 − L is the first-difference operator and a hat denotes quadratically detrended variables. All variables except i t are measured in logs.

Identification
We are interested in a linear mapping between the innovations u t and structural shocks t , that is, ( 2) From equations (1) and (2) follows: To identify the coefficients in A, theoretical restrictions must be imposed to reduce the number of unknown structural parameters to be less than or equal to the number of estimated parameters of . For our purposes, we will only identify the technology shock-either under a one-technology assumption (Section 3) or later distinguishing between embodied and disembodied technological change (Section 4). No additional assumptions are made to separately identify the remainder-the "non-technology shocks." In our baseline case, identification is achieved via the maxFEV approach, pioneered by Uhlig (2003Uhlig ( , 2004. 8 The idea of the maxFEV approach is to search for innovations that explain the maximum amount of forecast error variance (FEV) of a specified variable either at a target horizon or over a specified forecast horizon. In our baseline case, we identify technology shocks as the shocks which explain the maxFEV of labor productivity at 40 quarters. Appendix Section A.1.2 provides further information on the methodology.

EVIDENCE
This section presents our cross-country evidence on the labor market response to permanent technology shocks. First, Section 3.1 discusses our results for aggregate hours. Then, Section 3.2 focuses on the decomposition of results into adjustments along intensive and extensive margins of labor supply.

Aggregate Hours
Figure 2 depicts the median dynamic response of aggregate hours to a onestandard-deviation positive technology shock together with 5th, 16th, 84th, and 95th percentiles. Solid black lines with shaded bands correspond to our baseline specification. For comparison purposes, the figure also depicts results of two variants: other things being equal, dashed red lines and dotted green lines have aggregate hours and employment in levels and first differences, respectively (without removing a quadratic trend). For our baseline specification, the median impact response of aggregate hours is negative in 13 out of 14 countries, and significantly so (at the 68% level) for 10 countries (see also first row of Table 1). The single exception is Japan, where the impact response is slightly positive, but insignificant. In most countries, the negative response is short-lived. In many instances, aggregate hours revert back to zero within 10 quarters, sometimes after turning positive after the initial fall (e.g., Germany or the USA). As we discuss in more detail in Section 5, the negative impact responses hold for a number of alternative specifications considered.
We generalize the point raised by Fernald (2007), Canova, Lopez-Salido, and Michelacci (2010), and Whelan (2009) that for the USA, estimation results are sensitive to low-frequency movements in the data. Our findings confirm their evidence across a broad set of countries. When we compare our baseline results to the variants with aggregate hours in levels (red dashed lines) and in first differences (green dotted lines), two main observations stand out. First, the sign of the impact response appears sensitive to how we enter aggregate hours. For aggregate hours in levels, the impact response turns positive or less negative in seven countries and becomes more negative in five countries. For the first-difference specification, nine countries exhibit a stronger impact fall of aggregate hours. This is an indication of potential over-differencing or mis-specification. Second, in most instances, the shapes of the impact responses are unaffected. Only Australia and France show substantially different patterns. The provided evidence underpins the importance of properly accounting for low-frequency movements in the data prior to estimation. Table 2 summarizes median estimates for the fraction of FEV explained by permanent technology shocks. The table shows estimates at horizons 0 (the impact), 4, 8, 16, and 32 quarters. Overall, the table shows substantial heterogeneity in the role of technology shocks in accounting for fluctuations in aggregate hours across countries. On impact, results range from a low of 2% (Germany) to a high of 37% (Canada). On average, the fraction of FEV of aggregate hours explained by technology shocks remains roughly constant over time, decreasing from an average of 16% to 14% after 8 years. The fraction of FEV explained by technology shocks is higher for labor productivity than for aggregate hours. On impact, results range from a low of 34% (Sweden) to a high of 74% (UK). At the 2-8 years horizon, the average fraction of FEV of labor productivity explained by technology shocks is roughly 50%. The lowest numbers are found for Japan, Korea, Norway, and Sweden, where it is around 30%.  The table reports the impact response of aggregate hours and its components to a positive permanent technology shock. We use **/++ to indicate significance at 95% level and */+ to indicate significance at 68% level (based on 1000 bootstrap replications).  Summarized, we find similar impact responses of aggregate hours to positive, permanent technology shocks across countries, with an impact fall in 13 out of 14 countries of our sample. In most instances, we find a negative comovement between the technology-induced components of aggregate hours and labor productivity. At the same time, there is substantial cross-country heterogeneity in the quantitative importance of technology shocks for fluctuations in aggregate hours and labor productivity.

Intensive and Extensive Margins of Labor Supply
We now turn to the decomposition of our aggregate hours results into adjustments in intensive and extensive margins of labor supply. As a motivating observation, Table 3 reports (in the first three rows) the percentage of the variability in the cyclical component of aggregate hours that is accounted for by movement in employment n t , hours per employee h t , or their covariance. The sample period covered is as before (summarized in Appendix Table A1). The decomposition Cyclical components are extracted based on the HP-filter with λ = 1600.
follows an exercise by Hansen (1985) for the USA. In line with Hansen's results, the table shows that for the USA, most of the variances of aggregate hours can be attributed to adjustments in employment per capita. Only 14% can be directly attributed to adjustments in hours per employee. The numbers are similar for the other Anglo-Saxon countries. Interestingly, the pattern is not the same for Austria, France, Japan, Korea, and Norway, where fluctuations in hours per employee account for a higher fraction of the overall variance in aggregate hours than fluctuations in employment. Against this background, a natural question to ask is to what extent the two margins of labor supply can account for the aggregate hours response to technology shocks described in the last section. A few exceptions aside, we find that on impact of the technology shock, most of the adjustments in aggregate hours take place along the intensive margin. Further, the importance of the extensive margin picks up over time. In the Anglo-Saxon economies, Germany, and Sweden, the extensive margin plays overall the dominant role in explaining the aggregate hours adjustment to technology shocks. The opposite holds for Austria, Finland, France, Japan, Korea, and Norway: In these economies, most of the aggregate hours responses to technology shocks are through movements in average hours per employee.
The first finding is summarized in Table 1, which reports the decomposition of median impact responses. The table shows that the intensive margin response is higher (in absolute terms) than the extensive margin response in all countries except Australia and Ireland. Stated differently, we find that in the very shortrun, the fall in aggregate hours depicted in Figure 2 (present for all countries except Japan) is mainly driven by a decrease in hours per employee. Similarly, Figure 3 shows the IRFs of our decomposition of aggregate hours reported in Figure 2. The figure shows the response of employment (solid black lines) and hours per employee (dashed red lines) to positive permanent technology shocks for the baseline specification. While the response of employment to technology shocks tends to build up slowly over time, the response of hours per employee reverts back to zero rather quickly.  As to the finding of an increase in the importance of extensive margin adjustments over time, rows 4-6 in Table 3 repeat the variance decomposition of aggregate hours for the technology-driven components. Except for Austria, France, Japan, Korea, and Norway, we find that variation in employment accounts for the bulk of the adjustments in aggregate hours. Movements in hours per employee appear least important in Australia, the UK, and the USA.
Another interesting finding is that the technology-driven cyclical components of intensive and extensive margin adjustments are negatively correlated in France and Sweden, and positive in all the remaining countries. France is the only country with a significantly positive response of employment to technology shocks. Yet, hours per employee fall significantly on impact. As shown in Figure 3, the IRFs are almost mirror images of each other, highlighting the negative correlation of the two margins.
The last rows in Table 2 report the FEV of the two margins. The table shows that for all countries except Ireland and Italy, the fraction of FEV of employment accounted for by technology shocks is increasing over time. On average, the fraction of FEV of employment increases from 6% to 13% after 8 years. For hours per employee, the average FEV remains at roughly 15%.
Our results highlight interesting differences between countries generally associated with high and low labor market rigidities, respectively. A few exceptions aside, countries in which the intensive margin is important in explaining aggregate hours fluctuations (as highlighted in Table 3) coincide with countries with high firing frictions, as summarized for instance by the OECD ranking of protection against dismissal (printed in Appendix Table A4). Consider the decomposition of the aggregate hours response in terms of the two labor supply margins, depicted in Figure 3, for instance France/Germany (high firing restrictions) and Canada/the USA (low firing restrictions). For the former two, the employment response is positive on impact and builds up over time, reaching its peak 5-10 quarter after the shock. By contrast, for the latter, employment significantly declines on impact and requires more time to build up. Our results provide evidence consistent with a higher flexibility of adjusting employment in Canada and the USA compared to Germany or France. However, we also want to emphasize that the evidence we present in this regard is not perfect. For instance, Italy ranks similar to France and Germany on the OECD employment protection rankings but has important employment fluctuations.
Overall, our decomposition results show that the information from the intensive margin are especially important when focusing on the short-run response of aggregate hours. The fact that most short-run adjustment in aggregate hours takes place along the intensive margin highlights that working with employment as a proxy for aggregate hours can be misleading. Over longer horizons, however, employment becomes a better proxy. A second observation is that the observed delay in the reaction in the extensive margin to a positive technology shock appears consistent with the idea that (un)employment evolves as a stock, while hours evolve as a flow. The results are consistent with the idea of frictional labor markets as introduced by Mortensen and Pissarides (1994) and used by Merz (1995) or Andolfatto (1996) (among others) for studying economic fluctuations. In a nutshell, the idea of this approach is that workers take time to find a job and that it is costly for firms to find suitable workers, both generating a delay in the reaction of the extensive margin to technology shocks.

INVESTMENT-SPECIFIC TECHNOLOGICAL CHANGE
Up to this point, we have focused on technology shocks that affect the production of all goods homogeneously. However, Krusell (1997, 2000) and Fisher (2006) among others have shown that neutral shocks are not the only source of technological change. Accordingly, technical change specific to investment is a major source of economic growth and important for explaining business cycle fluctuations. 9 In contrast to disembodied technological change, technical progress specific to investment has no impact on the productivity of old capital goods. Rather, it makes new capital goods more productive or less expensive, thereby raising the real return to investment. Against this background, we review and expand our results from Section 3 by considering two distinct sources of technological change, namely disembodied (N) and embodied (I) technology. We focus on a subsample of five countries for which we are able to collect and construct data of sufficient quality.

Changes to the Previous Set-Up: Data and Identification
We start with the amendments to our previous set-up which allow us to disentangle N-and I-shocks. Following the empirical literature on investment-specific technological change, we identify I-shocks from data on the RPI. An econometric justification for this practice is provided by Schmitt-Grohé and Uribe (2011), who show that the technology transforming consumption into investment goods is approximately linear. 10 We follow Fisher (2006) and construct data series on the RPI from national accounts statistics as the ratios of chain-weighted deflators for investment and consumption. We define consumption as the sum of consumption in services and in non-durables. Depending on availability, we use either household or private consumption. Investment is the more challenging part to compute: We define it as the sum of consumption in durables, non-residential fixed investment (namely, investment in structures, machinery and equipment, and software) and residential investment (investment in structures and equipment). The exact categories differ across countries depending on availability. 11 We use chain-weighting to aggregate the subcategories into our measures of consumption and investment, as explained in more detail in Appendix Section A.2.3. The construction is based on quarterly data obtained from Statistics Canada, EUROSTAT, the Cabinet Office Japan, and the US Bureau of Economic Analysis, respectively (obtained via Datastream). As data prior to 1981 is not available apart from the USA, we choose a sample common to all countries spanning over the 1982Q3-2016Q4 period. 12 Appendix Section A.2.3 reports selected business cycle moments of the series and depicts the data in (log)-levels and growth rates.
To estimate the VAR including our RPI series, we need to express all series in a common unit. As in Fisher (2006), we use consumption as the common numeraire. To adjust labor productivity, we first express the series in nominal units by multiplying it by its deflator. We then deflate nominal labor productivity by the consumption deflator. Aggregate hours, interest rates, consumption-and investment-to-output ratios, inflation, and the RPI do not need to be adjusted. In particular, the RPI is, by construction, expressed in consumption units, since the price of investment is deflated by the consumption deflator. We use the same baseline specification as before, but now include the RPI in the first position. We now have Y t = [ RPI t , lp t ,Ĥ t , cy t , iy t , i t , π t ,n t ] . Data pretreatment is as explained in Section 2. In particular, aggregate hours are quadratically detrended and enter in levels. Labor productivity and now also the RPI enter in first differences.
Regarding identification, we follow Fisher (2006) and use the two identifying assumptions that (1) I-shocks are the sole source of permanent changes in the RPI and (2) only N-and I-shocks affect labor productivity in the long run. Fisher (2006) shows how these restrictions follow directly from a neoclassical growth model with investment-specific technology. The two identifying assumptions readily translate into the maxFEV set-up. Now, identification follows a two-step procedure: First, the I-shock is identified as the shock that explains the maxFEV of the RPI at a 10-year horizon (following the steps outlined in 2.1). Conditional on having identified the I-shock, the N-shock is identified as the orthogonal shock explaining the maxFEV of labor productivity at the 10-year horizon. Table 4 reports the median impact response of aggregate hours to one-standarddeviation positive N-shocks (left column) and I-shocks (right column) together with its decomposition along the intensive and extensive margin of labor supply. Focusing first on N-shocks, the results for Canada, the UK, and the USA are similar to those obtained under the one-technology assumption. However, in the cases of France and Japan, the median impact response of aggregate hours to positive Nshocks turns significantly positive and negative, respectively. As for the response to I-shocks, except for Japan the median impact response of aggregate hours is positive and significantly so for France and the USA. 13 The evidence reported for Japan is in line with the results presented in Watanabe (2012). Regarding the decomposition into adjustment along the labor supply margins, several observations emerge. First, it still holds that the bulk of adjustment in aggregate hours on impact takes place along the intensive labor supply margin, irrespective of the type of technology shock. Second, for France the response of hours per employee to a N-shock reverses its sign and is thus responsible for the increase in aggregate hours. Third, in the case of Japan all labor market variables significantly fall in response to both N-shocks and I-shocks.  Table 5 summarizes the fraction of FEV for labor productivity, aggregate hours, and its decomposition explained by N-and I-shocks, respectively. While N-shocks are far more important in explaining the FEV of labor productivity than I-shocks (at any horizon and for any country), they are roughly equally important for aggregate hours. One exception is Japan, where the importance of I-shocks gains importance over time in explaining the FEV of aggregate hours. When it comes to employment and hours worked per employee, we observe for Canada that N-shocks have become more important in explaining the FEV of employment, while in France the opposite is the case. In Japan it is mostly the I-shock which explains the FEV of intensive and extensive labor supply margins.

Results
Finally, Table 6 decomposes the variability in the conditional components of aggregate hours in terms of hours per employee, employment, or their covariance. The results confirm the findings of the one-technology specification. Overall, variation in employment accounts for the bulk of the adjustments in aggregate hours conditional on both N-shocks and I-shocks. Further, the relative roles of intensive and extensive margins are largely similar across the two shocks (except France). For France, the relative role of the two labor supply margins switches depending on the source of the shock: While hours per employee accounts for a higher fraction of cyclical fluctuations in aggregate hours conditional on Nshocks, employment is more important conditional on I-shocks. However, what is most striking from these results are the large cross-country differences in the overall role of intensive and extensive labor supply margins, as discussed in Section 3.2.

FURTHER ROBUSTNESS CHECKS
The IRFs presented in Figure 2 show that our estimation results are somewhat sensitive to low-frequency movements in the data: While overall, the nonpositive impact response of aggregate hours largely holds across specifications, and the shapes of the IRFs remain similar, there are nonetheless interesting differences in results. In particular, impact responses under the level specification are  somewhat higher than under quadratic detrending (due to a more positive response in employment), and somewhat lower for the first-difference specification (due to a more negative response in hours per employee). For the level specification, the impact response of aggregate hours turns positive in four instances: France, Japan, Korea, and Norway. These four countries correspond to the only ones for which we also identify a strong co-movement between labor productivity growth and aggregate hours: in each of these countries, the series display a similar high-low pattern over the sample studied, much like discussed by Fernald (2007) for the US. Hence, the instabilities in the sign of the median responses do not raise "red flags": Data is pre-treated prior to estimation specifically because we knowe.g. from Fernald (2007) or Canova, Lopez-Salido, and Michelacci (2010)-that estimation results are sensitive to low-frequency comovements in the data. Since estimated responses switch sign if this comovement is reduced (either by adjusting labor productivity, or aggregate hours, or both), our results reemphasize the importance of controlling for low-frequency patterns. While our results stress the importance of removing long cycles, the choice of using quadratic detrending by contrast to for instance cubic or higher order polynomials is somewhat arbitrary.
Our results are qualitatively robust to these alternative detrending methods. 14 We test the robustness of our results along many other dimensions. First, instead of identifying technology shocks by maxFEV, we also estimate our baseline VAR using standard long-run (LR) restrictions in the spirit of Galí (1999). While impact responses remain negative in most instances, they often turn insignificant. As shown by Francis, Owyang, Roush, and DiCecio (2014), models identified by LR restrictions may contain only limited information about shortrun movements in hours, which is why the maxFEV approach is preferred in this exercise. A further robustness test concerns our decomposition methodology for aggregate hours, as the linear restriction onn t not to have any contemporaneous or lagged impact on all other variables of the system may be considered overly restrictive. Without this restriction, we can estimate the VAR by simple OLS rather than by EGLS. Although this alternative specification is not in line with the related literature, it allows getting a better understanding of the decomposition into the two labor supply margins. Removing the restriction leaves results in terms of aggregate hours largely unaffacted. The response is nonpositive in all countries and significantly negative in 8. Looking at the intensive and extensive margins of labor supply, however, all responses for the intensive margin become insignificant and employment dynamics dominate the results. The aggregate hours response is biased towards employment due to two reasons. First, the high persistence of employment convolutes the aggregate hours response. Second, employment is by definition already included in aggregate hours, and consequently also in labor productivity, thus over-specifying the system.
Further robustness issues concern omitted variable bias and sub-sample stability. Our baseline specification features four variables meant to represent the broad picture of the macroeconomic environment and monetary policy setting in particular. Our results appear qualitatively robust to systems of lower order, yet, there are indications of omitted variable bias. In particular, following an exercise by Canova, Lopez-Salido, and Michelacci (2010), we find that in a bivariate VAR with labor productivity and aggregate hours, the estimated technology shocks are correlated with the four variables added in our baseline specification. In particular, for each country, at least one of the four variables is significantly correlated with the extracted technology shocks between four leads and four lags. As to sub-sample stability, we check the robustness of our results by excluding data prior to 1982Q3. The early 1980s are associated, in many industrialized countries, with important changes in the conduct of monetary policy and the macroeconomic environment more broadly. While these changes potentially matter for the propagation of shocks and hence identified technology shocks, results for the two samples appear overall similar.
Finally, we also consider robustness to changes in assumptions regarding the estimation of the VAR. According to standard lag-selection criteria, the optimal number of lags is either one or two (except Sweden, for which it is three). We consider systems of higher order as the more generous lag lengths of 8 or 12 may offer a better approximation of the model's infinite order representation. We only see a sign reversal for Sweden (for a lag order of 12) and Germany (for lag orders of 1, 2 or 12). Regarding the horizon of identification, we consider 20, 60 and 80 quarters over which the forecast error variance is maximized. The only country that depicts a sign reversal in the impact response is Australia. Overall, these results reveal some instabilities in Australia, Germany, and Sweden. For the majority of countries, however, the negative impact response of aggregate hours to positive technology shocks is robust to the VAR's lag order or the horizon over which shocks are identified.

CONCLUDING REMARKS
Is the reaction of aggregate hours to technology shocks similar across OECD countries? Based on the quarterly data for 14 OECD countries over the 1970Q1-2016Q4 period, we find that it is: Many qualitative features documented for the US hold across the 14 OECD countries of our sample. We observe a negative impact response of aggregate hours to positive permanent technology shocks in 13 out of 14 countries. While the dynamics in aggregate hours appear similar across countries, our decomposition analysis shows that the intensive and extensive labor supply margins respond differently to technology shocks. The reported evidence implies using employment data as a proxy for aggregate hours as for instance in Galí (1999Galí ( , 2004 or Dupaigne and Fève (2009) can be misleading. In the Anglo-Saxon economies, variation in employment explains the bulk of the aggregate hours response to technology shocks. However, the opposite holds for Austria, France, Japan, Korea, and Norway.
Our results further emphasize the importance of labor market rigidities in accounting for labor market dynamics. Countries in which relatively much of the adjustments in total hours take place along the intensive margin of labor supply are typically associated with high firing costs. Moreover, high flexibility in employment adjustment appears to be related to countries where the bulk of adjustment in aggregate hours is along the extensive margin. Whether this evidence can be replicated by theoretical business cycle models with frictional labor markets is left for future research. NOTES 1. Prominent contributions to this debate include Galí (1999), Christiano, Eichenbaum, and Vigfusson (2004), Francis and Ramey (2005), Basu, Fernald, and Kimball (2006), Fisher (2006), and Fernald (2007. 2. This extension follows a long line of work including Greenwood, Hercowitz, and Huffman (1988), Gordon (1990), Greenwood, Hercowitz, and Krusell (2000), Fisher (1999Fisher ( , 2006, Smets and Wouters (2007), or Justiniano, Primiceri, and Tambalotti (2011).
3. For the US data, Christiano, Eichenbaum, and Vigfusson (2003) show that the omission of consumption-and investment-to-output ratios can cause specification error sufficiently large to qualitatively affect the inference about the effect of technology shocks. See also Erceg, Guerrieri, and Gust (2005) among others.
4. We also explore subsample robustness. For example, we consider the 1982Q3-2016Q4 subsample, because the early 1980s are associated, in many industrialized countries, with important changes in the conduct of monetary policy and the macroeconomic environment more broadly (see, e.g., Bernanke and Blinder (1992), Sims (1992), Taylor (1993), or Ireland (2000)).
5. In a robustness exercise, we deflate the data with the Consumer Price Index (CPI) instead of the GDP deflator. The qualitative nature of our results is unaffected by this change.
6. This clearly matters: It is a widely discussed feature of the US data that for many specifications, the estimated response of aggregate hours to permanent technology shocks changes its sign depending on how aggregate hours enter the VAR (see, e.g., Christiano, Eichenbaum, and Vigfusson (2003)). Fernald (2007) and Canova, Lopez-Salido, and Michelacci (2010) explain the sign-reversal by level shifts in the data. In the USA, high aggregate hours growth in the 1990s coincides with high labor productivity growth. If the level shifts are not accounted for, the positive comovement is partly identified as permanent technology shocks. Testing for stationarity does not resolve the matter. As aggregate hours are very persistent, the assumption of stationarity is usually rejected by unit root tests in small samples (see Appendix Table A3). Yet, from a theoretical viewpoint, one can easily argue that aggregate hours per capita are by definition level-stationary, as one cannot work more than 24 h a day.
7. Results are by definition identical when appending average hours per employee instead. 8. We choose the maxFEV over long-run (LR) restrictions (e.g., applied in Galí (1999)) since it does not suffer from small sample bias in computing impulse responses (e.g., Faust and Leeper (1997)). In particular, Francis, Owyang, Roush, and DiCecio (2014) show with Monte Carlo experiments that the maxFEV identification outperforms identification via LR restrictions as it reduces the bias in short-run IRFs and raises estimation precision.
10. A linear transformation technology is sufficient to ensure that the RPI is exogenous and only depends on investment-specific technology shocks. With nonlinear technology, the RPI would instead depend on the amount of resources devoted to investment.
11. For France, Japan, and the UK, we collect data for GFCF by type (namely dwellings, other buildings and structures, transport equipment, and other machinery and equipment). For the UK, price increases in residential investment in dwellings are over 400% between 1980 and 2010, and we choose to exclude it. For Canada, we collect the relevant GFCF data for businesses and government. For the USA, we use data on private residential and nonresidential investment and government investment.
12. Our obtained measures of the RPI likely underestimate the rate of investment-specific technological change due to lack of quality adjustment (see, e.g., Gordon (1990), Schmitt-Grohé andUribe (2011) or Cummins and Violante (2002)). For the USA, this issue is likely small, as the share of quality-adjusted equipment goods in the national accounts series we use becomes relatively large after 1982. However, there is no reason to assume the situation to be similar in other countries. To our knowledge, information on quality adjustment procedures is not available at an international level, and there are no clear quality adjustment rules across national statistical offices. This drawback in our RPI series is the main reason we report results for the one-technology assumption separately, since the underlying data for Section 3 is likely of much better quality.
13. By contrast to the dynamics to N-shocks, the response of aggregate hours to I-shocks is more sensitive to alternative specifications. For instance, we calculate an alternative measure of the RPI in a simple and relatively crude way. For each country, we obtain nominal and real series for private consumption and GFCF and compute the respective consumption and investment deflators by dividing nominal series by real ones. The RPI is then defined as the ratio of these two deflators (see for instance Watanabe (2012) or Beaudry, Moura, and Portier (2015)). Under such a specification, the median impact response of aggregate hours to I-shocks turns negative for the USA.
14. The evidence is summarized in Appendix Figure B1 and Tables B1 and B2. 15. To give an example, to compute the chain-weighted growth rates in 2000 prices, we compute two distinct growth rates with either 1999 or 2000 as base year and take their geometric average. Lütkepohl (2005) shows that a generalized LS (GLS) estimator can be calculated aŝ The matrix u is typically unknown and has to be estimated from the data. We use the consistent and unbiased estimator from the unrestricted VAR,ˆ u = 1 T−np−1ÛÛ , to obtain the EGLS estimator,γ , which has the same asymptotic properties as the GLS estimatorγ : Finally, the estimator for β is given byβ = Rγ + r.
APPENDIX A.1.2: Identification: Maximum fraction of FEV approach. The following explanations are drawn from Caldara, Fuentes-Albero, Gilchrist, and Zakrajšek (2016), Uhlig (2003), and Francis, Owyang, Roush, and DiCecio (2014). To provide some structure on the methodology, it is helpful to rewrite equation (1) into the moving-average representation or its equivalent representation in terms of the lag polynomial (L), where (L) ≡ I + 1 L + 2 L 2 + ... . The version of the maxFEV approach we consider is to identify a shock by searching for innovations that explain the maximum amount of FEV of a specified variable at some target horizon k. We are hence interested in the k-step ahead forecast error of equation (A6), which is given by: With labor productivity being ordered first in the SVAR, the k-step ahead forecast error due to the technology shock is given by where e 1 has dimension 1 × n and takes the form e 1 = 1 0 ... 0 . Following Caldara, Fuentes-Albero, Gilchrist, and Zakrajšek (2016), what is needed to find the maximum fraction of FEV is an orthogonal matrix Q such that A = AQ, where A denotes the Cholesky decomposition of . Then, finding the innovation that accounts for the maximum fraction of FEV of the first variable in Y t amounts to finding the first column of Q (denoted by q 1 ) by solving: subject to q 1 q 1 = 1.
The solution to equation (A9) corresponds to finding q 1 -the eigenvector of S of the largest eigenvalue λ.   Note: (i) ISO country codes; (ii) Neither theory nor visual inspection of the series provides clear guidance of whether the econometrician should include a time trend in the unit root test; (iii) As hours per capita are bounded above and below, the inclusion of a time trend is purely dependent on the considered time window; (iv) Statistic is T(ρ − 1) with T being the sample length andρ the OLS estimate of y t = ρy t−1 + u t ; (v) OLS t-test statistic for null hypothesis that ρ = 1. ***, **, and * imply that we fail to reject the null of a unit root at 1%, 5%, and 10% significance, respectively. p-values are computed based on 1'000 bootstrap replications. An investment deflator is computed in an analog way. The ratio between the investment and consumption deflators then gives a measure of the real price of investment.
Computing the chain-weighted growth in the consumption deflator involves making two calculations of growth for each period. The only difference is the base period used for the quantities: we compute the growth in the prices once by using the quantities of the period itself and once by using the quantities of the preceding period. 15 With Q denoting quantities and P the deflator, we compute: gr 1 = i P t,i Q t,i i P t−1,i Q t,i = i C curr,t,i i C curr,t−1,i C con,t−1,i C con,t,i , gr 2 = i P t,i Q t−1,i i P t−1,i Q t−1,i = i C curr,t,i C con,t,i C con,t−1,i i C curr,t−1,i , where the indices i denote the different components of consumption used. The right-hand equations show the computations in terms of indices measured in constant prices (C con , giving Q) and current prices (C curr , corresponding to PQ). The chain-weighted growth in the consumption deflator is then computed as the geometric average of the two growth rates, namely: P CW = (gr 1 × gr 2 ) 0.5 − 1.
Ideally, we would want chain-weighted indices instead of constant price series. However, such series are in most cases only available after the 1990s. The choice of a fixed year means that one is using a price structure that becomes more and more remote from the current structure the further we move away from the base year. To alleviate somewhat Note: Cyclical components are obtained based on the HP-filter with λ = 1600. Output is measured in terms of consumption units. All correlations ρ(·) are significant at the 5% level.
Note: The relative price of investment (RPI) is measured in log-levels FIGURE A1. Relative price and quantity of investment.
from this problematic, we take several series for different base years and link them via growth rates into one index. Business cycle moments: Table A5 reports selected business cycle moments for our RPI series. According to the table, the unconditional correlation between the RPI and GDP (measured in consumption units, as explained below) is positive in all countries. The correlation is lowest in the USA and highest in Japan. Apart from the UK, we find that the volatility in the RPI is lower than in GDP.
Visual inspection: Figure A1 shows a clear negative trend in the RPI for all five countries. While the decline in the RPI is near monotone in Canada, France, Japan, and the USA, there is substantial variation in the UK series. The figure further shows that the large price declines coincide with large increases in the quantity of investment. Investment and its relative price are clearly negatively correlated. APPENDIX B: APPENDIX: ADDITIONAL RESULTS AND ROBUSTNESS Figure B1 compares impact responses of the labor market variables across countries for our baseline specification with quadratic detrending to both the level and first-difference specification. For comparison purposes, here we take an agnostic stance on the relative relevance of the different specifications. The blue error bars correspond to the results presented in Table 1, while the green and red error bars disentangle the impact responses of aggregate hours depicted in Figure 2 along the two margins of labor supply. The table reports the impact response of aggregate hours to positive permanent neutral (N) and investment-specific (I) technology shocks. We use **/++ to indicate significance at 95% level and */+ to indicate significance at 68% level (based on 1000 bootstrap replications). The specification Fisher (2006) corresponds to a trivariate VAR with hours in first-differences and identification with LR restrictions. If not stated otherwise, the specifications are as in our baseline VAR with the single difference outlined in the first column of each row.