Which Factors Drive the Skill-Mix of Migrants in the Long-Run?

A pervasive, yet little acknowledged feature of international migration to developed countries is that newly arriving immigrants are increasingly highly skilled since the 1980s. This paper analyses the determinants of changes in the skill composition of immigrants using a framework suggested by Grogger & Hanson (2011). We focus on Switzerland, which continuously showed very high immigration rates and dramatic changes in the skill composition of immigrants. In addition, the recent integration of Switzerland into the European labour market in 2002 serves as a policy experiment which allows analysing the influence of a reduction on immigration restrictions on immigrants from European countries in comparison to those from other countries. Our findings suggest that changes of education supply in origin countries and shifts to the relative demand for education groups stand out as the two most important drivers. Yet, while supply alone predicts only a modest increase in the case of highly educated workers and a large increase of middle educated workers, one particular demand channel, the polarisation of labour demand induced by the adoption of computer capital, is crucial to explain the sharp increase in highly educated workers and the mere stabilisation of the share of middle educated immigrant workers. The abolition of quotas for EU residents played a smaller role, yet may have slightly reduced the high skill share among immigrants relative to immigrants from other countries.


Introduction
Which factors drive the skill composition of immigrants? A pervasive feature of international migration flows to developed countries in the last decades is that newly arriving immigrants are increasingly highly skilled. Between 1980 and 2010, the share of immigrants with a tertiary education increased by 15 percentage points on average for 20 OECD countries (Brücker H. & Marfouk, 2013). 1 Yet, the changes in the share of highly educated immigrants have been very uneven across countries with large gains in countries such as Australia, Canada, the UK and Switzerland and more modest changes in France or Germany (Docquier & Marfouk, 2005). These trends have gained more saliency in the light of an ongoing discussion among policy makers whether skilled immigration could serve as a palliative for increasing labour shortages of skills in developed countries. Yet, there seems to be little agreement on the actual drivers of skill scarcity and whether and how policy makers should respond by adapting immigration policies (Chaloff & Lemaitre, 2009;Stevens et al., 2009). From this perspective, it is surprising that the factors driving these trends have received relatively scant attention in the academic literature.
In this paper, we study the determinants of the changing skill composition among immigrants building on the empirical framework suggested by Grogger & Hanson (2011). According to this framework, the educational composition of immigrants from a certain origin country observed in a destination depends on (i) the wage differentials of education groups in the destination, (ii) the wage differentials in the origin country, (iii) the population shares of the education groups in the origin and (iv) a vector of education specific bilateral migration costs. While Grogger & Hanson (2011)'s analysis is static using a cross-section of destinations and origin countries, we analyse the importance of each of these factors from a long-run perspective. We focus on the period between 1980 and 2010, and on a single destination country, Switzerland.
Switzerland represents an interesting yet exemplary case for a number of reasons. First, together with a group of other countries (such as Australia, Canada and the U.S.) Switzerland has traditionally exhibited very high immigration rates (Peri, 2005). In 2011, the country's population share of foreign born was at 27.2% which was only surpassed by Luxembourg (OECD, 2014). Second, Switzerland has witnessed a strong change in the skill composition of newly immigrating workers during the past decades. In 1990, only 17% of those workers, who recently (within the last five years) immigrated into the country had a tertiary degree, whereas this share rose to 47% in 2010 (cf. Figure 1). 2 Third, the integration of Switzerland into the European labour market provides us with a rare policy experiment in which immigration restrictions were abolished for workers from the EU while immigration from third party countries remained being subject to quotas. This allows studying the effect of changing immigration restrictions using a difference-in-difference design. To analyse the role of the different driving forces of the skill composition of immigrants, we exploit the variation in the change of three education group shares (tertiary, secondary, compulsory or less education) among newly arriving immigrants from 30 different origin countries in 106 local labour markets in Switzerland between 1980 and 2010. The fundamental challenge to the econometrician is the likely endogeneity of educational wage differentials to the influx of immigrants with different educational backgrounds, a point which has not been sufficiently acknowledged in the literature on cross-country immigration flows. 3 We deal with this concern by using a proxy for shifts of local relative demand for workers with different educational backgrounds which is orthogonal to immigration inflows.
In particular, as suggested by Autor & Dorn (2013), we exploit that local labour markets with a higher specialisation in routine occupations due to their industry structure in 1970 experience stronger adoption of computer / automation capital in later decades. More technology adoption, in turn, lead to a more pronounced polarisation of the wage and employment structure in these local labour markets.
The basic idea is, that the adoption of computer capital substituted for workers in occupations with a high routine task content lowering their wages and their employment while increasing wages and employment of workers with complementary skills in high-skill abstract occupations and low-skill non-routine manual occupations. As Michaels et al. (2014) show, this polarization of the labour demand goes hand in hand with a polarization of the demand for education, as broad occupations groups correspond closely with education levels. 4 Consequently, the share of routine employment of a local labour market in the pre-1980 area serves as a good proxy for exogenous changes to the relative demand for workers with different educational backgrounds during the computerisation area starting around 1980. The routine share has been widely employed in the literature on job polarisation as a proxy for relative demand shifts induced by technology. 5 In principle, relative demand workers with different educational backgrounds could also be driven by other demand shifters, e.g. offshoring.
Yet, the existing evidence suggests that technology has been the major source of wage changes in the developed world (Katz & Autor, 1999). 6 Our empirical analysis provides two main results. First, we find that two factors in the framework suggested by Grogger & Hanson (2011) stand out as the main drivers of the skill composition of newly arriving immigrants: Education supply in origin countries and the relative demand for skills in destinations. Our findings show, that a 1 percentage point increase in the share of a particular education group in the origin country leads to a close to 1:1 increase in the shares of highly and middle educated workers and a slightly lower increase in the share of low educated workers. However, education supply can only explain a fraction of the observed changes in destinations in the case of highly educated workers and mis-predicts the sign of the average change in the case of middle educated workers. This underscores the importance of accounting for the role of relative demand in destinations. Confirming our expectation, we estimate a positive effect of routinisation on the share of highly educated workers, a negative effect on the share of immigrants with a middle education whereas the point estimate for low educated immigrant workers cannot be distinguished from zero.
This highlights technological change as a particular source for demand driven immigration. Taken together, supply and demand slightly over-predict the observed change in the share of highly educated workers in Swiss destinations while explaining the small decrease in the share of middle educated workers and the larger loss of low educated workers relatively precisely. The estimates are very robust to controlling for a host of alternative explanations which might drive changes to the skill composition of immigrants. In particular, we show that accounting for origin country changes in educational wage differentials, changes to the income distribution and controlling for the general performance of the economy (proxied by changes to GDP per capita in PPP) does not affect our estimated coefficients.
Furthermore, we show that the effect of routinisation is also robust to controlling for ethnic networks, 4 As we show below, workers with tertiary education are overrepresented in non-routine abstract occupations and workers with low educational backgrounds are overrepresented in low skill service occupations while workers with a middle educational background overwhelmingly work in routine occupations.
5 See e.g. Goos et al. (2009), Goos et al. (2010) or Autor & Dorn (2013). Acemoglu & Autor (2011) provide an extensive overview of the relevant literature. 6 We document the importance of routinisation and other drivers of the relative demand for workers with different education background in the robustness section of our results. which are generally regarded as a powerful pull-driver of immigration (Card, 2001;Bartel, 1989). We find that ethnic networks are particularly important for low educated immigrants while they have no effect on highly educated workers. In addition, we show that relative demand shifts induced by routinisation are the most powerful demand factor explaining the change in the skill composition of immigrants, while alternative determinants of the relative demand, such as offshoring, are less important. This adds to the literature showing similar findings for natives or the general workforce (Michaels et al., 2014;Autor et al., 2013b;Goos et al., 2011).
As a second main finding, our results suggest that the integration of Switzerland into the European labour market after 2002 had, if anything, an adverse effect on the skill composition of migrants. That is, the increase in the share of highly educated workers from the EU was lowered relative to those from other countries. In contrast, the decrease of the share of low educated workers was attenuated for workers from the EU relative to those from other countries. To identify the effect of the labour market integration, we use a difference-in-difference estimator comparing changes in the education shares of immigrants from the EU to those of other countries for which immigration restrictions were not relaxed after 2002 while controlling for economic drivers as before. As may be expected, we find that the effect of lowering immigration restrictions on the skill composition was strongest in case of old EU member states, for which immigration quotas were phased-out completely already in 2007. 7 In contrast, the effect is not distinguishable from zero in case of new EU member states for which some quotas were kept in place until 2011. These estimates are robust to controlling for country group specific trends, and earlier country group specific immigration restrictions.
The related literature on the effect of a change in immigration restrictions on the skill composition of immigrants is rather scarce. Kato & Sparber (2013) show that the reduction of available H1B visas for skilled workers in 2003 had a negative effect on the quality of student applications to U.S. universities. They argue that the reduced working opportunities after graduation might have deterred high ability students more than lower ability students who would not have been able to apply for an H1B visa anyway. Huber & Bock-Schappelwein (2014), on the other hand, find that Austria's accession to the European Economic Area (EEA) in 1994 and the associated integration of the labour market reduced the share of poorly educated immigrants from the EU compared to other countries. As Huber & Bock-Schappelwein (2014) point out, immigrants to Austria prior to 1994 were negatively selected due to the low returns to education in Austria compared to other European countries. Thus, the liberalisation of the labour market access had the strongest effect on middle skilled foreign workers for 7 The cautious interpretation of this effect has to acknowledge the fact that applicants for residency permits from non-EU countries have been subject to quotas and skill requirements since the early 1990s: ''In deciding whether to grant residence permits, the professional qualifications of applicants and their professional and social adaptability, language skills and age must also indicate that there is a prospect of lasting integration in the Swiss job market and the social environment'' (Bundesbehörden, 2014). Thus, this effect is the difference in the policy treatment of immigrant workers from EU origin countries, which changed in 2002, compared to the policy treatment of non-EU workers for which the policy did not change.
whom the net benefits of immigration were close to zero before and positive thereafter. In Switzerland, in contrast, immigrants were already very positively selected prior to the accession to the EU labour market. Thus, net benefits of immigration were positive for highly skilled immigrants whereas the net benefits were close to zero for middle skilled and negative for poorly skilled. Consequently, the relaxation of immigration restrictions had the strongest effect on foreign workers at the lower tail of the skill distribution explaining our findings. 8 We build on a large literature on the selection of immigrants, i.e. which workers along the skill distribution find it most beneficial to migrate and how this affects the scale of immigration to destinations. In his seminal contribution, Borjas (1987Borjas ( , 1999 shows that immigrants are positively selected if the returns to education are higher in the destination than in the origin country and negatively selected in the opposite case. Then, changes to the general wage level, the return to education and changes to migration costs in the destination relative to the origin country affect which parts of the skill distribution of workers in the origin country find it beneficial to migrate. Based on this framework, Mayda (2010) and Ortega & Peri (2013) analyse the effect of economic drivers and changes to the immigration restriction on the general magnitude of immigration between different countries and to the U.S. (see also Clark et al., 2007). Most closely, however, our paper links to Grogger & Hanson (2011) who study the relative stock of immigrants with different educational backgrounds from various origin countries in a cross-section of destination countries. While finding that destinations with higher wage differentials experience more positive sorting, i.e. have a higher share of highly educated workers from a particular origin country, their analysis remains essentially silent on the fact where higher wage differentials across destinations might originate from and how sorting changes over time.
Our paper is also related to the literature on routinisation and directed technical change. We show that changes in long-run labour demand triggered by technological change may have major implications on the international migration flows, in particular its skill composition. This aspect has only been cautiously mentioned in the seminal contribution of Autor & Dorn (2013) and not gained much attention otherwise. Furthermore, our findings show that long-run labour demand changes, such as routinisation, have very persistent and local nature which underscores Borjas (2001)'s critique of the part of the immigration literature which treats past-settlement of immigrants as orthogonal to current changes in labour demand.
The remainder of this papers is organised as follows. Section 2 gives a short motivation of the theoretical framework and introduces the empirical strategy. Section 3.1 and 3.2 discusses the data and the routinisation measure. Section 3.3 establishes a series of stylised facts about employment and wage polarisation in Switzerland and points at a set of interesting differences between native and immigrant workers. Subsequently, the findings of the empirical analysis are discussed in Section 4. Section 5 concludes.

Conceptual framework and empirical approach
In this section, we illustrate the forces driving the immigration decision of workers with different educational backgrounds using a simple model of immigrants' self-selection based on income maximisation in the fashion of Roy (1951). Using Grogger & Hanson (2011)'s adaption of the Roy model to three education groups (low, middle and high), we derive sorting equation which explain why different destinations receive immigrants with different educational backgrounds. Subsequently, we explain how we identify the different factors using variation across local labour markets, origin countries and time and take our sorting equation to the data in Section 3.

Sorting of immigrants across local labour markets
We consider the stock of migrant workers from many origin countries in many destinations. The basic ingredient in Grogger & Hanson (2011)'s adaption of the Roy model are separate migration decisions of workers with primary, secondary and tertiary education. Specifically, worker i with education e from origin country o evaluates the utility from migrating to destination j based on the following linear utility function where W e i,j and C e i,o,j are education specific wages and migration costs, respectively, and e i,o,j is an unobserved idiosyncratic term. The wage of worker i is given by where exp(µ j ) is the wage of a primary educated worker and δ 2 j (δ 3 j ) is the return to secondary (tertiary) education and D e i = 1 if worker has education level e. Migrating from origin county o to destination j is a function of fixed costs f o,j and education specific costs g e o,j : Grogger & Hanson (2011) take Equation (1) as a first-order approximation of a more general utility function with α > 0 as the marginal utility of income. 9 Staying in the origin country is modelled as the migration costs being zero. We follow the literature (McFadden, 1974;Grogger & Hanson, 2011) 9 Grogger & Hanson (2011) also derive predictions using log-utility functions in the fashion of Borjas (1987Borjas ( , 1999. In their empirical analysis, however, they show that both linear and log-utility lead to very similar predictions of parameters in the sorting equation, on which we focus here. The reasons is, as they argue, that sorting on log differences in wages is very similar to sorting on level differences in wages in a sample of destinations with similar labour productivity. assuming that errors, e i,o,j , follow an i.i.d. extreme-value distribution. 10 Then, assuming that agents base their decision of whether and where to emigrate maximising their utility, we can write the log odds of migrating to destination j versus staying in the origin-country o for a worker with education level e as where L e o,j constitutes the population share of workers with education level e from origin country o in destination j and L e o is the population share of workers with education e staying in o. Equation (2) characterises scale of immigration, i.e. the number of workers with education e who decide to emigrate to destination j from origin country o. The scale of immigration depends positively on the skill-related wage difference net of migration costs. Now, the skill composition of immigrants in a destination j from origin country o is just the relative scale of immigration from this country of workers with different educational backgrounds. For concreteness, we can write down separate scale equations for workers with secondary (e = M ) and tertiary education (e = H), take the difference and rearrange to Equation (3) describes the relative sorting of tertiary versus secondary educated migrant workers from origin-country o across destinations j. The relative number of highly to middle educated workers increases, (i) if the wage difference between education groups in the destination j increases, (ii) if the same difference decreases in the origin-country, (iii) if migration costs fall more for highly educated workers, or (iv) if the supply of highly educated workers increases in the origin country.

Empirical approach
The sorting Equation (3) makes predictions about how the number of immigrants with tertiary education relative to those with secondary education from a specific origin-country o varies across destinations j. To characterise the change in sorting, we can add time indices, t, representing decades, to 10 This specification of the disturbance term assumes that independence of irrelevant alternatives (IIA) applies among destinations. In our empirical application, we consider different local labour markets within Switzerland as destinations, thus we need only that IIA applies within to the destinations in the sample (Grogger & Hanson, 2011). This assumption is supported by Borjas (2001) who shows that conditional on having arrived in certain country, immigrants pick the location which offers the highest reward for their particular skill. As Grogger & Hanson (2011) show, we can test this assumption by dropping one destination at the time in our regressions and investigating the stability of our estimated coefficients.

Equation (
3) and take first differences To take this expression to the data, we need information on wages in Swiss commuting zones, the destinations in our case, and origin-countries as well as information on the skill supply in origin countries and relative migration costs. However, one concern arising is that the change of these wage measures, ∆ w H − w M j,t , is likely to be endogenous to immigration (see e.g. Borjas, 2001). Although a large part of the literature investigates this particular relationship, the existing literature studying drivers of immigration largely ignored this concern. 11 Instead of using local wage measures, we suggest a different route using direct proxies for local relative demand shifts which affect educational wage differentials but are not affected by immigration. A well established proxy for such local demand shifts is a region's ''initial'' share of routine employment, which we denoted by RSH j,t . 12 This measure was first introduced by Autor & Dorn (2013) for local labour markets but the idea of routine intensity as a proxy for relative demand shifts affecting the wage differential of workers with different educational backgrounds and skills has found wide application in the literature on skill-biased technical change and job polarisation. 13 The basic intuition is simple. Computers (or automation capital, more generally) are a close substitute for workers employed in jobs with a large share of routine manual or routine cognitive tasks, such as assembly line workers or bank clerks. The continuously falling price of IT capital over the past decades has lead firms to substitute computers for these workers and has driven down their wages relative to other workers. On the other hand, computer capital complemented workers employed in managerial or professional occupations engaged in abstract, problem-solving tasks. Consequently, the adoption of computers increased the demand for these workers, raising their wages and employment. In this process, also the demand for occupations at the bottom of the wage and education distribution, which are primarily engaged in non-routine manual tasks (such as waiters, cleaners or security guards), increased also raising their wages and employment. Thus, labour demand polarised with wages and employment increasing at the top and at the bottom relative to the middle. 14 Indeed, Autor & Dorn (2013) show for the U.S. that regions with a larger share of employment in routine occupations at the beginning of their sample, experience stronger polarisation subsequently. 15 Consequently, we expect that regions with a larger initial share of routine employment experience larger positive demand shifts for highly educated workers relative to middle educated workers inducing their wage difference to increase. In contrast, we expect these regions to experience larger negative demand shifts for middle educated workers relative to low educated workers inducing their relative wage difference to decrease. Thus, we can write down the empirical versions of the sorting equation (4) for highly skilled and poorly skilled workers relative to middle skilled workers where we substitute changes to wage differences in destinations with a region's routine share: where ∆ takes differences over decades, t, ∆x t = x t+1 −x t . We include the following fixed effects to account for relative migration costs; Canton fixed-effects, α c , control for fixed institutional backgrounds, language regions, taxes and fixed amenities on a higher regional level than j. Time fixed-effects, α t , control for the fact that all regions and origin countries might face different circumstances in different decades. Finally, origin country fixed-effects, α o , control for constant differences between origin-countries such as the distance to Switzerland. If the routine share of a location captures a relative demand shift for high and low educated workers relative to middle educated workers we expect that β 1 is positive in Equation (5) and negative in Equation (6). In contrast, we expect that improving labour market conditions for highly (middle) educated workers in origin countries induce their migrating numbers to decrease relatively to middle (low) educated workers (β 2 < 0). Finally, we expect the supply of each education group in origin countries to be positively correlated (β 3 > 0).
A considerable drawback of estimating Equation (5) and (6) is that the magnitude of coefficients is hard to interpret. To make things more transparent, we estimate specifications similar to those of Autor & Dorn (2013): where ∆EDU SH E j,o,t is the decennial change in the share of education group E, , on the total of immigrants from origin-country o in destination j between decades t and t+1. We consider low, middle and highly educated workers, i.e. E ∈ {L, M, H}. ∆X E o,t represents the change in the relative labour market conditions of education group E in the origin countries and ∆EDU SH E o,t is the change in the education share in origin countries. This specification has the advantage that the coefficients represent now the percentage point change of the education share of immigrants from origin country o in destination j of a one unit change of each regressor. While β E 3 should be positive for all education groups, we expect that β H 1 , β L 1 > 0 and β M 1 < 0 according to the reasoning above. However, Michaels et al. (2014) note that the effect on the poorly educated group may be ambiguous.
We finish this section with a note on our proxy for relative demand shifts in the destinations j.
Certainly, employing the initial routine share of employment as suggested by Autor & Dorn (2013) is not the only way to proxy for local relative demand shifts. Alternative ways include exploiting local trade shocks as in Autor et al. (2013a,b), or more generally, exploiting a region's initial industrial structure in combination with exogenous national employment shifts as suggested by Bartik (1991). 16 We stick to the routine share as this illustrates one particular channel affecting the skill composition of immigrants and investigate alternative channels in the robustness part of section 4.

Data, measurement and stylized facts
In this section, we first provide summary information on how we combine data from the Swiss census and from origin countries, with further details deferred to the online Data Appendix. Secondly, we outline how we measure relative demand shifts at the level of local labour markets. Thirdly, we present a set of stylised facts on the polarisation of the Swiss labour market which underscores the relevance of using the routine intensity of local labour markets as proxies for relative demand shifts for skills.

Immigrants, education groups & destinations
We use data from the Swiss Census, which constitutes a complete inventory count of the population for the years 1970, 1980, 1990, 2000. For the years 2010 to 2012, we use the annually conducted structural survey which replaced the Census. This survey contains a representative, 3% sample of the total population. 17 As we break down this data into commuting zone, origin country and education 16 Recent applications of what is commonly known as Bartik instruments include Notowidigdo (2011) or education specific as Peri et al. (2014) and Moretti (2004). 17 This new "census" takes place annually with the 31 December as the day of reference (see Swiss Federal Statistical Office, 2011 for more). Due to this major change and some other redefinitions of variables (see the online data appendix for details), one has to compare aggregate statistics over time with some caution. Moreover, for some of the variables there were many missing observations which could not be included in the analysis. We compared many of the results with other datasets such as the Swiss Earnings Structure Survey (SESS) or the Swiss Labour Force Survey (SAKE) and group cells, we pooled the structural surveys from 2010 to 2012 to gain more accuracy. Our sample consists of individuals of age 16 to 64 who report nonzero working hours. Labour supply is measured in full time equivalents based on weekly hours worked. Workers in the structural surveys were weighted using the official sampling weights.
We classify individuals into natives and recent immigrants according to their country of birth.
Recent immigrants are non-Swiss born, having arrived in Switzerland not more then 5 years before the Census wave. 18 Among recent immigrants, we distinguish workers from 30 different origin countries based on their country of residence 5 years ago. 19 As the Swiss Census does not distinguish different places of origin for immigrants from Ex-Yugoslavia and the former Czechoslovakia prior to the 2010, we aggregate immigrants from all available countries of former Yugoslavia and aggregate immigrants from the Czech Republic and Slovakia in the Census 2010 to 2012 waves.
Workers were classified into three education groups using the International Standard Classification of Education (ISCED) following Peri (2005). Highly educated workers hold a tertiary degree (ISCED 5 and 6), whereas middle educated workers hold a degree from a secondary school (ISCED 3 and 4).
Poorly educated workers are those with compulsory education only or less (ISCED 0, 1 and 2).
For our destinations, we make use of a time-consistent definition of local labour markets provided by the Swiss Statistical Office which has been widely used in the applied literature. 20 The Swiss Statistical Office segments Swiss municipalities into 106 commuting zones (CZs) which are characterised by strong commuting-ties within CZs and weaker commuting ties across CZs. CZs represent internally homogenous labour markets with an orientation towards a centre and represent the closest approximation of functionally independent local labour markets employed in the theoretical model of Autor & Dorn (2013). An additional advantage of this definition is that these CZs may be aggregated into 16 larger labour markets to check robustness of our analysis.
Using these definitions, we collapse our dataset into year, CZ, country group and education group cells for recent immigrants. One not negligible issue is the presence of zero or missing bilateral migration stocks. As Grogger & Hanson (2011) point out, based on the law of large numbers, theory would predict all bilateral stocks to be positive, though some might be very small. Yet, zero migration stocks might occur in finite populations, if bilateral migration probabilities are very small. We deal with this by setting all empty cells to zero in the years 1970 to 2000, since for those years, we have a full inventory of the residency population in Switzerland. If however for any CZ observations were therefore are confident that our dataset yields representative results.
18 In Censuses 2010 to 2012, the information on the year of arrival is missing in some entries. In this case, we classified foreign-born residents as recent immigrants if they had a short-term residency permit (B, L) and as earlier immigrant if they had a long-term permit (C). 19 Using the last residency country reflects more closely the immigration decision in the sense of Grogger & Hanson (2011) compared just using the country of birth as origin. However, the correlation between the two classification of origin is very high in our data. missing for all educations groups, the calculation of education shares is mathematically not defined and, hence, such a CZ was treated as a missing observation. Since we cannot rely on a full inventory count for t ≥ 2010, we treated all empty cells as missing. In section 4.4, we demonstrate that our results are robust to alternative treatments of empty migration cells.

Origin country information
We complement the Swiss Census data with data of origin countries from various sources to control for origin country push drivers in our baseline sorting regression equation. We calculate the shares of education groups in origin countries using data from Barro & Lee (2013). Barro & Lee (2013) report the percentage of the population with some type of educational attainment (completed and uncompleted) for the population aged 15 or older. We define 'no schooling attainment' and 'primary schooling attainment' as poorly educated, 'secondary schooling attainment' as middle educated and 'tertiary schooling attainment' schooling attainment as highly educated. 21

We use data from the Luxembourg Income Study (LIS) 'Key Figures' (Version 3) for income based
Gini coefficients and to construct education specific wage measures by origin country as follows. Since a number of comparison issues arise when working with educational attainment information directly using the LIS, we follow Grogger & Hanson (2011) and use the quantiles of a country's earnings distribution to gauge wages of different education groups. The LIS provides information on the ratio of the the 90th percentile to the 10th percentile and of the 90th percentile to the median for various countries earnings distributions in different years. 22 We approximate median income by origin country with GDP per capita from Heston et al. (2011) and use the ratios from the LIS to gauge incomes for the 90th and 10th percentile. We use the median wage as our wage measure for middle educated workers, and the 90th and 10th percentile as a wage measure for highly and low educated, respectively. 23 Table A1 presents summary statistics of all variables used and table A2 presents the list of origin countries ranked by the number of immigrants in Swiss local labour markets.

Measuring routine intensities
A crucial ingredient in our analysis is a measure of routine task intensity as a proxy for relative demand shifts. We measure the routine task specialisation of a CZ using their occupational composition of employment. To this end, we merge job task requirements from the Dictionary of Occupational Titles 21 We use the population weighted means from Albania, Serbia, Croatia and Slovenia to calculate the education measure for 'Ex-Yugoslavia' and of the Czech Republic and Slovakia for measure of 'Czechoslovakia'. 22 We linearly interpolate the ratios in missing years between available waves and extrapolate trends up to 10 years to minimise the loss of observations. 23 In the LIS only data from Slovenia is available for the group of Ex-Yugoslavian countries. As the absolute wage differences by education group might be important as Grogger & Hanson, 2011 point out, we calculated the weighted means of GDP per capita of Albania, Serbia, Croatia and Slovenia for Ex-Yugoslavian countries as the median income measure and used then the percentile ratios of Slovenia to gauge the wages by education groups in all Ex-Yugoslavian countries. (DOT 1977) to ISCO-88 occupations available in the Swiss Census in order to measure the routine, abstract and manual task content of each occupation. 24 We thereby assume that the skill requirement of occupations in Switzerland is similar to their U.S. counterparts. 25 The DOT provides an assessment of the skill requirements of each U.S. Census occupation assigned by experts on a zero to ten scale.
Thus, each occupation comprises multiple task requirements at different levels of intensity. 26 Following Autor & Dorn (2013) we combine the three task measures to create a summary measure of routineintensity RT I by occupation: where T R k,1980 , T M k,1980 and T A k,1980 are the routine, manual and abstract task inputs in each occupation k in 1970. 27 This measure is increasing in the importance of routine tasks in each occupation and declining in the importance of manual and abstract tasks. Table 1 reports the share of education groups, the task scores from the DOT (standardized to have mean zero and standard deviation one) and the routine intensity as defined by Equation (8)  These occupations typically employ workers with a middle level of education. 30 Service occupations 24 Autor & Dorn (2013) provide a measure for routine, abstract and manual task content for US 2000 census occupations (occ2000) from the Dictionary of Occupational Titles 1977. These three task aggregates were collapsed from originally five task measures first used in Autor et al. (2003). We use a crosswalk from the US National Crosswalk Service Center (NCSC) to match these variables to the International Standard Classification of Occupations (ISCO-88) available in the Swiss Census. See online data appendix for more details. 25 Knowing that both countries lie at the world technology frontier (e.g. Caselli & Coleman, 2006) we believe this assumption to be reasonably satisfied for most occupations. In a similar way, Goos et al. (2009) use task requirement information from the Occupational Information Network (O*NET), the successor of the DOT, to build a measure of routine intensity for ISCO-88 occupations in different European countries. We use task measures from the DOT as the information on the task content of occupations should be predetermined and we use 1980 as the first year of our baseline analysis. We checked the robustness of results using instead the task measures from the O*NET data base. 26 Autor & Dorn (2013) show, for instance, that cognitive abstract skills are most important in professional and managerial occupations, whereas manual skills are most important in in-person service occupations such as cleaning and health care. Routine task input is most dominant either in clerical occupations (for routine-cognitive tasks) or in machine operator or assembly occupations (for routine-manual tasks).
27 Each task is measured on a one to ten scale, with ten meaning that the task is most heavily used in this occupation. 28 We experimented with slightly different classification systems. Acemoglu & Autor (2011) and Autor (2010) for instance suggest allocating part of ISCO88 group 9 (elementary occupations) to ISCO88 group 8 (operators) while allocating the remaining occupations to ISCO88 group 5 (service occupations) according the the task content of these ISCO88 subgroups. For the sake of clarity, we decided to follow Goos et al. (2009) and report all results for ISCO88 main groups. Taking the classification of Acemoglu & Autor (2011) does not significantly alter the results. See the discussion in online data appendix. 29 No wage data prior to that date for these occupations. For wages, we used the Swiss Labour Force Survey (SAKE). 30 Note that plant and machine operators have a relatively high share of low educated workers combined with a relatively high manual task input. The manual task requirement of this occupations group, however, is mainly of a routine-manual and elementary occupations at the bottom of the wage distribution have low average education levels and low levels of routine task inputs in combination of rather high scores of the manual task. Table   A3 shows the standardised task requirement score, averaged over all workers for each education group.
Unsurprisingly, this table confirms that workers with a middle education level work in occupations with the highest routine task content, whereas low educated workers work in manual jobs and high educated workers in occupations with a high abstract task content. Essentially, this finding confirms what Michaels et al. (2014) have found for several other OECD countries. To measure routine intensity at the level of local labour markets, we proceed in the following way.
First, we identify the set of occupations in the top employment-weighted third of routine task-intensity in 1970. 31 These occupations are subsequently referred to as routine-intensive occupations. Next, we calculate for each commuting zone j the employment share of these routine-intensive occupations, RSH j,t as: where L jkt is the employment in occupation k in commuting zone j and decade t. 1[.] is an indicator function taking a value of one if occupation k is in the top employment-weighted third of routine task-intensity in 1970.

Job and wage polarization in Switzerland
In our empirical analysis, we use the employment specialisation in routine occupations of a local labour market as a proxy for shifts to the relative demand for workers with different educational backgrounds.
A large literature has documented the association between initial specialisation in routine employment an the subsequent adoption of computer capital polarising the wage and employment distribution for type which is also subject to automatisation. 31 We performed robustness checks using alternative cut-off levels for defining routine-intensity. None of our results crucially hinged on this choice. most developed countries (Autor & Dorn, 2013;Michaels et al., 2014;Goos et al., 2009). Dustmann et al. (2009 point out, however, that there are potentially important cross-country differences, e.g. due to institutions, in how these technology shocks affect the occupational employment and wage distribution in detail. For Switzerland, these trends have not been documented satisfactorily over a long time period and using detailed information on occupations and tasks in accordance with the most recent academic literature. 32 In this subsection we document, first, that job and wage polarisation are also pervasive features of long-run trends in the Swiss labour market affecting both natives and recent immigrants on the national level. Second, we show that job polarisation also affect the skill composition in local labour markets depending on their initial specialisation in routine occupations. Job and wage polarisation on the national level Figure 2 shows the changes of our 1-digit ISCO groups, where we excluded occupations related to agriculture, separately for natives and recent immigrants (Table A4 provides details and Figure B1 shows the results for the total labour force). Occupations in each pane of Figure 2 are ranked by the median wage from the pooled Swiss Labour Force Surveys (SLFS) 1991 through 1993. 33 As can be seen, natives as well as recent immigrants are subject to polarisation. In fact, for recent immigrants the patterns seem to be even more pronounced than for natives. For example, the employment share of managers (ISCO 1) almost doubled on average for recent immigrants in every decade, growing from 2,7% in 1980 to almost 15% in 2010 whereas for native workers it grew from about 6% to to 11%. On the other hand, the share of craftsmen (ISCO 7) fell from over 41% in 1980 to less than 16% in 2010 among recent immigrants. For natives, it changed from 23% to about 14%. Finally, the fraction of poorly paid workers in service and elementary occupations (ISCO 5 and 9) stayed at somewhat more than 20% among recent immigrants, and around 15% for natives throughout our time horizon (one reason why the pattern for recent immigrants is stronger may be that the group resembles closer a flow of workers, rather than a stock as in the case of natives). 32 Oesch & Menés (2011) compare job polarisation in Switzerland, the UK, Spain and Germany. For Switzerland, the rely on a relatively small sample from the Swiss Labour Force Survey and a relatively short time span between 1991 and 2008, i.e. when computerisation was already well underway. Splitting the employment distribution into earnings quintiles, they find employment growth only at the top of the earnings distribution. Favre et al. (2012) and Müller et al. (2013) an increase in wage-inequality at the top of the wage distribution relative to the middle but do not rely on occupations for their analysis as we do here. Consequently, they do find very different results for wage changes at the bottom.
33 Appropriate wages for ISCO categories are not available prior to 1991, the first year of the SLFS. We pool the SLFS of three years in order to get reasonably large numbers of observations for each two-digit ISCO category. According to the routinization argument, we would expect a similar picture for occupational wages: Relatively strong growth for abstract and service occupations and more modest growth for routine occupations. Figure 3 plots the change in the mean log hourly wages by occupation groups. As we have to rely on the small sample size of the SLFS, we aggregated ISCO main occupations according to their task content in four groups, non-routine occupations (service), routine manual occupations (craft workers & operators), routine cognitive occupations (clerks) and non-routine abstract occupations (managers, professionals and technicians). We ranked those groups again by their mean log wage in 1991. 34 Evidently, wage growth is most pronounced at the bottom, with gains of about 0.07 log points in real terms for manual service workers, and at the top; Wages of abstract workers increased by about .05 log points. In contrast, wage gains of craftsmen and operators (employed in routine manual jobs) and clerks (representing routine cognitive workers) were considerably more modest (0.005 and 0.015 log points in real terms, respectively).
To sum up, paralleling trends in most other OECD countries, we find that in Switzerland, routine occupations show decreasing employment shares with losses most pronounced in routine manual occupations (operators and craft) but also in routine cognitive jobs (clerks). On the other hand, abstract occupations at the top of the wage distribution (managers, professionals and technicians) as well as 34 Cf. the notes of the figure. As the SLFS samples are relatively small, we aggregated ISCO 5 and 9 into service occupations, ISCO 7 and 8 into craft/operators, ISCO 4 as clerks and ISCO 1 to 3 into managers/professionals/technicians. instance, the simple OLS prediction (without controls) would predict that Zurich, with an initial routine intensity of 42%, would experience a 12 percentage points higher increase in highly skilled labour (0.43 × 0.3 × 100) compared to Schwarzwasser, whose routine share was only 12% in 1980. 35 In contrast, initial routine intensity is negatively related to the subsequent change in the share of middle educated workers as Panel B of Figure 4 shows. Finally, Panel C shows a positive relationship between the RSH j,1980 and the change of the share of poorly educated workers. As mean growth of the share of poorly skilled workers is negative and positive in the case of middle skilled workers, this might seem puzzling at first sight. However, this reflects the international trend of skill upgrading: If the supply of skills becomes generally more biased towards highly skilled workers, this growth may offset relative demand shifts originating from technical change (see also Michaels et al., 2014). Nevertheless, Panel B and C suggest that regions with larger routine specialisation experience stronger (weaker) demand for poorly (middle) educated workers than regions with a small routine input in 1980.
In what follows, we document that the polarising skill demand offers a key explanation for the skill composition of new immigrants.

OLS estimates
While the last section provided some preliminary, graphical evidence about the effect of routinization on skill demand, this section takes our empirical counterpart of the sorting equation, Equation (7), directly to the data. Table 2 reports the results of our baseline specifications. The dependent variable in Panel A is the decennial change in the share of highly skilled recent immigrant workers. Panel B shows the results for the middle skilled and Panel C for the poorly skilled share among recent immigrants workers. We confirm this conjecture in Table A5 in the Appendix which shows the regression results for equations (5) and (6)  Column (1) of Table 2  In column (2), we add controls for the changes in the relative skill supply and remuneration in origin countries. Note that this substantially reduced the number of observations, as we do not have both of these measures for all origin countries and decades. For changes in skill supply measured as changes to education stocks in origin countries reported by Barro & Lee (2013), we find that a change of the highly skilled labour share abroad translates almost one to one into a higher educated immigrating labour force as posited by the model of Grogger & Hanson (2011) [see Equation (4)]. The same holds true for middle skill employees whereas the estimations point to a slightly lower reaction in the case of low-skill migrants. The latter might indicate a smaller labour market mobility in case of low-skill compared to high-skill workers, a fact which is well-documented in the literature (see, for example, Bartel, 1989or Malamud & Wozniak, 2012. The effects for relative demand shifts in origin 36 These weight reflect the importance of each cell for the aggregate picture and also account for the likely inaccuracy of very small cells. We explore different weighting schemes in section 4.4. countries, measured in changes to wage differences of education groups as suggested by Grogger & Hanson (2011), are all estimated to be close to zero and sometimes have the wrong sign. Although they have the right sign for highly educated immigrant workers, we would have expected a positive sign for low educated workers, a positive sign for ∆ w H − w M o,τ and a negative sign for ∆ w M − w L o,τ in the case of middle educate workers. These findings seem to suggest that that relative demand shifts in origin countries are less important once we account for demand shifts in destinations and shifts to education supply. The minor importance of push drivers, especially income levels in origin countries, has also been acknowledged by Mayda (2010) for the general magnitude of migration flows.
Column (3) includes the change of per capita GDP in origin countries relative to the change of Swiss per capita GDP, in order to control for other omitted push and pull variables which influence the labour markets, in particular the business cycle. Higher GDP growth abroad tends to decrease the share of poorly skilled migrants whereas it has a smaller or insignificant effect on middle and highly skilled migrants.

2SLS estimates
One potential concern for the identification of our demand effect in destinations are cyclical factors affecting a commuting zone's industrial composition and hence its routine intensity in the short-run and at the same time influencing the skill composition of immigrants. If this were the case, using the routine intensity at the beginning of the decade would lead to a biased estimate of the demand effect.
This point has also been highlighted by Autor & Dorn (2013) from whom we borrow the identification strategy of relative demand shifts. To make an example in our context, note that a couple of very routine intensive commuting zones, like Grenchen, La-Chaux-de-Fonds, La Vallee or Jura (see Figures   4), were all dominated by the Swiss watch making industry. Task inputs in this industry were highly routine intensive in the 1980s. During the 1970s and 1980s, Swiss watchmakers saw their global market share plummeting due to international competition. This may have released a large share of such a region's workforce out of routine jobs into jobs with a more abstract or service task content. To the degree this cyclical spike to final demand for watches also affected the subsequent skill demand in these regions, it confounds our identification strategy.
To purge our main regressor from this kind of variation, we follow Autor & Dorn (2013) and use an imputed measure for the routine share, RSH j , as an instrument. More specifically, by predicting the local routine intensity using the countrywide routine share of industries and the local industry composition, we obtain an exogenous measure of local changes to labour demand: L N atives i,j,1970 represents native employement in industry i as a share on total native employment in CZ j in 1970. RSH N atives i,−j,1970 represents the routine share of native workers in industry i in all CZs except j at the start of our time span, 1970. 37 Since we use the 1970 census to calculate our IV and start the sample for the regressions in 1980, our instrument should be uncorrelated to cyclical spikes in, for example, final demand. By using only native employment for the calculation of the IV, we gain additional confidence that the variation of our instrument is exogenous in case of the regressions for recent immigrants. 38 The last five columns in Table 2 show the resulting instrumental variable estimates. Column (5) corresponds to the OLS specification in column (1) and, again, confirms our expectation that routinization has a negative impact on the demand for middle-skill migrants. Compared to that, on the other hand, we find again an adverse effect for the demand for highly and poorly skilled migrants.
Column (5) again shows the results for our full baseline specification, given by equation (7) and column (6) adds relative GDP per capita growth to proxy for other changes on the labour market. All coefficients for our demand shifter, RSH, and the education supply measure prove stable and highly significant.
As the change in the wage differences induced from income percentile ranks in origin countries might be measured with some error, we replace it with the Gini index, similar to other studies (Mayda, 2010;Clark et al., 2007). Again, changes to the relative inequality in origin countries seem to be of minor importance for the change of the skill composition of immigrants once relative demand in destinations and education supply is accounted for.
Finally, to control for still other omitted push variables, or for potential deficiencies of our foreign wage measures, column (7) adds origin-country fixed effects interacted with time dummies. Reassuringly, our estimates for the RSH coefficient prove to be robust even to this very demanding specification. Furthermore, the F-Statistic in case of all 2SLS estimates are well above the conventional threshold of 10 which makes us confident that our IV strategy works generally well (Stock et al., 2002). 37 More precisely, we take native workers employed by industry i in routine-intensive occupations as a share on total native employment in industry i. 38 We use only the native employment to calculate the routine intensity in 1970 to address the potential problem that immigrant could be clustered in routine occupations. This could result in the problem that high routine employment share in a region actually reflects high past immigrant employment which could drive future immigrant inflows through ethnic networks (Card, 2001). We ran regressions with the routine intensity inducing and not inducing past immigrants without any impact on the results. Notes: ***, **, *, denote statistical significance at the 1%, 5% and 10% level, respectively. Robust standard errors (clustered by Canton and origin country) are given in parentheses. All models include fixed effects for Cantons, origin countries and decades. Regressions are weighted using the total number of recent immigrants from origin country o in destination j at the beginning of the decade as weight. RSH j,t is instrumented with RSH j,1970 . ∆EDU SH E o,τ is the decennial change in the share of education group E ∈ {H, M, L} in origin country o and decade τ . ∆ w M − w L o,τ and ∆ w M − w L o,τ are the decennial change in the wage differential between highly and middle and middle and low educated workers in origin country o and decade τ , respectively. ∆GDP P Co,τ and ∆GIN Io,τ represent the decennial change in GDP per capita and the Gini index in origin country o and decade τ . See section 3.1 for a more detailed description of variables.
To summarise, our results prove to be very robust and confirm the hypotheses posited by selection model of Grogger & Hanson (2011) in combination with routinisation. While education supply almost affects the skill composition of recent immigrants one to one, we find that relative demand shifts in destinations are particularly important. There is a highly significant relation between the routine intensity of a CZ and the subsequent growth of the share of high skill immigrants. The opposite holds true for the share of middle skill labour. In the case of the low skill labour share on the other hand, the results for our main regressor turn out to be insignificant and close to zero. These results for highly, middle and poorly skilled workers broadly correspond to the results that Michaels et al. (2014) have found in case of the wage bill shares across OECD countries. 39 As a check of the particular specification of our baseline regression model, we report results for equations (5) and (6)

Robustness to omitted pull factors in destinations
Routinization may not be the only factor driving the skill composition of immigrants. In this section, we analyse the influence of ethnic networks and offshoring, as another potential driver of the relative demand for workers with different skills.
The location choices of current immigrants may be strongly influenced by the location decisions of their compatriots which have immigrated earlier (see, for example, Card, 2001or Bartel, 1989. Hence, if for some reason, earlier immigrants settled in routine-intensive commuting zones at the beginning of our time horizon and this affected current inflows of their compatriots, the coefficient of our routine measure would be biased. We follow Cadena & Kovak (2013) and include the population share of immigrants from origin country o in destination j in 1970 in our regressions in order to measure the influence of ethnic networks. As can be seen in column (1) of Table 3, controlling for such network effects has no effect on the estimates of the relative demand shifts and education supply. Interestingly, ethnic networks seem to play an important role in case of poorly educated workers, whereas for highly skilled individuals they have only minor effects. This finding is in-line with the results of Bartel (1989) who finds for the U.S. that more educated immigrants are less likely to be found in cities with a high proportion of a similar ethnic group.
In the recent literature on wage inequality, task offshoring or the impact of trade exposure is considered as one of the most important competing explanations for long-run changes to the relative demand for skills (Michaels et al., 2014). In particular, it could be the case that firms are actually not replacing workers by capital, but simply move routine-intensive task into countries with lower wages for routine labour. Several authors, see e.g. Autor & Dorn (2013) or Goos & Manning (2007) and the references therein, suspect that routine-intensive tasks are more offshorable than others (however, Blinder & Krueger, 2013 surprisingly find this not to be the case). As a result, regions with a more routine-intensive production would experience more offshoring and, therefore, their share of middle educated workers decreases faster. Although Autor et al. (2013b) showed that trade exposure and technology shocks are essentially uncorrelated on the geographical level In the U.S., we control for offshoring here to gain more certainty that technology adoption really drives our results. In so doing, we matched several offshorability measures to our dataset (see online data appendix for details).
Column 2 in Table 3 shows the results for the offshorability of ISCO-occupations calculated by Goos et al. (2011). Column 3 adds a measure for the offshorability of skills provided by Blinder & Krueger (2013) and column 4 adds an offshorability measure initially calculated for U.S. occupations by Autor & Dorn (2013). As may be expected, the skill offshorability measure as well as the Autor-Dorn measure indicate that middle skill labourers are more prone to offshoring than high-skill workers.
More importantly, however, the coefficients for our main regressor, RSH, prove to be stable, and offshorability clearly plays a minor role. This confirms the results of Michaels et al. (2014), Autor & Dorn (2013) and Goos et al. (2011) who all find that routinization -not offshorability -is the key driver of labour market polarization.
Finally, we introduce origin-country fixed effects interacted with CZ-dummies to control for still other omitted factors on the level of destinations and origin countries (note that we already controlled for time-origin country fixed effects in column 8 of Table 2). Reassuringly, our results also prove robust against this very demanding specification.

4.3
The effect of the changes in immigration policy on the skill composition of immigrants An important but difficult question to tackle is whether and to which degree immigration policy can influence the skill composition of migrants that a country attracts. From this perspective, Switzerland's integration into the European labour market serves as an interesting policy experiment in which immigration restrictions for newly arriving immigrant from EU countries were abolished while immigration restrictions for other countries were kept in place. In this subsection, we show how we can incorporate policy changes into our framework to analyse the effect of liberalisation on the skill composition of immigrants. We first give some additional background information on the changes to immigration policy in Switzerland since the 1980s.

Swiss Immigration Policy between 1980 to 2010
Swiss immigration policy throughout the 1980s was dominated by the desire to find a more balanced approach to govern immigration after several decades of low skill immigration. After the boom years in the aftermath of WWII, immigration was initially facilitated yet global quotas put in place in the 1970s proofed to be largely ineffective as major channels of immigration were de-facto exempted (Sheldon, 2007). In the early 1990s policy makers decided to discriminate between immigrants from EU/EFTA countries and third party countries. The goal of this distinction was to facilitate immigration from EU/EFTA countries while immigration from other countries was subject to a stronger focus on highly educated workers with larger obstacles for the low skilled (Bundesrat, 1991). De-facto, however, global quotas for immigrants from all origins were maintained. With the enactment of the Agreement of Free Movement of Persons with the EU in 2002, the distinction between immigrants from EU and Non-EU countries became even more pronounced. While quotas for Non-EU citizens were kept in place, the integration of the Swiss and European labour market followed a specific schedule. In the public policy debate in Switzerland, it is often stipulated that the integration into the European labour market caused a major upgrading in the skill composition of immigrants (Economiesuisse, 2011). Yet, the causal link between the change in policy and the change in the skill composition has not been analysed rigorously so far. Interestingly, as Panel A of Figure 5 shows, the share of highly educated workers increased sharply both for EU and Non-EU immigrants in the 1990s but levelled off for immigrants from the EU in the last decade. On the other hand, the share of low educated workers decreased throughout for both groups while, again, this fall levelled off in case of EU immigrants during the last decade. An inspection of group specific growth rates presented in Figure B2 reveals that the increase in the share of highly educated workers in the 1990s is driven by the fact that this was the only education group experiencing non-negative growth for both EU and Non-EU countries whereas the other two education groups experienced a reduction in their number of workers. Between 2000 and 2010, however, only the number of highly and middle educated workers from Non-EU countries increased while the number of low educated workers decreased. For EU countries on the other hand, the number of workers increased in all education groups between 2000 and 2010, most interestingly also for low educated workers.

Measuring the effect of changes to immigration policy
The existing evidence on the response of the skill composition of immigrants to changes in immigrants restrictions is rather scarce. With respect to the magnitude of immigration flows more generally, there is some agreement that changes in immigration restrictions affect immigration flows rather immediately and strongly (Ortega & Peri, 2013;Mayda, 2010) and may also influence the region immigrants originate from if restrictions are discriminatory (Clark et al., 2007). Among the only two studies investigating the effects of the policy on the skill compositions directly, Kato & Sparber (2013)  To analyse the effect of new immigration laws empirically, we exploit the fact that the policy distinguished between immigrants from EU and immigrant from other origin countries and that the policy changed over time. As the liberalisation of the Swiss labour market for EU citizens after 2002 is the most far-reaching change in immigration policy, we start investigating whether the skill composition of immigrants from European origin countries changed differentially after 2002 compared to Non-European immigrants controlling for economic drivers of immigrant sorting. Specifically, we augment regression specification (7) in the following way: where EU 2000 o,t is one for all EU countries affected by the AFMP in the decade between 2000 and 2010 and zero otherwise. 43 β E 4 then measures the degree to which education shares of immigrants from EU countries changed differentially between 2000 and 2010 compared to the change of EU and Non-EU immigrants prior to the AFMP conditional on covariates. Hence, the policy effect, as we measure it here, is the deviation of the change in the education shares of the affected group (the EU countries) from a common trend which immigrants from all countries share due to the differential policy treatment. The differential policy treatment is the combination of abolishing the quotas for EU citizens while sustaining quotas together with requirements to the skill for non-EU citizens. Crucially, the identification of β E 4 as the causal effect of the policy hinges on the assumption that there are no factors omitted from the regression which could have lead to differential trends in the change of the skill composition of immigrants from EU vs Non-EU countries after 2002. As we control for a large range of time-varying labour market characteristics in origin countries and fixed origin country characteristics we are confident to address these concerns already to a large extend. 44 Yet, as the discussion of the change in immigration policy in Switzerland since the 1980s suggests, it is likely that changes to the skill composition of immigrants from the EU and other countries started to follow different trends already after 1990. To account for this possibility, we check the robustness of β E 4 by controlling for separate trends in a second step. is the product of two indicators, one for EU origin countries and one for the last decade, i.e. EU 2000 o,t = [1(o ∈ EU ) · 1(τ = 2000s)] o,t . 44 Using region specific education group shares for each origin country in 1980 to control for mean reversion delivers similar results. Results available upon request. 45 Another important assumption for the interpretation of β E 4 as a causal effect is the exogeneity of EU 2000 o,t . This essentially means that liberalisation measures were not introduced for immigrants of those countries whose skill composition showed a favourable trend from the perspective of the policy maker. We think that this assumption can be justified on the grounds that free movement of persons with the EU was not a sought after goal of Swiss policy makers. To the contrary, the AFMP was a request of the European Union for a larger deal of bilateral packages (involving mostly trade agreements) between the EU and Switzerland.
In the framework of Grogger & Hanson (2011), changes to the immigration policy alter, ceteris paribus, the relative costs of immigration in the sorting Equation (4), i.e. g H − g M j,o,t and g M − g L j,o,t . Immigrant sorting is affected, if costs change differentially for education groups. For instance, if migration costs fall more for middle educated than for highly educated workers, i.e., ∆ g H − g M j,o,t > 0, the inflow of highly educated workers relative to middle educated workers will fall. It is natural to assume that the integration into the European labour market reduced migration costs for all education groups. Yet, the effect of the abolishing immigration quotas has most likely affected the net benefits of education groups differentially. As we lack direct information on immigration costs which vary across origin countries, education groups and over time, we can only approximate the effect of the changes in immigration policy indirectly by analysing the differential response of immigrants from affected and not affected countries conditional on a large series of covariates. Thus, what we capture with β E 4 is the differential change in net benefits of migration of education groups. Table 4 shows the results of estimating versions of specification (11) for each education group. For comparison, column 1 repeats the baseline specification without origin country labour market controls.

Discussion of main results
Column 2 reports the effect of introducing the policy dummy, EU 2000 o,t . As can be seen, in case of EU countries between 2000 and 2010, the increase in the share of highly educated immigrants was about 12 percentage points lower compared to the control group. The effect on middle educated immigrants is estimated around zero whereas the effect is positive and significant for low educated immigrant workers (15 percentage points higher). Next, we analyse whether this effect is sensitive to controlling for labour market characteristics in origin countries by switching in the change in educational wage differences (column 3) or the growth rate of real GDP per capita and the change in the Gini (column 4).
Reassuringly, point estimates remain within the 95%-confidence bands of the effect estimated in column 2. When we including all origin country controls in column 5, the estimated effect of the policy is still significant and negative in the case of highly educated workers and strongly significant and positive in the case of low educated workers. Although the AFMP eventually liberalised Swiss labour market access for all European countries, it is likely that the differential abolition of quotas for immigrants from EU17 and EU10 countries led to a heterogenous response in these two cases. We account for this possibility by allowing for separate effects of the AFMP policy on the skill composition of immigrants from both groups, EU17 respectively. Column 6 shows that the estimated effect for all European countries is actually a weighted average of a slightly larger and significant effect for old European member states and an effect which is estimated around zero for new member states of the EU. This makes sense since old member states (EU17) have been ''treated'' with completely unrestricted access already since 2007 whereas access for immigrants from new member states (EU10) was still subject to quotas until 2011. These findings suggest that the opening of the Swiss labour market had, if anything, an adverse effect on the skill composition of immigrants. In contrast, Huber & Bock-Schappelwein (2014) find that the liberalisation of immigration from European countries into Austria has led to a fall of low-skill immigration compared to other countries.
How can their findings be reconciled with ours? From a theoretical point of view, the answer is that the effect of changes in immigration restrictions on the skill composition of immigrants depends on the education-type of the so-called 'marginal immigrant', i.e. the skill group for which immigration costs and benefits roughly equalise. 46 Huber & Bock-Schappelwein (2014) point out that Austria ''had the third lowest return to education for men and the 13th lowest for women among 26 developed countries.'' Consequently, immigration to Austria was selected from the lower tail of the skill distribution in the pre-liberalisation period (i.e., the relative benefits from migration was higher for low-skill than for high-skill migrants). The subsequent reduction of immigration costs increased the net-benefits of immigration for all education groups but changed them from negative to positive for more skilled workers in the middle of the ability distribution whose net-benefits were close to zero before. In the Swiss case, immigrants were already positively selected prior to the liberalisation. In addition, the immigration policy of the early 1990s had a general focus on restricting immigration for low skilled immigrants from Non-EU countries. Thus, most likely the marginal immigrant for whom net-benefits of migration to Switzerland was zero prior to the AFMP would have been located towards the lower end of the skill distribution whereas net benefits for highly educated foreign workers were clearly positive. Consequently, the fall of migration costs across the board for EU-origin countries had a larger effect on the sign of net-benefits of migration for low educated workers.

Robustness of policy effect
As already mentioned, interpreting the estimations above as a causal effect of the AFMP hinges on the assumption that the evolution of the immigrants' skill composition from EU and Non-EU countries was subject to similar trends in trends prior to the AFMP. The validity of this assumption may be questionable, especially since policy makers already started to discriminate between the two country groups in the 1990s. Therefore, in Table 5 we control for pre-trends in our difference-in-difference analysis in various ways. Again for comparison, column 1 repeats the specification in column 5 of Table   4. Column 2 allows for differential linear time trends of EU and Non-EU countries which decreases (increases) the point estimate of the effect for highly (low) educated immigrants slightly while the point estimate for middle educated workers is still zero. In column 3, we allow for an additional differential change in the skill composition of EU immigrants in the 1990s (EU 1990 o,t ), when policy makers introduced the discrimination regime between EU and Non-EU countries for the first time.
Interestingly, the effect becomes only slightly smaller in absolute value for highly educated workers whereas the effect for low educated workers drops almost by half. Although both effects are not 46 The assumption here is that net-benefits are monotone in skills like in Borjas (1987). significant for each education groups individually, the hypothesis that both effects are jointly zero can be rejected on the 10% level for highly and on the 5& level for low educated immigrants. In addition, we can reject the hypothesis that the differential change in the skill composition of EU immigrants was similar in both decades on the 5% level and at the 1% level, respectively. Unsurprisingly, analysing this pre-trends separately for EU17 and EU10 origin countries paints a similar picture with effects for EU17 countries generally being larger in absolute magnitude whereas the effect for EU10 countries are clustered around zero. From this analysis, we cannot completely reject that changes to immigration policy had no effect on the skill composition of immigrants. However, immigration policy was clearly of secondary importance compared to economic drivers like the demand in destinations and education supply in origin countries.

Extensions
A. Benchmarking factors that drive the skill composition Average changes in the skill composition To gauge the economic magnitude of our estimates, we compare the observed change in the skill composition of immigrants with its change predicted by relative labour demand in destinations, the skill supply in origin countries and the potential effects of immigration policy. We illustrate these effects for an average commuting zone using our estimates from Table 4 (column 5).
Between 1980 and 2010, commuting zones in Switzerland experienced an average increase in the share of highly educated immigrants of 7.7 percentage points per decade (pp/d), and a decrease of 0.6 pp/d and 7.2 pp/d of middle and low educed recent immigrants, respectively. 47 Clearly, part of these changes are just driven by the fact that the education supply changed in the countries where these newly arriving immigrants originate from. On average, the share of highly and middle educated workers in the origin countries increased by 3.1 and 8.9 pp/d, respectively, whereas the share of low skill workers decreased by 12 pp/d. Our coefficients of the supply measure imply that these changes would have translated almost 1:1 into the changes in the skill composition of newly arriving immigrants in Switzerland: The share of highly and middle educated would increase by 2.57 pp/d (0.83 × 0.031) and 9 pp/d (1 × 0.089), respectively, and the share of low educated would decrease by 8.86 pp/d (0.73 × −0.12). Thus, the changes in supply clearly underestimate the observed change of highly educated workers, massively overestimate the change in the share of middle educated (with the wrong sign) and are about right concerning the share of low educated immigrants.
This highlights the importance of accounting for changes to the relative demand for workers with different educational backgrounds. The coefficients of RSH j,t imply that an average commuting zone with a share of 0.33 in routine employment in 1980 would have experienced an increase in the share of highly educated recent immigrants of 12 pp/d (0.36 × 0.33) and a decrease in the share of middle educated immigrants of −10 pp/d (−0.29 × 0.33) whereas the impact on low educated workers cannot be distinguished from zero. Adding both the effects of supply and demand in destinations together, our estimations imply that the share of highly educated among recent immigrants increased by roughly 14.7 pp/d whereas the share of middle and low educated decreased by 0.7 and 8.8 pp/d.
Accounting for the effect of the policy, the increase in the share of highly educated workers was substantially lower for EU immigrants between 2000 and 2010 and amounted to only 5.7 pp (14.7 − 8.98). On the other hand, the share of low educated workers from the EU decreased at a slower rate of 1.7 pp (−8.8 + 7.16).
47 These averages are calculated using the population weight of country groups in destinations at the beginning of the decade.
Regional heterogeneity The importance of relative labour demand as driver of immigrant skills can be illustrated nicely by contrasting regions which were exposed to very different demand shifts.
With a share of 40% and 26% of employment in routine intensive occupations in 1980, Basel and Zurich Oberland, the rural area outside of the city of Zurich, were at the 75th and 25th percentile rank of the routine employment distribution. Over the next three decades, the share of highly educated increased from 27% to 71% (14 pp/d) in Basel, while the share of middle and low educated decreased from 25% to 20% (-1.6 pp/d) and 48% to 8% (-13.3 pp/d), respectively. In Zurich Oberland, in the meantime, the shares highly and middle immigrants increased from 8% to 34% (8 pp/d) and from 17% to 39% (7 pp/d), respectively, whereas the share of low educated fell from 74% to 26% (-15 pp/d).
Although both regions faced very similar changes in the educational supply of immigrants, their skill composition changed very differently. 48 The difference in the routine intensity, however, explains to some degree the differential changes. The 13 pp difference in routine intensity translates into a 5 pp/d higher increase in the share of highly educated workers (of the 6 pp/d difference observed) and a 4 pp/d lower increase in the share of middle educated (8 pp/d difference observed).

C. Zero or missing bilateral immigration stocks
As pointed out in Section 3.1, we set cells with missing information of recent immigrants to zero for the Censuses 1980 to 2000. In so doing, we include many cells which would have otherwise dropped out of the sample. To demonstrate that our results are not sensitive to this treatment, Table A7 shows our baseline specification (cf. Table 2) excluding these cells. Our results again prove to be remarkably stable in light of the much smaller number of observations we have with this sample.

C. Weighting of cells
Next, we explore the robustness of the regressions to different weighting schemes. In our baseline regressions, we weighted cells using the total number of immigrants at the beginning of the decade as weights. Regression coefficients then, should reflect average changes in the immigrant population, yet the picture for average changes on the local level could be slightly different as these weights not necessarily reflect the different sizes of local labour markets. To check the influence of weighting, we weight instead using the total number of workers in Table A6 in the Appendix. While the OLS coefficients are estimated less precise, coefficients for our 2SLS results remain remarkably stable. 48 As both regions are German speaking, we abstract from the fact here, that both regions might attract immigrants from different origin countries and that the distribution of origin countries across regions might have some power of explaining observed changes in the skill composition.

Conclusion
A little acknowledged feature of international migration to rich countries is that newly arriving immigrants are increasingly highly educated. Since 1980, the share of immigrants workers with tertiary education rose on average 15 percentage points in OECD countries whereas educational upgrading soared especially in some countries, such as Canada, Australia, the United Kingdom or Switzerland.
In this paper, we analyse the determinants of the skill composition of newly arriving immigrants from a long-run perspective using a framework of immigrant selection and sorting suggested by Grogger & Hanson (2011). Applying this framework to one particular destination country, Switzerland between 1980 and 2010, we can analyse the importance of origin country push drivers, such as changes in the education supply and the relative demand for workers with different educational background in origin countries, as well as economic pull drivers, such as changing relative demand for education groups in destinations, and changes to immigration policy or other migration costs. We focus on Switzerland, which continuously showed very high immigration rates and exhibited dramatic changes in the skill composition of immigrants. Unlike other 'traditional' immigration countries, however, the recent integration of Switzerland into the European labour market in 2002 constitutes an interesting policy experiment in which immigration restrictions were abolished for immigrants from the EU but not for immigrants from other countries. This allows studying the effect of changing immigration restrictions on the selection of immigrants with different educational backgrounds using a difference-in-difference design.
Our findings suggest that changes of education supply in origin countries and shifts to the relative demand for education groups stand out as the two most important drivers. Yet, while supply alone predicts only a modest increase in the case of highly educated workers and a large increase of middle educated workers, one particular demand channel, the polarisation of labour demand induced by the adoption of computer capital, is crucial to explain the sharp increase in highly educated workers and the mere stabilisation of the share of middle educated immigrant workers. Furthermore, our analysis reveals that the abolishment of quotas for immigrants from European origin countries had a small effect but slightly reversed the trends in the change of the skill composition. Between 2000 and 2010, the share of highly educated workers increased at a significantly lower rate among recent immigrants from EU countries compared to other countries and the share of low educated workers decreased at a significantly lower rate. In the discussion of our results we argue that this finding can be reconciled with a situation in which immigrants were already very positively selected prior to the change in immigration policy in 2002. Thus, the reduction of immigration restriction for all immigrants from European countries increased the propensity to immigrate more for education groups at the lower end of the skill distribution compared to highly educated immigrants for whom the immigration was already beneficial prior to the policy change. Notes: ∆EDU SH E j,o,τ is the change in the share of education group E ∈ {H, M, L} of immigrants from origin country o in destination j in decade τ . RSH j,t is share of employment of commuting zone j working in routine occupations as defined in Equation (9)      Notes: ***, **, *, denote statistical significance at the 1%, 5% and 10% level, respectively. Robust standard errors (clustered by Canton and origin country) are given in parentheses. All models include fixed effects for Cantons, origin countries and decades. Regressions are weighted using the total number of recent immigrants from origin country o in destination j at the beginning of the decade as weight. is the change in the log relative number of high to middle or middle to low educated workers in origin country o and decade τ , respectively. ∆ w M − w L o,τ and ∆ w M − w L o,τ are the decennial change in the wage differential between highly and middle and middle and low educated workers in origin country o and decade τ , respectively. τ . ∆GDP P Co,τ and ∆GIN Io,τ represent the decennial change in GDP per capita and the Gini index in origin country o and decade τ . See section 3.1 for a more detailed description of variables.     1980-1990 1990-2000 2000-2010 Notes: Census 1980 -2010. High educated workers have a tertiary degree, middle educated workers a secondary degree and low educated workers compulsory schooling or less. Share of tertiary educated workers among each nationality group.