Analysis of social mobility using the M index: Exercises

Workshop at Universidad de Sevilla, November 16, 2019

Instructors: Ben Jann and Simon Seiler, University of Bern

Contents

Compute M index from table

Open dataset ESS6.dta and compute a two-way table of respondent's class (class) by father's class (fclass) using the tabulate command. Compute the M index manually using the frequencies reported in the table.

Open data:

. use ESS6.dta, clear

. describe

Contains data from ESS6.dta
  obs:        24,069                          
 vars:            14                          13 Nov 2019 18:30
------------------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
------------------------------------------------------------------------------------------
country         byte    %18.0g     country    country
year            int     %10.0g                survey year (ESS Round)
female          byte    %9.0g      female     gender (1 = female)
age             byte    %8.0g                 age
birthyr         int     %10.0g                birth year (survey year - age)
educ            byte    %25.0g     educ       respondent's education
meduc           byte    %44.0g     peduc      mother's education
feduc           byte    %44.0g     peduc      father's education
class           byte    %9.0g      class      respondent's class
mclass          byte    %9.0g      class      mother's class
fclass          byte    %9.0g      class      father's class
mhome           byte    %9.0g                 mother was homemaker
fisei           byte    %8.0g                 father's ISEI
misei           byte    %8.0g                 mother's ISEI
------------------------------------------------------------------------------------------
Sorted by: country

Step 1: Calculate p(y_k,x_j), p(y_k|x_j), and p(y_k) of respondent's class by father's class using the tabulate command.

. tabulate class fclass, column cell nofreq

+-------------------+
| Key               |
|-------------------|
| column percentage |
|  cell percentage  |
+-------------------+

respondent |          father's class
  's class |     upper     middle      lower |     Total
-----------+---------------------------------+----------
     upper |     56.88      30.76      19.44 |     31.92 
           |     13.62       9.54       8.76 |     31.92 
-----------+---------------------------------+----------
    middle |     32.28      46.11      40.23 |     40.15 
           |      7.73      14.29      18.12 |     40.15 
-----------+---------------------------------+----------
     lower |     10.84      23.12      40.34 |     27.94 
           |      2.60       7.17      18.17 |     27.94 
-----------+---------------------------------+----------
     Total |    100.00     100.00     100.00 |    100.00 
           |     23.95      30.99      45.05 |    100.00 

Step 2: Use the results to compute M = sum_j sum_k p(y_k,x_j) * (ln p(y_k|x_j) - ln p(y_k)).

. display ///
> /// row 1 (son: upper class)
>       0.1362 * (ln(0.5688) - ln(0.3192)) /// col 1 (father: upper class)
>     + 0.0954 * (ln(0.3076) - ln(0.3192)) /// col 2 (father: middle class)
>     + 0.0876 * (ln(0.1944) - ln(0.3192)) /// col 3 (father: lower class)
> /// row 2 (son: middle class)
>     + 0.0773 * (ln(0.3228) - ln(0.4015)) /// col 1 (father: upper class)
>     + 0.1429 * (ln(0.4611) - ln(0.4015)) /// col 2 (father: middle class)
>     + 0.1812 * (ln(0.4023) - ln(0.4015)) /// col 3 (father: lower class)
> /// row 3 (son: lower class)
>     + 0.0260 * (ln(0.1084) - ln(0.2794)) /// col 1 (father: upper class)
>     + 0.0717 * (ln(0.2312) - ln(0.2794)) /// col 2 (father: middle class)
>     + 0.1817 * (ln(0.4034) - ln(0.2794)) //  col 3 (father: lower class)
.06352723

Compute M index using model predictions

Compute the M index using predictions from multinomial logit models (mlogit) and confirm that the result is the same as above.

To obtain Pr(Y_i), you need to estimate a reduced model, which is simply a multinomial logit without predictors. To obtain Pr(Y_i|X_i) you need an extended model in which you include father's class as a categorical predictor. After each model, you can use predict to obtain predictions of the probabilities. Things are slightly complicated since you have to make sure that for each observation you predict the probability of the class that has actually been observed for this observation. The easiest approach is to generate three variables, one for each class, and then select the appropriate probability for each observation.

Step 1: Estimate reduced model and generate predictions

. mlogit class

Iteration 0:   log likelihood = -26166.564  
Iteration 1:   log likelihood = -26166.564  

Multinomial logistic regression                 Number of obs     =     24,069
                                                LR chi2(0)        =       0.00
                                                Prob > chi2       =          .
Log likelihood = -26166.564                     Pseudo R2         =     0.0000

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |
       _cons |  -.2294242    .015286   -15.01   0.000    -.2593842   -.1994642
-------------+----------------------------------------------------------------
middle       |  (base outcome)
-------------+----------------------------------------------------------------
lower        |
       _cons |  -.3626209   .0158811   -22.83   0.000    -.3937473   -.3314946
------------------------------------------------------------------------------

. predict p0 p0_2 p0_3, pr

. replace p0 = p0_2 if class == 2
(9,663 real changes made)

. replace p0 = p0_3 if class == 3
(6,724 real changes made)

. drop p0_2 p0_3

Step 2: Estimate extended model and generate predictions

. mlogit class i.fclass 

Iteration 0:   log likelihood = -26166.564  
Iteration 1:   log likelihood = -24674.085  
Iteration 2:   log likelihood = -24633.704  
Iteration 3:   log likelihood =  -24633.62  
Iteration 4:   log likelihood =  -24633.62  

Multinomial logistic regression                 Number of obs     =     24,069
                                                LR chi2(4)        =    3065.89
                                                Prob > chi2       =     0.0000
Log likelihood =  -24633.62                     Pseudo R2         =     0.0586

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |
      fclass |
     middle  |  -.9711631   .0396074   -24.52   0.000    -1.048792   -.8935341
      lower  |  -1.293616   .0393186   -32.90   0.000    -1.370679   -1.216553
             |
       _cons |   .5664245   .0290227    19.52   0.000     .5095411    .6233079
-------------+----------------------------------------------------------------
middle       |  (base outcome)
-------------+----------------------------------------------------------------
lower        |
      fclass |
     middle  |   .4008732    .054843     7.31   0.000     .2933829    .5083635
      lower  |   1.093865   .0509433    21.47   0.000     .9940178    1.193712
             |
       _cons |  -1.091118   .0462314   -23.60   0.000     -1.18173   -1.000506
------------------------------------------------------------------------------

. predict p1 p1_2 p1_3, pr

. replace p1 = p1_2 if class == 2
(9,663 real changes made)

. replace p1 = p1_3 if class == 3
(6,724 real changes made)

. drop p1_2 p1_3

Step 3: Generate observation-level M values from predictions; the mean of these values is the M index

. generate double m = ln(p1) - ln(p0)

. drop p0 p1

. mean m

Mean estimation                   Number of obs   =     24,069

--------------------------------------------------------------
             |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
           m |   .0636896   .0022315      .0593157    .0680635
--------------------------------------------------------------

. drop m

Optional: Compute M index by country using model predictions

Compute the M index by country (country) using prediction from multinomial logit models.

You can do this either (1) by repeating the estimation process individually for each country or (2) by estimating joint models including country as a categorical predictor (and interactions with father's class in the extended model). The second approach is easier (less code).

First have a look at the country variable, to get an overview (if not already installed, first type ssc install fre):

. fre country

country -- country
-----------------------------------------------------------------------
                          |      Freq.    Percent      Valid       Cum.
--------------------------+--------------------------------------------
Valid   4  Switzerland    |       4331      17.99      17.99      17.99
        6  Czechia        |       3834      15.93      15.93      33.92
        10 Spain          |       3732      15.51      15.51      49.43
        13 United Kingdom |       3998      16.61      16.61      66.04
        14 Greece         |       4088      16.98      16.98      83.02
        26 Portugal       |       4086      16.98      16.98     100.00
        Total             |      24069     100.00     100.00           
-----------------------------------------------------------------------

Approach 1: repeat computation of observation-level M values for each country using a loop; in each country a reduced model and an extended model is estimated

. generate double m = .
(24,069 missing values generated)

. levelsof country
4 6 10 13 14 26

. foreach cntry in `r(levels)' {
  2.     mlogit class if country==`cntry'
  3.     predict p0 p0_2 p0_3, pr
  4.     replace p0 = p0_2 if class == 2 & country==`cntry'
  5.     replace p0 = p0_3 if class == 3 & country==`cntry'
  6.     drop p0_2 p0_3
  7.     mlogit class i.fclass if country==`cntry'
  8.     predict p1 p1_2 p1_3, pr
  9.     replace p1 = p1_2 if class == 2 & country==`cntry'
 10.     replace p1 = p1_3 if class == 3 & country==`cntry'
 11.     drop p1_2 p1_3
 12.     replace m = ln(p1) - ln(p0) if country==`cntry'
 13.     drop p0 p1
 14. }

Iteration 0:   log likelihood = -4389.3046  
Iteration 1:   log likelihood = -4389.3046  

Multinomial logistic regression                 Number of obs     =      4,331
                                                LR chi2(0)        =       0.00
                                                Prob > chi2       =          .
Log likelihood = -4389.3046                     Pseudo R2         =     0.0000

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |  (base outcome)
-------------+----------------------------------------------------------------
middle       |
       _cons |  -.3144104   .0336063    -9.36   0.000    -.3802777   -.2485432
-------------+----------------------------------------------------------------
lower        |
       _cons |  -1.096232   .0436254   -25.13   0.000    -1.181736   -1.010728
------------------------------------------------------------------------------
(1,532 real changes made)
(701 real changes made)

Iteration 0:   log likelihood = -4389.3046  
Iteration 1:   log likelihood = -4159.3234  
Iteration 2:   log likelihood = -4149.0484  
Iteration 3:   log likelihood = -4148.9884  
Iteration 4:   log likelihood = -4148.9884  

Multinomial logistic regression                 Number of obs     =      4,331
                                                LR chi2(4)        =     480.63
                                                Prob > chi2       =     0.0000
Log likelihood = -4148.9884                     Pseudo R2         =     0.0548

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |  (base outcome)
-------------+----------------------------------------------------------------
middle       |
      fclass |
     middle  |   .7586456   .0817074     9.28   0.000     .5985021    .9187892
      lower  |   1.046058   .0850619    12.30   0.000       .87934    1.212776
             |
       _cons |  -.8433046   .0554011   -15.22   0.000    -.9518888   -.7347204
-------------+----------------------------------------------------------------
lower        |
      fclass |
     middle  |   1.228821   .1323972     9.28   0.000     .9693271    1.488315
      lower  |    2.27427     .12365    18.39   0.000     2.031921     2.51662
             |
       _cons |  -2.343099   .1026584   -22.82   0.000    -2.544306   -2.141893
------------------------------------------------------------------------------
(1,532 real changes made)
(701 real changes made)
(4,331 real changes made)

Iteration 0:   log likelihood = -4075.2209  
Iteration 1:   log likelihood = -4075.2209  

Multinomial logistic regression                 Number of obs     =      3,834
                                                LR chi2(0)        =       0.00
                                                Prob > chi2       =          .
Log likelihood = -4075.2209                     Pseudo R2         =     0.0000

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |
       _cons |  -.5499335   .0392624   -14.01   0.000    -.6268864   -.4729807
-------------+----------------------------------------------------------------
middle       |  (base outcome)
-------------+----------------------------------------------------------------
lower        |
       _cons |  -.5353772   .0390821   -13.70   0.000    -.6119767   -.4587778
------------------------------------------------------------------------------
(1,773 real changes made)
(1,038 real changes made)

Iteration 0:   log likelihood = -4075.2209  
Iteration 1:   log likelihood = -3920.0791  
Iteration 2:   log likelihood = -3913.8142  
Iteration 3:   log likelihood = -3913.8025  
Iteration 4:   log likelihood = -3913.8025  

Multinomial logistic regression                 Number of obs     =      3,834
                                                LR chi2(4)        =     322.84
                                                Prob > chi2       =     0.0000
Log likelihood = -3913.8025                     Pseudo R2         =     0.0396

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |
      fclass |
     middle  |  -.9135061   .1034584    -8.83   0.000    -1.116281   -.7107313
      lower  |  -1.317428   .1128637   -11.67   0.000    -1.538637    -1.09622
             |
       _cons |   .2949776   .0854363     3.45   0.001     .1275255    .4624297
-------------+----------------------------------------------------------------
middle       |  (base outcome)
-------------+----------------------------------------------------------------
lower        |
      fclass |
     middle  |   .6398091   .1528019     4.19   0.000     .3403229    .9392953
      lower  |   1.079944   .1518972     7.11   0.000     .7822308    1.377657
             |
       _cons |   -1.31758   .1407448    -9.36   0.000    -1.593435   -1.041726
------------------------------------------------------------------------------
(1,773 real changes made)
(1,038 real changes made)
(3,834 real changes made)

Iteration 0:   log likelihood = -4048.6257  
Iteration 1:   log likelihood = -4048.6257  

Multinomial logistic regression                 Number of obs     =      3,732
                                                LR chi2(0)        =       0.00
                                                Prob > chi2       =          .
Log likelihood = -4048.6257                     Pseudo R2         =     0.0000

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |
       _cons |   -.405809   .0414699    -9.79   0.000    -.4870885   -.3245296
-------------+----------------------------------------------------------------
middle       |  (base outcome)
-------------+----------------------------------------------------------------
lower        |
       _cons |  -.1050549   .0381012    -2.76   0.006    -.1797318    -.030378
------------------------------------------------------------------------------
(1,454 real changes made)
(1,309 real changes made)

Iteration 0:   log likelihood = -4048.6257  
Iteration 1:   log likelihood = -3838.9752  
Iteration 2:   log likelihood = -3832.5637  
Iteration 3:   log likelihood = -3832.5522  
Iteration 4:   log likelihood = -3832.5522  

Multinomial logistic regression                 Number of obs     =      3,732
                                                LR chi2(4)        =     432.15
                                                Prob > chi2       =     0.0000
Log likelihood = -3832.5522                     Pseudo R2         =     0.0534

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |
      fclass |
     middle  |  -.9123146   .1096721    -8.32   0.000    -1.127268   -.6973613
      lower  |  -.9198791   .1030637    -8.93   0.000     -1.12188   -.7178779
             |
       _cons |   .2505425   .0785807     3.19   0.001     .0965272    .4045578
-------------+----------------------------------------------------------------
middle       |  (base outcome)
-------------+----------------------------------------------------------------
lower        |
      fclass |
     middle  |   .3678975   .1347786     2.73   0.006     .1037363    .6320587
      lower  |   1.290743   .1231006    10.49   0.000      1.04947    1.532016
             |
       _cons |  -.9624801   .1120854    -8.59   0.000    -1.182163   -.7427968
------------------------------------------------------------------------------
(1,454 real changes made)
(1,309 real changes made)
(3,732 real changes made)

Iteration 0:   log likelihood = -4216.7706  
Iteration 1:   log likelihood = -4216.7706  

Multinomial logistic regression                 Number of obs     =      3,998
                                                LR chi2(0)        =       0.00
                                                Prob > chi2       =          .
Log likelihood = -4216.7706                     Pseudo R2         =     0.0000

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |  (base outcome)
-------------+----------------------------------------------------------------
middle       |
       _cons |  -.3448084   .0362927    -9.50   0.000    -.4159407    -.273676
-------------+----------------------------------------------------------------
lower        |
       _cons |  -.7441243   .0411774   -18.07   0.000    -.8248305   -.6634182
------------------------------------------------------------------------------
(1,297 real changes made)
(870 real changes made)

Iteration 0:   log likelihood = -4216.7706  
Iteration 1:   log likelihood = -4096.0324  
Iteration 2:   log likelihood = -4094.9771  
Iteration 3:   log likelihood = -4094.9765  
Iteration 4:   log likelihood = -4094.9765  

Multinomial logistic regression                 Number of obs     =      3,998
                                                LR chi2(4)        =     243.59
                                                Prob > chi2       =     0.0000
Log likelihood = -4094.9765                     Pseudo R2         =     0.0289

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |  (base outcome)
-------------+----------------------------------------------------------------
middle       |
      fclass |
     middle  |    .709111   .0901932     7.86   0.000     .5323357    .8858864
      lower  |   .8141949   .0881766     9.23   0.000      .641372    .9870179
             |
       _cons |   -.783219   .0578729   -13.53   0.000    -.8966478   -.6697902
-------------+----------------------------------------------------------------
lower        |
      fclass |
     middle  |   1.074415   .1088361     9.87   0.000     .8611006     1.28773
      lower  |   1.396194   .1035503    13.48   0.000     1.193239    1.599149
             |
       _cons |  -1.530689   .0768426   -19.92   0.000    -1.681298    -1.38008
------------------------------------------------------------------------------
(1,297 real changes made)
(870 real changes made)
(3,998 real changes made)

Iteration 0:   log likelihood = -4331.2848  
Iteration 1:   log likelihood = -4331.2848  

Multinomial logistic regression                 Number of obs     =      4,088
                                                LR chi2(0)        =       0.00
                                                Prob > chi2       =          .
Log likelihood = -4331.2848                     Pseudo R2         =     0.0000

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |
       _cons |  -.7075519   .0408182   -17.33   0.000    -.7875541   -.6275496
-------------+----------------------------------------------------------------
middle       |  (base outcome)
-------------+----------------------------------------------------------------
lower        |
       _cons |  -.2800108   .0357471    -7.83   0.000    -.3500739   -.2099477
------------------------------------------------------------------------------
(1,818 real changes made)
(1,374 real changes made)

Iteration 0:   log likelihood = -4331.2848  
Iteration 1:   log likelihood = -4154.6206  
Iteration 2:   log likelihood = -4147.3138  
Iteration 3:   log likelihood = -4147.3054  
Iteration 4:   log likelihood = -4147.3054  

Multinomial logistic regression                 Number of obs     =      4,088
                                                LR chi2(4)        =     367.96
                                                Prob > chi2       =     0.0000
Log likelihood = -4147.3054                     Pseudo R2         =     0.0425

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |
      fclass |
     middle  |  -.6794951   .1164341    -5.84   0.000    -.9077018   -.4512885
      lower  |   -.993138   .1068786    -9.29   0.000    -1.202616   -.7836598
             |
       _cons |  -.0117418   .0884763    -0.13   0.894    -.1851522    .1616686
-------------+----------------------------------------------------------------
middle       |  (base outcome)
-------------+----------------------------------------------------------------
lower        |
      fclass |
     middle  |   .3837685   .1521116     2.52   0.012     .0856353    .6819017
      lower  |   1.232826   .1370353     9.00   0.000     .9642417     1.50141
             |
       _cons |  -1.205271   .1299156    -9.28   0.000    -1.459901   -.9506408
------------------------------------------------------------------------------
(1,818 real changes made)
(1,374 real changes made)
(4,088 real changes made)

Iteration 0:   log likelihood = -4321.9888  
Iteration 1:   log likelihood = -4321.9888  

Multinomial logistic regression                 Number of obs     =      4,086
                                                LR chi2(0)        =       0.00
                                                Prob > chi2       =          .
Log likelihood = -4321.9888                     Pseudo R2         =     0.0000

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |
       _cons |  -.7266826   .0414131   -17.55   0.000    -.8078507   -.6455145
-------------+----------------------------------------------------------------
middle       |  (base outcome)
-------------+----------------------------------------------------------------
lower        |
       _cons |  -.2225847   .0354584    -6.28   0.000    -.2920819   -.1530876
------------------------------------------------------------------------------
(1,789 real changes made)
(1,432 real changes made)

Iteration 0:   log likelihood = -4321.9888  
Iteration 1:   log likelihood =  -4053.845  
Iteration 2:   log likelihood = -4031.6302  
Iteration 3:   log likelihood = -4031.1323  
Iteration 4:   log likelihood = -4031.1322  

Multinomial logistic regression                 Number of obs     =      4,086
                                                LR chi2(4)        =     581.71
                                                Prob > chi2       =     0.0000
Log likelihood = -4031.1322                     Pseudo R2         =     0.0673

------------------------------------------------------------------------------
       class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
upper        |
      fclass |
     middle  |  -1.259648   .1175696   -10.71   0.000     -1.49008   -1.029215
      lower  |  -1.860239   .1180938   -15.75   0.000    -2.091699   -1.628779
             |
       _cons |   .5299594   .0950068     5.58   0.000     .3437494    .7161693
-------------+----------------------------------------------------------------
middle       |  (base outcome)
-------------+----------------------------------------------------------------
lower        |
      fclass |
     middle  |    .338532   .1610124     2.10   0.036     .0229535    .6541105
      lower  |   1.080183   .1526311     7.08   0.000     .7810311    1.379334
             |
       _cons |  -1.011601   .1459686    -6.93   0.000    -1.297694   -.7255082
------------------------------------------------------------------------------
(1,789 real changes made)
(1,432 real changes made)
(4,086 real changes made)

. mean m, over(country)

Mean estimation                      Number of obs   =     24,069

-----------------------------------------------------------------
                |       Mean   Std. Err.     [95% Conf. Interval]
----------------+------------------------------------------------
    c.m@country |
   Switzerland  |   .0554875   .0049016        .04588    .0650949
       Czechia  |   .0421018   .0046022      .0330813    .0511224
         Spain  |   .0578975   .0054384      .0472378    .0685572
United Kingdom  |   .0304638   .0038253       .022966    .0379615
        Greece  |   .0450048   .0045675      .0360522    .0539573
      Portugal  |   .0711837   .0058591      .0596995    .0826679
-----------------------------------------------------------------

. drop m

Approach 2: compute observation-level M values in one go based on joint models across countries (the reduced model contains country dummies; the extended model additionally contains interactions between country dummies and father's class; this is formally equivalent to estimating a separate model in each country)

. mlogit class i.country

Iteration 0:   log likelihood = -26166.564  
Iteration 1:   log likelihood =  -25392.18  
Iteration 2:   log likelihood = -25383.197  
Iteration 3:   log likelihood = -25383.195  

Multinomial logistic regression                 Number of obs     =     24,069
                                                LR chi2(10)       =    1566.74
                                                Prob > chi2       =     0.0000
Log likelihood = -25383.195                     Pseudo R2         =     0.0299

---------------------------------------------------------------------------------
          class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
upper           |
        country |
       Czechia  |  -.8643438   .0516809   -16.72   0.000    -.9656366    -.763051
         Spain  |  -.7202193   .0533773   -13.49   0.000    -.8248369   -.6156017
United Kingdom  |   .0303981   .0494626     0.61   0.539    -.0665467     .127343
        Greece  |  -1.021962   .0528726   -19.33   0.000     -1.12559   -.9183336
      Portugal  |  -1.041093   .0533332   -19.52   0.000    -1.145624   -.9365615
                |
          _cons |   .3144103   .0336063     9.36   0.000      .248543    .3802775
----------------+----------------------------------------------------------------
middle          |  (base outcome)
----------------+----------------------------------------------------------------
lower           |
        country |
       Czechia  |   .2464447   .0600557     4.10   0.000     .1287378    .3641516
         Spain  |   .6767671    .059422    11.39   0.000     .5603021    .7932321
United Kingdom  |    .382506   .0632433     6.05   0.000     .2585515    .5064605
        Greece  |   .5018112   .0579408     8.66   0.000     .3882494     .615373
      Portugal  |   .5592373   .0577631     9.68   0.000     .4460238    .6724508
                |
          _cons |  -.7818219   .0455991   -17.15   0.000    -.8711945   -.6924494
---------------------------------------------------------------------------------

. predict p0 p0_2 p0_3, pr

. replace p0 = p0_2 if class == 2
(9,663 real changes made)

. replace p0 = p0_3 if class == 3
(6,724 real changes made)

. drop p0_2 p0_3

. mlogit class i.fclass##i.country

Iteration 0:   log likelihood = -26166.564  
Iteration 1:   log likelihood = -24240.885  
Iteration 2:   log likelihood = -24169.315  
Iteration 3:   log likelihood = -24168.757  
Iteration 4:   log likelihood = -24168.757  

Multinomial logistic regression                 Number of obs     =     24,069
                                                LR chi2(34)       =    3995.61
                                                Prob > chi2       =     0.0000
Log likelihood = -24168.757                     Pseudo R2         =     0.0763

----------------------------------------------------------------------------------------
                 class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
upper                  |
                fclass |
               middle  |  -.7586456   .0817074    -9.28   0.000    -.9187892   -.5985021
                lower  |  -1.046058   .0850619   -12.30   0.000    -1.212776     -.87934
                       |
               country |
              Czechia  |   -.548327   .1018266    -5.38   0.000    -.7479034   -.3487506
                Spain  |  -.5927621   .0961468    -6.17   0.000    -.7812064   -.4043178
       United Kingdom  |  -.0600856   .0801159    -0.75   0.453    -.2171099    .0969386
               Greece  |  -.8550464   .1043903    -8.19   0.000    -1.059648   -.6504451
             Portugal  |   -.313345   .1099799    -2.85   0.004    -.5289017   -.0977883
                       |
        fclass#country |
       middle#Czechia  |  -.1548605   .1318323    -1.17   0.240     -.413247     .103526
         middle#Spain  |   -.153669   .1367628    -1.12   0.261    -.4217191    .1143812
middle#United Kingdom  |   .0495346   .1217001     0.41   0.684    -.1889932    .2880624
        middle#Greece  |   .0791505   .1422427     0.56   0.578    -.1996402    .3579411
      middle#Portugal  |  -.5010021   .1431737    -3.50   0.000    -.7816174   -.2203868
        lower#Czechia  |  -.2713703   .1413285    -1.92   0.055     -.548369    .0056284
          lower#Spain  |   .1261792   .1336325     0.94   0.345    -.1357357    .3880941
 lower#United Kingdom  |   .2318633   .1225179     1.89   0.058    -.0082674    .4719939
         lower#Greece  |   .0529202   .1365963     0.39   0.698    -.2148036    .3206441
       lower#Portugal  |   -.814181   .1455393    -5.59   0.000    -1.099433   -.5289293
                       |
                 _cons |   .8433046   .0554011    15.22   0.000     .7347204    .9518888
-----------------------+----------------------------------------------------------------
middle                 |  (base outcome)
-----------------------+----------------------------------------------------------------
lower                  |
                fclass |
               middle  |   .4701738   .1374924     3.42   0.001     .2006937    .7396539
                lower  |   1.228211   .1268501     9.68   0.000     .9795889    1.476832
                       |
               country |
              Czechia  |   .1822128   .1776806     1.03   0.305    -.1660348    .5304603
                Spain  |   .5373131   .1559629     3.45   0.001     .2316314    .8429947
       United Kingdom  |   .7523233     .13753     5.47   0.000     .4827694    1.021877
               Greece  |   .2945225   .1692317     1.74   0.082    -.0371654    .6262105
             Portugal  |   .4881923   .1818464     2.68   0.007     .1317799    .8446047
                       |
        fclass#country |
       middle#Czechia  |   .1696353   .2055543     0.83   0.409    -.2332437    .5725143
         middle#Spain  |  -.1022763   .1925342    -0.53   0.595    -.4796364    .2750838
middle#United Kingdom  |  -.1048694   .1793743    -0.58   0.559    -.4564365    .2466977
        middle#Greece  |  -.0864053   .2050417    -0.42   0.673    -.4882796     .315469
      middle#Portugal  |  -.1316422   .2117289    -0.62   0.534    -.5466232    .2833389
        lower#Czechia  |  -.1482667   .1978983    -0.75   0.454    -.5361401    .2396068
          lower#Spain  |   .0625326   .1767617     0.35   0.724     -.283914    .4089793
 lower#United Kingdom  |  -.6462116   .1673126    -3.86   0.000    -.9741383   -.3182848
         lower#Greece  |   .0046153   .1867341     0.02   0.980    -.3613768    .3706074
       lower#Portugal  |  -.1480284   .1984621    -0.75   0.456    -.5370069    .2409501
                       |
                 _cons |  -1.499793   .1084495   -13.83   0.000     -1.71235   -1.287236
----------------------------------------------------------------------------------------

. predict p1 p1_2 p1_3, pr

. replace p1 = p1_2 if class == 2
(9,663 real changes made)

. replace p1 = p1_3 if class == 3
(6,724 real changes made)

. drop p1_2 p1_3

. generate double m = ln(p1) - ln(p0)

. drop p0 p1

. mean m, over(country)

Mean estimation                      Number of obs   =     24,069

-----------------------------------------------------------------
                |       Mean   Std. Err.     [95% Conf. Interval]
----------------+------------------------------------------------
    c.m@country |
   Switzerland  |   .0554875   .0049016        .04588    .0650949
       Czechia  |   .0421018   .0046022      .0330813    .0511224
         Spain  |   .0578975   .0054384      .0472378    .0685572
United Kingdom  |   .0304638   .0038253       .022966    .0379615
        Greece  |   .0450048   .0045675      .0360522    .0539573
      Portugal  |   .0711837   .0058591      .0596995    .0826679
-----------------------------------------------------------------

. drop m

Compute M index using command mindex

Compute the M index using command mindex, overall and by country, and confirm that the results are the same as above.

See help mindex for information on how to use mindex. There is a simple syntax and an advanced syntax. For now, focus on the simple syntax.

Overall M:

. mindex class i.fclass
Estimating outcome models .. done.

M-index analysis                                Number of obs     =     24,069

Reduced model:  mlogit class
Extended model: mlogit class i.fclass

------------------------------------------------------------------------------
     M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   .0636896   .0022315    28.54   0.000     .0593159    .0680632
------------------------------------------------------------------------------

M by country:

. mindex class i.fclass, over(country)
Estimating outcome models ............ done.

M-index analysis                                Number of obs     =     24,069

Reduced model:  mlogit class
Extended model: mlogit class i.fclass
Over:           country

---------------------------------------------------------------------------------
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        country |
   Switzerland  |   .0554875   .0049011    11.32   0.000     .0458814    .0650935
       Czechia  |   .0421019   .0046017     9.15   0.000     .0330827     .051121
         Spain  |   .0578975   .0054378    10.65   0.000     .0472396    .0685554
United Kingdom  |   .0304638   .0038249     7.96   0.000     .0229671    .0379604
        Greece  |   .0450047    .004567     9.85   0.000     .0360536    .0539559
      Portugal  |   .0711837   .0058585    12.15   0.000     .0597012    .0826662
---------------------------------------------------------------------------------

M by country and overall M:

. mindex class i.fclass, over(country) total
Estimating outcome models .............. done.

M-index analysis                                Number of obs     =     24,069

Reduced model:  mlogit class
Extended model: mlogit class i.fclass
Over:           country

---------------------------------------------------------------------------------
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        country |
   Switzerland  |   .0554875   .0049011    11.32   0.000     .0458814    .0650935
       Czechia  |   .0421019   .0046017     9.15   0.000     .0330827     .051121
         Spain  |   .0578975   .0054378    10.65   0.000     .0472396    .0685554
United Kingdom  |   .0304638   .0038249     7.96   0.000     .0229671    .0379604
        Greece  |   .0450047    .004567     9.85   0.000     .0360536    .0539559
      Portugal  |   .0711837   .0058585    12.15   0.000     .0597012    .0826662
                |
          Total |   .0636896   .0022315    28.54   0.000     .0593159    .0680632
---------------------------------------------------------------------------------

Create a graph containing country-specific results

Present the country-specific results and the total in a graph. Optional: Apart form the default results, also include confidence intervals obtained by the bootstrap (stratify the bootstrap samples by country and use 100 replications).

To create the graph, we suggest using coefplot (type ssc install coefplot). To be able to plot the results using coefplot, they need to be stored using estimates store.

Bootstrap results can be obtained by applying option vce(bootrtap) to the mindex command. Use suboptions reps() to set the number of replications and strata() to request stratified resampling.

Estimate and store M using default standard errors/confidence intervals:

. mindex class i.fclass, over(country) total
Estimating outcome models .............. done.

M-index analysis                                Number of obs     =     24,069

Reduced model:  mlogit class
Extended model: mlogit class i.fclass
Over:           country

---------------------------------------------------------------------------------
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        country |
   Switzerland  |   .0554875   .0049011    11.32   0.000     .0458814    .0650935
       Czechia  |   .0421019   .0046017     9.15   0.000     .0330827     .051121
         Spain  |   .0578975   .0054378    10.65   0.000     .0472396    .0685554
United Kingdom  |   .0304638   .0038249     7.96   0.000     .0229671    .0379604
        Greece  |   .0450047    .004567     9.85   0.000     .0360536    .0539559
      Portugal  |   .0711837   .0058585    12.15   0.000     .0597012    .0826662
                |
          Total |   .0636896   .0022315    28.54   0.000     .0593159    .0680632
---------------------------------------------------------------------------------

. estimates store default

Estimate and store M using bootstrap standard errors/confidence intervals:

. mindex class i.fclass, over(country) total vce(bootstrap, reps(100) strata(country))
(running mindex on estimation sample)

Bootstrap replications (100)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50
..................................................   100

M-index analysis

Number of strata   =         6                  Number of obs     =     24,069
                                                Replications      =        100

Reduced model:  mlogit class
Extended model: mlogit class i.fclass
Over:           country

---------------------------------------------------------------------------------
                |   Observed   Bootstrap                         Normal-based
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        country |
   Switzerland  |   .0554875   .0041687    13.31   0.000      .047317    .0636579
       Czechia  |   .0421019   .0044463     9.47   0.000     .0333872    .0508165
         Spain  |   .0578975   .0052173    11.10   0.000     .0476717    .0681233
United Kingdom  |   .0304638   .0042599     7.15   0.000     .0221145     .038813
        Greece  |   .0450047   .0046572     9.66   0.000     .0358769    .0541326
      Portugal  |   .0711837   .0051462    13.83   0.000     .0610974      .08127
                |
          Total |   .0636896   .0024681    25.81   0.000     .0588523    .0685269
---------------------------------------------------------------------------------

. estimates store bootstrap

Display results in a graph:

. coefplot default bootstrap
Stata Graph - Graph Switzerland Czechia Spain United Kingdom Greece Portugal Total .02 .04 .06 .08 default bootstrap

Optional: Compute a country-adjusted overall M index

As you will see in the graph, the overall M is rather high compared to the country-specific M values. This is because the overall M also picks up country differences in the class distributions. A more sensible approach for the overall M might therefore be to compute the M based on a model that includes country fixed-effects. Try to compute such an adjusted overall M.

Approach 1: manual computation based on model predictions; the trick is to include country dummies in both the reduced and the extended model (but omit interactions in the extended model)

. mlogit class i.country

Iteration 0:   log likelihood = -26166.564  
Iteration 1:   log likelihood =  -25392.18  
Iteration 2:   log likelihood = -25383.197  
Iteration 3:   log likelihood = -25383.195  

Multinomial logistic regression                 Number of obs     =     24,069
                                                LR chi2(10)       =    1566.74
                                                Prob > chi2       =     0.0000
Log likelihood = -25383.195                     Pseudo R2         =     0.0299

---------------------------------------------------------------------------------
          class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
upper           |
        country |
       Czechia  |  -.8643438   .0516809   -16.72   0.000    -.9656366    -.763051
         Spain  |  -.7202193   .0533773   -13.49   0.000    -.8248369   -.6156017
United Kingdom  |   .0303981   .0494626     0.61   0.539    -.0665467     .127343
        Greece  |  -1.021962   .0528726   -19.33   0.000     -1.12559   -.9183336
      Portugal  |  -1.041093   .0533332   -19.52   0.000    -1.145624   -.9365615
                |
          _cons |   .3144103   .0336063     9.36   0.000      .248543    .3802775
----------------+----------------------------------------------------------------
middle          |  (base outcome)
----------------+----------------------------------------------------------------
lower           |
        country |
       Czechia  |   .2464447   .0600557     4.10   0.000     .1287378    .3641516
         Spain  |   .6767671    .059422    11.39   0.000     .5603021    .7932321
United Kingdom  |    .382506   .0632433     6.05   0.000     .2585515    .5064605
        Greece  |   .5018112   .0579408     8.66   0.000     .3882494     .615373
      Portugal  |   .5592373   .0577631     9.68   0.000     .4460238    .6724508
                |
          _cons |  -.7818219   .0455991   -17.15   0.000    -.8711945   -.6924494
---------------------------------------------------------------------------------

. predict p0 p0_2 p0_3, pr

. replace p0 = p0_2 if class == 2
(9,663 real changes made)

. replace p0 = p0_3 if class == 3
(6,724 real changes made)

. drop p0_2 p0_3

. mlogit class i.fclass i.country

Iteration 0:   log likelihood = -26166.564  
Iteration 1:   log likelihood = -24297.422  
Iteration 2:   log likelihood = -24245.108  
Iteration 3:   log likelihood = -24244.971  
Iteration 4:   log likelihood = -24244.971  

Multinomial logistic regression                 Number of obs     =     24,069
                                                LR chi2(14)       =    3843.19
                                                Prob > chi2       =     0.0000
Log likelihood = -24244.971                     Pseudo R2         =     0.0734

---------------------------------------------------------------------------------
          class |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
upper           |
         fclass |
        middle  |  -.8234312   .0406635   -20.25   0.000    -.9031301   -.7437322
         lower  |  -1.096431   .0405718   -27.02   0.000    -1.175951   -1.016912
                |
        country |
       Czechia  |  -.6958079   .0530956   -13.10   0.000    -.7998734   -.5917423
         Spain  |  -.5828481   .0547182   -10.65   0.000    -.6900938   -.4756025
United Kingdom  |  -.0006604   .0506764    -0.01   0.990    -.0999844    .0986635
        Greece  |  -.7965508   .0544689   -14.62   0.000    -.9033078   -.6897938
      Portugal  |  -.8180399   .0547832   -14.93   0.000    -.9254129   -.7106669
                |
          _cons |   .8821138   .0409238    21.56   0.000     .8019048    .9623229
----------------+----------------------------------------------------------------
middle          |  (base outcome)
----------------+----------------------------------------------------------------
lower           |
         fclass |
        middle  |   .3974989   .0554946     7.16   0.000     .2887315    .5062664
         lower  |   1.063228   .0518614    20.50   0.000     .9615811    1.164874
                |
        country |
       Czechia  |   .1962998   .0610913     3.21   0.001      .076563    .3160365
         Spain  |   .5542965   .0603611     9.18   0.000      .435991     .672602
United Kingdom  |   .3976265   .0640695     6.21   0.000     .2720526    .5232005
        Greece  |   .3054363   .0591274     5.17   0.000     .1895487    .4213238
      Portugal  |    .400215   .0588458     6.80   0.000     .2848793    .5155506
                |
          _cons |  -1.394522   .0613094   -22.75   0.000    -1.514686   -1.274358
---------------------------------------------------------------------------------

. predict p1 p1_2 p1_3, pr

. replace p1 = p1_2 if class == 2
(9,663 real changes made)

. replace p1 = p1_3 if class == 3
(6,724 real changes made)

. drop p1_2 p1_3

. generate double m = ln(p1) - ln(p0)

. drop p0 p1

. mean m

Mean estimation                   Number of obs   =     24,069

--------------------------------------------------------------
             |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
           m |   .0472901   .0019488      .0434704    .0511098
--------------------------------------------------------------

. drop m

Approach 2 (simple mindex syntax): apply option controls()

. mindex class i.fclass, controls(i.country)
Estimating outcome models .. done.

M-index analysis                                Number of obs     =     24,069

Reduced model:  mlogit class i.country
Extended model: mlogit class i.fclass i.country

------------------------------------------------------------------------------
     M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   .0472901   .0019488    24.27   0.000     .0434706    .0511096
------------------------------------------------------------------------------

Approach 3 (advanced mindex syntax): three sets of parentheses specify the extended model (first set), the reduced model (second set), and the model used to aggregate/analyze the observation-level M values (third set; in the current case this model only contains a constant and no predictors)

. mindex class (extended:i.fclass i.country) (reduced:i.country) (mindex:)
Estimating outcome models .. done.

M-index analysis                                Number of obs     =     24,069
                                                R-squared         =     0.0000
                                                Adj R-squared     =     0.0000
                                                Root MSE          =     0.3023

Reduced model:  mlogit class i.country
Extended model: mlogit class i.fclass i.country

------------------------------------------------------------------------------
     M-index |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   .0472901   .0019488    24.27   0.000     .0434704    .0511098
------------------------------------------------------------------------------

Compare M index to unidiff

Social mobility is often analyzed using the so-called unidiff model. Compute an unidiff model to evaluate country differences in social origin effects (again using father's class as a predictor for respondent's class) and compare to the country-specific M indices above. Use Spain as the reference country for computing the unidiff parameters.

To estimate the unidiff model, use the udiff command (type ssc install udiff). udiff is easy to use and can be applied to individual-level data. Alternatively, if you are already familiar with the unidiff command by Pisati (2000, Stata Technical Bulletin 55:33–47), feel free to use this command (type net install sg142, from(http://www.stata.com/stb/stb55)). Note that, in this case, you first have to collapse the data (e.g. using command contract).

Estimate the unidiff model using udiff:

. udiff class i.fclass ib10.country, baselevel eform

fitting constant fluidity model ... done

Iteration 0:   log likelihood = -24244.971  
Iteration 1:   log likelihood = -24241.049  (not concave)
Iteration 2:   log likelihood = -24222.776  
Iteration 3:   log likelihood = -24203.263  
Iteration 4:   log likelihood = -24197.886  
Iteration 5:   log likelihood = -24197.154  
Iteration 6:   log likelihood = -24197.146  
Iteration 7:   log likelihood = -24197.146  

Unidiff model                                   Number of obs     =     24,069
                                                LR chi2(5)        =      95.65
Log likelihood = -24197.146                     Prob > chi2       =     0.0000

---------------------------------------------------------------------------------
          class |     exp(b)   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
Phi             |
        country |
   Switzerland  |   .9714266    .068701    -0.41   0.682     .8456906    1.115857
       Czechia  |   .9994966   .0782658    -0.01   0.995     .8572901    1.165292
         Spain  |          1  (base)
United Kingdom  |    .638413   .0538429    -5.32   0.000     .5411435    .7531664
        Greece  |   1.008582   .0758624     0.11   0.910      .870335    1.168788
      Portugal  |   1.330639   .0915229     4.15   0.000     1.162823    1.522674
---------------------------------------------------------------------------------

. estimates store unidiff

The likelihood evaluator implemented in udiff models the unidiff parameters on a logarithmic scale. This is why option eform has been applied; the option only affects how the results are displayed (i.e. it transforms the parameters back to scaling factors, a representation that is more common in the literature). Option baselevel has been specified so that the reference country is included in the output.

Estimate the M index by country:

. mindex class i.fclass, over(country)
Estimating outcome models ............ done.

M-index analysis                                Number of obs     =     24,069

Reduced model:  mlogit class
Extended model: mlogit class i.fclass
Over:           country

---------------------------------------------------------------------------------
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        country |
   Switzerland  |   .0554875   .0049011    11.32   0.000     .0458814    .0650935
       Czechia  |   .0421019   .0046017     9.15   0.000     .0330827     .051121
         Spain  |   .0578975   .0054378    10.65   0.000     .0472396    .0685554
United Kingdom  |   .0304638   .0038249     7.96   0.000     .0229671    .0379604
        Greece  |   .0450047    .004567     9.85   0.000     .0360536    .0539559
      Portugal  |   .0711837   .0058585    12.15   0.000     .0597012    .0826662
---------------------------------------------------------------------------------

. estimates store mindex

Compare the results in a graph:

. coefplot unidiff, eform baselevel || mindex ||, byopts(xrescale) xlabel(, grid)
Stata Graph - Graph Switzerland Czechia Spain United Kingdom Greece Portugal .5 1 1.5 .02 .04 .06 .08 unidiff mindex

The two methods lead to quite different results. Conclusions are similar for UK (highest mobility) and Portugal (lowest mobility), but for the other countries results differ. Whereas the unidiff model indicates that these countries all have very similar levels of social mobility, the M index points to substantial variation (more on this in the next exercise).

For sake of completeness, here is how the unidiff parameters could be estimated by the unidiff command by Pisati (2000):

. preserve

. keep class fclass country

. contract class fclass country

. unidiff _freq, row(fclass) col(class) lay(country) effect(mult) pattern(fi) refcat(3)


Iteration 1:     deviance =    56.2607
Iteration 2:     deviance =     4.2810
Iteration 3:     deviance =     0.1094
Iteration 4:     deviance =     0.0069
Iteration 5:     deviance =     0.0004




Analysis of differences in two-way associations



Table structure

-------------------------------------------------------------------------------
            Name      Label                                   N. of categories
-------------------------------------------------------------------------------
Row         fclass    father's class                                   3
Column      class     respondent's class                               3
Layer       country   country                                          6
-------------------------------------------------------------------------------


Model specification

-------------------------------------------------------------------------------
Layer effect:            multiplicative
R-C association pattern: full interaction
Additional variables:    none      
-------------------------------------------------------------------------------


Goodness-of-fit statistics

-------------------------------------------------------------------------------
Model               N    df      X2     p       G2     p     rG2     BIC     DI
-------------------------------------------------------------------------------
Cond. indep.     24069   24   2479.0  0.00   2428.9  0.00    0.0   2186.7  12.1
Null effect      24069   20    153.0  0.00    152.4  0.00   93.7    -49.3   2.8
Multipl. effect  24069   15     56.8  0.00     56.8  0.00   97.7    -94.6   2.0
-------------------------------------------------------------------------------


Phi parameters (layer scores)

---------------------------------------------------
       country |        Raw    Scaled 1    Scaled 2
---------------+-----------------------------------
   Switzerland |     4.1086      0.9714      0.3921
       Czechia |     4.2273      0.9995      0.4034
         Spain |     4.2295      1.0000      0.4036
United Kingdom |     2.7002      0.6384      0.2577
        Greece |     4.2658      1.0086      0.4071
      Portugal |     5.6279      1.3306      0.5371
---------------------------------------------------


Psi parameters (R-C association scores)

----------------------------------
father's  |   respondent's class  
class     |  upper  middle   lower
----------+-----------------------
    upper |   0.00    0.00    0.00
   middle |   0.00    0.20    0.31
    lower |   0.00    0.28    0.54
----------------------------------


Kappa indices

-----------------------
       country |  Kappa
---------------+-------
   Switzerland |   0.37
       Czechia |   0.38
         Spain |   0.38
United Kingdom |   0.24
        Greece |   0.39
      Portugal |   0.51
-----------------------



. restore

Look for column "Scaled 1" in table "Phi parameters (layer scores)". Results are the same as the ones obtained by udiff.

Decomposition of country differences

According to the unidiff model, social mobility in Spain is quite similar to social mobility in Switzerland, Czechia, and Greece. However, when looking at the M index, we see that the Czech and Greek societies appear to be more open than the Spanish society (Czechia and Greece have a lower M index than Spain). Compute a decomposition of the differences in the M index between Spain an the other countries into a part due to differences in the internal structure (association pattern between respondent's and father's class) and a part due to differences in the marginal distributions respondent's and father's class.

Use option decompose to obtain the decomposition. Equation "Contrast" will display the country-differences in the M index; equation "Internal" will report, for each contrast, the part that is due to differences in internal structure (i.e. the association patterns); equation "Marginal" will contain the part that is due to differences in the marginal distributions.

. mindex class i.fclass, over(country) refgroup(10) decompose
Estimating outcome models ............ done.
Fitting counterfactual distributions .......... done.

M-index analysis                                Number of obs     =     24,069

Reduced model:  mlogit class
Extended model: mlogit class i.fclass
Over:           country

---------------------------------------------------------------------------------
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
Level           |
        country |
   Switzerland  |   .0554875   .0049011    11.32   0.000     .0458814    .0650935
       Czechia  |   .0421019   .0046017     9.15   0.000     .0330827     .051121
         Spain  |   .0578975   .0054378    10.65   0.000     .0472396    .0685554
United Kingdom  |   .0304638   .0038249     7.96   0.000     .0229671    .0379604
        Greece  |   .0450047    .004567     9.85   0.000     .0360536    .0539559
      Portugal  |   .0711837   .0058585    12.15   0.000     .0597012    .0826662
----------------+----------------------------------------------------------------
Contrast        |
        country |
   Switzerland  |  -.0024101   .0073206    -0.33   0.742    -.0167581     .011938
       Czechia  |  -.0157957   .0071236    -2.22   0.027    -.0297576   -.0018337
         Spain  |          0  (omitted)
United Kingdom  |  -.0274338   .0066483    -4.13   0.000    -.0404641   -.0144034
        Greece  |  -.0128928   .0071012    -1.82   0.069    -.0268109    .0010254
      Portugal  |   .0132862   .0079932     1.66   0.096    -.0023803    .0289526
----------------+----------------------------------------------------------------
Internal        |
        country |
   Switzerland  |  -.0007332          .        .       .            .           .
       Czechia  |  -.0006831          .        .       .            .           .
         Spain  |          0  (omitted)
United Kingdom  |  -.0355089          .        .       .            .           .
        Greece  |  -.0005447          .        .       .            .           .
      Portugal  |   .0311138          .        .       .            .           .
----------------+----------------------------------------------------------------
Marginal        |
        country |
   Switzerland  |  -.0016768          .        .       .            .           .
       Czechia  |  -.0151126          .        .       .            .           .
         Spain  |          0  (omitted)
United Kingdom  |   .0080751          .        .       .            .           .
        Greece  |  -.0123481          .        .       .            .           .
      Portugal  |  -.0178276          .        .       .            .           .
---------------------------------------------------------------------------------

Analytic standard errors are not supported with decompose. Use the bootstrap to obtain standard errors:

. mindex class i.fclass, over(country) refgroup(10) decompose vce(bootstrap, reps(100))
(running mindex on estimation sample)

Bootstrap replications (100)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50
..................................................   100

M-index analysis                                Number of obs     =     24,069
                                                Replications      =        100

Reduced model:  mlogit class
Extended model: mlogit class i.fclass
Over:           country

---------------------------------------------------------------------------------
                |   Observed   Bootstrap                         Normal-based
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
Level           |
        country |
   Switzerland  |   .0554875   .0046273    11.99   0.000     .0464181    .0645568
       Czechia  |   .0421019   .0047326     8.90   0.000     .0328261    .0513776
         Spain  |   .0578975   .0052912    10.94   0.000      .047527     .068268
United Kingdom  |   .0304638   .0043373     7.02   0.000     .0219629    .0389646
        Greece  |   .0450047   .0041803    10.77   0.000     .0368114    .0531981
      Portugal  |   .0711837   .0062111    11.46   0.000     .0590102    .0833572
----------------+----------------------------------------------------------------
Contrast        |
        country |
   Switzerland  |  -.0024101   .0071953    -0.33   0.738    -.0165127    .0116925
       Czechia  |  -.0157957   .0071166    -2.22   0.026     -.029744   -.0018473
         Spain  |          0  (omitted)
United Kingdom  |  -.0274338   .0063535    -4.32   0.000    -.0398863   -.0149812
        Greece  |  -.0128928   .0068096    -1.89   0.058    -.0262394    .0004538
      Portugal  |   .0132862   .0078379     1.70   0.090    -.0020758    .0286481
----------------+----------------------------------------------------------------
Internal        |
        country |
   Switzerland  |  -.0007332   .0070194    -0.10   0.917    -.0144911    .0130246
       Czechia  |  -.0006831   .0071564    -0.10   0.924    -.0147093    .0133431
         Spain  |          0  (omitted)
United Kingdom  |  -.0355089   .0063526    -5.59   0.000    -.0479598   -.0230579
        Greece  |  -.0005447     .00673    -0.08   0.935    -.0137353    .0126459
      Portugal  |   .0311138   .0078213     3.98   0.000     .0157844    .0464433
----------------+----------------------------------------------------------------
Marginal        |
        country |
   Switzerland  |  -.0016768   .0019865    -0.84   0.399    -.0055704    .0022167
       Czechia  |  -.0151126   .0019238    -7.86   0.000    -.0188832    -.011342
         Spain  |          0  (omitted)
United Kingdom  |   .0080751   .0016028     5.04   0.000     .0049336    .0112166
        Greece  |  -.0123481   .0017047    -7.24   0.000    -.0156892   -.0090069
      Portugal  |  -.0178276   .0025488    -6.99   0.000    -.0228232   -.0128321
---------------------------------------------------------------------------------

Here is an overview graph of the results. It displays the unidiff parameter and the M index in the upper part, and the decomposition results in the lower part. As we would expect, at least if the unidiff model fits the data, the pattern of country differences due to internal structure is very similar to the pattern of unidiff parameters.

. coefplot unidiff, eform baselevel ///
>       || ., keep(Level:)    bylabel(mindex)   ///
>       || ., keep(Internal:) bylabel(internal) ///
>       || ., keep(Marginal:) bylabel(marginal) ///
>       ||  , byopts(xrescale) xlabel(, grid)
Stata Graph - Graph Switzerland Czechia Spain United Kingdom Greece Portugal Switzerland Czechia Spain United Kingdom Greece Portugal .5 1 1.5 .02 .04 .06 .08 -.05 0 .05 -.02 -.01 0 .01 unidiff mindex internal marginal

Note that the "Marginal" component of the decomposition can be further subdivided into a part due to differences in the marginal distribution of the destination variable ("Marginal_Y") and a part due to differences in the marginal distribution of the origin variable ("Marginal_X") using the split option:

. mindex class i.fclass, over(country) refgroup(10) decompose split vce(bootstrap, reps(10
> 0))
(running mindex on estimation sample)

Bootstrap replications (100)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50
..................................................   100

M-index analysis                                Number of obs     =     24,069
                                                Replications      =        100

Reduced model:  mlogit class
Extended model: mlogit class i.fclass
Over:           country

---------------------------------------------------------------------------------
                |   Observed   Bootstrap                         Normal-based
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
Level           |
        country |
   Switzerland  |   .0554875   .0042302    13.12   0.000     .0471964    .0637785
       Czechia  |   .0421019   .0045425     9.27   0.000     .0331986    .0510051
         Spain  |   .0578975   .0053298    10.86   0.000     .0474514    .0683436
United Kingdom  |   .0304638    .003847     7.92   0.000     .0229237    .0380038
        Greece  |   .0450047   .0046086     9.77   0.000     .0359721    .0540374
      Portugal  |   .0711837   .0063437    11.22   0.000     .0587503    .0836171
----------------+----------------------------------------------------------------
Contrast        |
        country |
   Switzerland  |  -.0024101   .0066066    -0.36   0.715    -.0153587    .0105386
       Czechia  |  -.0157957   .0068574    -2.30   0.021    -.0292358   -.0023555
         Spain  |          0  (omitted)
United Kingdom  |  -.0274338   .0068502    -4.00   0.000    -.0408599   -.0140076
        Greece  |  -.0128928   .0076375    -1.69   0.091     -.027862    .0020764
      Portugal  |   .0132862   .0079853     1.66   0.096    -.0023648    .0289372
----------------+----------------------------------------------------------------
Internal        |
        country |
   Switzerland  |  -.0007332   .0067816    -0.11   0.914    -.0140249    .0125585
       Czechia  |  -.0006831   .0068848    -0.10   0.921    -.0141771    .0128109
         Spain  |          0  (omitted)
United Kingdom  |  -.0355089    .007119    -4.99   0.000    -.0494619   -.0215559
        Greece  |  -.0005447    .007351    -0.07   0.941    -.0149523     .013863
      Portugal  |   .0311138   .0084995     3.66   0.000     .0144551    .0477726
----------------+----------------------------------------------------------------
Marginal        |
        country |
   Switzerland  |  -.0016768   .0020541    -0.82   0.414    -.0057028    .0023491
       Czechia  |  -.0151126   .0017739    -8.52   0.000    -.0185894   -.0116357
         Spain  |          0  (omitted)
United Kingdom  |   .0080751   .0015607     5.17   0.000     .0050163     .011134
        Greece  |  -.0123481   .0016163    -7.64   0.000    -.0155159   -.0091803
      Portugal  |  -.0178276   .0022998    -7.75   0.000    -.0223351   -.0133202
----------------+----------------------------------------------------------------
Marginal_Y      |
        country |
   Switzerland  |  -.0130514   .0023644    -5.52   0.000    -.0176855   -.0084173
       Czechia  |  -.0060491   .0009611    -6.29   0.000    -.0079327   -.0041655
         Spain  |          0  (omitted)
United Kingdom  |  -.0028811   .0011642    -2.47   0.013    -.0051629   -.0005993
        Greece  |  -.0036376   .0009304    -3.91   0.000    -.0054611   -.0018141
      Portugal  |  -.0045025   .0010467    -4.30   0.000    -.0065539   -.0024511
----------------+----------------------------------------------------------------
Marginal_X      |
        country |
   Switzerland  |   .0113746   .0014617     7.78   0.000     .0085098    .0142394
       Czechia  |  -.0090635   .0014143    -6.41   0.000    -.0118355   -.0062914
         Spain  |          0  (omitted)
United Kingdom  |   .0109562   .0015189     7.21   0.000     .0079792    .0139333
        Greece  |  -.0087105   .0013197    -6.60   0.000    -.0112971   -.0061238
      Portugal  |  -.0133252   .0018751    -7.11   0.000    -.0170002   -.0096501
---------------------------------------------------------------------------------

Local M index (class linkages)

Dig a bit deeper into the country differences between Spain and Czechia by analyzing the class-specific components of the M index, that is, by analyzing local class linkages in connection with the frequencies of the classes.

You can do this either from the perspective of respondent's class or from the perspective of father's class. You will need to use advanced syntax to obtain the estimates of local class linkages.

To subdivide the M index by classes we will need to resort to advanced syntax. We first take the perspective of respondent's class. Note that the analytic standard errors for local class linkages appear rather unreliable, this why we will use the bootstrap. Furthermore, we will use command estout to create an overview table of the results (if not already installed, first type ssc install estout).

Local class linkages and class proportions for Spain:

. mindex class (extended:i.fclass) (reduced:) (mindex:ibn.class, nocons) ///
>     if country==10, vce(boot, reps(100))
(running mindex on estimation sample)

Bootstrap replications (100)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50
..................................................   100

M-index analysis                                Number of obs     =      3,732
                                                Replications      =        100
                                                R-squared         =     0.0417
                                                Adj R-squared     =     0.0410
                                                Root MSE          =     0.3302

Reduced model:  mlogit class
Extended model: mlogit class i.fclass

------------------------------------------------------------------------------
             |   Observed   Bootstrap                         Normal-based
     M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       class |
      upper  |   .0900216   .0101249     8.89   0.000     .0701772     .109866
     middle  |   .0112208   .0029259     3.83   0.000      .005486    .0169555
      lower  |   .0859645   .0094486     9.10   0.000     .0674456    .1044835
------------------------------------------------------------------------------

. estimates store m_spain

. proportion class if country==10

Proportion estimation             Number of obs   =      3,732

--------------------------------------------------------------
             |                                   Logit
             | Proportion   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
       class |
      upper  |   .2596463   .0071769      .2458245    .2739629
     middle  |   .3896034   .0079826      .3740711     .405363
      lower  |   .3507503   .0078115      .3355929    .3662149
--------------------------------------------------------------

. estimates store p_spain

Local class linkages and class proportions for Czechia:

. mindex class (extended:i.fclass) (reduced:) (mindex:ibn.class, nocons) ///
>     if country==6, vce(boot, reps(100))
(running mindex on estimation sample)

Bootstrap replications (100)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50
..................................................   100

M-index analysis                                Number of obs     =      3,834
                                                Replications      =        100
                                                R-squared         =     0.0377
                                                Adj R-squared     =     0.0370
                                                Root MSE          =     0.2826

Reduced model:  mlogit class
Extended model: mlogit class i.fclass

------------------------------------------------------------------------------
             |   Observed   Bootstrap                         Normal-based
     M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       class |
      upper  |   .0861695    .011065     7.79   0.000     .0644825    .1078564
     middle  |   .0032655   .0015994     2.04   0.041     .0001307    .0064003
      lower  |   .0650071    .009096     7.15   0.000     .0471792     .082835
------------------------------------------------------------------------------

. estimates store m_czechia

. proportion class if country==6

Proportion estimation             Number of obs   =      3,834

--------------------------------------------------------------
             |                                   Logit
             | Proportion   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
       class |
      upper  |   .2668232   .0071432      .2530541    .2810595
     middle  |   .4624413   .0080522      .4466971    .4782608
      lower  |   .2707355   .0071761      .2568981    .2850324
--------------------------------------------------------------

. estimates store p_czechia

Overview table:

. estout m_spain p_spain m_czechia p_czechia, collabels(none) cell(b(fmt(3)))

----------------------------------------------------------------
                  m_spain      p_spain    m_czechia    p_czechia
----------------------------------------------------------------
1.class             0.090        0.260        0.086        0.267
2.class             0.011        0.390        0.003        0.462
3.class             0.086        0.351        0.065        0.271
----------------------------------------------------------------

We see that in Czechia local class linkage is relatively weak for the middle class, but at the same time this class is relatively large in Czechia. This explains, at least in part, the increase in social mobility in Czechia, relative to Spain, once the marginal distribution of classes is taken into account.

We now repeat the analysis from the perspective of father's class:

. mindex class (extended:i.fclass) (reduced:) (mindex:ibn.fclass, nocons) ///
>     if country==10, vce(boot, reps(100))
(running mindex on estimation sample)

Bootstrap replications (100)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50
..................................................   100

M-index analysis                                Number of obs     =      3,732
                                                Replications      =        100
                                                R-squared         =     0.0515
                                                Adj R-squared     =     0.0508
                                                Root MSE          =     0.3285

Reduced model:  mlogit class
Extended model: mlogit class i.fclass

------------------------------------------------------------------------------
             |   Observed   Bootstrap                         Normal-based
     M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      fclass |
      upper  |   .1551994   .0179685     8.64   0.000     .1199818     .190417
     middle  |   .0216725   .0061038     3.55   0.000     .0097092    .0336357
      lower  |   .0386212    .004261     9.06   0.000     .0302698    .0469725
------------------------------------------------------------------------------

. estimates store m_spain

. proportion fclass if country==10

Proportion estimation             Number of obs   =      3,732

--------------------------------------------------------------
             |                                   Logit
             | Proportion   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
      fclass |
      upper  |   .2057878   .0066177      .1931157    .2190656
     middle  |    .278135   .0073347      .2639854    .2927414
      lower  |   .5160772   .0081804      .5000276    .5320936
--------------------------------------------------------------

. estimates store p_spain

. mindex class (extended:i.fclass) (reduced:) (mindex:ibn.fclass, nocons) ///
>     if country==6, vce(boot, reps(100))
(running mindex on estimation sample)

Bootstrap replications (100)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50
..................................................   100

M-index analysis                                Number of obs     =      3,834
                                                Replications      =        100
                                                R-squared         =     0.0604
                                                Adj R-squared     =     0.0597
                                                Root MSE          =     0.2793

Reduced model:  mlogit class
Extended model: mlogit class i.fclass

------------------------------------------------------------------------------
             |   Observed   Bootstrap                         Normal-based
     M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      fclass |
      upper  |   .1659646   .0196865     8.43   0.000     .1273797    .2045495
     middle  |   .0017258   .0011645     1.48   0.138    -.0005565    .0040082
      lower  |   .0367671   .0050966     7.21   0.000     .0267779    .0467563
------------------------------------------------------------------------------

. estimates store m_czechia

. proportion fclass if country==6

Proportion estimation             Number of obs   =      3,834

--------------------------------------------------------------
             |                                   Logit
             | Proportion   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
      fclass |
      upper  |   .1627543   .0059617      .1514014    .1747832
     middle  |   .4478352    .008031      .4321472    .4636276
      lower  |   .3894105    .007875      .3740863    .4049564
--------------------------------------------------------------

. estimates store p_czechia

. estout m_spain p_spain m_czechia p_czechia, collabels(none) cell(b(fmt(3)))

----------------------------------------------------------------
                  m_spain      p_spain    m_czechia    p_czechia
----------------------------------------------------------------
1.fclass            0.155        0.206        0.166        0.163
2.fclass            0.022        0.278        0.002        0.448
3.fclass            0.039        0.516        0.037        0.389
----------------------------------------------------------------

The results are similar as above, but even more pronounced.

10  Include multiple origin variables

Up to now we only looked at the relation between respondent's class and father's class, but there are also other origin variables in the dataset that could be taken into account. Evaluate how results change once you add some of these variables. Restrict the sample to Spain, UK, and Greece and compute the following variants:

Father's class only:

. mindex class i.fclass if inlist(country,10,13,14), over(country)
Estimating outcome models ...... done.

M-index analysis                                Number of obs     =     11,818

Reduced model:  mlogit class
Extended model: mlogit class i.fclass
Over:           country

---------------------------------------------------------------------------------
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        country |
         Spain  |   .0578975   .0054379    10.65   0.000     .0472394    .0685557
United Kingdom  |   .0304638    .003825     7.96   0.000      .022967    .0379605
        Greece  |   .0450047   .0045671     9.85   0.000     .0360534    .0539561
---------------------------------------------------------------------------------

. estimates store father

Father's class and ISEI:

. mindex class i.fclass fisei if inlist(country,10,13,14), over(country)
Estimating outcome models ...... done.

M-index analysis                                Number of obs     =     11,818

Reduced model:  mlogit class
Extended model: mlogit class i.fclass fisei
Over:           country

---------------------------------------------------------------------------------
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        country |
         Spain  |    .071539   .0060155    11.89   0.000     .0597489    .0833291
United Kingdom  |   .0340836   .0040505     8.41   0.000     .0261448    .0420224
        Greece  |    .052306   .0050094    10.44   0.000     .0424878    .0621242
---------------------------------------------------------------------------------

. estimates store isei

Father's class and ISEI as well as mother's class:

. mindex class i.fclass fisei i.mclass if inlist(country,10,13,14), over(country)
Estimating outcome models ...... done.

M-index analysis                                Number of obs     =     11,818

Reduced model:  mlogit class
Extended model: mlogit class i.fclass fisei i.mclass
Over:           country

---------------------------------------------------------------------------------
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        country |
         Spain  |    .073715   .0061037    12.08   0.000     .0617521     .085678
United Kingdom  |   .0389382   .0042966     9.06   0.000      .030517    .0473595
        Greece  |   .0657745   .0055685    11.81   0.000     .0548604    .0766886
---------------------------------------------------------------------------------

. estimates store parents

Father's class and ISEI, mother's class, as well as father's and mother's education:

. mindex class i.fclass fisei i.mclass i.feduc i.meduc if inlist(country,10,13,14), over(c
> ountry)
Estimating outcome models ...... done.

M-index analysis                                Number of obs     =     11,818

Reduced model:  mlogit class
Extended model: mlogit class i.fclass fisei i.mclass i.feduc i.meduc
Over:           country

---------------------------------------------------------------------------------
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        country |
         Spain  |   .0917888   .0067806    13.54   0.000     .0784991    .1050785
United Kingdom  |   .0557218    .005024    11.09   0.000      .045875    .0655686
        Greece  |   .0809756   .0060717    13.34   0.000     .0690752    .0928759
---------------------------------------------------------------------------------

. estimates store educ

Graph displaying an overview of the results:

. coefplot father isei parents educ, legend(cols(1) pos(3)) xlabel(,grid)
Stata Graph - Graph Spain United Kingdom Greece .02 .04 .06 .08 .1 father isei parents educ

For Spain we see that father's ISEI is quite relevant, but additionally taking account of mother's class does not change the result much. For Greece the story is a bit different: here the additional effect of mother's class is stronger than the additional effect of ISEI. In all three countries, taking account of education gives an additional boost to the M index.

11  Direct and indirect effects

A part of the effect of father's class on respondent's class might be mediated by respondent's educational attainment. Try to evaluate, for each country, how the “total” effect of father's class divides into such an “indirect” effect through education and a “direct” effect that is unrelated of education.

For the total effect, simply compute the M index of father's class as above. For the direct effect, compute the M index while controlling for respondent's education (educ). The indirect effect is equal to the difference between the total effect and the direct effect

Total effect:

. mindex class i.fclass, over(country)
Estimating outcome models ............ done.

M-index analysis                                Number of obs     =     24,069

Reduced model:  mlogit class
Extended model: mlogit class i.fclass
Over:           country

---------------------------------------------------------------------------------
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        country |
   Switzerland  |   .0554875   .0049011    11.32   0.000     .0458814    .0650935
       Czechia  |   .0421019   .0046017     9.15   0.000     .0330827     .051121
         Spain  |   .0578975   .0054378    10.65   0.000     .0472396    .0685554
United Kingdom  |   .0304638   .0038249     7.96   0.000     .0229671    .0379604
        Greece  |   .0450047    .004567     9.85   0.000     .0360536    .0539559
      Portugal  |   .0711837   .0058585    12.15   0.000     .0597012    .0826662
---------------------------------------------------------------------------------

. estimates store total

Direct effect:

. mindex class i.fclass, over(country) controls(i.educ)
Estimating outcome models ............ done.

M-index analysis                                Number of obs     =     24,069

Reduced model:  mlogit class i.educ
Extended model: mlogit class i.fclass i.educ
Over:           country

---------------------------------------------------------------------------------
        M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        country |
   Switzerland  |   .0205909   .0031607     6.51   0.000      .014396    .0267859
       Czechia  |   .0139231   .0026664     5.22   0.000     .0086969    .0191492
         Spain  |   .0143113   .0027804     5.15   0.000     .0088619    .0197607
United Kingdom  |   .0060199   .0017407     3.46   0.001     .0026083    .0094315
        Greece  |   .0078507   .0019812     3.96   0.000     .0039676    .0117338
      Portugal  |   .0105322   .0023074     4.56   0.000     .0060099    .0150546
---------------------------------------------------------------------------------

. estimates store direct

Overview graph:

. coefplot total direct, vertical recast(bar) citop cirecast(rcap) barwidth(0.7) nooffset
Stata Graph - Graph 0 .02 .04 .06 .08 Switzerland Czechia Spain United Kingdom Greece Portugal total direct

We see that the largest part of the social origin effect operates indirectly through educational attainment. The direct effect is less than half of the total effect in each country.

Here is how we could estimate the direct effect using advanced syntax:

. mindex class (extended: i.(fclass educ)##i.country) ///
>              (reduced:  i.educ##i.country)          ///
>              (mindex:   ibn.country, noconstant), robust
Estimating outcome models .. done.

M-index analysis                                Number of obs     =     24,069
                                                R-squared         =     0.0069
                                                Adj R-squared     =     0.0067
                                                Root MSE          =     0.1582

Reduced model:  mlogit class i.educ i.country i.educ#i.country
Extended model: mlogit class i.fclass i.educ i.country i.fclass#i.country i.educ#i.country

---------------------------------------------------------------------------------
                |               Robust
        M-index |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        country |
   Switzerland  |    .020591   .0031611     6.51   0.000     .0143951    .0267868
       Czechia  |    .013923   .0026666     5.22   0.000     .0086962    .0191498
         Spain  |   .0143113   .0027807     5.15   0.000     .0088611    .0197616
United Kingdom  |   .0060199   .0017408     3.46   0.001     .0026078    .0094321
        Greece  |   .0078507   .0019815     3.96   0.000     .0039669    .0117345
      Portugal  |   .0105323   .0023076     4.56   0.000     .0060091    .0150554
---------------------------------------------------------------------------------

12  Change in social mobility across birth cohorts

Now we shift focus from country differences to changes across birth cohorts. Do the following:

To model time trends, you will need to resort to advanced syntax.

Generate groups of birth cohorts:

. egen cohort = cut(birthyr), at(1933,1946,1956,1966,1976)

. tabulate cohort

     cohort |      Freq.     Percent        Cum.
------------+-----------------------------------
       1933 |      5,113       21.24       21.24
       1946 |      6,610       27.46       48.71
       1956 |      7,323       30.43       79.13
       1966 |      5,023       20.87      100.00
------------+-----------------------------------
      Total |     24,069      100.00

M index over cohort groups:

. mindex class i.fclass, over(cohort)
Estimating outcome models ........ done.

M-index analysis                                Number of obs     =     24,069

Reduced model:  mlogit class
Extended model: mlogit class i.fclass
Over:           cohort

------------------------------------------------------------------------------
     M-index |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cohort |
       1933  |   .0673485   .0049418    13.63   0.000     .0576628    .0770342
       1946  |   .0619595   .0042429    14.60   0.000     .0536437    .0702754
       1956  |   .0614739    .003949    15.57   0.000      .053734    .0692138
       1966  |   .0547618   .0045785    11.96   0.000     .0457881    .0637356
------------------------------------------------------------------------------

. test 1933.cohort = 1946.cohort = 1956.cohort = 1966.cohort

 ( 1)  1933bn.cohort - 1946.cohort = 0
 ( 2)  1933bn.cohort - 1956.cohort = 0
 ( 3)  1933bn.cohort - 1966.cohort = 0

           chi2(  3) =    3.56
         Prob > chi2 =    0.3128

. coefplot, recast(connect) cirecast(rcap) vertical
Stata Graph - Graph .04 .05 .06 .07 .08 cohort=1933 cohort=1946 cohort=1956 cohort=1966

The pattern of the M index across birth cohorts indicates that social mobility increased over time (decreasing M index), but the overall test for differences between cohort groups is not significant.

We now model a linear time trend by birth year. We first transform variable birthyr such that its zero point is 1955 (which is about the average birth year in the data). The transformation does not change the model, it just changes the interpretation of the constant (without the transformation the constant would reflect the value of the M index in year 0, after the transformation it reflect the value of the M index in year 1955).

. generate birthyr55 = birthyr - 1955

. mindex class (extended: i.fclass##c.birthyr55) ///
>              (reduced:  birthyr55)             ///
>              (mindex:   birthyr55), robust
Estimating outcome models .. done.

M-index analysis                                Number of obs     =     24,069
                                                R-squared         =     0.0002
                                                Adj R-squared     =     0.0001
                                                Root MSE          =     0.3397

Reduced model:  mlogit class birthyr55
Extended model: mlogit class i.fclass birthyr55 i.fclass#c.birthyr55

------------------------------------------------------------------------------
             |               Robust
     M-index |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   birthyr55 |  -.0004325   .0002095    -2.07   0.039    -.0008431    -.000022
       _cons |   .0612618   .0021979    27.87   0.000     .0569537    .0655698
------------------------------------------------------------------------------

. estimates store linear

The effect of birth year is negative (increasing social mobility over time) and (marginally) significant.

To check for possible nonlinearities in the trend, we can include birth year squared in the model (parabolic time trend):

. mindex class (extended: i.fclass##c.birthyr55##c.birthyr55) ///
>              (reduced:  c.birthyr55##c.birthyr55)             ///
>              (mindex:   c.birthyr55##c.birthyr55), robust
Estimating outcome models .. done.

M-index analysis                                Number of obs     =     24,069
                                                R-squared         =     0.0002
                                                Adj R-squared     =     0.0001
                                                Root MSE          =     0.3397

Reduced model:  mlogit class birthyr55 c.birthyr55#c.birthyr55
Extended model: mlogit class i.fclass birthyr55 i.fclass#c.birthyr55 c.birthyr55#c.birthyr
> 55 i.fclass#c.birthyr55#c.birthyr55

-----------------------------------------------------------------------------------------
                        |               Robust
                M-index |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
------------------------+----------------------------------------------------------------
              birthyr55 |  -.0004476   .0002074    -2.16   0.031    -.0008542    -.000041
                        |
c.birthyr55#c.birthyr55 |   -.000011   .0000205    -0.54   0.592    -.0000512    .0000292
                        |
                  _cons |   .0624239   .0031801    19.63   0.000     .0561906    .0686571
-----------------------------------------------------------------------------------------

. estimates store quadratic

There is no evidence for a non-linear trend (coefficient c.birthyr55#c.birthyr55 not significant).

We now add mother's class to see whether this changes the story. The labor market participation of women changed over time and, possibly, also mother's class became more relevant for respondent's class over time. The model including mother's class is as follows:

. mindex class (extended: i.(fclass mclass)##c.birthyr55##c.birthyr55) ///
>              (reduced:  c.birthyr55##c.birthyr55)             ///
>              (mindex:   c.birthyr55##c.birthyr55), robust
Estimating outcome models .. done.

M-index analysis                                Number of obs     =     24,069
                                                R-squared         =     0.0002
                                                Adj R-squared     =     0.0001
                                                Root MSE          =     0.3595

Reduced model:  mlogit class birthyr55 c.birthyr55#c.birthyr55
Extended model: mlogit class i.fclass i.mclass birthyr55 i.fclass#c.birthyr55 i.mclass#c.b
> irthyr55 c.birthyr55#c.birthyr55 i.fclass#c.birthyr55#c.birthyr55 i.mclass#c.birthyr55#c
> .birthyr55

-----------------------------------------------------------------------------------------
                        |               Robust
                M-index |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
------------------------+----------------------------------------------------------------
              birthyr55 |  -.0004656   .0002234    -2.08   0.037    -.0009035   -.0000277
                        |
c.birthyr55#c.birthyr55 |   .0000133   .0000224     0.60   0.552    -.0000305    .0000571
                        |
                  _cons |   .0675265   .0033553    20.13   0.000     .0609499    .0741032
-----------------------------------------------------------------------------------------

. estimates store withmothers

There is still no evidence for a nonlinear time trend, but note that the sign of the effect of c.birthyr55#c.birthyr55 changed. It might thus be interesting to isolate the effect that mother's class has in addition to father's class. To do so we include father's class in the restricted model:

. mindex class (extended: i.(fclass mclass)##c.birthyr55##c.birthyr55) ///
>              (reduced:  i.fclass##c.birthyr55##c.birthyr55)          ///
>              (mindex:   c.birthyr55##c.birthyr55), robust
Estimating outcome models .. done.

M-index analysis                                Number of obs     =     24,069
                                                R-squared         =     0.0004
                                                Adj R-squared     =     0.0003
                                                Root MSE          =     0.1257

Reduced model:  mlogit class i.fclass birthyr55 i.fclass#c.birthyr55 c.birthyr55#c.birthyr
> 55 i.fclass#c.birthyr55#c.birthyr55
Extended model: mlogit class i.fclass i.mclass birthyr55 i.fclass#c.birthyr55 i.mclass#c.b
> irthyr55 c.birthyr55#c.birthyr55 i.fclass#c.birthyr55#c.birthyr55 i.mclass#c.birthyr55#c
> .birthyr55

-----------------------------------------------------------------------------------------
                        |               Robust
                M-index |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
------------------------+----------------------------------------------------------------
              birthyr55 |   -.000018   .0000886    -0.20   0.839    -.0001916    .0001557
                        |
c.birthyr55#c.birthyr55 |   .0000243   9.11e-06     2.67   0.008     6.46e-06    .0000422
                        |
                  _cons |   .0051027    .001107     4.61   0.000     .0029329    .0072724
-----------------------------------------------------------------------------------------

. estimates store motherspartial

We see that the additional effect of mother's class is indeed non-linear (U shaped time trend).

Here is a graph that plots the effect shapes for the different non-linear models:

. foreach m in quadratic withmothers motherspartial {
  2.     estimates restore `m'
  3.     quietly margins, at(birthyr55=(-20(2)20)) post
  4.     estimates store shape_`m'
  5. }
(results quadratic are active now)
(results withmothers are active now)
(results motherspartial are active now)

. coefplot shape_*, at recast(line) lwidth(*2) legend(cols(1) pos(2)) ///
>     ciopts(recast(rarea) pstyle(ci) color(%50) lcolor(%0)) ///
>     plotlabels("Fathers only" "Fathers + mothers" "Mothers' partial effect") ///
>     xlabel(-20 "1935" -10 "1945" 0 "1955" 10 "1965" 20 "1975") ///
>     xtitle("birth cohort")
Stata Graph - Graph 0 .02 .04 .06 .08 .1 1935 1945 1955 1965 1975 birth cohort Fathers only Fathers + mothers Mothers' partial effect

13  Explaining country-differences in social mobility

Open dataset ESSplus.dta and evaluate whether country differences in social mobility can be “explained” by welfare state characteristics (the welfare state characteristics have been taken from Kuitto 2011, Journal of European Social Policy 21:348–346). Do the following:

In both cases, do separate analyses by gender (female) and control for age of the respondent (age) and the ESS round (year).

You will need to resort to advanced syntax to estimate these models. Consider computing robust standard errors that are clustered by country.

Effects of welfare state clusters:

. use ESSplus, clear

. fre Welfare

Welfare -- Welfare cluster (Kuitto 2011)
---------------------------------------------------------------------------------------
                                          |      Freq.    Percent      Valid       Cum.
------------------------------------------+--------------------------------------------
Valid   1 Cluster 1: Continental European |      29670      36.08      36.08      36.08
        2 Cluster 2: Southern European    |       9925      12.07      12.07      48.15
        3 Cluster 3: Nordic               |      14939      18.17      18.17      66.32
        4 Cluster 4: Eastern European     |      11746      14.28      14.28      80.60
        5 Cluster 5: Mixed/unclassified   |      15951      19.40      19.40     100.00
        Total                             |      82231     100.00     100.00           
---------------------------------------------------------------------------------------

. mindex class (extended: i.fclass##i.country age i.year)   ///
>              (reduced:  i.country           age i.year)   ///
>              (mindex:   i.Welfare           age i.year)   ///
>              if female==0, vce(cluster country)
Estimating outcome models .. done.

M-index analysis                                Number of obs     =     39,599
                                                R-squared         =     0.0004
                                                Adj R-squared     =     0.0002
                                                Root MSE          =     0.3100

Reduced model:  mlogit class i.country age i.year
Extended model: mlogit class i.fclass i.country i.fclass#i.country age i.year

                                          (Std. Err. adjusted for 23 clusters in country)
-----------------------------------------------------------------------------------------
                        |               Robust
                M-index |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
------------------------+----------------------------------------------------------------
                Welfare |
Cluster 2: Southern ..  |    .009026   .0117008     0.77   0.449    -.0152399    .0332918
     Cluster 3: Nordic  |  -.0035637   .0079001    -0.45   0.656    -.0199475    .0128202
Cluster 4: Eastern E..  |   .0047564   .0122648     0.39   0.702    -.0206791     .030192
Cluster 5: Mixed/unc..  |   .0142508   .0076794     1.86   0.077    -.0016753     .030177
                        |
                    age |   .0000846   .0001553     0.54   0.591    -.0002375    .0004068
                        |
                   year |
                  2004  |   .0024888   .0049655     0.50   0.621     -.007809    .0127866
                  2006  |   .0039574   .0032565     1.22   0.237    -.0027961    .0107108
                  2008  |  -.0007534   .0044513    -0.17   0.867    -.0099849    .0084781
                  2010  |  -.0013209   .0052614    -0.25   0.804    -.0122324    .0095907
                        |
                  _cons |   .0411939   .0089189     4.62   0.000     .0226972    .0596906
-----------------------------------------------------------------------------------------

. testparm i.Welfare

 ( 1)  2.Welfare = 0
 ( 2)  3.Welfare = 0
 ( 3)  4.Welfare = 0
 ( 4)  5.Welfare = 0

       F(  4,    22) =    1.36
            Prob > F =    0.2808

. mindex class (extended: i.fclass##i.country age i.year)   ///
>              (reduced:  i.country           age i.year)   ///
>              (mindex:   i.Welfare           age i.year)   ///
>              if female==1, vce(cluster country)
Estimating outcome models .. done.

M-index analysis                                Number of obs     =     42,632
                                                R-squared         =     0.0004
                                                Adj R-squared     =     0.0002
                                                Root MSE          =     0.2759

Reduced model:  mlogit class i.country age i.year
Extended model: mlogit class i.fclass i.country i.fclass#i.country age i.year

                                          (Std. Err. adjusted for 23 clusters in country)
-----------------------------------------------------------------------------------------
                        |               Robust
                M-index |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
------------------------+----------------------------------------------------------------
                Welfare |
Cluster 2: Southern ..  |   .0108221   .0073571     1.47   0.155    -.0044354    .0260797
     Cluster 3: Nordic  |  -.0068114   .0033979    -2.00   0.057    -.0138581    .0002354
Cluster 4: Eastern E..  |   .0007086   .0070367     0.10   0.921    -.0138846    .0153018
Cluster 5: Mixed/unc..  |   .0030457   .0055945     0.54   0.592    -.0085566    .0146479
                        |
                    age |    .000154   .0001716     0.90   0.379     -.000202    .0005099
                        |
                   year |
                  2004  |   .0028996   .0039593     0.73   0.472    -.0053115    .0111108
                  2006  |   .0031251   .0055261     0.57   0.577    -.0083353    .0145856
                  2008  |   .0028193   .0041614     0.68   0.505     -.005811    .0114496
                  2010  |  -.0009233   .0049539    -0.19   0.854    -.0111971    .0093505
                        |
                  _cons |   .0299029   .0085008     3.52   0.002     .0122733    .0475324
-----------------------------------------------------------------------------------------

. testparm i.Welfare

 ( 1)  2.Welfare = 0
 ( 2)  3.Welfare = 0
 ( 3)  4.Welfare = 0
 ( 4)  5.Welfare = 0

       F(  4,    22) =    3.16
            Prob > F =    0.0342

The M index is lowest in the nordic cluster; the continental European cluster is in the middle. For women, the overall test for differenced among clusters is significant.

Effect of social service expenditures:

. mindex class (extended: i.fclass##i.country age i.year)   ///
>              (reduced:  i.country           age i.year)   ///
>              (mindex:   SocialService       age i.year)   ///
>              if female==0, vce(cluster country)
Estimating outcome models .. done.

M-index analysis                                Number of obs     =     39,599
                                                R-squared         =     0.0002
                                                Adj R-squared     =     0.0001
                                                Root MSE          =     0.3100

Reduced model:  mlogit class i.country age i.year
Extended model: mlogit class i.fclass i.country i.fclass#i.country age i.year

                                (Std. Err. adjusted for 23 clusters in country)
-------------------------------------------------------------------------------
              |               Robust
      M-index |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
SocialService |  -.0029345   .0023173    -1.27   0.219    -.0077403    .0018713
          age |   .0000846    .000153     0.55   0.586    -.0002328    .0004019
              |
         year |
        2004  |   .0025736   .0051553     0.50   0.623    -.0081179    .0132651
        2006  |   .0027581    .003576     0.77   0.449     -.004658    .0101743
        2008  |  -.0008205   .0048869    -0.17   0.868    -.0109554    .0093144
        2010  |  -.0012313   .0056003    -0.22   0.828    -.0128457    .0103831
              |
        _cons |   .0517761   .0089275     5.80   0.000     .0332616    .0702907
-------------------------------------------------------------------------------

. mindex class (extended: i.fclass##i.country age i.year)   ///
>              (reduced:  i.country           age i.year)   ///
>              (mindex:   SocialService       age i.year)   ///
>              if female==1, vce(cluster country)
Estimating outcome models .. done.

M-index analysis                                Number of obs     =     42,632
                                                R-squared         =     0.0003
                                                Adj R-squared     =     0.0002
                                                Root MSE          =     0.2759

Reduced model:  mlogit class i.country age i.year
Extended model: mlogit class i.fclass i.country i.fclass#i.country age i.year

                                (Std. Err. adjusted for 23 clusters in country)
-------------------------------------------------------------------------------
              |               Robust
      M-index |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
SocialService |  -.0031692   .0010928    -2.90   0.008    -.0054356   -.0009028
          age |   .0001432   .0001689     0.85   0.406    -.0002071    .0004935
              |
         year |
        2004  |   .0022577   .0039465     0.57   0.573    -.0059267    .0104422
        2006  |   .0024082   .0054851     0.44   0.665    -.0089671    .0137835
        2008  |   .0023257   .0044633     0.52   0.608    -.0069306    .0115819
        2010  |  -.0013749   .0049984    -0.28   0.786    -.0117409    .0089911
              |
        _cons |   .0386031   .0088161     4.38   0.000     .0203196    .0568866
-------------------------------------------------------------------------------

Higher social service expenditures go along with higher social mobility (lower M index) for both men and women. For women, the effect is statistically significant.