Workshop at Universidad de Sevilla, November 16, 2019
Instructors: Ben Jann and Simon Seiler, University of Bern
mindex
Open dataset ESS6.dta
and compute a two-way table of
respondent's class (class
) by father's class
(fclass
) using the tabulate
command. Compute the
M index manually using the frequencies reported in the table.
Open data:
. use ESS6.dta, clear . describe Contains data from ESS6.dta obs: 24,069 vars: 14 13 Nov 2019 18:30 ------------------------------------------------------------------------------------------ storage display value variable name type format label variable label ------------------------------------------------------------------------------------------ country byte %18.0g country country year int %10.0g survey year (ESS Round) female byte %9.0g female gender (1 = female) age byte %8.0g age birthyr int %10.0g birth year (survey year - age) educ byte %25.0g educ respondent's education meduc byte %44.0g peduc mother's education feduc byte %44.0g peduc father's education class byte %9.0g class respondent's class mclass byte %9.0g class mother's class fclass byte %9.0g class father's class mhome byte %9.0g mother was homemaker fisei byte %8.0g father's ISEI misei byte %8.0g mother's ISEI ------------------------------------------------------------------------------------------ Sorted by: country
Step 1: Calculate p(y_k,x_j)
, p(y_k|x_j)
, and p(y_k)
of respondent's class by father's class using the tabulate
command.
. tabulate class fclass, column cell nofreq +-------------------+ | Key | |-------------------| | column percentage | | cell percentage | +-------------------+ respondent | father's class 's class | upper middle lower | Total -----------+---------------------------------+---------- upper | 56.88 30.76 19.44 | 31.92 | 13.62 9.54 8.76 | 31.92 -----------+---------------------------------+---------- middle | 32.28 46.11 40.23 | 40.15 | 7.73 14.29 18.12 | 40.15 -----------+---------------------------------+---------- lower | 10.84 23.12 40.34 | 27.94 | 2.60 7.17 18.17 | 27.94 -----------+---------------------------------+---------- Total | 100.00 100.00 100.00 | 100.00 | 23.95 30.99 45.05 | 100.00
Step 2: Use the results to compute
M = sum_j sum_k p(y_k,x_j) * (ln p(y_k|x_j) - ln p(y_k))
.
. display /// > /// row 1 (son: upper class) > 0.1362 * (ln(0.5688) - ln(0.3192)) /// col 1 (father: upper class) > + 0.0954 * (ln(0.3076) - ln(0.3192)) /// col 2 (father: middle class) > + 0.0876 * (ln(0.1944) - ln(0.3192)) /// col 3 (father: lower class) > /// row 2 (son: middle class) > + 0.0773 * (ln(0.3228) - ln(0.4015)) /// col 1 (father: upper class) > + 0.1429 * (ln(0.4611) - ln(0.4015)) /// col 2 (father: middle class) > + 0.1812 * (ln(0.4023) - ln(0.4015)) /// col 3 (father: lower class) > /// row 3 (son: lower class) > + 0.0260 * (ln(0.1084) - ln(0.2794)) /// col 1 (father: upper class) > + 0.0717 * (ln(0.2312) - ln(0.2794)) /// col 2 (father: middle class) > + 0.1817 * (ln(0.4034) - ln(0.2794)) // col 3 (father: lower class) .06352723
Compute the M index using predictions from multinomial logit models
(mlogit
) and confirm that the result is the same as above.
To obtain Pr(Y_i)
, you need to estimate a reduced model, which is
simply a multinomial logit without predictors. To obtain
Pr(Y_i|X_i)
you need an extended model in which you include
father's class as a categorical predictor. After each model, you can
use predict
to obtain predictions of the probabilities. Things
are slightly complicated since you have to make sure that for each
observation you predict the probability of the class that has actually
been observed for this observation. The easiest approach is to generate
three variables, one for each class, and then select the appropriate
probability for each observation.
Step 1: Estimate reduced model and generate predictions
. mlogit class Iteration 0: log likelihood = -26166.564 Iteration 1: log likelihood = -26166.564 Multinomial logistic regression Number of obs = 24,069 LR chi2(0) = 0.00 Prob > chi2 = . Log likelihood = -26166.564 Pseudo R2 = 0.0000 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | _cons | -.2294242 .015286 -15.01 0.000 -.2593842 -.1994642 -------------+---------------------------------------------------------------- middle | (base outcome) -------------+---------------------------------------------------------------- lower | _cons | -.3626209 .0158811 -22.83 0.000 -.3937473 -.3314946 ------------------------------------------------------------------------------ . predict p0 p0_2 p0_3, pr . replace p0 = p0_2 if class == 2 (9,663 real changes made) . replace p0 = p0_3 if class == 3 (6,724 real changes made) . drop p0_2 p0_3
Step 2: Estimate extended model and generate predictions
. mlogit class i.fclass Iteration 0: log likelihood = -26166.564 Iteration 1: log likelihood = -24674.085 Iteration 2: log likelihood = -24633.704 Iteration 3: log likelihood = -24633.62 Iteration 4: log likelihood = -24633.62 Multinomial logistic regression Number of obs = 24,069 LR chi2(4) = 3065.89 Prob > chi2 = 0.0000 Log likelihood = -24633.62 Pseudo R2 = 0.0586 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | fclass | middle | -.9711631 .0396074 -24.52 0.000 -1.048792 -.8935341 lower | -1.293616 .0393186 -32.90 0.000 -1.370679 -1.216553 | _cons | .5664245 .0290227 19.52 0.000 .5095411 .6233079 -------------+---------------------------------------------------------------- middle | (base outcome) -------------+---------------------------------------------------------------- lower | fclass | middle | .4008732 .054843 7.31 0.000 .2933829 .5083635 lower | 1.093865 .0509433 21.47 0.000 .9940178 1.193712 | _cons | -1.091118 .0462314 -23.60 0.000 -1.18173 -1.000506 ------------------------------------------------------------------------------ . predict p1 p1_2 p1_3, pr . replace p1 = p1_2 if class == 2 (9,663 real changes made) . replace p1 = p1_3 if class == 3 (6,724 real changes made) . drop p1_2 p1_3
Step 3: Generate observation-level M values from predictions; the mean of these values is the M index
. generate double m = ln(p1) - ln(p0) . drop p0 p1 . mean m Mean estimation Number of obs = 24,069 -------------------------------------------------------------- | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ m | .0636896 .0022315 .0593157 .0680635 -------------------------------------------------------------- . drop m
Compute the M index by country (country
) using prediction
from multinomial logit models.
You can do this either (1) by repeating the estimation process individually
for each country or (2) by estimating joint models including
country
as a categorical predictor (and interactions with
father's class in the extended model). The second approach is easier (less
code).
First have a look at the country variable, to get an overview (if not
already installed, first type ssc install fre
):
. fre country country -- country ----------------------------------------------------------------------- | Freq. Percent Valid Cum. --------------------------+-------------------------------------------- Valid 4 Switzerland | 4331 17.99 17.99 17.99 6 Czechia | 3834 15.93 15.93 33.92 10 Spain | 3732 15.51 15.51 49.43 13 United Kingdom | 3998 16.61 16.61 66.04 14 Greece | 4088 16.98 16.98 83.02 26 Portugal | 4086 16.98 16.98 100.00 Total | 24069 100.00 100.00 -----------------------------------------------------------------------
Approach 1: repeat computation of observation-level M values for each country using a loop; in each country a reduced model and an extended model is estimated
. generate double m = . (24,069 missing values generated) . levelsof country 4 6 10 13 14 26 . foreach cntry in `r(levels)' { 2. mlogit class if country==`cntry' 3. predict p0 p0_2 p0_3, pr 4. replace p0 = p0_2 if class == 2 & country==`cntry' 5. replace p0 = p0_3 if class == 3 & country==`cntry' 6. drop p0_2 p0_3 7. mlogit class i.fclass if country==`cntry' 8. predict p1 p1_2 p1_3, pr 9. replace p1 = p1_2 if class == 2 & country==`cntry' 10. replace p1 = p1_3 if class == 3 & country==`cntry' 11. drop p1_2 p1_3 12. replace m = ln(p1) - ln(p0) if country==`cntry' 13. drop p0 p1 14. } Iteration 0: log likelihood = -4389.3046 Iteration 1: log likelihood = -4389.3046 Multinomial logistic regression Number of obs = 4,331 LR chi2(0) = 0.00 Prob > chi2 = . Log likelihood = -4389.3046 Pseudo R2 = 0.0000 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | (base outcome) -------------+---------------------------------------------------------------- middle | _cons | -.3144104 .0336063 -9.36 0.000 -.3802777 -.2485432 -------------+---------------------------------------------------------------- lower | _cons | -1.096232 .0436254 -25.13 0.000 -1.181736 -1.010728 ------------------------------------------------------------------------------ (1,532 real changes made) (701 real changes made) Iteration 0: log likelihood = -4389.3046 Iteration 1: log likelihood = -4159.3234 Iteration 2: log likelihood = -4149.0484 Iteration 3: log likelihood = -4148.9884 Iteration 4: log likelihood = -4148.9884 Multinomial logistic regression Number of obs = 4,331 LR chi2(4) = 480.63 Prob > chi2 = 0.0000 Log likelihood = -4148.9884 Pseudo R2 = 0.0548 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | (base outcome) -------------+---------------------------------------------------------------- middle | fclass | middle | .7586456 .0817074 9.28 0.000 .5985021 .9187892 lower | 1.046058 .0850619 12.30 0.000 .87934 1.212776 | _cons | -.8433046 .0554011 -15.22 0.000 -.9518888 -.7347204 -------------+---------------------------------------------------------------- lower | fclass | middle | 1.228821 .1323972 9.28 0.000 .9693271 1.488315 lower | 2.27427 .12365 18.39 0.000 2.031921 2.51662 | _cons | -2.343099 .1026584 -22.82 0.000 -2.544306 -2.141893 ------------------------------------------------------------------------------ (1,532 real changes made) (701 real changes made) (4,331 real changes made) Iteration 0: log likelihood = -4075.2209 Iteration 1: log likelihood = -4075.2209 Multinomial logistic regression Number of obs = 3,834 LR chi2(0) = 0.00 Prob > chi2 = . Log likelihood = -4075.2209 Pseudo R2 = 0.0000 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | _cons | -.5499335 .0392624 -14.01 0.000 -.6268864 -.4729807 -------------+---------------------------------------------------------------- middle | (base outcome) -------------+---------------------------------------------------------------- lower | _cons | -.5353772 .0390821 -13.70 0.000 -.6119767 -.4587778 ------------------------------------------------------------------------------ (1,773 real changes made) (1,038 real changes made) Iteration 0: log likelihood = -4075.2209 Iteration 1: log likelihood = -3920.0791 Iteration 2: log likelihood = -3913.8142 Iteration 3: log likelihood = -3913.8025 Iteration 4: log likelihood = -3913.8025 Multinomial logistic regression Number of obs = 3,834 LR chi2(4) = 322.84 Prob > chi2 = 0.0000 Log likelihood = -3913.8025 Pseudo R2 = 0.0396 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | fclass | middle | -.9135061 .1034584 -8.83 0.000 -1.116281 -.7107313 lower | -1.317428 .1128637 -11.67 0.000 -1.538637 -1.09622 | _cons | .2949776 .0854363 3.45 0.001 .1275255 .4624297 -------------+---------------------------------------------------------------- middle | (base outcome) -------------+---------------------------------------------------------------- lower | fclass | middle | .6398091 .1528019 4.19 0.000 .3403229 .9392953 lower | 1.079944 .1518972 7.11 0.000 .7822308 1.377657 | _cons | -1.31758 .1407448 -9.36 0.000 -1.593435 -1.041726 ------------------------------------------------------------------------------ (1,773 real changes made) (1,038 real changes made) (3,834 real changes made) Iteration 0: log likelihood = -4048.6257 Iteration 1: log likelihood = -4048.6257 Multinomial logistic regression Number of obs = 3,732 LR chi2(0) = 0.00 Prob > chi2 = . Log likelihood = -4048.6257 Pseudo R2 = 0.0000 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | _cons | -.405809 .0414699 -9.79 0.000 -.4870885 -.3245296 -------------+---------------------------------------------------------------- middle | (base outcome) -------------+---------------------------------------------------------------- lower | _cons | -.1050549 .0381012 -2.76 0.006 -.1797318 -.030378 ------------------------------------------------------------------------------ (1,454 real changes made) (1,309 real changes made) Iteration 0: log likelihood = -4048.6257 Iteration 1: log likelihood = -3838.9752 Iteration 2: log likelihood = -3832.5637 Iteration 3: log likelihood = -3832.5522 Iteration 4: log likelihood = -3832.5522 Multinomial logistic regression Number of obs = 3,732 LR chi2(4) = 432.15 Prob > chi2 = 0.0000 Log likelihood = -3832.5522 Pseudo R2 = 0.0534 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | fclass | middle | -.9123146 .1096721 -8.32 0.000 -1.127268 -.6973613 lower | -.9198791 .1030637 -8.93 0.000 -1.12188 -.7178779 | _cons | .2505425 .0785807 3.19 0.001 .0965272 .4045578 -------------+---------------------------------------------------------------- middle | (base outcome) -------------+---------------------------------------------------------------- lower | fclass | middle | .3678975 .1347786 2.73 0.006 .1037363 .6320587 lower | 1.290743 .1231006 10.49 0.000 1.04947 1.532016 | _cons | -.9624801 .1120854 -8.59 0.000 -1.182163 -.7427968 ------------------------------------------------------------------------------ (1,454 real changes made) (1,309 real changes made) (3,732 real changes made) Iteration 0: log likelihood = -4216.7706 Iteration 1: log likelihood = -4216.7706 Multinomial logistic regression Number of obs = 3,998 LR chi2(0) = 0.00 Prob > chi2 = . Log likelihood = -4216.7706 Pseudo R2 = 0.0000 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | (base outcome) -------------+---------------------------------------------------------------- middle | _cons | -.3448084 .0362927 -9.50 0.000 -.4159407 -.273676 -------------+---------------------------------------------------------------- lower | _cons | -.7441243 .0411774 -18.07 0.000 -.8248305 -.6634182 ------------------------------------------------------------------------------ (1,297 real changes made) (870 real changes made) Iteration 0: log likelihood = -4216.7706 Iteration 1: log likelihood = -4096.0324 Iteration 2: log likelihood = -4094.9771 Iteration 3: log likelihood = -4094.9765 Iteration 4: log likelihood = -4094.9765 Multinomial logistic regression Number of obs = 3,998 LR chi2(4) = 243.59 Prob > chi2 = 0.0000 Log likelihood = -4094.9765 Pseudo R2 = 0.0289 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | (base outcome) -------------+---------------------------------------------------------------- middle | fclass | middle | .709111 .0901932 7.86 0.000 .5323357 .8858864 lower | .8141949 .0881766 9.23 0.000 .641372 .9870179 | _cons | -.783219 .0578729 -13.53 0.000 -.8966478 -.6697902 -------------+---------------------------------------------------------------- lower | fclass | middle | 1.074415 .1088361 9.87 0.000 .8611006 1.28773 lower | 1.396194 .1035503 13.48 0.000 1.193239 1.599149 | _cons | -1.530689 .0768426 -19.92 0.000 -1.681298 -1.38008 ------------------------------------------------------------------------------ (1,297 real changes made) (870 real changes made) (3,998 real changes made) Iteration 0: log likelihood = -4331.2848 Iteration 1: log likelihood = -4331.2848 Multinomial logistic regression Number of obs = 4,088 LR chi2(0) = 0.00 Prob > chi2 = . Log likelihood = -4331.2848 Pseudo R2 = 0.0000 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | _cons | -.7075519 .0408182 -17.33 0.000 -.7875541 -.6275496 -------------+---------------------------------------------------------------- middle | (base outcome) -------------+---------------------------------------------------------------- lower | _cons | -.2800108 .0357471 -7.83 0.000 -.3500739 -.2099477 ------------------------------------------------------------------------------ (1,818 real changes made) (1,374 real changes made) Iteration 0: log likelihood = -4331.2848 Iteration 1: log likelihood = -4154.6206 Iteration 2: log likelihood = -4147.3138 Iteration 3: log likelihood = -4147.3054 Iteration 4: log likelihood = -4147.3054 Multinomial logistic regression Number of obs = 4,088 LR chi2(4) = 367.96 Prob > chi2 = 0.0000 Log likelihood = -4147.3054 Pseudo R2 = 0.0425 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | fclass | middle | -.6794951 .1164341 -5.84 0.000 -.9077018 -.4512885 lower | -.993138 .1068786 -9.29 0.000 -1.202616 -.7836598 | _cons | -.0117418 .0884763 -0.13 0.894 -.1851522 .1616686 -------------+---------------------------------------------------------------- middle | (base outcome) -------------+---------------------------------------------------------------- lower | fclass | middle | .3837685 .1521116 2.52 0.012 .0856353 .6819017 lower | 1.232826 .1370353 9.00 0.000 .9642417 1.50141 | _cons | -1.205271 .1299156 -9.28 0.000 -1.459901 -.9506408 ------------------------------------------------------------------------------ (1,818 real changes made) (1,374 real changes made) (4,088 real changes made) Iteration 0: log likelihood = -4321.9888 Iteration 1: log likelihood = -4321.9888 Multinomial logistic regression Number of obs = 4,086 LR chi2(0) = 0.00 Prob > chi2 = . Log likelihood = -4321.9888 Pseudo R2 = 0.0000 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | _cons | -.7266826 .0414131 -17.55 0.000 -.8078507 -.6455145 -------------+---------------------------------------------------------------- middle | (base outcome) -------------+---------------------------------------------------------------- lower | _cons | -.2225847 .0354584 -6.28 0.000 -.2920819 -.1530876 ------------------------------------------------------------------------------ (1,789 real changes made) (1,432 real changes made) Iteration 0: log likelihood = -4321.9888 Iteration 1: log likelihood = -4053.845 Iteration 2: log likelihood = -4031.6302 Iteration 3: log likelihood = -4031.1323 Iteration 4: log likelihood = -4031.1322 Multinomial logistic regression Number of obs = 4,086 LR chi2(4) = 581.71 Prob > chi2 = 0.0000 Log likelihood = -4031.1322 Pseudo R2 = 0.0673 ------------------------------------------------------------------------------ class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- upper | fclass | middle | -1.259648 .1175696 -10.71 0.000 -1.49008 -1.029215 lower | -1.860239 .1180938 -15.75 0.000 -2.091699 -1.628779 | _cons | .5299594 .0950068 5.58 0.000 .3437494 .7161693 -------------+---------------------------------------------------------------- middle | (base outcome) -------------+---------------------------------------------------------------- lower | fclass | middle | .338532 .1610124 2.10 0.036 .0229535 .6541105 lower | 1.080183 .1526311 7.08 0.000 .7810311 1.379334 | _cons | -1.011601 .1459686 -6.93 0.000 -1.297694 -.7255082 ------------------------------------------------------------------------------ (1,789 real changes made) (1,432 real changes made) (4,086 real changes made) . mean m, over(country) Mean estimation Number of obs = 24,069 ----------------------------------------------------------------- | Mean Std. Err. [95% Conf. Interval] ----------------+------------------------------------------------ c.m@country | Switzerland | .0554875 .0049016 .04588 .0650949 Czechia | .0421018 .0046022 .0330813 .0511224 Spain | .0578975 .0054384 .0472378 .0685572 United Kingdom | .0304638 .0038253 .022966 .0379615 Greece | .0450048 .0045675 .0360522 .0539573 Portugal | .0711837 .0058591 .0596995 .0826679 ----------------------------------------------------------------- . drop m
Approach 2: compute observation-level M values in one go based on joint models across countries (the reduced model contains country dummies; the extended model additionally contains interactions between country dummies and father's class; this is formally equivalent to estimating a separate model in each country)
. mlogit class i.country Iteration 0: log likelihood = -26166.564 Iteration 1: log likelihood = -25392.18 Iteration 2: log likelihood = -25383.197 Iteration 3: log likelihood = -25383.195 Multinomial logistic regression Number of obs = 24,069 LR chi2(10) = 1566.74 Prob > chi2 = 0.0000 Log likelihood = -25383.195 Pseudo R2 = 0.0299 --------------------------------------------------------------------------------- class | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- upper | country | Czechia | -.8643438 .0516809 -16.72 0.000 -.9656366 -.763051 Spain | -.7202193 .0533773 -13.49 0.000 -.8248369 -.6156017 United Kingdom | .0303981 .0494626 0.61 0.539 -.0665467 .127343 Greece | -1.021962 .0528726 -19.33 0.000 -1.12559 -.9183336 Portugal | -1.041093 .0533332 -19.52 0.000 -1.145624 -.9365615 | _cons | .3144103 .0336063 9.36 0.000 .248543 .3802775 ----------------+---------------------------------------------------------------- middle | (base outcome) ----------------+---------------------------------------------------------------- lower | country | Czechia | .2464447 .0600557 4.10 0.000 .1287378 .3641516 Spain | .6767671 .059422 11.39 0.000 .5603021 .7932321 United Kingdom | .382506 .0632433 6.05 0.000 .2585515 .5064605 Greece | .5018112 .0579408 8.66 0.000 .3882494 .615373 Portugal | .5592373 .0577631 9.68 0.000 .4460238 .6724508 | _cons | -.7818219 .0455991 -17.15 0.000 -.8711945 -.6924494 --------------------------------------------------------------------------------- . predict p0 p0_2 p0_3, pr . replace p0 = p0_2 if class == 2 (9,663 real changes made) . replace p0 = p0_3 if class == 3 (6,724 real changes made) . drop p0_2 p0_3 . mlogit class i.fclass##i.country Iteration 0: log likelihood = -26166.564 Iteration 1: log likelihood = -24240.885 Iteration 2: log likelihood = -24169.315 Iteration 3: log likelihood = -24168.757 Iteration 4: log likelihood = -24168.757 Multinomial logistic regression Number of obs = 24,069 LR chi2(34) = 3995.61 Prob > chi2 = 0.0000 Log likelihood = -24168.757 Pseudo R2 = 0.0763 ---------------------------------------------------------------------------------------- class | Coef. Std. Err. z P>|z| [95% Conf. Interval] -----------------------+---------------------------------------------------------------- upper | fclass | middle | -.7586456 .0817074 -9.28 0.000 -.9187892 -.5985021 lower | -1.046058 .0850619 -12.30 0.000 -1.212776 -.87934 | country | Czechia | -.548327 .1018266 -5.38 0.000 -.7479034 -.3487506 Spain | -.5927621 .0961468 -6.17 0.000 -.7812064 -.4043178 United Kingdom | -.0600856 .0801159 -0.75 0.453 -.2171099 .0969386 Greece | -.8550464 .1043903 -8.19 0.000 -1.059648 -.6504451 Portugal | -.313345 .1099799 -2.85 0.004 -.5289017 -.0977883 | fclass#country | middle#Czechia | -.1548605 .1318323 -1.17 0.240 -.413247 .103526 middle#Spain | -.153669 .1367628 -1.12 0.261 -.4217191 .1143812 middle#United Kingdom | .0495346 .1217001 0.41 0.684 -.1889932 .2880624 middle#Greece | .0791505 .1422427 0.56 0.578 -.1996402 .3579411 middle#Portugal | -.5010021 .1431737 -3.50 0.000 -.7816174 -.2203868 lower#Czechia | -.2713703 .1413285 -1.92 0.055 -.548369 .0056284 lower#Spain | .1261792 .1336325 0.94 0.345 -.1357357 .3880941 lower#United Kingdom | .2318633 .1225179 1.89 0.058 -.0082674 .4719939 lower#Greece | .0529202 .1365963 0.39 0.698 -.2148036 .3206441 lower#Portugal | -.814181 .1455393 -5.59 0.000 -1.099433 -.5289293 | _cons | .8433046 .0554011 15.22 0.000 .7347204 .9518888 -----------------------+---------------------------------------------------------------- middle | (base outcome) -----------------------+---------------------------------------------------------------- lower | fclass | middle | .4701738 .1374924 3.42 0.001 .2006937 .7396539 lower | 1.228211 .1268501 9.68 0.000 .9795889 1.476832 | country | Czechia | .1822128 .1776806 1.03 0.305 -.1660348 .5304603 Spain | .5373131 .1559629 3.45 0.001 .2316314 .8429947 United Kingdom | .7523233 .13753 5.47 0.000 .4827694 1.021877 Greece | .2945225 .1692317 1.74 0.082 -.0371654 .6262105 Portugal | .4881923 .1818464 2.68 0.007 .1317799 .8446047 | fclass#country | middle#Czechia | .1696353 .2055543 0.83 0.409 -.2332437 .5725143 middle#Spain | -.1022763 .1925342 -0.53 0.595 -.4796364 .2750838 middle#United Kingdom | -.1048694 .1793743 -0.58 0.559 -.4564365 .2466977 middle#Greece | -.0864053 .2050417 -0.42 0.673 -.4882796 .315469 middle#Portugal | -.1316422 .2117289 -0.62 0.534 -.5466232 .2833389 lower#Czechia | -.1482667 .1978983 -0.75 0.454 -.5361401 .2396068 lower#Spain | .0625326 .1767617 0.35 0.724 -.283914 .4089793 lower#United Kingdom | -.6462116 .1673126 -3.86 0.000 -.9741383 -.3182848 lower#Greece | .0046153 .1867341 0.02 0.980 -.3613768 .3706074 lower#Portugal | -.1480284 .1984621 -0.75 0.456 -.5370069 .2409501 | _cons | -1.499793 .1084495 -13.83 0.000 -1.71235 -1.287236 ---------------------------------------------------------------------------------------- . predict p1 p1_2 p1_3, pr . replace p1 = p1_2 if class == 2 (9,663 real changes made) . replace p1 = p1_3 if class == 3 (6,724 real changes made) . drop p1_2 p1_3 . generate double m = ln(p1) - ln(p0) . drop p0 p1 . mean m, over(country) Mean estimation Number of obs = 24,069 ----------------------------------------------------------------- | Mean Std. Err. [95% Conf. Interval] ----------------+------------------------------------------------ c.m@country | Switzerland | .0554875 .0049016 .04588 .0650949 Czechia | .0421018 .0046022 .0330813 .0511224 Spain | .0578975 .0054384 .0472378 .0685572 United Kingdom | .0304638 .0038253 .022966 .0379615 Greece | .0450048 .0045675 .0360522 .0539573 Portugal | .0711837 .0058591 .0596995 .0826679 ----------------------------------------------------------------- . drop m
mindex
Compute the M index using command mindex
, overall and by
country, and confirm that the results are the same as above.
See help mindex
for information on how to use
mindex
. There is a simple syntax and an advanced syntax.
For now, focus on the simple syntax.
Overall M:
. mindex class i.fclass Estimating outcome models .. done. M-index analysis Number of obs = 24,069 Reduced model: mlogit class Extended model: mlogit class i.fclass ------------------------------------------------------------------------------ M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | .0636896 .0022315 28.54 0.000 .0593159 .0680632 ------------------------------------------------------------------------------
M by country:
. mindex class i.fclass, over(country) Estimating outcome models ............ done. M-index analysis Number of obs = 24,069 Reduced model: mlogit class Extended model: mlogit class i.fclass Over: country --------------------------------------------------------------------------------- M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- country | Switzerland | .0554875 .0049011 11.32 0.000 .0458814 .0650935 Czechia | .0421019 .0046017 9.15 0.000 .0330827 .051121 Spain | .0578975 .0054378 10.65 0.000 .0472396 .0685554 United Kingdom | .0304638 .0038249 7.96 0.000 .0229671 .0379604 Greece | .0450047 .004567 9.85 0.000 .0360536 .0539559 Portugal | .0711837 .0058585 12.15 0.000 .0597012 .0826662 ---------------------------------------------------------------------------------
M by country and overall M:
. mindex class i.fclass, over(country) total Estimating outcome models .............. done. M-index analysis Number of obs = 24,069 Reduced model: mlogit class Extended model: mlogit class i.fclass Over: country --------------------------------------------------------------------------------- M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- country | Switzerland | .0554875 .0049011 11.32 0.000 .0458814 .0650935 Czechia | .0421019 .0046017 9.15 0.000 .0330827 .051121 Spain | .0578975 .0054378 10.65 0.000 .0472396 .0685554 United Kingdom | .0304638 .0038249 7.96 0.000 .0229671 .0379604 Greece | .0450047 .004567 9.85 0.000 .0360536 .0539559 Portugal | .0711837 .0058585 12.15 0.000 .0597012 .0826662 | Total | .0636896 .0022315 28.54 0.000 .0593159 .0680632 ---------------------------------------------------------------------------------
Present the country-specific results and the total in a graph. Optional: Apart form the default results, also include confidence intervals obtained by the bootstrap (stratify the bootstrap samples by country and use 100 replications).
To create the graph, we suggest using coefplot
(type
ssc install coefplot
). To be able to plot the results using
coefplot
, they need to be stored using estimates
store
.
Bootstrap results can be obtained by applying option
vce(bootrtap)
to the mindex
command. Use suboptions
reps()
to set the number of replications and strata()
to request stratified resampling.
Estimate and store M using default standard errors/confidence intervals:
. mindex class i.fclass, over(country) total Estimating outcome models .............. done. M-index analysis Number of obs = 24,069 Reduced model: mlogit class Extended model: mlogit class i.fclass Over: country --------------------------------------------------------------------------------- M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- country | Switzerland | .0554875 .0049011 11.32 0.000 .0458814 .0650935 Czechia | .0421019 .0046017 9.15 0.000 .0330827 .051121 Spain | .0578975 .0054378 10.65 0.000 .0472396 .0685554 United Kingdom | .0304638 .0038249 7.96 0.000 .0229671 .0379604 Greece | .0450047 .004567 9.85 0.000 .0360536 .0539559 Portugal | .0711837 .0058585 12.15 0.000 .0597012 .0826662 | Total | .0636896 .0022315 28.54 0.000 .0593159 .0680632 --------------------------------------------------------------------------------- . estimates store default
Estimate and store M using bootstrap standard errors/confidence intervals:
. mindex class i.fclass, over(country) total vce(bootstrap, reps(100) strata(country)) (running mindex on estimation sample) Bootstrap replications (100) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 M-index analysis Number of strata = 6 Number of obs = 24,069 Replications = 100 Reduced model: mlogit class Extended model: mlogit class i.fclass Over: country --------------------------------------------------------------------------------- | Observed Bootstrap Normal-based M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- country | Switzerland | .0554875 .0041687 13.31 0.000 .047317 .0636579 Czechia | .0421019 .0044463 9.47 0.000 .0333872 .0508165 Spain | .0578975 .0052173 11.10 0.000 .0476717 .0681233 United Kingdom | .0304638 .0042599 7.15 0.000 .0221145 .038813 Greece | .0450047 .0046572 9.66 0.000 .0358769 .0541326 Portugal | .0711837 .0051462 13.83 0.000 .0610974 .08127 | Total | .0636896 .0024681 25.81 0.000 .0588523 .0685269 --------------------------------------------------------------------------------- . estimates store bootstrap
Display results in a graph:
. coefplot default bootstrap
As you will see in the graph, the overall M is rather high compared to the country-specific M values. This is because the overall M also picks up country differences in the class distributions. A more sensible approach for the overall M might therefore be to compute the M based on a model that includes country fixed-effects. Try to compute such an adjusted overall M.
Approach 1: manual computation based on model predictions; the trick is to include country dummies in both the reduced and the extended model (but omit interactions in the extended model)
. mlogit class i.country Iteration 0: log likelihood = -26166.564 Iteration 1: log likelihood = -25392.18 Iteration 2: log likelihood = -25383.197 Iteration 3: log likelihood = -25383.195 Multinomial logistic regression Number of obs = 24,069 LR chi2(10) = 1566.74 Prob > chi2 = 0.0000 Log likelihood = -25383.195 Pseudo R2 = 0.0299 --------------------------------------------------------------------------------- class | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- upper | country | Czechia | -.8643438 .0516809 -16.72 0.000 -.9656366 -.763051 Spain | -.7202193 .0533773 -13.49 0.000 -.8248369 -.6156017 United Kingdom | .0303981 .0494626 0.61 0.539 -.0665467 .127343 Greece | -1.021962 .0528726 -19.33 0.000 -1.12559 -.9183336 Portugal | -1.041093 .0533332 -19.52 0.000 -1.145624 -.9365615 | _cons | .3144103 .0336063 9.36 0.000 .248543 .3802775 ----------------+---------------------------------------------------------------- middle | (base outcome) ----------------+---------------------------------------------------------------- lower | country | Czechia | .2464447 .0600557 4.10 0.000 .1287378 .3641516 Spain | .6767671 .059422 11.39 0.000 .5603021 .7932321 United Kingdom | .382506 .0632433 6.05 0.000 .2585515 .5064605 Greece | .5018112 .0579408 8.66 0.000 .3882494 .615373 Portugal | .5592373 .0577631 9.68 0.000 .4460238 .6724508 | _cons | -.7818219 .0455991 -17.15 0.000 -.8711945 -.6924494 --------------------------------------------------------------------------------- . predict p0 p0_2 p0_3, pr . replace p0 = p0_2 if class == 2 (9,663 real changes made) . replace p0 = p0_3 if class == 3 (6,724 real changes made) . drop p0_2 p0_3 . mlogit class i.fclass i.country Iteration 0: log likelihood = -26166.564 Iteration 1: log likelihood = -24297.422 Iteration 2: log likelihood = -24245.108 Iteration 3: log likelihood = -24244.971 Iteration 4: log likelihood = -24244.971 Multinomial logistic regression Number of obs = 24,069 LR chi2(14) = 3843.19 Prob > chi2 = 0.0000 Log likelihood = -24244.971 Pseudo R2 = 0.0734 --------------------------------------------------------------------------------- class | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- upper | fclass | middle | -.8234312 .0406635 -20.25 0.000 -.9031301 -.7437322 lower | -1.096431 .0405718 -27.02 0.000 -1.175951 -1.016912 | country | Czechia | -.6958079 .0530956 -13.10 0.000 -.7998734 -.5917423 Spain | -.5828481 .0547182 -10.65 0.000 -.6900938 -.4756025 United Kingdom | -.0006604 .0506764 -0.01 0.990 -.0999844 .0986635 Greece | -.7965508 .0544689 -14.62 0.000 -.9033078 -.6897938 Portugal | -.8180399 .0547832 -14.93 0.000 -.9254129 -.7106669 | _cons | .8821138 .0409238 21.56 0.000 .8019048 .9623229 ----------------+---------------------------------------------------------------- middle | (base outcome) ----------------+---------------------------------------------------------------- lower | fclass | middle | .3974989 .0554946 7.16 0.000 .2887315 .5062664 lower | 1.063228 .0518614 20.50 0.000 .9615811 1.164874 | country | Czechia | .1962998 .0610913 3.21 0.001 .076563 .3160365 Spain | .5542965 .0603611 9.18 0.000 .435991 .672602 United Kingdom | .3976265 .0640695 6.21 0.000 .2720526 .5232005 Greece | .3054363 .0591274 5.17 0.000 .1895487 .4213238 Portugal | .400215 .0588458 6.80 0.000 .2848793 .5155506 | _cons | -1.394522 .0613094 -22.75 0.000 -1.514686 -1.274358 --------------------------------------------------------------------------------- . predict p1 p1_2 p1_3, pr . replace p1 = p1_2 if class == 2 (9,663 real changes made) . replace p1 = p1_3 if class == 3 (6,724 real changes made) . drop p1_2 p1_3 . generate double m = ln(p1) - ln(p0) . drop p0 p1 . mean m Mean estimation Number of obs = 24,069 -------------------------------------------------------------- | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ m | .0472901 .0019488 .0434704 .0511098 -------------------------------------------------------------- . drop m
Approach 2 (simple mindex
syntax): apply option controls()
. mindex class i.fclass, controls(i.country) Estimating outcome models .. done. M-index analysis Number of obs = 24,069 Reduced model: mlogit class i.country Extended model: mlogit class i.fclass i.country ------------------------------------------------------------------------------ M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | .0472901 .0019488 24.27 0.000 .0434706 .0511096 ------------------------------------------------------------------------------
Approach 3 (advanced mindex
syntax): three sets of parentheses
specify the extended model (first set), the reduced model (second set), and the
model used to aggregate/analyze the observation-level M values (third set; in
the current case this model only contains a constant and no predictors)
. mindex class (extended:i.fclass i.country) (reduced:i.country) (mindex:) Estimating outcome models .. done. M-index analysis Number of obs = 24,069 R-squared = 0.0000 Adj R-squared = 0.0000 Root MSE = 0.3023 Reduced model: mlogit class i.country Extended model: mlogit class i.fclass i.country ------------------------------------------------------------------------------ M-index | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | .0472901 .0019488 24.27 0.000 .0434704 .0511098 ------------------------------------------------------------------------------
Social mobility is often analyzed using the so-called unidiff model. Compute an unidiff model to evaluate country differences in social origin effects (again using father's class as a predictor for respondent's class) and compare to the country-specific M indices above. Use Spain as the reference country for computing the unidiff parameters.
To estimate the unidiff model, use the udiff
command (type
ssc install udiff
). udiff
is easy to use and can be
applied to individual-level data. Alternatively, if you are already
familiar with the unidiff
command by Pisati (2000, Stata
Technical Bulletin 55:33–47), feel free to use this command (type
net install sg142, from(http://www.stata.com/stb/stb55)
).
Note that, in this case, you first have to collapse the data (e.g. using
command contract
).
Estimate the unidiff model using udiff
:
. udiff class i.fclass ib10.country, baselevel eform fitting constant fluidity model ... done Iteration 0: log likelihood = -24244.971 Iteration 1: log likelihood = -24241.049 (not concave) Iteration 2: log likelihood = -24222.776 Iteration 3: log likelihood = -24203.263 Iteration 4: log likelihood = -24197.886 Iteration 5: log likelihood = -24197.154 Iteration 6: log likelihood = -24197.146 Iteration 7: log likelihood = -24197.146 Unidiff model Number of obs = 24,069 LR chi2(5) = 95.65 Log likelihood = -24197.146 Prob > chi2 = 0.0000 --------------------------------------------------------------------------------- class | exp(b) Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- Phi | country | Switzerland | .9714266 .068701 -0.41 0.682 .8456906 1.115857 Czechia | .9994966 .0782658 -0.01 0.995 .8572901 1.165292 Spain | 1 (base) United Kingdom | .638413 .0538429 -5.32 0.000 .5411435 .7531664 Greece | 1.008582 .0758624 0.11 0.910 .870335 1.168788 Portugal | 1.330639 .0915229 4.15 0.000 1.162823 1.522674 --------------------------------------------------------------------------------- . estimates store unidiff
The likelihood evaluator implemented in udiff
models the
unidiff parameters on a logarithmic scale. This is why option
eform
has been applied; the option only affects how the
results are displayed (i.e. it transforms the parameters back to scaling
factors, a representation that is more common in the literature). Option
baselevel
has been specified so that the reference country is
included in the output.
Estimate the M index by country:
. mindex class i.fclass, over(country) Estimating outcome models ............ done. M-index analysis Number of obs = 24,069 Reduced model: mlogit class Extended model: mlogit class i.fclass Over: country --------------------------------------------------------------------------------- M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- country | Switzerland | .0554875 .0049011 11.32 0.000 .0458814 .0650935 Czechia | .0421019 .0046017 9.15 0.000 .0330827 .051121 Spain | .0578975 .0054378 10.65 0.000 .0472396 .0685554 United Kingdom | .0304638 .0038249 7.96 0.000 .0229671 .0379604 Greece | .0450047 .004567 9.85 0.000 .0360536 .0539559 Portugal | .0711837 .0058585 12.15 0.000 .0597012 .0826662 --------------------------------------------------------------------------------- . estimates store mindex
Compare the results in a graph:
. coefplot unidiff, eform baselevel || mindex ||, byopts(xrescale) xlabel(, grid)
The two methods lead to quite different results. Conclusions are similar for UK (highest mobility) and Portugal (lowest mobility), but for the other countries results differ. Whereas the unidiff model indicates that these countries all have very similar levels of social mobility, the M index points to substantial variation (more on this in the next exercise).
For sake of completeness, here is how the unidiff parameters could be
estimated by the unidiff
command by Pisati (2000):
. preserve . keep class fclass country . contract class fclass country . unidiff _freq, row(fclass) col(class) lay(country) effect(mult) pattern(fi) refcat(3) Iteration 1: deviance = 56.2607 Iteration 2: deviance = 4.2810 Iteration 3: deviance = 0.1094 Iteration 4: deviance = 0.0069 Iteration 5: deviance = 0.0004 Analysis of differences in two-way associations Table structure ------------------------------------------------------------------------------- Name Label N. of categories ------------------------------------------------------------------------------- Row fclass father's class 3 Column class respondent's class 3 Layer country country 6 ------------------------------------------------------------------------------- Model specification ------------------------------------------------------------------------------- Layer effect: multiplicative R-C association pattern: full interaction Additional variables: none ------------------------------------------------------------------------------- Goodness-of-fit statistics ------------------------------------------------------------------------------- Model N df X2 p G2 p rG2 BIC DI ------------------------------------------------------------------------------- Cond. indep. 24069 24 2479.0 0.00 2428.9 0.00 0.0 2186.7 12.1 Null effect 24069 20 153.0 0.00 152.4 0.00 93.7 -49.3 2.8 Multipl. effect 24069 15 56.8 0.00 56.8 0.00 97.7 -94.6 2.0 ------------------------------------------------------------------------------- Phi parameters (layer scores) --------------------------------------------------- country | Raw Scaled 1 Scaled 2 ---------------+----------------------------------- Switzerland | 4.1086 0.9714 0.3921 Czechia | 4.2273 0.9995 0.4034 Spain | 4.2295 1.0000 0.4036 United Kingdom | 2.7002 0.6384 0.2577 Greece | 4.2658 1.0086 0.4071 Portugal | 5.6279 1.3306 0.5371 --------------------------------------------------- Psi parameters (R-C association scores) ---------------------------------- father's | respondent's class class | upper middle lower ----------+----------------------- upper | 0.00 0.00 0.00 middle | 0.00 0.20 0.31 lower | 0.00 0.28 0.54 ---------------------------------- Kappa indices ----------------------- country | Kappa ---------------+------- Switzerland | 0.37 Czechia | 0.38 Spain | 0.38 United Kingdom | 0.24 Greece | 0.39 Portugal | 0.51 ----------------------- . restore
Look for column "Scaled 1" in table "Phi parameters (layer scores)". Results are
the same as the ones obtained by udiff
.
According to the unidiff model, social mobility in Spain is quite similar to social mobility in Switzerland, Czechia, and Greece. However, when looking at the M index, we see that the Czech and Greek societies appear to be more open than the Spanish society (Czechia and Greece have a lower M index than Spain). Compute a decomposition of the differences in the M index between Spain an the other countries into a part due to differences in the internal structure (association pattern between respondent's and father's class) and a part due to differences in the marginal distributions respondent's and father's class.
Use option decompose
to obtain the decomposition. Equation "Contrast"
will display the country-differences in the M index; equation "Internal"
will report, for each contrast, the part that is due to differences in internal structure
(i.e. the association patterns); equation "Marginal" will contain the part that is due
to differences in the marginal distributions.
. mindex class i.fclass, over(country) refgroup(10) decompose Estimating outcome models ............ done. Fitting counterfactual distributions .......... done. M-index analysis Number of obs = 24,069 Reduced model: mlogit class Extended model: mlogit class i.fclass Over: country --------------------------------------------------------------------------------- M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- Level | country | Switzerland | .0554875 .0049011 11.32 0.000 .0458814 .0650935 Czechia | .0421019 .0046017 9.15 0.000 .0330827 .051121 Spain | .0578975 .0054378 10.65 0.000 .0472396 .0685554 United Kingdom | .0304638 .0038249 7.96 0.000 .0229671 .0379604 Greece | .0450047 .004567 9.85 0.000 .0360536 .0539559 Portugal | .0711837 .0058585 12.15 0.000 .0597012 .0826662 ----------------+---------------------------------------------------------------- Contrast | country | Switzerland | -.0024101 .0073206 -0.33 0.742 -.0167581 .011938 Czechia | -.0157957 .0071236 -2.22 0.027 -.0297576 -.0018337 Spain | 0 (omitted) United Kingdom | -.0274338 .0066483 -4.13 0.000 -.0404641 -.0144034 Greece | -.0128928 .0071012 -1.82 0.069 -.0268109 .0010254 Portugal | .0132862 .0079932 1.66 0.096 -.0023803 .0289526 ----------------+---------------------------------------------------------------- Internal | country | Switzerland | -.0007332 . . . . . Czechia | -.0006831 . . . . . Spain | 0 (omitted) United Kingdom | -.0355089 . . . . . Greece | -.0005447 . . . . . Portugal | .0311138 . . . . . ----------------+---------------------------------------------------------------- Marginal | country | Switzerland | -.0016768 . . . . . Czechia | -.0151126 . . . . . Spain | 0 (omitted) United Kingdom | .0080751 . . . . . Greece | -.0123481 . . . . . Portugal | -.0178276 . . . . . ---------------------------------------------------------------------------------
Analytic standard errors are not supported with decompose
. Use the
bootstrap to obtain standard errors:
. mindex class i.fclass, over(country) refgroup(10) decompose vce(bootstrap, reps(100)) (running mindex on estimation sample) Bootstrap replications (100) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 M-index analysis Number of obs = 24,069 Replications = 100 Reduced model: mlogit class Extended model: mlogit class i.fclass Over: country --------------------------------------------------------------------------------- | Observed Bootstrap Normal-based M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- Level | country | Switzerland | .0554875 .0046273 11.99 0.000 .0464181 .0645568 Czechia | .0421019 .0047326 8.90 0.000 .0328261 .0513776 Spain | .0578975 .0052912 10.94 0.000 .047527 .068268 United Kingdom | .0304638 .0043373 7.02 0.000 .0219629 .0389646 Greece | .0450047 .0041803 10.77 0.000 .0368114 .0531981 Portugal | .0711837 .0062111 11.46 0.000 .0590102 .0833572 ----------------+---------------------------------------------------------------- Contrast | country | Switzerland | -.0024101 .0071953 -0.33 0.738 -.0165127 .0116925 Czechia | -.0157957 .0071166 -2.22 0.026 -.029744 -.0018473 Spain | 0 (omitted) United Kingdom | -.0274338 .0063535 -4.32 0.000 -.0398863 -.0149812 Greece | -.0128928 .0068096 -1.89 0.058 -.0262394 .0004538 Portugal | .0132862 .0078379 1.70 0.090 -.0020758 .0286481 ----------------+---------------------------------------------------------------- Internal | country | Switzerland | -.0007332 .0070194 -0.10 0.917 -.0144911 .0130246 Czechia | -.0006831 .0071564 -0.10 0.924 -.0147093 .0133431 Spain | 0 (omitted) United Kingdom | -.0355089 .0063526 -5.59 0.000 -.0479598 -.0230579 Greece | -.0005447 .00673 -0.08 0.935 -.0137353 .0126459 Portugal | .0311138 .0078213 3.98 0.000 .0157844 .0464433 ----------------+---------------------------------------------------------------- Marginal | country | Switzerland | -.0016768 .0019865 -0.84 0.399 -.0055704 .0022167 Czechia | -.0151126 .0019238 -7.86 0.000 -.0188832 -.011342 Spain | 0 (omitted) United Kingdom | .0080751 .0016028 5.04 0.000 .0049336 .0112166 Greece | -.0123481 .0017047 -7.24 0.000 -.0156892 -.0090069 Portugal | -.0178276 .0025488 -6.99 0.000 -.0228232 -.0128321 ---------------------------------------------------------------------------------
Here is an overview graph of the results. It displays the unidiff parameter and the M index in the upper part, and the decomposition results in the lower part. As we would expect, at least if the unidiff model fits the data, the pattern of country differences due to internal structure is very similar to the pattern of unidiff parameters.
. coefplot unidiff, eform baselevel ///
> || ., keep(Level:) bylabel(mindex) ///
> || ., keep(Internal:) bylabel(internal) ///
> || ., keep(Marginal:) bylabel(marginal) ///
> || , byopts(xrescale) xlabel(, grid)
Note that the "Marginal" component of the decomposition can be further subdivided
into a part due to differences in the marginal distribution of the destination
variable ("Marginal_Y") and a part due to differences in the marginal distribution of the origin
variable ("Marginal_X") using the split
option:
. mindex class i.fclass, over(country) refgroup(10) decompose split vce(bootstrap, reps(10 > 0)) (running mindex on estimation sample) Bootstrap replications (100) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 M-index analysis Number of obs = 24,069 Replications = 100 Reduced model: mlogit class Extended model: mlogit class i.fclass Over: country --------------------------------------------------------------------------------- | Observed Bootstrap Normal-based M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- Level | country | Switzerland | .0554875 .0042302 13.12 0.000 .0471964 .0637785 Czechia | .0421019 .0045425 9.27 0.000 .0331986 .0510051 Spain | .0578975 .0053298 10.86 0.000 .0474514 .0683436 United Kingdom | .0304638 .003847 7.92 0.000 .0229237 .0380038 Greece | .0450047 .0046086 9.77 0.000 .0359721 .0540374 Portugal | .0711837 .0063437 11.22 0.000 .0587503 .0836171 ----------------+---------------------------------------------------------------- Contrast | country | Switzerland | -.0024101 .0066066 -0.36 0.715 -.0153587 .0105386 Czechia | -.0157957 .0068574 -2.30 0.021 -.0292358 -.0023555 Spain | 0 (omitted) United Kingdom | -.0274338 .0068502 -4.00 0.000 -.0408599 -.0140076 Greece | -.0128928 .0076375 -1.69 0.091 -.027862 .0020764 Portugal | .0132862 .0079853 1.66 0.096 -.0023648 .0289372 ----------------+---------------------------------------------------------------- Internal | country | Switzerland | -.0007332 .0067816 -0.11 0.914 -.0140249 .0125585 Czechia | -.0006831 .0068848 -0.10 0.921 -.0141771 .0128109 Spain | 0 (omitted) United Kingdom | -.0355089 .007119 -4.99 0.000 -.0494619 -.0215559 Greece | -.0005447 .007351 -0.07 0.941 -.0149523 .013863 Portugal | .0311138 .0084995 3.66 0.000 .0144551 .0477726 ----------------+---------------------------------------------------------------- Marginal | country | Switzerland | -.0016768 .0020541 -0.82 0.414 -.0057028 .0023491 Czechia | -.0151126 .0017739 -8.52 0.000 -.0185894 -.0116357 Spain | 0 (omitted) United Kingdom | .0080751 .0015607 5.17 0.000 .0050163 .011134 Greece | -.0123481 .0016163 -7.64 0.000 -.0155159 -.0091803 Portugal | -.0178276 .0022998 -7.75 0.000 -.0223351 -.0133202 ----------------+---------------------------------------------------------------- Marginal_Y | country | Switzerland | -.0130514 .0023644 -5.52 0.000 -.0176855 -.0084173 Czechia | -.0060491 .0009611 -6.29 0.000 -.0079327 -.0041655 Spain | 0 (omitted) United Kingdom | -.0028811 .0011642 -2.47 0.013 -.0051629 -.0005993 Greece | -.0036376 .0009304 -3.91 0.000 -.0054611 -.0018141 Portugal | -.0045025 .0010467 -4.30 0.000 -.0065539 -.0024511 ----------------+---------------------------------------------------------------- Marginal_X | country | Switzerland | .0113746 .0014617 7.78 0.000 .0085098 .0142394 Czechia | -.0090635 .0014143 -6.41 0.000 -.0118355 -.0062914 Spain | 0 (omitted) United Kingdom | .0109562 .0015189 7.21 0.000 .0079792 .0139333 Greece | -.0087105 .0013197 -6.60 0.000 -.0112971 -.0061238 Portugal | -.0133252 .0018751 -7.11 0.000 -.0170002 -.0096501 ---------------------------------------------------------------------------------
Dig a bit deeper into the country differences between Spain and Czechia by analyzing the class-specific components of the M index, that is, by analyzing local class linkages in connection with the frequencies of the classes.
You can do this either from the perspective of respondent's class or from the perspective of father's class. You will need to use advanced syntax to obtain the estimates of local class linkages.
To subdivide the M index by classes we will need to resort to advanced syntax.
We first take the perspective of respondent's class. Note that the analytic
standard errors for local class linkages appear rather unreliable,
this why we will use the bootstrap. Furthermore, we will use command
estout
to create an overview table of the results (if not
already installed, first type ssc install estout
).
Local class linkages and class proportions for Spain:
. mindex class (extended:i.fclass) (reduced:) (mindex:ibn.class, nocons) /// > if country==10, vce(boot, reps(100)) (running mindex on estimation sample) Bootstrap replications (100) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 M-index analysis Number of obs = 3,732 Replications = 100 R-squared = 0.0417 Adj R-squared = 0.0410 Root MSE = 0.3302 Reduced model: mlogit class Extended model: mlogit class i.fclass ------------------------------------------------------------------------------ | Observed Bootstrap Normal-based M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- class | upper | .0900216 .0101249 8.89 0.000 .0701772 .109866 middle | .0112208 .0029259 3.83 0.000 .005486 .0169555 lower | .0859645 .0094486 9.10 0.000 .0674456 .1044835 ------------------------------------------------------------------------------ . estimates store m_spain . proportion class if country==10 Proportion estimation Number of obs = 3,732 -------------------------------------------------------------- | Logit | Proportion Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ class | upper | .2596463 .0071769 .2458245 .2739629 middle | .3896034 .0079826 .3740711 .405363 lower | .3507503 .0078115 .3355929 .3662149 -------------------------------------------------------------- . estimates store p_spain
Local class linkages and class proportions for Czechia:
. mindex class (extended:i.fclass) (reduced:) (mindex:ibn.class, nocons) /// > if country==6, vce(boot, reps(100)) (running mindex on estimation sample) Bootstrap replications (100) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 M-index analysis Number of obs = 3,834 Replications = 100 R-squared = 0.0377 Adj R-squared = 0.0370 Root MSE = 0.2826 Reduced model: mlogit class Extended model: mlogit class i.fclass ------------------------------------------------------------------------------ | Observed Bootstrap Normal-based M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- class | upper | .0861695 .011065 7.79 0.000 .0644825 .1078564 middle | .0032655 .0015994 2.04 0.041 .0001307 .0064003 lower | .0650071 .009096 7.15 0.000 .0471792 .082835 ------------------------------------------------------------------------------ . estimates store m_czechia . proportion class if country==6 Proportion estimation Number of obs = 3,834 -------------------------------------------------------------- | Logit | Proportion Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ class | upper | .2668232 .0071432 .2530541 .2810595 middle | .4624413 .0080522 .4466971 .4782608 lower | .2707355 .0071761 .2568981 .2850324 -------------------------------------------------------------- . estimates store p_czechia
Overview table:
. estout m_spain p_spain m_czechia p_czechia, collabels(none) cell(b(fmt(3))) ---------------------------------------------------------------- m_spain p_spain m_czechia p_czechia ---------------------------------------------------------------- 1.class 0.090 0.260 0.086 0.267 2.class 0.011 0.390 0.003 0.462 3.class 0.086 0.351 0.065 0.271 ----------------------------------------------------------------
We see that in Czechia local class linkage is relatively weak for the middle class, but at the same time this class is relatively large in Czechia. This explains, at least in part, the increase in social mobility in Czechia, relative to Spain, once the marginal distribution of classes is taken into account.
We now repeat the analysis from the perspective of father's class:
. mindex class (extended:i.fclass) (reduced:) (mindex:ibn.fclass, nocons) /// > if country==10, vce(boot, reps(100)) (running mindex on estimation sample) Bootstrap replications (100) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 M-index analysis Number of obs = 3,732 Replications = 100 R-squared = 0.0515 Adj R-squared = 0.0508 Root MSE = 0.3285 Reduced model: mlogit class Extended model: mlogit class i.fclass ------------------------------------------------------------------------------ | Observed Bootstrap Normal-based M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- fclass | upper | .1551994 .0179685 8.64 0.000 .1199818 .190417 middle | .0216725 .0061038 3.55 0.000 .0097092 .0336357 lower | .0386212 .004261 9.06 0.000 .0302698 .0469725 ------------------------------------------------------------------------------ . estimates store m_spain . proportion fclass if country==10 Proportion estimation Number of obs = 3,732 -------------------------------------------------------------- | Logit | Proportion Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ fclass | upper | .2057878 .0066177 .1931157 .2190656 middle | .278135 .0073347 .2639854 .2927414 lower | .5160772 .0081804 .5000276 .5320936 -------------------------------------------------------------- . estimates store p_spain . mindex class (extended:i.fclass) (reduced:) (mindex:ibn.fclass, nocons) /// > if country==6, vce(boot, reps(100)) (running mindex on estimation sample) Bootstrap replications (100) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 M-index analysis Number of obs = 3,834 Replications = 100 R-squared = 0.0604 Adj R-squared = 0.0597 Root MSE = 0.2793 Reduced model: mlogit class Extended model: mlogit class i.fclass ------------------------------------------------------------------------------ | Observed Bootstrap Normal-based M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- fclass | upper | .1659646 .0196865 8.43 0.000 .1273797 .2045495 middle | .0017258 .0011645 1.48 0.138 -.0005565 .0040082 lower | .0367671 .0050966 7.21 0.000 .0267779 .0467563 ------------------------------------------------------------------------------ . estimates store m_czechia . proportion fclass if country==6 Proportion estimation Number of obs = 3,834 -------------------------------------------------------------- | Logit | Proportion Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ fclass | upper | .1627543 .0059617 .1514014 .1747832 middle | .4478352 .008031 .4321472 .4636276 lower | .3894105 .007875 .3740863 .4049564 -------------------------------------------------------------- . estimates store p_czechia . estout m_spain p_spain m_czechia p_czechia, collabels(none) cell(b(fmt(3))) ---------------------------------------------------------------- m_spain p_spain m_czechia p_czechia ---------------------------------------------------------------- 1.fclass 0.155 0.206 0.166 0.163 2.fclass 0.022 0.278 0.002 0.448 3.fclass 0.039 0.516 0.037 0.389 ----------------------------------------------------------------
The results are similar as above, but even more pronounced.
Up to now we only looked at the relation between respondent's class and father's class, but there are also other origin variables in the dataset that could be taken into account. Evaluate how results change once you add some of these variables. Restrict the sample to Spain, UK, and Greece and compute the following variants:
fisei
)mclass
)feduc
and meduc
)Father's class only:
. mindex class i.fclass if inlist(country,10,13,14), over(country) Estimating outcome models ...... done. M-index analysis Number of obs = 11,818 Reduced model: mlogit class Extended model: mlogit class i.fclass Over: country --------------------------------------------------------------------------------- M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- country | Spain | .0578975 .0054379 10.65 0.000 .0472394 .0685557 United Kingdom | .0304638 .003825 7.96 0.000 .022967 .0379605 Greece | .0450047 .0045671 9.85 0.000 .0360534 .0539561 --------------------------------------------------------------------------------- . estimates store father
Father's class and ISEI:
. mindex class i.fclass fisei if inlist(country,10,13,14), over(country) Estimating outcome models ...... done. M-index analysis Number of obs = 11,818 Reduced model: mlogit class Extended model: mlogit class i.fclass fisei Over: country --------------------------------------------------------------------------------- M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- country | Spain | .071539 .0060155 11.89 0.000 .0597489 .0833291 United Kingdom | .0340836 .0040505 8.41 0.000 .0261448 .0420224 Greece | .052306 .0050094 10.44 0.000 .0424878 .0621242 --------------------------------------------------------------------------------- . estimates store isei
Father's class and ISEI as well as mother's class:
. mindex class i.fclass fisei i.mclass if inlist(country,10,13,14), over(country) Estimating outcome models ...... done. M-index analysis Number of obs = 11,818 Reduced model: mlogit class Extended model: mlogit class i.fclass fisei i.mclass Over: country --------------------------------------------------------------------------------- M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- country | Spain | .073715 .0061037 12.08 0.000 .0617521 .085678 United Kingdom | .0389382 .0042966 9.06 0.000 .030517 .0473595 Greece | .0657745 .0055685 11.81 0.000 .0548604 .0766886 --------------------------------------------------------------------------------- . estimates store parents
Father's class and ISEI, mother's class, as well as father's and mother's education:
. mindex class i.fclass fisei i.mclass i.feduc i.meduc if inlist(country,10,13,14), over(c > ountry) Estimating outcome models ...... done. M-index analysis Number of obs = 11,818 Reduced model: mlogit class Extended model: mlogit class i.fclass fisei i.mclass i.feduc i.meduc Over: country --------------------------------------------------------------------------------- M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- country | Spain | .0917888 .0067806 13.54 0.000 .0784991 .1050785 United Kingdom | .0557218 .005024 11.09 0.000 .045875 .0655686 Greece | .0809756 .0060717 13.34 0.000 .0690752 .0928759 --------------------------------------------------------------------------------- . estimates store educ
Graph displaying an overview of the results:
. coefplot father isei parents educ, legend(cols(1) pos(3)) xlabel(,grid)
For Spain we see that father's ISEI is quite relevant, but additionally taking account of mother's class does not change the result much. For Greece the story is a bit different: here the additional effect of mother's class is stronger than the additional effect of ISEI. In all three countries, taking account of education gives an additional boost to the M index.
A part of the effect of father's class on respondent's class might be mediated by respondent's educational attainment. Try to evaluate, for each country, how the “total” effect of father's class divides into such an “indirect” effect through education and a “direct” effect that is unrelated of education.
For the total effect, simply compute the M index of father's class as
above. For the direct effect, compute the M index while controlling
for respondent's education (educ
). The indirect effect is equal to the
difference between the total effect and the direct effect
Total effect:
. mindex class i.fclass, over(country) Estimating outcome models ............ done. M-index analysis Number of obs = 24,069 Reduced model: mlogit class Extended model: mlogit class i.fclass Over: country --------------------------------------------------------------------------------- M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- country | Switzerland | .0554875 .0049011 11.32 0.000 .0458814 .0650935 Czechia | .0421019 .0046017 9.15 0.000 .0330827 .051121 Spain | .0578975 .0054378 10.65 0.000 .0472396 .0685554 United Kingdom | .0304638 .0038249 7.96 0.000 .0229671 .0379604 Greece | .0450047 .004567 9.85 0.000 .0360536 .0539559 Portugal | .0711837 .0058585 12.15 0.000 .0597012 .0826662 --------------------------------------------------------------------------------- . estimates store total
Direct effect:
. mindex class i.fclass, over(country) controls(i.educ) Estimating outcome models ............ done. M-index analysis Number of obs = 24,069 Reduced model: mlogit class i.educ Extended model: mlogit class i.fclass i.educ Over: country --------------------------------------------------------------------------------- M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- country | Switzerland | .0205909 .0031607 6.51 0.000 .014396 .0267859 Czechia | .0139231 .0026664 5.22 0.000 .0086969 .0191492 Spain | .0143113 .0027804 5.15 0.000 .0088619 .0197607 United Kingdom | .0060199 .0017407 3.46 0.001 .0026083 .0094315 Greece | .0078507 .0019812 3.96 0.000 .0039676 .0117338 Portugal | .0105322 .0023074 4.56 0.000 .0060099 .0150546 --------------------------------------------------------------------------------- . estimates store direct
Overview graph:
. coefplot total direct, vertical recast(bar) citop cirecast(rcap) barwidth(0.7) nooffset
We see that the largest part of the social origin effect operates indirectly through educational attainment. The direct effect is less than half of the total effect in each country.
Here is how we could estimate the direct effect using advanced syntax:
. mindex class (extended: i.(fclass educ)##i.country) /// > (reduced: i.educ##i.country) /// > (mindex: ibn.country, noconstant), robust Estimating outcome models .. done. M-index analysis Number of obs = 24,069 R-squared = 0.0069 Adj R-squared = 0.0067 Root MSE = 0.1582 Reduced model: mlogit class i.educ i.country i.educ#i.country Extended model: mlogit class i.fclass i.educ i.country i.fclass#i.country i.educ#i.country --------------------------------------------------------------------------------- | Robust M-index | Coef. Std. Err. t P>|t| [95% Conf. Interval] ----------------+---------------------------------------------------------------- country | Switzerland | .020591 .0031611 6.51 0.000 .0143951 .0267868 Czechia | .013923 .0026666 5.22 0.000 .0086962 .0191498 Spain | .0143113 .0027807 5.15 0.000 .0088611 .0197616 United Kingdom | .0060199 .0017408 3.46 0.001 .0026078 .0094321 Greece | .0078507 .0019815 3.96 0.000 .0039669 .0117345 Portugal | .0105323 .0023076 4.56 0.000 .0060091 .0150554 ---------------------------------------------------------------------------------
Now we shift focus from country differences to changes across birth cohorts. Do the following:
birthyr
) into a number of groups (1933–1945, 1946–1955, 1956–1965,
1966–1975) and then compute the M index by group. To model time trends, you will need to resort to advanced syntax.
Generate groups of birth cohorts:
. egen cohort = cut(birthyr), at(1933,1946,1956,1966,1976) . tabulate cohort cohort | Freq. Percent Cum. ------------+----------------------------------- 1933 | 5,113 21.24 21.24 1946 | 6,610 27.46 48.71 1956 | 7,323 30.43 79.13 1966 | 5,023 20.87 100.00 ------------+----------------------------------- Total | 24,069 100.00
M index over cohort groups:
. mindex class i.fclass, over(cohort) Estimating outcome models ........ done. M-index analysis Number of obs = 24,069 Reduced model: mlogit class Extended model: mlogit class i.fclass Over: cohort ------------------------------------------------------------------------------ M-index | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- cohort | 1933 | .0673485 .0049418 13.63 0.000 .0576628 .0770342 1946 | .0619595 .0042429 14.60 0.000 .0536437 .0702754 1956 | .0614739 .003949 15.57 0.000 .053734 .0692138 1966 | .0547618 .0045785 11.96 0.000 .0457881 .0637356 ------------------------------------------------------------------------------ . test 1933.cohort = 1946.cohort = 1956.cohort = 1966.cohort ( 1) 1933bn.cohort - 1946.cohort = 0 ( 2) 1933bn.cohort - 1956.cohort = 0 ( 3) 1933bn.cohort - 1966.cohort = 0 chi2( 3) = 3.56 Prob > chi2 = 0.3128 . coefplot, recast(connect) cirecast(rcap) vertical
The pattern of the M index across birth cohorts indicates that social mobility increased over time (decreasing M index), but the overall test for differences between cohort groups is not significant.
We now model a linear time trend by birth year. We first transform variable
birthyr
such that its zero point is 1955 (which is about the
average birth year in the data). The transformation does not change the
model, it just changes the interpretation of the constant (without the
transformation the constant would reflect the value of the M index in year 0,
after the transformation it reflect the value of the M index in year 1955).
. generate birthyr55 = birthyr - 1955 . mindex class (extended: i.fclass##c.birthyr55) /// > (reduced: birthyr55) /// > (mindex: birthyr55), robust Estimating outcome models .. done. M-index analysis Number of obs = 24,069 R-squared = 0.0002 Adj R-squared = 0.0001 Root MSE = 0.3397 Reduced model: mlogit class birthyr55 Extended model: mlogit class i.fclass birthyr55 i.fclass#c.birthyr55 ------------------------------------------------------------------------------ | Robust M-index | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- birthyr55 | -.0004325 .0002095 -2.07 0.039 -.0008431 -.000022 _cons | .0612618 .0021979 27.87 0.000 .0569537 .0655698 ------------------------------------------------------------------------------ . estimates store linear
The effect of birth year is negative (increasing social mobility over time) and (marginally) significant.
To check for possible nonlinearities in the trend, we can include birth year squared in the model (parabolic time trend):
. mindex class (extended: i.fclass##c.birthyr55##c.birthyr55) /// > (reduced: c.birthyr55##c.birthyr55) /// > (mindex: c.birthyr55##c.birthyr55), robust Estimating outcome models .. done. M-index analysis Number of obs = 24,069 R-squared = 0.0002 Adj R-squared = 0.0001 Root MSE = 0.3397 Reduced model: mlogit class birthyr55 c.birthyr55#c.birthyr55 Extended model: mlogit class i.fclass birthyr55 i.fclass#c.birthyr55 c.birthyr55#c.birthyr > 55 i.fclass#c.birthyr55#c.birthyr55 ----------------------------------------------------------------------------------------- | Robust M-index | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------------------------+---------------------------------------------------------------- birthyr55 | -.0004476 .0002074 -2.16 0.031 -.0008542 -.000041 | c.birthyr55#c.birthyr55 | -.000011 .0000205 -0.54 0.592 -.0000512 .0000292 | _cons | .0624239 .0031801 19.63 0.000 .0561906 .0686571 ----------------------------------------------------------------------------------------- . estimates store quadratic
There is no evidence for a non-linear trend
(coefficient c.birthyr55#c.birthyr55
not significant).
We now add mother's class to see whether this changes the story. The labor market participation of women changed over time and, possibly, also mother's class became more relevant for respondent's class over time. The model including mother's class is as follows:
. mindex class (extended: i.(fclass mclass)##c.birthyr55##c.birthyr55) /// > (reduced: c.birthyr55##c.birthyr55) /// > (mindex: c.birthyr55##c.birthyr55), robust Estimating outcome models .. done. M-index analysis Number of obs = 24,069 R-squared = 0.0002 Adj R-squared = 0.0001 Root MSE = 0.3595 Reduced model: mlogit class birthyr55 c.birthyr55#c.birthyr55 Extended model: mlogit class i.fclass i.mclass birthyr55 i.fclass#c.birthyr55 i.mclass#c.b > irthyr55 c.birthyr55#c.birthyr55 i.fclass#c.birthyr55#c.birthyr55 i.mclass#c.birthyr55#c > .birthyr55 ----------------------------------------------------------------------------------------- | Robust M-index | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------------------------+---------------------------------------------------------------- birthyr55 | -.0004656 .0002234 -2.08 0.037 -.0009035 -.0000277 | c.birthyr55#c.birthyr55 | .0000133 .0000224 0.60 0.552 -.0000305 .0000571 | _cons | .0675265 .0033553 20.13 0.000 .0609499 .0741032 ----------------------------------------------------------------------------------------- . estimates store withmothers
There is still no evidence for a nonlinear time trend, but note that the sign
of the effect of c.birthyr55#c.birthyr55
changed. It might thus be
interesting to isolate the effect that mother's class has in addition to
father's class. To do so we include father's class in the restricted
model:
. mindex class (extended: i.(fclass mclass)##c.birthyr55##c.birthyr55) /// > (reduced: i.fclass##c.birthyr55##c.birthyr55) /// > (mindex: c.birthyr55##c.birthyr55), robust Estimating outcome models .. done. M-index analysis Number of obs = 24,069 R-squared = 0.0004 Adj R-squared = 0.0003 Root MSE = 0.1257 Reduced model: mlogit class i.fclass birthyr55 i.fclass#c.birthyr55 c.birthyr55#c.birthyr > 55 i.fclass#c.birthyr55#c.birthyr55 Extended model: mlogit class i.fclass i.mclass birthyr55 i.fclass#c.birthyr55 i.mclass#c.b > irthyr55 c.birthyr55#c.birthyr55 i.fclass#c.birthyr55#c.birthyr55 i.mclass#c.birthyr55#c > .birthyr55 ----------------------------------------------------------------------------------------- | Robust M-index | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------------------------+---------------------------------------------------------------- birthyr55 | -.000018 .0000886 -0.20 0.839 -.0001916 .0001557 | c.birthyr55#c.birthyr55 | .0000243 9.11e-06 2.67 0.008 6.46e-06 .0000422 | _cons | .0051027 .001107 4.61 0.000 .0029329 .0072724 ----------------------------------------------------------------------------------------- . estimates store motherspartial
We see that the additional effect of mother's class is indeed non-linear (U shaped time trend).
Here is a graph that plots the effect shapes for the different non-linear models:
. foreach m in quadratic withmothers motherspartial { 2. estimates restore `m' 3. quietly margins, at(birthyr55=(-20(2)20)) post 4. estimates store shape_`m' 5. } (results quadratic are active now) (results withmothers are active now) (results motherspartial are active now) . coefplot shape_*, at recast(line) lwidth(*2) legend(cols(1) pos(2)) /// > ciopts(recast(rarea) pstyle(ci) color(%50) lcolor(%0)) /// > plotlabels("Fathers only" "Fathers + mothers" "Mothers' partial effect") /// > xlabel(-20 "1935" -10 "1945" 0 "1955" 10 "1965" 20 "1975") /// > xtitle("birth cohort")
Open dataset ESSplus.dta
and evaluate whether country differences in
social mobility can be “explained” by welfare state characteristics (the
welfare state characteristics have been taken from Kuitto 2011, Journal of
European Social Policy 21:348–346). Do the following:
Welfare
).
SocialService
).
In both cases, do separate analyses by gender (female
) and
control for age of the respondent (age
) and the
ESS round (year
).
You will need to resort to advanced syntax to estimate these models. Consider computing robust standard errors that are clustered by country.
Effects of welfare state clusters:
. use ESSplus, clear . fre Welfare Welfare -- Welfare cluster (Kuitto 2011) --------------------------------------------------------------------------------------- | Freq. Percent Valid Cum. ------------------------------------------+-------------------------------------------- Valid 1 Cluster 1: Continental European | 29670 36.08 36.08 36.08 2 Cluster 2: Southern European | 9925 12.07 12.07 48.15 3 Cluster 3: Nordic | 14939 18.17 18.17 66.32 4 Cluster 4: Eastern European | 11746 14.28 14.28 80.60 5 Cluster 5: Mixed/unclassified | 15951 19.40 19.40 100.00 Total | 82231 100.00 100.00 --------------------------------------------------------------------------------------- . mindex class (extended: i.fclass##i.country age i.year) /// > (reduced: i.country age i.year) /// > (mindex: i.Welfare age i.year) /// > if female==0, vce(cluster country) Estimating outcome models .. done. M-index analysis Number of obs = 39,599 R-squared = 0.0004 Adj R-squared = 0.0002 Root MSE = 0.3100 Reduced model: mlogit class i.country age i.year Extended model: mlogit class i.fclass i.country i.fclass#i.country age i.year (Std. Err. adjusted for 23 clusters in country) ----------------------------------------------------------------------------------------- | Robust M-index | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------------------------+---------------------------------------------------------------- Welfare | Cluster 2: Southern .. | .009026 .0117008 0.77 0.449 -.0152399 .0332918 Cluster 3: Nordic | -.0035637 .0079001 -0.45 0.656 -.0199475 .0128202 Cluster 4: Eastern E.. | .0047564 .0122648 0.39 0.702 -.0206791 .030192 Cluster 5: Mixed/unc.. | .0142508 .0076794 1.86 0.077 -.0016753 .030177 | age | .0000846 .0001553 0.54 0.591 -.0002375 .0004068 | year | 2004 | .0024888 .0049655 0.50 0.621 -.007809 .0127866 2006 | .0039574 .0032565 1.22 0.237 -.0027961 .0107108 2008 | -.0007534 .0044513 -0.17 0.867 -.0099849 .0084781 2010 | -.0013209 .0052614 -0.25 0.804 -.0122324 .0095907 | _cons | .0411939 .0089189 4.62 0.000 .0226972 .0596906 ----------------------------------------------------------------------------------------- . testparm i.Welfare ( 1) 2.Welfare = 0 ( 2) 3.Welfare = 0 ( 3) 4.Welfare = 0 ( 4) 5.Welfare = 0 F( 4, 22) = 1.36 Prob > F = 0.2808 . mindex class (extended: i.fclass##i.country age i.year) /// > (reduced: i.country age i.year) /// > (mindex: i.Welfare age i.year) /// > if female==1, vce(cluster country) Estimating outcome models .. done. M-index analysis Number of obs = 42,632 R-squared = 0.0004 Adj R-squared = 0.0002 Root MSE = 0.2759 Reduced model: mlogit class i.country age i.year Extended model: mlogit class i.fclass i.country i.fclass#i.country age i.year (Std. Err. adjusted for 23 clusters in country) ----------------------------------------------------------------------------------------- | Robust M-index | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------------------------+---------------------------------------------------------------- Welfare | Cluster 2: Southern .. | .0108221 .0073571 1.47 0.155 -.0044354 .0260797 Cluster 3: Nordic | -.0068114 .0033979 -2.00 0.057 -.0138581 .0002354 Cluster 4: Eastern E.. | .0007086 .0070367 0.10 0.921 -.0138846 .0153018 Cluster 5: Mixed/unc.. | .0030457 .0055945 0.54 0.592 -.0085566 .0146479 | age | .000154 .0001716 0.90 0.379 -.000202 .0005099 | year | 2004 | .0028996 .0039593 0.73 0.472 -.0053115 .0111108 2006 | .0031251 .0055261 0.57 0.577 -.0083353 .0145856 2008 | .0028193 .0041614 0.68 0.505 -.005811 .0114496 2010 | -.0009233 .0049539 -0.19 0.854 -.0111971 .0093505 | _cons | .0299029 .0085008 3.52 0.002 .0122733 .0475324 ----------------------------------------------------------------------------------------- . testparm i.Welfare ( 1) 2.Welfare = 0 ( 2) 3.Welfare = 0 ( 3) 4.Welfare = 0 ( 4) 5.Welfare = 0 F( 4, 22) = 3.16 Prob > F = 0.0342
The M index is lowest in the nordic cluster; the continental European cluster is in the middle. For women, the overall test for differenced among clusters is significant.
Effect of social service expenditures:
. mindex class (extended: i.fclass##i.country age i.year) /// > (reduced: i.country age i.year) /// > (mindex: SocialService age i.year) /// > if female==0, vce(cluster country) Estimating outcome models .. done. M-index analysis Number of obs = 39,599 R-squared = 0.0002 Adj R-squared = 0.0001 Root MSE = 0.3100 Reduced model: mlogit class i.country age i.year Extended model: mlogit class i.fclass i.country i.fclass#i.country age i.year (Std. Err. adjusted for 23 clusters in country) ------------------------------------------------------------------------------- | Robust M-index | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------------+---------------------------------------------------------------- SocialService | -.0029345 .0023173 -1.27 0.219 -.0077403 .0018713 age | .0000846 .000153 0.55 0.586 -.0002328 .0004019 | year | 2004 | .0025736 .0051553 0.50 0.623 -.0081179 .0132651 2006 | .0027581 .003576 0.77 0.449 -.004658 .0101743 2008 | -.0008205 .0048869 -0.17 0.868 -.0109554 .0093144 2010 | -.0012313 .0056003 -0.22 0.828 -.0128457 .0103831 | _cons | .0517761 .0089275 5.80 0.000 .0332616 .0702907 ------------------------------------------------------------------------------- . mindex class (extended: i.fclass##i.country age i.year) /// > (reduced: i.country age i.year) /// > (mindex: SocialService age i.year) /// > if female==1, vce(cluster country) Estimating outcome models .. done. M-index analysis Number of obs = 42,632 R-squared = 0.0003 Adj R-squared = 0.0002 Root MSE = 0.2759 Reduced model: mlogit class i.country age i.year Extended model: mlogit class i.fclass i.country i.fclass#i.country age i.year (Std. Err. adjusted for 23 clusters in country) ------------------------------------------------------------------------------- | Robust M-index | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------------+---------------------------------------------------------------- SocialService | -.0031692 .0010928 -2.90 0.008 -.0054356 -.0009028 age | .0001432 .0001689 0.85 0.406 -.0002071 .0004935 | year | 2004 | .0022577 .0039465 0.57 0.573 -.0059267 .0104422 2006 | .0024082 .0054851 0.44 0.665 -.0089671 .0137835 2008 | .0023257 .0044633 0.52 0.608 -.0069306 .0115819 2010 | -.0013749 .0049984 -0.28 0.786 -.0117409 .0089911 | _cons | .0386031 .0088161 4.38 0.000 .0203196 .0568866 -------------------------------------------------------------------------------
Higher social service expenditures go along with higher social mobility (lower M index) for both men and women. For women, the effect is statistically significant.