*** Table 6.1 in W&B book: ORDERED MULTINOMIAL LOGIT EXAMPLE: CHOICE OF SECONDARY SCHOOL
* This is a ordered multinomial problem. This program reproduces Table 6.1 in the
* W&B textbook. The regressors in this problem are case-specific only.
* Read in dataset and describe dependent variable and regressors.
use c:\data\school.dta, clear
describe
* Summarize dependent variable and regressors
summarize, separator(0)
* Tabulate the dependent variable
tabulate school
* Table of log of income by school
table school, contents(N linc mean linc sd linc)
* Table of mother's education by school
table school, contents(N motheduc mean motheduc sd motheduc)
********** Table 6.1 in W&B ORDERED MULTINOMIAL LOGIT/PROBIT MODEL OF SCHOOL CHOICE
* Creation of year dummies
generate d95 = (year == 1995)
generate d96 = (year == 1996)
generate d97 = (year == 1997)
generate d98 = (year == 1998)
generate d99 = (year == 1999)
generate d00 = (year == 2000)
generate d01 = (year == 2001)
generate d02 = (year == 2002)
* Create full time dummy for mother's employment. 1 = full time employment
* 0 = otherwise.
* Creat Work dummy if mother works at all or is not employed. 1 = employed
* (either full-time or part-time), 0 = otherwise.
generate mothftime = (mothemp == 1)
generate mothwork = (mothemp < 3)
* The Ordered Probit results reported in Table 6.1 in W and B
* Remember the or option is not applicable to Ordered Probit
oprobit school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02
* The Ordered Logit results reported in Table 6.1 in W and B
ologit school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02
* The or version of the report of coefficients (not in Table 6.1) with the
* accompanying Brant command to test the Single Index (Parallel Regressions)
* assumption. We see that the assumption of a Single Index is supported by the
* Brant test of the data. See the Likelihood Ratio test of the Single Index
* assumption at the end of this program.
ologit school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02, or
brant, detail
* Predict probabilities of choice of each mode and compare to actual freqs
predict pologit1 pologit2 pologit3, pr
summarize pologit*, separator(3)
* List predicted values of alternatives for first 10 observations
list pologit* in 1/10
* Create Classification Table and get accuracy rate
egen pred_max = rowmax(pologit*)
generate pred_choice = .
forv i=1/3 {
replace pred_choice = `i' if (pred_max == pomlogit`i')
}
local school_label: value label school
label values pred_choice `school_label'
tabulate pred_choice school
* Accuracy rate = (107 + 46 + 215)/675 = 0.545
* In comparison, the accuracy rate that one would expect from naively classifying
* using the majority class (Gymnasium) would be 41.04% accuracy on average.
* See the previous tabulation result for the dependent variable - school.
* Thus, the current ologit classifier is providing a LIFT of 54.5/41.04 = 1.328.
* Recall the below calculation of the lift of the unordered MNL model reported
* in the program Table5_1_WandB.do.
* The Lift Ratios are about the same whether unordered or ordered.
* Of course we have not generated a classification table using the generalized
* ordered logit (gologit2). But the Brant test indicates that this is not
* necessary.
* Accuracy rate = (113 + 49 + 208)/675 = 0.548
* In comparison, the accuracy rate that one would expect from naively classifying
* using the majority class (Gymnasium) would be 41.04% accuracy on average.
* See the previous tabulation result for the dependent variable - school.
* Thus, the current mlogit classifier is providing a LIFT of 54.8/41.04 = 1.335.
* As an alternative to the Brant test we could do a likelihood ratio test
* using the generalized ordered logit model. If the restriction that the
* single index model is appropriate we should get a high probability value
* for the Likelihood ratio test.
gologit2 school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02
* With OR report
gologit2 school motheduc mothwork linc lsize parity d95 d96 d97 d98 d99 d00 d01 d02, or
* Now for the computation of the Likelihood Ratio test of the Single Index hypothesis.
* From the Ordered Logit model above we have a log likelihood value of -630.1549.
* This is the restricted model that imposes the Single Index assumption.
* From the above Generalized Ordered Logit model we obtain the log likelihood
* value of -622.32471. This represents the fit of the unrestricted model because
* we are not imposing the Single Index assumption. Then the likelihood ratio
* statistic is -2log(lambda) = -2(logl(restricted model) - logl(unrestricted model))
* = -2(-630.1549 -(-622.32471)) = -2(-7.83019) = 15.66038.
* The number of degrees of freedom of the chi-square test is 28 - 15 = 13
* where the number of parameters in the unrestricted model (gologit2) is 28
* while the number of parameters in the restricted model (ologit) is 15.
* It follows that the p-value associated with the observed statistic of
* 15.66038 is 0.267957 > 0.05. Therefore we accept the null hypothesis of a
* Single Index. You can use the EXCEL function chisq.dist.rt to obtain
* this probability value. Therefore, we see that the Brant and
* Likelihood Ratio tests give the same result. The Single Index model
* (ologit) is to be preferred. That is, the Ordered Logit model, in this
* case, is to be preferred over the Generalized Ordered Logit model.