For the churndata set [4], suppose that we categorize thecustomer service calls variable into a new variable,CSC, as follows:
r Zero or one customer service calls: CSC=Low r Two or three customer service calls: CSC=Medium r Four or more customer service calls: CSC=High
ThenCSCis a trichotomous predictor. How will logistic regression handle this? First, the analyst will need to code the data set using indicator (dummy) variables and reference cell coding. Suppose that we chooseCSC=Lowto be our reference cell. Then we assign the indicator variable values to two new indicator variables,CSC-Med andCSC-Hi, given in Table 4.5. Each record will have assigned to it a value of zero or 1 for each ofCSC-MedandCSC-Hi. For example, a customer with 1 customer service call will have values CSC-Med=0 and CSC-Hi =0, a customer with 3 customer service calls will haveCSC-Med=1 andCSC-Hi=0, and a customer with 7 customer service calls will haveCSC-Med=0 andCSC-Hi=1.
Table 4.6 shows a cross-tabulation of churn by CSC. Using CSC = Low as the reference class, we can calculate the odds ratios using the cross-products
INTERPRETING A LOGISTIC REGRESSION MODEL 167
TABLE 4.6 Cross-Tabulation ofChurnbyCSC
CSC=Low CSC=Medium CSC=High Total
Churn=false, 1664 1057 129 2850 y=0 Churn=true, 214 131 138 483 y=1 Total 1878 1188 267 3333 as follows: r ForCSC=Medium: OR=131(1664) 214(1057) =0.963687≈0.96 r ForCSC=High: OR=138(1664) 214(129) =8.31819≈8.32
The logistic regression is then performed in Minitab with the results shown in Table 4.7.
Note that the odds ratio reported by Minitab are the same that we found using the cell counts directly. We verify the odds ratios given in Table 4.7, using equation (4.3):
r For CSC-Med: OR∧ =eb1=e−0.0369891=0.96
r For CSC-Hi: OR∧ =eb2=e2.11844=8.32
Here we haveb0= −2.051, b1 = −0.0369891,andb2=2.11844.So the proba- bility of churning is estimated as
ˆ π(x)= e ˆ g(x) 1+eg(x)ˆ = e−2.051−0.0369891(CSC-Med)+2.11844(CSC-Hi) 1+e−2.051−0.0369891(CSC-Med)+2.11844(CSC-Hi) with the estimated logit:
ˆ
g(x)= −2.051− 0.0369891(CSC-Med)+2.11844(CSC-Hi)
TABLE 4.7 Results of Logistic Regression ofChurnonCSC
Logistic Regression Table
Odds 95% CI
Predictor Coef SE Coef Z P Ratio Lower Upper
Constant -2.05100 0.0726213 -28.24 0.000
CSC-Med -0.0369891 0.117701 -0.31 0.753 0.96 0.77 1.21
CSC-Hi 2.11844 0.142380 14.88 0.000 8.32 6.29 11.00
Log-Likelihood = -1263.368
SPH SPH
JWDD006-04 JWDD006-Larose November 23, 2005 14:51 Char Count= 0
168 CHAPTER 4 LOGISTIC REGRESSION
For a customer with low customer service calls, we estimate his or her probability of churning: ˆ g(x)= −2.051−0.0369891(0)+2.11844(0)= −2.051 and ˆ π(x)= e ˆ g(x) 1+eg(x)ˆ = e−2.051 1+e−2.051 =0.114
So the estimated probability that a customer with low numbers of customer service calls will churn is 11.4%, which is less than the overall proportion of churners in the data set, 14.5%, indicating that such customers churn somewhat less frequently than the overall group. Also, this probability could have been found directly from Table 4.6:
P(churn|CSC=Low)= 214
1878 =0.114
For a customer with medium customer service calls, the probability of churn is estimated as ˆ g(x)= −2.051−0.0369891(1)+2.11844(0)= −2.088 and ˆ π(x)= e ˆ g(x) 1+eg(x)ˆ = e−2.088 1+e−2.088 =0.110
The estimated probability that a customer with medium numbers of customer service calls will churn is 11.0%, which is about the same as that for customers with low numbers of customer service calls. The analyst may consider collapsing the distinction betweenCSC-MedandCSC-Low. This probability could have been found directly from Table 4.6:
P(churn|CSC=Medium)= 131
1188 =0.110
For a customer with high customer service calls, the probability of churn is estimated as ˆ g(x)= −2.051−0.0369891(0)+2.11844(1)=0.06744 and ˆ π(x)= e ˆ g(x) 1+eg(x)ˆ = e0.06744 1+e0.06744 =0.5169
Thus, customers with high levels of customer service calls have a much higher es- timated probability of churn, over 51%, which is more than triple the overall churn rate. Clearly, the company needs to flag customers who make 4 or more customer service calls and intervene with them before they attrit. This probability could also have been found directly from Table 4.6:
P(churn|CSC=High)= 138
267 =0.5169
Applying the Wald test for the significance of theCSC-Medparameter, we have b1 = −0.0369891 and SE(b1)=0.117701,giving us
ZWald=
−0.0369891
INTERPRETING A LOGISTIC REGRESSION MODEL 169
as reported under z for the coefficient CSC-Med in Table 4.7. The p-value is P(|z|>0.31426)=0.753,which is not significant. There is no evidence that the CSC-MedversusCSC-Lowdistinction is useful for predicting churn. For theCSC-Hi parameter, we haveb1=2.11844 and SE(b1)=0.142380,giving us
ZWald=
2.11844
0.142380 =14.88
as shown for the coefficientCSC-Hi in Table 4.7. The p-value, P(|z|>14.88)∼= 0.000,indicates that there is strong evidence that the distinctionCSC-HiversusCSC- Lowis useful for predicting churn.
Examining Table 4.7, note that the odds ratios for bothCSC =Mediumand CSC=Highare equal to those we calculated using the cell counts directly. Also note that the logistic regression coefficients for the indicator variables are equal to the natural log of their respective odds ratios:
bCSC-Med=ln(0.96)≈ln(0.963687)= −0.0369891 bCSC-High=ln(8.32)≈ln(8.31819)=2.11844
For example, the natural log of the odds ratio ofCSC-HitoCSC-Lowcan be derived using equation (4.4) as follows:
ln [OR(High,Low)]=gˆ(High)−gˆ(Low)
=[b0+b1(CSC-Med=0)+b2(CSC-Hi=1)] −[b0+b1(CSC-Med=0)+b2(CSC-Hi=0)] =b2=2.11844
Similarly, the natural log of the odds ratio ofCSC-MediumtoCSC-Lowis given by ln [OR(Medium,Low)]=gˆ(Medium)−gˆ(Low)
=[b0+b1(CSC-Med=1)+b2(CSC-Hi=0)] −[b0+b1(CSC-Med=0)+b2(CSC-Hi=0)] =b1= −0.0369891
Just as for the dichotomous case, we may use the cell entries to estimate the standard error of the coefficients directly. For example, the standard error for the logistic regression coefficientb1forCSC-Medis estimated as follows:
∧ SE(b1)= 1 131+ 1 1664+ 1 214+ 1 1057 =0.117701
Also similar to the dichotomous case, we may calculate 100(1−α)% confidence intervals for the odds ratios, for theith predictor, as follows:
exp bi±z·
∧
SE(bi)
For example, a 95% confidence interval for the odds ratio between CSC-Hi and CSC-Lowis given by:
exp b2±z· ∧ SE(b2) =exp2.11844±(1.96)(0.142380) =(e1.8394, e2.3975) =(6.29, 11.0)
SPH SPH
JWDD006-04 JWDD006-Larose November 23, 2005 14:51 Char Count= 0
170 CHAPTER 4 LOGISTIC REGRESSION
as reported in Table 4.7. We are 95% confident that the odds ratio for churning for customers with high customer service calls compared to customers with low customer service calls lies between 6.29 and 11.0. Since the interval does not includee0=1, the relationship is significant with 95% confidence.
However, consider the 95% confidence interval for the odds ratio between CSC-MedandCSC-Low:
exp b1±z· ∧ SE(b1) =exp [−0.0369891±(1.96)(0.117701)] =(e−0.2677, e0.1937) =(0.77, 1.21)
as reported in Table 4.7. We are 95% confident that the odds ratio for churning for customers with medium customer service calls compared to customers with low customer service calls lies between 0.77 and 1.21. Since this interval does include e0=1, the relationship is not significant with 95% confidence. Depending on other modeling factors, the analyst may consider collapsingCSC-MedandCSC-Lowinto a single category.