• No results found

Survey Analysis: Data Mining versus Standard Statistical Analysis for Better Analysis of Survey Responses

N/A
N/A
Protected

Academic year: 2021

Share "Survey Analysis: Data Mining versus Standard Statistical Analysis for Better Analysis of Survey Responses"

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)

Survey Analysis: Data Mining versus

Standard Statistical Analysis for Better

Analysis of Survey Responses

By Dean Abbott

Abbott Analytics

http://www.abbottanalytics.com

Salford Systems Data Mining 2006

March 27-31 2006

San Diego, CA

(2)

Acknowledgements

Work done under contract with Seer Analytics

Work done under contract with Seer Analytics

Subcontractors:

Subcontractors:

Tessar

Tessar

and Associates (now Mobile

and Associates (now Mobile

Foundry), Abbott Consulting (now Abbott Analytics)

Foundry), Abbott Consulting (now Abbott Analytics)

Seer Analytics, LLC

Seer Analytics, LLC

518 North Tampa Street

518 North Tampa Street

Tampa, FL 33602

Tampa, FL 33602

we help you see what's there.

SEE

R

http://

(3)

About Abbott Analytics

Abbott Analytics

Abbott Analytics

Founded in 1999, based in San Diego, CA

Founded in 1999, based in San Diego, CA

Dedicated to data mining consulting and training

Dedicated to data mining consulting and training

Principal: Dean Abbott

Principal: Dean Abbott

Applied Data Mining for 19+ years in

Applied Data Mining for 19+ years in

Direct Marketing, CRM, Survey Analysis, Tax Compliance, Fraud

Direct Marketing, CRM, Survey Analysis, Tax Compliance, Fraud

Detection, Predictive Toxicology, Biological Risk Assessment

Detection, Predictive Toxicology, Biological Risk Assessment

Course Instruction

Course Instruction

Public 2

Public 2-

-day Data Mining Courses

day Data Mining Courses

Conference Tutorials

Conference Tutorials

Customized Training and Knowledge Transfer

Customized Training and Knowledge Transfer

Data mining methodology (CRISP

Data mining methodology (CRISP

-

-

DM)

DM)

Training services for software products, including CART,

Training services for software products, including CART,

Clementine,

(4)

Talk Outline

Member survey

Member survey

Survey description

Survey description

Results using statistical modeling

Results using statistical modeling

Lessons learned

Lessons learned

Employee survey

Employee survey

Survey description

Survey description

Results using decision trees (CART)

Results using decision trees (CART)

Lessons learned

(5)

Problem Setup:

Member Survey

Question:

Question:

What are the characteristics of members who indicated the

What are the characteristics of members who indicated the

highest overall satisfaction with their Club?

highest overall satisfaction with their Club?

Data:

Data:

32,811 records containing survey answers

32,811 records containing survey answers

No demographic data except what was on survey (marital

No demographic data except what was on survey (marital

status, children, age, gender)

status, children, age, gender)

Approach:

Approach:

Create supervised learning models with target variable

Create supervised learning models with target variable

“

(6)

Data Preparation

Begin with 57 candidate inputs to model

Begin with 57 candidate inputs to model

All survey questions are multiple choice

All survey questions are multiple choice

Treated as categories, not numbers

Treated as categories, not numbers

Typically 6 categories per question (1

Typically 6 categories per question (1

-

-

5)

5)

Unknown initially coded as

Unknown initially coded as

“

“

0

0

”

”

No text comments fields included as inputs to model

No text comments fields included as inputs to model

Create new column for target variable

Create new column for target variable

If overall_satisfaction = 1, variable value = 1,

If overall_satisfaction = 1, variable value = 1,

otherwise, variable value = 0

otherwise, variable value = 0

Data very clean with respect to missing data

Data very clean with respect to missing data

Only needed to record # children fields

Only needed to record # children fields

Number missing

Number missing

11,006 children < 6; 10,701 children 6

11,006 children < 6; 10,701 children 6

-

-

12; 10,873 children 13

12; 10,873 children 13

-

-

17; 4,936 children

17; 4,936 children

(overall)

(7)

Member Survey Question

Categories

(8)

Sampling

Begin with 32,811 responses

Begin with 32,811 responses

Set aside about half for validation (not used during

Set aside about half for validation (not used during

modeling): 16,379 records

modeling): 16,379 records

These records will be used to provide final summaries of the

These records will be used to provide final summaries of the

segments

segments

16,433 records used in creating and scoring model

16,433 records used in creating and scoring model

5,059 had overall satisfaction = 1 (30.8%)

5,059 had overall satisfaction = 1 (30.8%)

Model 1 splits data into training and testing data: 2/3 for

Model 1 splits data into training and testing data: 2/3 for

training (creating model), 1/3 for testing (scoring and ranking

training (creating model), 1/3 for testing (scoring and ranking

models)

(9)

Relationship of Overall Satisfaction

to Recommend to Friends

0 1 2 3 4 OVERALL.RA 0 1 2 3 4 5 RE CO M M E ND.

Overall satisfaction

Recommend to Friend

•Of the 4912 / 16739 (30.2%) with

Overall Satisfaction = 1

•86% have Recommend to friends = 1

•Of the 8708 / 16739 (54%) with

Recommend to Friends = 1

•49% have Overall Satis. = 1 • 4227 / 16739 (26.0%) have both

overall satisfaction and recommend to friends both equal to 1

•This is the biggest bin of the cross tab, followed by

•Overall = 2 / recommend = 2 (24%; 3890 / 16739)

•Overall = 2 / recommend = 1 (22%; 3565 / 16739)

•No other bin greater than 5% of records

(10)

Objective and Data

Challenges

Project Objective

Project Objective

Interpret results of survey for large health club

Interpret results of survey for large health club

(not a predictive model)

(not a predictive model)

Challenges

Challenges

Missing data (some questions either N/A or blank)

Missing data (some questions either N/A or blank)

Solution: Impute values that least effect information communicat

Solution: Impute values that least effect information communicat

ed by

ed by

question (not a mean or median!)

question (not a mean or median!)

Answers (target variables) highly correlated with one another

Answers (target variables) highly correlated with one another

Multi

Multi

-

-

collinearity and interpretation of results problematic

collinearity and interpretation of results problematic

Must reduce dimensionality without losing interpretation of resu

Must reduce dimensionality without losing interpretation of resu

lts

lts

Solution: Factor analysis

Solution: Factor analysis

Target variable

Target variable

Three questions pointed to the important actionable information

Three questions pointed to the important actionable information

(related to

(related to

how satisfied members were)

(11)

Data Preprocessing Approach

Reduce input data (for understanding)

Reduce input data (for understanding)

Use factor analysis to identify groupings of variables that are

Use factor analysis to identify groupings of variables that are

interesting.

interesting.

Factors can be candidate inputs to models, but didn

Factors can be candidate inputs to models, but didn’

’t work as well on

t work as well on

this data

this data

Selected as inputs, those variables with highest loadings as

Selected as inputs, those variables with highest loadings as

representative of that type of factor

representative of that type of factor

Also retained key questions in addition to the factor analysis

Also retained key questions in addition to the factor analysis

representative questions

representative questions

The effect is to remove questions

The effect is to remove questions

“

“

too highly

too highly

”

”

correlated

correlated

with one another, while maintaining relevant information for

with one another, while maintaining relevant information for

modeling.

(12)

Predictive Modeling Approach

Identify Key

Questions

Identify Key

Questions

Factor Analysis:

10 factors

Factor Analysis:

10 factors

Regression Model:

Find Significant

Variables

Regression Model:

Find Significant

Variables

Regression Model:

Find Significant

Variables

Regression Model:

Find Significant

Variables

3 questions with

high association

with target

10 factors, or

variables that

loaded

highest on

each factor

13 fields

down to 7

Variable

ranks

60+ Survey Questions

60+ Survey Questions

3 key questions

(13)

loadings 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Fact or1 Fact or2 Fac tor3 Fac tor4 Fact or5 Fact or6 Fact or7 Fact or8 Fact or9 Fact or10 Factor Loa di ng loadings Factor 1 0.00 0.20 0.40 0.60 0.80 1.00 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12

Top Question Loadings

Lo a d in g V a lu e Factor 2 0.00 0.20 0.40 0.60 0.80 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q23

Top Question Loadings

Lo a d in g V a lu e s

Factor Analysis:

(14)

Member Survey Factor

Analysis Loadings

(15)

Reduce Variables using

Regression

Already beginning with

Already beginning with

only 13 variables

only 13 variables

Question: how many of

Question: how many of

these are useful

these are useful

predictors?

predictors?

Decided to retain 5

Decided to retain 5

factors for final model

factors for final model

Regression Rankings of Questions/Factors

0 0.1 0.2 0.3 0.4 0.5 0.6 Q44 Q22 Q25 factor 3.2 facto r3.9 factor 3.1 fact or3. 4 facto r3.3 facto r3.8 facto r3.1 0 facto r3.6 factor 3.5 fact or3. 7 Question/Factor R e g res si o n C o effi c ien t

(16)

Explaining Results Through

Visualization

Customer

Customer

was

was

not

not

interested in

interested in

“

“

techno

techno

”

”

solutions

solutions

Customer

Customer

was

was

interested in what actions could be taken

interested in what actions could be taken

as a result of the data mining models

as a result of the data mining models

Which characteristics are most correlated with best

Which characteristics are most correlated with best

customers?

customers?

What do they like and dislike about the club?

What do they like and dislike about the club?

Is it equipment? relationships? facility? staff?

Is it equipment? relationships? facility? staff?

Show key contributors, how each club compared with other

Show key contributors, how each club compared with other

club locations, and if club is improving

(17)

Key: Explaining Results

Visualization shows

Visualization shows

key variables in survey

key variables in survey

associated with

associated with

“

“

excellence

excellence

”

”

, and

, and

performance metrics

performance metrics

for each club

for each club

How well did this

How well did this

club do?

club do?

What is the change

What is the change

over last year

over last year

’

’

s

s

result?

result?

Shows which attributes

Shows which attributes

does the club need to

does the club need to

improve to improve

improve to improve

customer satisfaction.

customer satisfaction.

relationships facility equipment Staff 2 Staff 1 goals value Drivers of Satisfaction

(18)

So What’s The Problem with

That?

Regression, Neural Networks are

Regression, Neural Networks are

“

“

global

global

”

”

estimators

estimators

The operate over the entire data space

The operate over the entire data space

Descriptors of Regression represent

Descriptors of Regression represent

average

average

influence

influence

Neither technique provides explicit localized characteristics

Neither technique provides explicit localized characteristics

Customer would like actionable analytics

Customer would like actionable analytics

Clear characteristics of subgroups

Clear characteristics of subgroups

Different strategies for subgroups

Different strategies for subgroups

Conclusion: In Round 2 (Employee Survey), use

Conclusion: In Round 2 (Employee Survey), use

another approach

(19)

Employee Survey Analysis

Problem Setup

Very similar to member survey

Very similar to member survey

60+ questions

60+ questions

Few demographics

Few demographics

Attitudes the job

Attitudes the job

How to handle questions

How to handle questions

They are ordinal, but CART

They are ordinal, but CART

®

®

supports interval and nominal

supports interval and nominal

types

types

Treat as categorical, but make sure values aren

Treat as categorical, but make sure values aren

’

’

t split up

t split up

If see a split on a question having values 1, 2, 4

If see a split on a question having values 1, 2, 4—

—rebuild as interval

rebuild as interval

variable

variable

Didn

(20)

Employee Survey Question

Groupings

(21)

Employee Survey:

Target Variable Definition

Predict key attitudes that are consequents

Predict key attitudes that are consequents

Satisfaction

Satisfaction

Recommend to a Friend

Recommend to a Friend

Intend to Work Next Year at Club

Intend to Work Next Year at Club

Club is Good Place to Work

Club is Good Place to Work

Exclude these from each others

Exclude these from each others

’

’

models

models

They are highly correlated with each other

They are highly correlated with each other

Models that predict a target variable with these as inputs are n

Models that predict a target variable with these as inputs are n

ot actionable

ot actionable

Key Predictors, questions relating to:

Key Predictors, questions relating to:

Communications with management

Communications with management

Quality of supervisors

Quality of supervisors

Training received

Training received

Effectiveness of club

Effectiveness of club

Fairness of policies

Fairness of policies

Perceived member attitudes

(22)

Employee Satisfaction (=1) Model:

Data Information

File: modeling data with binarized dependents w missing.txt Target Variable: Q1_1 Predictor Variables: Q66, Q67, Q68, Q69, Q3, Q4, Q5, Q6, Q7, Q8, Q9, Q10, Q11, Q12, Q13, Q14, Q15, Q16, Q17, Q18, Q20, Q21, Q22, Q23, Q24, Q25, Q26, Q27, Q28, Q29, Q30, Q31, Q32, Q33, Q34, Q35, Q36, Q37, Q38, Q45, Q46, Q47, Q48, Q49, Q50, Q51, Q52, Q53, Q54, Q55, Q56, Q57, Q58, Q59, Q60, Q61, Q62, Q63, Q64, Q65

Class

N Cases Pct Cases

0

4,645

76.0%

1

1,470

24.0%

(23)

Employee Satisfaction Model:

Performance

Node Cases Target Class % of Node Tgt. Class % Target Class Cum % Tgt. Class Cum % Pop % Pop Cases in

Node Cum lift Lift 8 859 60.75 58.44 58.44 23.12 23.12 1,414 2.53 2.53 4 95 43.58 6.46 64.90 26.69 3.57 218 2.43 1.81 7 201 42.23 13.67 78.57 34.47 7.78 476 2.28 1.76 3 30 17.44 2.04 80.61 37.29 2.81 172 2.16 0.73 5 92 14.38 6.26 86.87 47.75 10.47 640 1.82 0.60 6 14 13.86 0.95 87.82 49.40 1.65 101 1.78 0.58 2 124 10.12 8.44 96.26 69.44 20.03 1,225 1.39 0.42 1 55 2.94 3.74 100.00 100.00 30.56 1,869 1.00 0.12

Class N Cases N Misclassified Pct. Class

0

4,645

953

20.52

(24)

Employee Satisfaction Model:

Splits

•

Q8: Feel Welcome

– Surrogate: Q27 (family friendly),

Q28 (inclusive environment), Q18

(good working conditions)

– Q18: Good working conditions

– Surrogate: Q17 (necessary

support/materials to do job)

•

Q3: Feeling of accomplishment

– Surrogates: Q6 (responsibilities

good fit with interests/skills)

–

Q7: Staff Competent

– Surrogates: Q15 (supervisor lets

know work is appreciated), Q33

(trust management to take interests

into account), Q5 (good

1

2

3

8

Q36 Q3 Q7 Q32 Q3 Q18 Q8

4

5

6 7

(25)

Employee Satisfaction:

Q8 Split (root node)

Competitor Split Improvement

winner Q8 1 0.1174 1 Q18 1 0.1169 2 Q3 1 0.0998 3 Q35 1 0.0957 4 Q6 1 0.0951 5 Q7 1,2 0.094

(26)

Employee Satisfaction:

Q18 Split (right side or root)

This is the best terminal

node for satisfaction

Strongly agree feel welcome

Competitor Split Improvement

Winner

Q18

1

0.0271

1

Q3

1

0.0203

2

Q35

1

0.0195

3

Q6

1

0.0177

4

Q14

1,5

0.0172

5

Q13

1,5

0.0167

(27)

Employee Satisfaction Model:

Key Variables

Primary splitters only

Variable Score Q18 100 Q8 81.02 Q14 72.03 Q27 55.11 Q26 50.53 Q28 50.12 Q5 17.66 Q3 14.14 Q17 14.05 Q11 13.15 Q7 11.89 Q13 11.56 Q6 11.27 Q33 11.03 Q16 9.6

Variable Score

Q8

100

Q18

23.11

Q3

17.46

Q7

14.68

Q36

2.88

Q32

2.68

•

Q8: Feel Welcome

–

Surrogate: Q27 (family friendly),

Q28 (inclusive environment), Q18

(good working conditions)

–

Q18: Good working conditions

–

Surrogate: Q17 (necessary

support/materials to do job)

•

Q3: Feeling of accomplishment

–

Surrogates: Q6 (responsibilities

good fit with interests/skills)

–

Q7: Staff Competent

–

Surrogates: Q15 (supervisor lets

know work is appreciated), Q33

(trust management to take

interests into account), Q5 (good

opportunities for professional

growth)

(28)

Member Satisfaction Model: Key Rules

/*Rules for terminal node 8*/ Matches

• 1,414 surveys (23.1%), • 859 highly satisfied (60.8%), • 58.4% of all highly satisfied RULE:

If ( Q18 = 1 and Q8 = 1) Then Highly Satisfied P(0) = 0.39;

P(1) = 0.61; Lift 2.5

If strongly agree that there are good working conditions and strongly agree that member

/*Rules for terminal node 7 */ Matches

• 476 surveys (7.8%),

• 201 highly satisfied (42.2%), • 13.7% of all highly satisfied RULE:

If ( Q8 = 1 and Q18 <> 1 and Q3 == 1 and Q32 == 1 or 2)

Then Highly Satisfied P(0) = 0.58;

P(1) = 0.42; Lift 1.8

If strongly agree that feel welcome and strongly agree working at the club gives feeling of personal accomplishment, and agree management will take

/*Rules for terminal node 4 */ Matches

• 218 surveys (3.6%),

• 95 highly satisfied (43.6%), • 6.5% of all highly satisfied RULE:

If ( Q8 <> 1 and Q7 = 1 or 2 and Q3 == 1 and Q36 == 1 or 2) Then Highly Satisfied

P(0) = 0.56;

P(1) = 0.44; Lift 1.8

If agree that I’ll be recognized for doing a good job, and strongly agree working at the club gives feeling of personal accomplishment, and agree

(29)

Member Satisfaction Model:

Unsatisfied Rules

/*Rules for terminal node 1*/ Matches

• 1,869 surveys (30.6%), • 55 highly satisfied (2.9%), • 3.7% of highly satisfied

• 39.0% of all not highly satisfied RULE:

If ( Q8 <> 1 and Q7 <> 1 or 2) Then not highly satisfied P(0) = 0.96;

P(1) = 0.04; Lift 0.12

If don’t strongly agree that feel welcome and don’t agree that will be properly recognized for a good job, then not highly satisfied.

/*Rules for terminal node 5*/ Matches

• 640 surveys (10.5%), • 92 highly satisfied (14.4%), • 6.3% of all highly satisfied • 11.8% of all not highly satisfied RULE:

If ( Q8 = 1 and Q18 <> 1 and Q3 <> 1) Then not highly satisfied

P(0) = 0.86;

P(1) = 0.14; Lift 0.58

If don’t strongly agree that there are good working conditions and don’t strongly agree that feel welcome and work doesn’t give a feeling of accomplishment, even though strongly agree that feel welcome, then not highly satisfied.

/*Rules for terminal node 2 */ Matches

• 1,225 surveys (20.0%), • 124 highly satisfied (10.1%), • 8.4% of highly satisfied

• 23.7% of all not highly satisfied RULE:

If ( Q8 <> 1 and Q7 = 1 or 2 and Q3 <> 1)

Then not highly satisfied P(0) = 0.90;

P(1) = 0.10; Lift 0.42

If don’t strongly agree that feel welcome and work doesn’t give a feeling of

accomplishment, even though I agree that I will be properly recognized for a good job, then not highly satisfied.

(30)

Recommend to Friend (=1)

Model: Data Information

File: modeling data with binarized dependents w missing.txt

Target Variable: Q44_1 Predictor Variables: Q66, Q67, Q68, Q69, Q3, Q4, Q5, Q6, Q7, Q8, Q9, Q10, Q11, Q12, Q13, Q14, Q15, Q16, Q17, Q18, Q19, Q20, Q21, Q22, Q23, Q24, Q25, Q26, Q27, Q28, Q29, Q30, Q31, Q32, Q33, Q34, Q35, Q36, Q37, Q38, Q45, Q46, Q47, Q48, Q49, Q50, Q51, Q52, Q53, Q54, Q55, Q56, Q57, Q58, Q59, Q60, Q61, Q62, Q63, Q64, Q65

Class N Cases

Pct

0

3,958 64.7%

1

2,157 35.3%

(31)

31 © Abbott Analytics, 2000-2006

Recommend to Friend Model

Performance

Class N Cases N Misclassified Pct. Class

0

3,958

894

22.59

1

2,157

525

24.34

Node Cases Target Class % of Node

Tgt. Class % Target Class

Cum % Tgt. Class

Cum %

Pop % Pop

Cases in

Node Cum lift Lift 10 1,113 71.90 51.60 51.60 25.32 25.32 1,548 2.04 2.04 9 110 58.51 5.10 56.70 28.39 3.07 188 2.00 1.66 5 198 56.57 9.18 65.88 34.11 5.72 350 1.93 1.60 4 128 49.81 5.93 71.81 38.32 4.20 257 1.87 1.41 8 83 45.36 3.85 75.66 41.31 2.99 183 1.83 1.29 3 215 29.49 9.97 85.63 53.23 11.92 729 1.61 0.84 7 36 24.83 1.67 87.30 55.60 2.37 145 1.57 0.70 2 132 15.60 6.12 93.42 69.44 13.84 846 1.35 0.44 6 12 14.12 0.56 93.97 70.83 1.39 85 1.33 0.40

(32)

Recommend to Friend Model

Splits

Q19: Treated with respect

Q19: Treated with respect

Surrogates: Q18 (good working conditions) and Q8

Surrogates: Q18 (good working conditions) and Q8

(feel welcome)

(feel welcome)

Q37: Compensation practice is fair

Q37: Compensation practice is fair

Surrogates: Q36 (I am paid fairly)

Surrogates: Q36 (I am paid fairly)

Q45: How think members rate club

Q45: How think members rate club

Surrogates: Q47, Q46, Q60 (member

Surrogates: Q47, Q46, Q60 (member

-

-

cleanliness,

cleanliness,

enough equip., check on progress)

enough equip., check on progress)

Q33: Trust management to take interests into account

Q33: Trust management to take interests into account

Surrogates: Q32 (management keeps promises), Q34

Surrogates: Q32 (management keeps promises), Q34

(leaders remove roadblocks to inclusion)

(leaders remove roadblocks to inclusion)

Q5: Good opportunities for professional growth

Q5: Good opportunities for professional growth

Surrogates: Q4 (responsibilities good fit with interests),

Surrogates: Q4 (responsibilities good fit with interests),

Q7 (appropriately recognized)

Q7 (appropriately recognized)

1

2

5

9

6

Q8 Q5 Q45 Q33 Q50 Q35 Q45 Q37 Q19

10

(33)

33 © Abbott Analytics, 2000-2006

Recommend to Friend Model

Key Variables

Primary splitters only

Variable Score Q8 100.0 Q19 99.1 Q18 97.4 Q15 64.5 Q16 63.1 Q14 61.3 Q33 39.6 Q35 33.8 Q32 24.7 Q34 23.9 Q31 23.9 Q9 21.5 Q7 15.4 Q45 14.8 Q37 12.9 Q5 10.0 Q36 9.7 Q4 4.3 Q38 4.0 Q22 1.6 Q50 1.4 Q26 1.0 Q48 0.8 Q47 0.7 Q28 0.6 Q46 0.6 Q11 0.3 Q51 0.3 Q60 0.1

Variable

Score

Q19

100

Q33

32.23

Q45

14.94

Q37

12.99

Q5

8.98

Q8

3.03

Q35

1.67

Q50

1.34

Q19: Treated with respect

Q19: Treated with respect

Surrogates: Q18 (good working conditions) and

Surrogates: Q18 (good working conditions) and

Q8 (feel welcome)

Q8 (feel welcome)

Q37: Compensation practice is fair

Q37: Compensation practice is fair

Surrogates: Q36 (I am paid fairly)

Surrogates: Q36 (I am paid fairly)

Q45: How think members rate club

Q45: How think members rate club

Surrogates: Q47, Q46, Q60 (member

Surrogates: Q47, Q46, Q60 (member--cleanliness, cleanliness,

enough equip., check on progress)

enough equip., check on progress)

Q33: Trust management to take interests into

Q33: Trust management to take interests into

account

account

Surrogates: Q32 (management keeps promises),

Surrogates: Q32 (management keeps promises),

Q34 (leaders remove roadblocks to inclusion)

Q34 (leaders remove roadblocks to inclusion)

Q5: Good opportunities for professional growth

Q5: Good opportunities for professional growth

Surrogates: Q4 (responsibilities good fit with

Surrogates: Q4 (responsibilities good fit with

interests), Q7 (appropriately recognized)

interests), Q7 (appropriately recognized)

Q8: Feel welcome

Q8: Feel welcome

Surrogates: Q7

(34)

Recommend to Friend Model:

Key Rules

/*Rules for terminal node 10*/ Matches

• 1,548 surveys (25.3%), • 1,113 recommend (71.9%), • 51.6% of all strong recommends RULE:

If ( Q19= 1 and Q37 = 1 or 2) Then Recommend = 1

P(0) = 0.281;

P(1) = 0.719;; Lift = 2.0

If strongly agree that supervisors treat me with respect, and agree that compensation practice is fair, then

/*Rules for terminal node 9*/ Matches

• 188 surveys (3.1%), • 110 recommend 58.5%), • 5.1% of all strong recommends RULE:

If ( Q19 = 1 and Q37 <> 1or 2 and Q45 = 1)

Then Recommend = 1 P(0) = 0.415;

P(1) = 0.585; Lift = 1.7

If strongly agree that supervisors treat me with respect, and believe that members strongly agree they are highly

/*Rules for terminal node 5*/ Matches

• 350 surveys (5.7%), • 198 recommend (73.5%), • 9.2% of all strong recommends RULE IF ( Q19 <> 1 and Q33 = 1 or 2 and Q45 = 1 ) Then Recommend = 1 P(0)= 0.434; P(1) = 0.566; Lift = 1.4

If agree that trust management will take my interests into account, and believe that members strongly agree they are highly satisfied, even though don’t

(35)

35 © Abbott Analytics, 2000-2006

Recommend to Friend Model:

Rules for Not Recommending

/*Rules for terminal node 1 */ Matches

• 1,784 surveys (29.2%),

• 130 highly recommend (7.3%), 94% don’t highly rec. • 6.0% of all highly recommend

RULE:

If ( Q31 <> 1 and Q22 <> 1) Then Don’t Strongly Recommend P(0) = 0.94

P(1) = 0.06;

If don’t strongly agree that supervisors treat me with respect, and don’t agree that management will take interests into account, then don’t strongly agree that will recommend to friend.

/*Rules for terminal node 2 */ Matches

• 846 surveys (13.84%),

• 132 highly recommend (15.6%), 84.4% don’t highly rec. • 6.1% of all highly recommend

RULE

If ( Q19 <>1and Q33 = 1or 2 and Q45 <> 1 and Q5 <> 1 or 2) Then Don’t Strongly Recommend

P(0) = 0.84; P(1) = 0.16;

If don’t strongly agree that supervisors treat me with respect, and don’t strongly believe that members are highly satisfied, and don’t agree that there are good opportunities for professional growth, then even though agree that management will take interests into account,

(36)

Intend to Continue Working at Club (=1)

Model: Data Information

File:modeling data with binarized dependents w missing.txt Target Variable: Q39_1 Predictor Variables: Q66, Q67, Q68, Q69, Q3, Q4, Q5, Q6, Q7, Q8, Q9, Q10, Q11, Q12, Q13, Q14, Q15, Q16, Q17, Q18, Q20, Q21, Q22, Q23, Q24, Q25, Q26, Q27, Q28, Q29, Q30, Q31, Q32, Q33, Q34, Q35, Q36, Q37, Q38, Q45, Q46, Q47, Q48, Q49, Q50, Q51, Q52, Q53, Q54, Q55, Q56, Q57, Q58, Q59, Q60, Q61, Q62, Q63, Q64, Q65

Class N Cases

Pct

0

3,030

49.6%

1

3,085

50.4%

(37)

37 © Abbott Analytics, 2000-2006

Intend to Continue Working at Club:

Model Performance

Class N Cases N Misclassified

Pct.

Misclass

0

3,030

868

28.65

1

3,085

849

27.52

Node Cases Target Class % of Node Tgt. Class % Target Class Cum % Tgt. Class Cum % Pop % Pop Cases in

Node Cum lift Lift 10 1,099 80.81 35.62 35.62 22.24 22.24 1,360 1.60 1.60 9 486 69.63 15.75 51.38 33.66 11.42 698 1.53 1.38 5 349 67.38 11.31 62.69 42.13 8.47 518 1.49 1.34 8 100 65.36 3.24 65.93 44.63 2.50 153 1.48 1.30 4 202 53.87 6.55 72.48 50.76 6.13 375 1.43 1.07 7 75 43.86 2.43 74.91 53.56 2.80 171 1.40 0.87 2 224 35.33 7.26 82.17 63.93 10.37 634 1.29 0.70 3 43 33.59 1.39 83.57 66.02 2.09 128 1.27 0.67 6 65 30.23 2.11 85.67 69.53 3.52 215 1.23 0.60 1 442 23.73 14.33 100.00 100.00 30.47 1,863 1.00 0.47

(38)

Intend to Continue Working at Club

Model: Splitters

•

Q8: Feel Welcome

– Surrogate: Q27 (family friendly place), Q28

(diverse environment), Q18 (good working

conditions)

•

Q69: Age

– Surrogate: Q66 (how long worked at Club),

Q68 (education)

•

Q18: Good Working Conditions

– Q17 (have necessary support and

materials to do job)

•

Q5: Good Opportunities for Professional

Growth

– Q7, Q33 (Management will take my

interests into account)

•

Q7: Will be Recognized for Good Job

Q56

Q66

Q7

Q5

Q6

Q5

Q18

Q69

Q8

1

2

6

5

9

10

(39)

Intend to Continue Working at Club

Model: Key Variables

Primary splitters only

Variable Score Q8 100 Q18 84.13 Q27 63.23 Q11 57.03 Q28 50.45 Q26 48.54 Q7 43.43 Q5 37.23 Q33 32.81 Q31 23.56 Q69 22.21 Q4 21.86 Q9 18.79 Q3 13.82 Q13 9.98 Q14 9.46 Q16 8.12 Q15 6.03 Q66 5.26 Q17 3.99 Q56 2.15 Q6 2.03 Q23 1.63 Q68 1.23

Variable

Score

Q8

100

Q5

37.07

Q69

17.48

Q7

11.24

Q18

10.7

Q66

5.19

Q56

2.15

Q6

2.03

•

Q8: Feel Welcome

– Surrogate: Q27 (family friendly place),

Q28 (diverse environment), Q18 (good

working conditions)

•

Q69: Age

– Surrogate: Q66 (how long worked at

Club), Q68 (education)

•

Q18: Good Working Conditions

– Q17 (have necessary support and

materials to do job)

•

Q5: Good Opportunities for

Professional Growth

– Q7, Q33 (Management will take

my interests into account)

•

Q7: Will be Recognized for Good

Job

(40)

Intend to Continue Working at Club

Model: Key Rules

/*Rules for terminal node 10 */ Matches

• 1,360 surveys (22.2%), • 1,099 intend to continue (80.8%),

• 35.6% of all intend to continue RULE:

If (Q8 = 1 and Q69>=2.5 ) Then Intend to continue P(0) = 0.19;

P(1) = 0.81;; Lift = 1.6 If strongly agree that feel

welcome and am 35 years old or older, then strongly agree that

/*Rules for terminal node 9 */ Matches

• 698 surveys (11.4%),

• 486 intend to continue (69.6%), • 15.8% of all intend to continue RULE:

If ( Q8 = 1 and Q18 = 1and Q69 <= 2.5 )

Then Intend to continue P(0) = 0.30;

P(1) = 0.70; Lift = 1.4

If strongly agree that feel welcome and strongly agree that there are good

/*Rules for terminal node 5 */ Matches

• 518 surveys (8.5%),

• 349 intend to continue (67.4%), • 11.3% of all intend to contiue RULE

IF ( Q8 <> 1 and Q5 = 1 or 2 and Q7 = 1 or 2 and Q66 > 2.5 )

Then Intend to continue P(0)= 0.32;

P(1) = 0.68; Lift = 1.3

If I strongly agree that if I do a good job I’ll be recognized, and I strongly agree that there are good opportunities for professional growth, and I have worked at the club for more than 2

(41)

41 © Abbott Analytics, 2000-2006

Intend to Continue Working at Club Model:

Rules for Don’t Strongly Intend to Continue

/* Rules for terminal node 1 */ Matches

• 1,863 surveys (30.5%),

• 442 strongly intend to continue working (23.7%), • 14.3% of all strongly intend to continue working • 46.9% of all not strongly intending to continue RULE:

If ( Q8 <> 1 and Q5 <> 1 or 2)

Then not strongly intending to continue working at club P(0) = 0.76;

P(1) = 0.24; Lift 0.47

If don’t strongly agree that feel welcome and don’t strongly agree that there are good opportunities for professional growth, then don’t strongly agree that intend to continue working at the club.

/*Rules for terminal node 2 */ Matches

• 634 surveys (10.4%),

• 224 strongly intend to continue working (35.3%), • 7.3% of all strongly intend to continue working

• 13.5% of all not strongly intending to continue working RULE

If ( Q8 <> 1 and Q5 = 1 or 2 and Q7 <> 1 or 2 ) Then not strongly intending to continue working at club P(0) = 0.65;

P(1) = 0.35; Lift 0.70

If don’t strongly agree that feel welcome and don’t strongly agree that if I do a good job I’ll be recognized, even though I strongly agree that there are good

opportunities for professional growth, then don’t strongly agree that intend to continue working at the club.

(42)

Satisfaction Model

Satisfaction Model

Top two rules identify 65% of most satisfied

Top two rules identify 65% of most satisfied

Top three rules identify 79% of most satisfied

Top three rules identify 79% of most satisfied

Recommend to Friend

Recommend to Friend

Top three rules identify 66% of most likely to recommend to

Top three rules identify 66% of most likely to recommend to

friend

friend

Intend to Keep Working at Club

Intend to Keep Working at Club

Top three rules identify 63% of most likely to keep working

Top three rules identify 63% of most likely to keep working

(43)

Summary of Results

Satisfaction keys:

Satisfaction keys:

Make an environment where employees feel welcome, and have a sen

Make an environment where employees feel welcome, and have a sense

se

of purpose

of purpose

Recommend to a Friend keys

Recommend to a Friend keys

Supervisors treat employees with respect and either good pay or

Supervisors treat employees with respect and either good pay or

it is

it is

perceived that members really like the club

perceived that members really like the club

Will work at club in a years time

Will work at club in a years time

For those under 35: feel welcome (relationships)

For those under 35: feel welcome (relationships)

For those over 35 (or worked at club a long time): feel welcome

For those over 35 (or worked at club a long time): feel welcome

and

and

good

good

working conditions

working conditions

For those who don

For those who don’

’t feel welcome, need good opportunities for

t feel welcome, need good opportunities for

professional growth

(44)

Conclusions

Trees can be used to provide concise summaries

Trees can be used to provide concise summaries

of behavioral tendencies from surveys

of behavioral tendencies from surveys

Regression shows global, average attitudes

Regression shows global, average attitudes

Trees show specific, localized attitudes

Trees show specific, localized attitudes

Two or three rules can describe nearly 2/3 of all

Two or three rules can describe nearly 2/3 of all

employee attitudes of interest

employee attitudes of interest

Rules make sense, and are easy to explain

Rules make sense, and are easy to explain

Rules and are actionable

References

Related documents

They include rate of groundwater inflow from the confined aquifer into the Kolahdarvazeh open pit mine, hydraulic heads at four observation wells around the pit during its advance

3.1.1 If a Proposer intends to request that the City of Cedar Rapids enter into any agreement form in connection with the award of this project, the form must be submitted with the

[r]

The Engine-Tank configuration menu is located below the NMEA 2000 Devices list, but will only be accessible if a Suzuki engine Interface, EP-10 Fuel Flow, EP-15 Fluid Level,

Easiest, fastest product line selection… Stack Height Transmission Speed Number of Positions Current Rating Contact Pitch Number of Rows Termination Style Linear Density

Introduction: Ventilator Associated Pneumonia (VAP) refers to a type of pneumonia that occurs more than 48-72 hours after endotracheal intubation, and is one of the most

Among macroeconomic variables only gross domestic product (GDP) positively affects environmental degradation in the long term and short term whereas energy consumption only