1
The Determinants of Student Success in University:
A Generalized Ordered Logit Approach
Philippe Cyrenne and Alan Chan
1July 15, 2019
Abstract. The ability of Universities and Colleges to predict the future success of admitted students
continues to be a key concern of higher education officials. Apart from a desire to see a greater number of
students have successful academic careers, there is also the fiscal reality of greater tuition revenues
providing needed support for university budgets. Using administrative data, this paper introduces a
relatively new empirical approach to estimating the determinants of student success in post secondary
institutions. Using Ordered Logit and Generalized Ordered Logit estimators, we are able to estimate the
role a number of key factors have in influencing student success. An important feature of this approach is
that it can be conducted using readily available administrative data. While the results relate to one
institution, we feel there are useful lessons to be learned for other institutions facing similar issues.
2
1. IntroductionThe ability of Universities and Colleges to retain previously admitted students continues to be a key concern of higher education officials. Apart from their desire to see a greater number of students have successful academic careers, there is also the fiscal reality of higher tuition revenues providing needed support for university budgets. This desire has spurred a large number of researchers to examine the factors that influence the performance of students in higher education, by examining issues related to graduation, stop out and withdrawal.
2There are a number of approaches that researchers have taken to try to estimate the relationship between student characteristics, institutional support and student performance. One approach is to predict the retention or withdrawal rates of students using "Survival Analysis" also called an "event history study".
Survival analysis is the branch of statistics that examines time to a significant event, for example, in the medical literature it examines the factors that influence the time to a heart attack.
3In higher education, the event of interest might be the time to withdrawal, either permanent or temporary, or time to graduation.
4The type of survival analyis used depends largely on the nature and frequency of the data. Continuous time models are based on frequent observations, while discrete time methods are more appropriate when subjects are observed at relatively infrequent fixed intervals. Given that data on post secondary students is collected on a semester or annual basis, it has been suggested that discrete time methods are more apppriate for event studies of higher education. While addressing student performance using discrete time survival analysis can yield helpful insights, that approach requires signficant data preparation as well as the need to address the alternative student outcomes in a competing risk framework.
5Another approach higher education researchers have taken is to model the factors that determine the varying academic outcomes for post secondary students using a bivariate or multinomial choice model.
For example, choice models can relate the different student outcomes to a number of explanatory variables,
which include student characteristics as well as the level and nature of institutional support.
6A key
shortcoming of bivariate or multinomial models of student outcomes is they ignore the ranking of
alternative student outcomes that can occur, for example graduation, continuation, or withdrawal. The fact
3
that multinominal models do not rank outcomes is at odds with the fact that both students and university officials have a sense that a student’s performance in university programs can be ranked. For example, graduation is preferable to continuation which is preferable to withdrawal from the university.
7This paper builds on the latter approach by examining the factors that determine the relative success of students at one post secondary institution. The central research question is to determine the factors that influence the graduation, continuation, or withdrawal of previously admitted students. In doing so, we develop a relatively novel approach that as far as we know has not been widely used in the higher education literature. Specifically, we use both an Ordered Logit and Generalized Ordered Logit model to determine the factors which can lead to greater success in university or college. In defining relative success, we assume the student outcomes can be ranked ordinally. For example, we assume the highest ranked student outcome would be graduation, followed by continued enrollment, and finally permanent withdrawal as the lowest ranked outcome. Each of these outcomes is based on a five year observation period.
8In order to address the issue of the factors influencing student success, we collected data on the subsequent performance at the University of Winnipeg of students from 84 Manitoba high schools which are grouped into 32 Manitoba school divisions.
9By tracking the university performance of these students, we are able to determine the likelihood of success of students enrolled at the University of Winnipeg, based on their initial academic performance as well as a number of characteristics including their high school grades and the nature of their high school education and financial support.
Our paper differs from previous work in a number of ways. First, our data set has a number of
unique features. For example, given we to track students from a number of entering classes over a five
year period.
10This allows us to record whether students graduated, continued or withdrew from the
university within that period.
11Second, our data set only involves students that graduated from a Manitoba
high school, which are governed by a standard province wide curricula and are taught by teachers who are
provincially certified. This allows us to control for the relationship between a student’s high school
curriculum and subsequent performance in university.
12Third, our data set includes a significant number
4
of covariates, which can assist us in identifying the types of students that may be more successful. For example, we include whether students received financial assistance, either a bursary or a scholarship, while attending the University of Winnipeg In addition, our data allows us to examine the effect of not only a student’s high school grades on the likelihood of successs, but the nature of the student’s high school, whether private or public, as well as its funding level. Fourth, by using a Generalized Ordered Logit model, we are able to develop a rather simple framework for evaluating the determinants of students success requiring a relatively straightforward application of commonly collected administrative data.
13A Generalized Ordered Logit approach we feel is a very flexible estimator that can be easily applied by unversity officials even in settings where data on students is limited.
14Fifth, in order to forecast the success of admitted students we only make use of data on the respective covariates one year after admittance, data which is readily collected by university officials.
15We also examined the determinants of success for a number of student types enrolled at the University of Winnipeg. For example, we identified the relative success of students who were enrolled on a student Visa, students who were granted Permanent resident status in Canada, and students who self-identified as Aboriginal.
16Our findings can be summarized as follows. First, we find that a student’s first year performance is a key indicator of overall student success. We also find that students pursuing an education degree, students who entered with better high school grades or took a larger number of full course equivalents in their first year, were more successful. We find little evidence of significant high school fixed effects on student success. In addition, we find that the parallel lines/ proportional odds assumption assumed in the standard Ordered Logit model, which assumes the effects of the covariates are with respect to the respective student outcomes, does not hold for a number of covariates. To account for these varying effects, we re-estimate the model using the Generalized Ordered Logit model as discussed by Williams (2006, 2016).
We find that the odds ratios associated with a number of regressors differ significantly with respect to the different categories of success.
While the results obtained here apply to one university, we feel that in many ways the results
are applicable to a wider range of institutions and countries, that share similar academic structures and
5
similar information gathering. In particular, the results here we feel are representative of liberal arts colleges and universities which have relatively liberal admittance criteria and relatively strong faculty.
172. A Brief Literature Review
A number of researchers have examined the factors that influence the success of students in higher education institutions. An early paper by Dey and Astin (1993) used logit, probit and linear regression models to estimate the retention of college students. Using data from 947 first time full time college freshmen from 29 community colleges in 1987, Dey and Astin (1993) found few practical differences between analyses conducted using logit, probit, or linear regression. They found the strongest predictor of retention, that is returning in the second year to complete a vocational certificate, was a student’s high school grade average, with concern about finances, a desire to make more money, and preparation for graduate school also playing a significant role.
18Montmarquette et al.(2001) using data from over 3000 students enrolled in 43 progams and a bivariate probit model with selectivity bias find that strong academic performance in the first semester, as well as a non-traditional class size effect in first year mandatory courses, help explain the persistence of the university students. Singell Jr. (2004) takes a random utility approach to examine whether financial aid affects college retention. Using data on 10,560 in state and out of state freshmen, Singell Jr. examines the decision of college students to re-enroll after the first year. He finds that while need and merit based financial aid significantly increase retention, increasing reliance on unsubsidized and merit-based aid by the government and universities, lowers the relative graduation rates of needy students.
19Herzog (2005) using a multinominal logit model examines the impact of high school preparation,
first year academic performance, multi-institution enrollment, and financial aid support, on second year
persistence. He finds that middle income students with financial challenges are more at risk of dropping out,
while first year math experience, second year financial aid, and simultaneous enrollment at another college
or university are important determinants of student retention. In contrast, Arulampalam et. al (2005) use a
binomial logit model to determine the factors that influence the probability that an individual withdraws
6
from their university degree course during their first year of study. Using a large data set of undergraduate students drawn from UK universities, they find that weaker students, both males and females, are more likely to drop out. They also find that that better prepared students are less likely to leave highly ranked universities while weaker students at the same universities face significant pressure to do so.
Similar to Herzog (2005), Stratton et al. (2008) use a multinomial logit random utility model to examine the choice of continuous enrollment, short-term dropout, and long term dropout. Using data from the National Center for Education Statistics, they find significant differences associated with stopout and dropout behavior. They find that delayed matriculation, the type of first year aid as well as marital and personal status have a significant effect on whether a student’s academic career is interrupted. They find that a substantial fraction of withdrawals are temporary, and the nature of the financial aid has a significant effect on whether a student continues or dropout.
Recently, Danilowicz-Gösele et al. (2017) use a Probit model and administrative data from a German university to estimate the factors contributing to student success. They find that student success, which they define as graduation, differs significantly by faculty. In particular, they find that a student’s entering grade is a strong predictor of academic success. In contrast; measures of parental income or social background do not add significantly to the explanatory power of the model.
20Perhaps the closest research to ours in terms of methodology is the ordered logit model of high school persistence undertaken by Heck et al. (2012).
21In their model, the ordered outcomes are 0 if the student dropped out, 1 if the student was enrolled, and 3 if the student graduated from high school. Using data from 6,883 students from 934 high schools, they find that lower levels of socioeconomic status or high absenteeism raise the log of the odds of dropping out or remaining enrolled in high school.
22It is useful to place our research in relation to the above. While academic research examining
higher education issues using administrative data is not new, a goal of our research is to help university
officials who are concerned with retention issues. While empirical approaches like survival analysis can
potentially provide institutions with helpful information to identify students at risk, the estimator requires a
great deal of data preparation and relatively highly skilled administrative staff to perform such exercises.
7
In contrast, the Generalized Ordered Logit approach taken here requires less sophistication in terms of data collection, data preparation and model estimation. In addition, we argue that a Generalized Ordered Logit approach can provide a reliable indicator of which type of students are at risk, while only using data from the student’s first year performance at the University. We feel this reduced administrative burden as well as model parsimony are likely to be very useful to higher education administrators concerned with student retention and performance.
3. The Estimation Procedure
The first step in the estimation procedure involves defining what is meant by student success in higher education. In doing so, we order the outcomes of students enrolled at the University of Winnipeg as follows. We assume the highest ranked outcome is graduation, followed by continued enrollment, and finally withdrawal as the lowest ranked outcome. Specifically, if the student graduates the outcome is coded 3, if the student is still enrolled the outcome is coded 2, with the outcome coded 1 if the student has withdrawn from the University, all within the 5 year period.
23It is clear there could be a larger set of ordinal outcomes, involving a finer partition of states. For example, we could distinguish between students who graduated within 4 years from those who graduated in 5 years. While this may be of interest to some observers, we feel an expansion of the event space will make the interpretation of the empirical results somewhat unwieldy.
In order to estimate the determinants of the above ordinal rankings of student outcomes we use both an Ordered Logit Model and a Generalized Ordered Logit model. An Ordered Logit model is designed to estimate the the underlying score (the ranking of outcome) as a linear function of the independent variables and a set of cutpoints. Specifically, "the probability of observing outcome i corresponds to the probability that the estimated linear function, plus random error, is within the range of the cutpoints estimated for the outcome."
24As outlined by Williams (2006), the standard ordinal logit model (ologit) can be written as
8
P(Y
i>j)=g(Xβ)= exp(α
j+ X
iβ)/[1 + {exp(αj+ X
iβ)} , j=1,2,….., M-1The standard model is sometimes called the parallel lines/proportional odds model. In constast the Generalized Ordered Logit model (gologit) can be written as
P(Y
i>j)=g(Xβ
j)= exp(α
j+ X
iβj)/[1 + {exp(α
j+ X
iβj)} , j=1,2,….., M-1
As can be seen, the difference is that in the standard Ordinal Logit model, the β’s are constrained to be equal across categories, that is, for each value of j.
25As suggested by Williams (2006:60), the parallel lines assumption, or fixed β’s across categories, is often violated. In the case where the β’s vary across categories, the probabilities that the dependent variable Y will take on the respective values, 1,…,M are given as
P(Y
i=1) = 1-g(X
iβ1)P(Yi=j)=g(Xiβj-1) – g(Xiβj) j=2,…,M-1
P(Y
i=M) = 1-g(X
iβM-1)Depending on the number of categories M, when M=2 we can get the logistic regression model, with M>2 as outlined by Williams the model becomes a series of binary logistic regessions.
26The Generalized Ordered Logit can be estimated using the estimator gologit2 written for Stata 8.2.
274. The Data Set and Estimation Procedure
The data used in this paper involves a cross section of student cohorts who entered the University of
Winnipeg over a five year period. The first set of students in our sample entered the University of Winnipeg
in 1997.
28The University of Winnipeg may be described as a typical Liberal Arts institution in Canada
similar in many ways to smaller state schools in the United States. The University of Winnipeg has
relatively liberal admission standards, with a significant range in student preparation, but a relatively strong
faculty overall. The University of Winnipeg is a primarily undergraduate institution, which in many ways
is similar in structure and mission to four year public colleges found in the United States. It is funded in
much the same way as state colleges in the United States, with the majority of operating funds coming from
9 the Province of Manitoba.
During the period we tracked the performance of students, the University of Winnipeg academic structure consisted of 3 principal faculties: the Faculty of Education, the Faculty of Science and the Faculty of Arts (including Humanities). Unlike most universities, the University of Winnipeg did not have a Faculty of Business, which was only created in 2008, several years after our sample period. Prior to that, business studies consisted largely of a set of recommended courses from a wide variety of departments, and was viewed as an interdisciplinary program in the social sciences, with a relatively small set of explicitly business type courses in existence.
It is important to outline how we arrived at the final data set.
29In total we have 5,013 observations
in our final data set. There were 14,246 observations in our initial sample; however, a number of students
were dropped from this initial sample for a variety of reasons. First, we only considered students who
graduated from a Manitoba high school: this included both Canadian and Foreign students.
30Second, a
number of students were dropped because they did not have a standard high school average – i.e. students
who might have a letter grade for Grade 12 (or equivalent) courses rather than a numerical score. Third,
some adult learners (older than 21) who did not graduate from high school (Manitoba or otherwise) but
were admitted to the University of Winnipeg under a Mature Student category were dropped. Fourth,
students for which family income data could not be estimated, or were missing school expenditure data
from their respective high schools were ruled out.
31Finally, students who did not have a five year average
GPA at the University of Winnipeg were excluded from the sample. The above exclusions left us with 5187
observations, but a decision to restrict the sample by dropping high schools that sent fewer than 3 students
to the University of Winnipeg resulted in 5136 observations. We further excluded 123 students who
dropped out completely prior to the end of the first year, which resulted in 5013 students in the final data
set. Our data set tracked the subsequent performance of the entering classes of 1997 through to 2001. Once
admitted, we tracked the course registrations and university grades for each entering class over a five year
period. To summarize our data set, of the population of students who graduated from the University of
Winnipeg, the students that were not included in our final data set were excluded largely for recording
10 reasons.
Table 1 provides summary statistics on our sample of 5013 students from 84 Manitoba High Schools. Table 1 shows that the majority of students were classified as Canadian (canadian), roughly with Males making up approximately 36% of the students (male). The mean age of students was 18.8 years (age). Overall, the High School Average (hsaverage) of incoming students over the 5 year period (1997-2002) was 78.4%. In our sample, we also indicate whether a student was enrolled at the University of Winnipeg on a student visa (Visa) - 77 students in total - was a Permanent resident in Canada (permanent) - 86 students, or self-identified as Aboriginal (aboriginal) - 93 in total.
32Table 1: Summary Statistics – Incoming Students (5013)
variable sum mean sd min median max
success 1.92 0.8943 1 2 3
canadian 4850 0.97 0.1774 0 1 1
aboriginal 93 0.02 0.1349 0 0 1
permanent 86 0.02 0.1299 0 0 1
visa 77 0.02 0.1230 0 0 1
male 1794 0.36 0.4794 0 0 1
deg4yrs 116 0.02 0.1504 0 0 1
degedu 813 0.16 0.3687 0 0 1
degnoinfo 2387 0.48 0.4995 0 0 1
private 882 0.18 0.3808 0 0 1
relig_private 500 0.10 0.2997 0 0 1
dUWbursary1 109 0.02 0.1459 0 0 1
dUWscholarship1 1444 0.29 0.4529 0 0 1
dMBloan1 551 0.11 0.3128 0 0 1
dMBbursary1 124 0.02 0.1553 0 0 1
hsaverage (%) 78.39 10.0231 51 79 100
gpayr1 2.77 0.9160 0 2.75 4.5
fcesyr1 3.54 1.2495 .5 4 7.5
age 18.83 2.4800 16 18 69
rincome (000's) 48.828 13.2919 18.400 48.744 91.999
rexpenditure (000's) 6.077 0.9031 3.436 5.921 13.840
Source: University of Winnipeg Administrative Data, Statistics Canada, Province of Manitoba.
97%
11
We also recorded the academic plans of students, with 116 students indicating they were pursuing a 4 year degree (deg4yrs), 813 were pursuing an Education degree (degedu), while 2387 students had not declared their degree plans (deqnoinfo).
33In addition, we recorded the grade point average of students after their first year (gpayr1) and the number of full course equivalent courses students completed after their first year (fcesyr1). The mean gpa after year 1 for the set of students was 2.77 who completed 3.5 fces in their first year on average.
We also include as regressors, whether the student received financial assistance while enrolled. We include whether the student received an entrance scholarship and/or a bursary. Students at the University of Winnipeg are eligible for a University scholarship (dUWscholarship1) or University bursary (dUWbursary1) or a Province of Manitoba loan (dMBloan1) or Province of Manitoba bursary (dMBbursary1).
34We record the receipt each of above financial assistance as a dummy variable. The fraction of students who received either a University of Winnipeg or Province of Manitoba bursary was 4%, while 29% of students received a University of Winnipeg scholarship while 11% of students received a student loan from the Province of Manitoba.
We also recorded several additional characteristics for each student. For example, we recorded the high school from which the student graduated.
35The sample includes students from 84 Manitoba High schools, with the top 10 high schools sending slightly more than 50% (2584) of the total number of students to the University of Winnipeg. The students in our sample come from 77 public schools and 7 private schools. Private high schools are assigned to the Funded Independent School Division and receive some public funding from the Province of Manitoba.
36Regarding private high schools (private), they include a mix of secular and religious based high schools (relig_private). In terms of the mix of students in our sample, of the students graduated from a private high school, while 10% graduated from religious based high school.
We also collected school expenditures per pupil determined at the school division level.
Specifically, rexpediture, is the amount of spending per student for each year and for each school division based on data from Manitoba Education, Citizen and Youth.
37The 5013 students came from 32 school
18%
12
divisions. The top 10 school divisions are represented by 4705 students or 94% of the students in our sample. During the sample period 1997 – 2002 period, the largest number of first year enrolments came from the Winnipeg School Division (969) followed by Funded Independent Schools (882). The remaining students came from several school divisions in the City of Winnipeg and a large number of rural school divisions. As can be seen there is a substantial variation in real expediture by school divisision, with the lowest real expenditure per student being $3,436 while the highest is $13,840, with a median real expenditure of $5,921.
While it would be helpful to have data on financial resources of students, unfortunately the University of Winnipeg does not record the financial background of students or the educational level of parents or guardians in their admission process. Given that educational attainment is often seen as related to family income, we use as a proxy for a student’s family income, the median family income associated with the respective postal code given as their permanent residence (rincome).
38It addition, the variable rincome can also capture peer or neighborhood effects which can play a role in academic achievment.
Table 2 summarizes the subsequent performance of students admitted to the University of Winnipeg for the 1997-2002 entering classes.
Table 3: Status of the 5013 students (5 years after entry)
Ordered
cohort 1 2 3
Withdrawn Continuing Graduated Total
1997 367 131 242 740
1998 380 138 285 803
1999 317 150 306 773
2000 326 183 299 808
2001 372 183 337 892
2002 467 185 345 997
Total 2,229 970 1,814 5,013
Source: University of Winnipeg Administrative Data
13
Our sample of 5013 observations includes both students who graduated over the five year period (1814) and those that did not (3199).
39As can be seen the withdrawal rate is very high ranging from a low of 40% for the class of 2000 to a high of 50% for the students admitted in 1997. The percentage of students graduating within 5 years ranged from a low of 33% for the 1997 entering class to a high of 40% for the 1999 class.
Much like the trend for other post secondary institutions, a number of students are taking longer to complete their degrees, although the fraction has stabilized at around 19% which is the percentage of students continuing their registration after 5 years.
Table 3 reports the summary statistics for three subsamples of our complete data set, which include self-identified Aboriginal students, Permanent residents and Visa students. Permanent residents are students who have received landed immigrant status in Canada. As can be seen, there is some variation in the summary statistics for the three groups, however, there are some general trends. For example, overall self- identified Aboriginal students tend to be quite similar to Visa students, in terms of incoming high school average (hsaverage), first year gpa (gpayr1) and student success. Visa students are more likely to have attended a private high school in Manitoba than Aboriginal students or Permanent residents.
Students who are Permanent residents tend to have a higher incoming high school average, have a higher first year GPA and take more courses in their first year and come from higher income neighborhoods than Visa or Aboriginal students. Aboriginal students tend to be somewhat older, more likely to pursue an education degree and more likely to be female than Visa or Permanent resident students.
Regarding financial support, 15% of self-identified Aboriginal students in our sample received a loan from the Province of Manitoba, 13% received a bursary from the Province of Manitoba, and 13%
received a scholarship from the University of Winnipeg. In contrast, 30% of Permanent students received
a loan from the Province of Manitoba, 22% received a University of Winnipeg Scholarship and 8% a
University of Winnipeg Bursary. Regarding Visa students, they are generally ineligible for financial
support from the Province of Manitoba, but 4% did receive a scholarship from the University of Winnipeg.
14
Table 3: Summary Statistics - Sub Samples
success male deg4yrs degedu degnoinfo private dMBloan1 dMBbursary1 dUWbursary1 dUWscholar1 hsaverage gpayr1 fcesyr1 age rincome rexpenditure Aboriginal (N=93)
sum 20 1 19 38 14 14 12 2 12
mean 1.83 0.22 0.01 0.20 0.41 0.15 0.15 0.13 0.02 0.13 73.81 2.36 3.27 21.23 41.911 6.186
min 1 0 0 0 0 0 0 0 0 0 55 0 0.5 17 18.983 4.148
median 2 0 0 0 0 0 0 0 0 0 75 2.4 3 19 43.338 6.042
max 3 1 1 1 1 1 1 1 1 1 94.67 4.25 5.5 54 89.337 8.675
Permanent (N=86)
sum 40 2 5 42 14 26 3 7 19
mean 1.698 0.465 0.023 0.058 0.488 0.163 0.302 0.035 0.081 0.221 78.36 2.61 3.40 20.04 44.352 6.219
min 1 0 0 0 0 0 0 0 0 0 57.67 0 0.5 16 18.983 3.957
median 1 0 0 0 0 0 0 0 0 0 79 2.5 3.5 19 46.588 5.988
max 3 1 1 1 1 1 1 1 1 1 96.67 4.5 6 43 91.999 10.543
Visa (N=77)
sum 37 3 1 38 46 0 0 0 3
mean 1.896 0.481 0.039 0.013 0.494 0.597 0 0 0 0.039 71.99 2.23 2.71 20.04 36.370 5.705
min 1 0 0 0 0 0 0 0 0 0 52 0 0.5 17 18.400 3.522
median 2 0 0 0 0 1 0 0 0 0 71.33 2.17 2.5 19 37.263 5.912
max 3 1 1 1 1 1 0 0 0 1 89 4.25 7.5 32 53.789 11.143
15
4.1 Possible HypothesesThere are a number of hypotheses regarding the expected signs of the regression parameters. Along with much of the literature, we assume that a student’s high school grades is a strong predictor of University performance.
40Regarding the effect of gender, recent academic research (as well as anecdotal evidence) suggests that females are outperforming their male counterparts in recent years, both at the high school and post secondary level.
41We also expect that certain types of students may face particular obtacles in their pursuit of higher education at the University of Winnipeg. For example, students here on student visas (visa) , students who are recent immigrants (permanent) may find adjusting to Canadian life challenging at the outset. In has been suggested that Aboriginal students (aboriginal) also face significant obstacles in acquiring higher education. For example, Richardson and Blanchet-Cohen (2000) provide a comprehensive analysis of the unique challenges facing aboriginal students and discuss a number of issues that need to be addressed in order to support aboriginal students in post secondary education programs.
Regarding the high school effect, it is possible the subsequent performance of Manitoba high school students is affected by the resources spent on their high school education (rexpenditure) as well as the non-pecuniary features of the high school (academic standards, discipline).
42For example, a common perception is that students from private high schools perform better at the University of Winnipeg.
43This is often separated from the relative differences in high school resources.
44We attempt to separate these two effects by recording the differences in school expenditures as well as the pure private school effect by identifying whether the student graduated from a public or private school system, and whether the private school was religious based or secular.
45Financial variables, which can include family income support, bursaries, and scholarships are also thought to play a significant role in helping students succeed in their post-secondary studies.
46However, given that the University of Winnipeg does not record the student’s financial background upon admittance, we use as a proxy the median family income of the postal code that is listed as the student’s permanent address (rincome).
We also include some information regarding the choice of major by students. For example, we
have a dummy variable to indicate whether a student did not declare an intended major. In addition, we
16
include a dummy variable which indicates whether students have been admitted to the Education program prior to enrolling at the University of Winnipeg. These variables allow us to test whether students pursuing an Education degree or have not declared their major are more likely to succeed at the University of Winnipeg. Regarding education students, we control for the relative abilities of these students including their high school average (hsaverage) which leaves two possible hypotheses regarding the relative success of Education students. The first is that students in professional programs like Education are more committed to their degree program, since they may feel they have more at stake should they succeed.
Second, is that graduating with an Education degree may be relatively easier than for other programs. For both these reasons, it is suggested that students enrolled in Education are more likely to succeed.
4. Ordered Logit Estimation Results
Tables 4(a) and 4(b) summarize the estimation results based on equation (1). Table 4(a) reports the logit coefficients whille Table 4(b) reports the corresponding odds ratios. Regarding Table 4(b), it is important to note regarding the odds ratio, a value greater than one suggests the covariate has a positive effect on the odds of success for a particular category, a value less than one, suggests a negative effect on the respective odds, while an odds ratio equal to one implies no relationship between the particular covariate and the odds.
Regarding Table 4(a), column (1) reports the results for the the Ordered Logit estimator excluding
high school fixed effects while column (2) reports the results including the high school fixed effects. As
can be seen the results are very similar for the two specifications. We find limited evidence of high school
fixed effects contributing to student success, as only six few high school dummy variables were statistically
significant.
47To summarize our results, students attending the University of Winnipeg on a student Visa,
persuing an education degree (deqedu), entering with a higher high school average (hsaverage), taking
more courses in their first year (fceyr1), or earning a higher first year University GPA (gpayr1), had greater
success at the University of Winnipeg.
17
Table 4(a): Ordered Logit Results (Logit Coefficients)
Variable olbase olschool
(1) (2)
success
age -0.0058 0.0029
aboriginal 0.1506 0.1999
permanent -0.2746 -0.2676
visa 0.6992** 0.5574*
male 0.0418 0.0545
deg4yrs 0.1872 0.2059
degedu 0.6366*** 0.6730***
degnoinfo -0.0611 -0.0486
hsaverage 0.0078 0.0104*
gpayr1 0.4702*** 0.4685***
fcesyr1 0.4395*** 0.4502***
private -0.0929 0.2078
dMBloan1 -0.3356*** -0.3347**
dMBbursary1 -0.2382 -0.2308
dUWbursary1 -0.0426 -0.038
dUWscholarship1 0.0226 0.0139
rincome -0.0015 -0.0027
rexpenditure 0.0035 -0.0783
yr1998 0.0805 0.0653
yr1999 0.2872* 0.2827*
yr2000 0.2458* 0.2479*
yr2001 0.2395* 0.2438*
yr2002 0.0877 0.1231
High School Fixed Effects No Yes
cut1
_cons 3.2292*** 3.4112***
cut2
_cons 4.1651*** 4.3647***
Statistics
N 5013 5013
ll -4751.8701 -4694.8939
aic 9553.7401 9603.7878
bic 9716.7349 10301.4053
McFadden Pseudo R2 0.094
0.105McKelvey and Zavoina R2 0.201 0.265
legend: *p<0.05; **p<0.01; ***p<0.001
18
Table (4(b): Ordered Logit Results (Odds Ratios)
Variable olbase1 olschool1
(1) (2)
success
age 0.9943 1.0029
aboriginal 1.1625 1.2213
permanent 0.7599 0.7652
visa 2.0122** 1.7462*
male 1.0427 1.056
deg4yrs 1.2059 1.2287
degedu 1.8901*** 1.9601***
degnoinfo 0.9408 0.9526
hsaverage 1.0079 1.0104*
gpayr1 1.6003*** 1.5977***
fcesyr1 1.5519*** 1.5686***
private 0.9113 1.231
dMBloan1 0.7149*** 0.7155**
dMBbursary1 0.788 0.7939
dUWbursary1 0.9583 0.9627
dUWscholar~1 1.0229 1.014
rincome 0.9985 0.9973
rexpenditure 1.0035 0.9247
yr1998 1.0839 1.0675
yr1999 1.3327* 1.3268*
yr2000 1.2786* 1.2813*
yr2001 1.2706* 1.2761*
yr2002 1.0917 1.131
High School Fixed Effects No Yes
cut1
_cons 25.2604*** 30.3016***
cut2
_cons 64.4017*** 78.6223***
Statistics
N 5013 5013
ll -4751.8701 -4694.8939
aic 9553.7401 9603.7878
bic 9716.7349 10301.4053
McFadden Pseudo R2 0.094 0.105
McKelvey and Zavoina R2 0.201 0.261
legend: * p<0.05; **p>0.01; *** p<0.001
19
In contrast, students who received a loan from the Province of Manitoba (dMBloan1), were less successful. The other measures of student aid were not statistically significant for our sample.
48Regarding the overall fit of the respective Ordered Logit models, a number of Pseudo R2 measures have been proposed.
49We report two common measures, the McFadden Pseudo R2 which is 0.094 for model (1) and 0.105 for model (2), and the McKelvey and Zavoina R2 measure which are 0.201 and 0.261 respectively. Veall and Zimmermann (1996) argue that the McKelvey and Zavoina R2 has a number of desirable properties as a measure of goodness of fit for a number of limited depeendent variable models.
50Overall, considering the McKelvey and Zavoina measures, it suggests the respective models are relatively good predictors of student success.
51As a test of robustness we included a number of interactive terms, (relig_private=religious x private) to test whether the religious nature of the private high school effected success, and whether
education students were more successful because they entered with higher incoming high school averages (degedu_hsavg=deqedu x hsavg) and an age squared term (agesq). Given that all these effects were statistically insignificant for specifications (1) to (2), these additional regressors were dropped from the respective models.
52As is customary, we also report the marginal effects associated with the Ordered Logit model specification
given by the first column of Table 4(a). Table 4(c) shows that Visa students, education majors, students
entering with better high school grades, students taking more courses in the first year, and students earning
a higher first year grade point average, are both more likely to graduate and less likely to withdraw from the
University of Winnipeg. The marginal effects are calculated for both the continuous variables as well as the
dummy variables and indicate the effect of a unit increase in the variable on the probability of the respective
outcome. For example, for Outcome 3 (graduating) in Table 4(c) , taking one more fce in year 1 (fceyr1)
increases the probability of a student graduating by 8.75%, one point higher first year gpa (gpayr1)
increases the probability of graduating by 9.36% while Education students are 12.7% more likely to
graduate, Visa students are 13.9% are more likely.
5320
Table 4(c ): Average Marginal Effects
Outcome 1 Outcome 2 Outcome 3
age 0.00120 -0.0000504 -0.00115
(0.00272) (0.000115) (0.00260)
aboriginal (d) -0.0313 0.00132 0.0300
(0.0429) (0.00183) (0.0411)
permanent (d) 0.0571 -0.00241 -0.0547
(0.0512) (0.00222) (0.0491)
visa (d) -0.145** 0.00613* 0.139**
(0.0525) (0.00246) (0.0504)
male (d) -0.00870 0.000367 0.00833
(0.0125) (0.000533) (0.0119)
deg4yrs (d) -0.0389 0.00164 0.0373
(0.0398) (0.00171) (0.0381) degedu (d) -0.132*** 0.00558*** 0.127***
(0.0182) (0.00143) (0.0173)
degnoinfo (d) 0.0127 -0.000535 -0.0122
(0.0163) (0.000697) (0.0156)
hsaverage -0.00163 0.0000686 0.00156
(0.000895) (0.0000400) (0.000857) gpayr1 -0.0977*** 0.00412*** 0.0936***
(0.00839) (0.000868) (0.00812) fcesyr1 -0.0913*** 0.00385*** 0.0875***
(0.00495) (0.000746) (0.00496)
private 0.0193 -0.000814 -0.0185
(0.0162) (0.000698) (0.0155) dMBloan1 (d) 0.0698*** -0.00294** -0.0668***
(0.0209) (0.00105) (0.0200)
dMBbursary1 (d) 0.0495 -0.00209 -0.0474
(0.0424) (0.00184) (0.0406) dUWbursary1 (d) 0.00886 -0.000374 -0.00849 (0.0367) (0.00155) (0.0352) dUWscholarship1 (d) -0.00470 0.000198 0.00450 (0.0178) (0.000755) (0.0170) rincome 0.000322 -0.0000136 -0.000308
(0.000457) (0.0000196) (0.000437) rexpenditure -0.000722 0.0000305 0.000692
(0.00668) (0.000282) (0.00640)
yr1998 (d) -0.0167 0.000706 0.0160
(0.0219) (0.000934) (0.0210)
yr1999 (d) -0.0597* 0.00252* 0.0572*
21
(0.0241) (0.00114) (0.0231)
yr2000 (d) -0.0511* 0.00215* 0.0489*
(0.0225) (0.00106) (0.0216)
yr2001 (d) -0.0498* 0.00210 0.0477*
(0.0232) (0.00107) (0.0222)
yr2002 (d) -0.0182 0.000769 0.0175
Standard errors in parentheses
"* p<0.05; **p<0.01; ***p<0.001
(d) for discrete changes of dummy variable from 0 to 1
5. Ordered Logit Results – Testing for the Parallel Regression Assumption
Regarding the Ordered Logit Results, a key assumption is the parallel lines/proportional odds assumption, which implies the stability of the β coeffficients across categories. The Brant test, whose results are listed in Tables 5 and 6 formally tests this restriction. Table 5 reports the coefficient estimates that result from a series of binary regressions across the categories. As can be seen the respective coefficients, the β
j, vary significantly for a number of regressors, specifically, Visa, degedu, gpayr1, fceyr1, and rincome.
54The formal test of these differences across categories is summarized in Table 6. The Brant test of
the parallel lines/proportional odds assumption is based on a chi-sq test, a significant test statistic provides
evidence that the assumption has been violated. The regressors that vary significantly across categories,
illustrated in Table 6, are confirmed by the formal Brant test, namely Visa and Permanent students, students
pursuing an education degree (degedu), students with have a higher first year gpa (gpayr1), have taken a
larger number of full course equivalent courses in year 1 (fcesyr1) and have higher family income (rincome)
perform differently across categories of success.
22
Table 5: Brant Test - Estimated Coefficients from j-1 binary regressions
Variable y>1 y>2
age -0.004 -0.012
aboriginal 0.176 0.133
permanent -0.42 -0.04
visa 0.478 1.128
male 0.071 -0.002
deg4yrs 0.272 0.179
degedu 0.9 0.522
degnoinfo -0.067 -0.061
hsaverage 0.008 0.007
gpayr1 0.42 0.597
fcesyr1 0.361 0.59
private -0.111 -0.127
dMBloan1 -0.271 -0.433
dMBbursary1 -0.411 -0.046
dUWbursary1 0.089 -0.125
dUWscholarship1 -0.036 -0.021
rincome -0.005 0.003
rexpenditure 0.008 -0.023
yr1998 0.078 0.113
yr1999 0.335 0.272
yr2000 0.35 0.175
yr2001 0.304 0.218
yr2002 0.126 0.087
_cons -2.85 -4.925
23
Table 6: Brant Test of Parallel Regression Assumption
chi2 p>chi2 df
All 179.46 0.000 23
age 0.260 0.610 1 aboriginal 0.030 0.853 1 permanent 4.180 0.041 1 visa 11.040 0.001 1
male 1.500 0.221 1
deg4yrs 0.220 0.638 1
degedu 15.460 0.000 1
degnoinfo 0.010 0.929 1
hsaverage 0.070 0.790 1
gpayr1 16.260 0.000 1
fcesyr1 63.350 0.000 1
private 0.050 0.823 1
dMBloan1 2.260 0.133 1
dMBbursary1 3.320 0.068 1
dUWbursary1 1.090 0.297 1
dUWscholarship1 0.040 0.850 1
rincome 13.140 0.000 1
rexpenditure 1.040 0.308 1
yr1998 0.130 0.717 1
yr1999 0.330 0.567 1
yr2000 2.470 0.116 1
yr2001 0.620 0.431 1
yr2002 0.130 0.721 1
Note: A significant test provides evidence that the parallel
regression assumption has been violated24
Given this result, the possibilities as discussed by Williams (2006:62) include doing nothing, reframing the model into one of multinomial choice, or re-estimate the model, incorporating the fact that for a number of regressors, the β coefficients are approximately equal across categories. In the following section, we undertake the latter approach, using the Generalized Ordered Logit model which restimates the Ordered Logit model subject to restricting a number of the above β coefficients to be equal across categories.
6. Generalized Ordered Logit Results (gologit2)
Before presenting the results from the Generalized Ordered lLgit estimator (gologit2), it is helpful to discuss the meaning of the empirical results. In doing so, we will frame the discussion in terms of the odds ratios with respect to the covariates. As discussed by Liu (2016:182), the Generalized Ordinal Logit model estimates the cumulative odds of being beyond a certain category versus being at or below that category.
Specifically,
Odds (Y>j) = p(Y>j)/p(Y≤j)
For example, in our case, with the three categories, with category 3 being a student who graduated within 5 years, category 2 being a student who is still registered in year 5, and category 1 if the student has
withdrawn by year 5 (did not graduate and was no longer registered).
For example, regarding the odds a student is above category 1, that is, is no longer registered is Odds(Y>1)=p(Y>1)/(1-p(Y>1))=(p(2)+p(3))/p(1) (1) where p(j) is the probability of category j. Similarly, the odds that a student is above category 2, that is, is still registered is
Odds(Y>2)=p(Y>2)/(1-p(Y>2)=p(3)/p(1)+p(2) (2)
Or in terms of the category names
25
Odds(Y>withdrawn)= p(Y>withdrawn) = p(continuing) + p(graduating) (1’) (1-p(Y>withdrawn)) p(withdrawn)
Odds(Y>continuing)= p(Y>continuing ) = p(graduating) (2’)
(1-p(Y>continuing)) p(continuing) + p(withdrawn)
Table 7 lists the Generalized Ordered Logit results for the model excluding high school fixed effects.
55We choose to discuss the odds ratio associated with the parameter estimates, the corresponding logit
coefficients are reported in Table 7(b) in the Appendix,
As the results indicate, there are restrictions imposed on the β’s for 17 of the 24 regressors, which includes restrictions on the 5 time fixed effects. As can be seen, these restrictions yield identical
coefficients for the success category 1 and success category 2.
56In addition, once these restrictions are imposed, the Wald test of the parallel lines or proportional odds for the model, yields a χ
2(17)=16.25 which yields the prob>χ
2=0.5065. As listed in the Stata output, an insignificant test statistic suggests the final model does not violate the parallel lines/ proportional odds assumption. The McFadden Pseudo R2 is a respectable 0.1108.
57As pointed out by Liu (2016:191), in discussing the results from the gologit2 estimator, it is necessary to discuss the regressors that meet the parallel lines/proportional odds assumption, and those that violate the assumption. The regressors that meet the parallel lines/proportional odds assumption include the 17 regressors that are constrained in the model. In contrast, the regressors that indicate the immigration status of the student (visa) or (permanent) , the choice of major ( degedu), the student’s first year
performance (gpayr1), course load (fcesyr1), financial resources (rincome) and high school resources
(rexpenditure) are not constrained. As can be seen, the regression coefficients for the former set, the β’s,
are constrained to be equal for the varying categories of success, while the β’s for the latter set are allowedto vary across the categories.
26
Table 7: GEOLOGIT2 Estimates (Odds Ratios)
Generalized Ordered Logit Estimates Number of obs = 5013 Wald chi2(30) = 985.56 Prob> chi2 = 0.000 Log pseudolikelihood = -4662.8129 Pseudo R2 = 0.1108 ( 1) [1]hsaverage - [2]hsaverage = 0
( 2) [1]yr1998 - [2]yr1998 = 0 ( 3) [1]degnoinfo - [2]degnoinfo = 0 ( 4) [1]private - [2]private = 0
( 5) [1]rexpenditure - [2]rexpenditure = 0
( 6) [1]dUWscholarship1 - [2]dUWscholarship1 = 0 ( 7) [1]aboriginal - [2]aboriginal = 0
( 8) [1]deg4yrs - [2]deg4yrs = 0 ( 9) [1]male - [2]male = 0 (10) [1]yr2002 - [2]yr2002 = 0 (11) [1]yr1999 - [2]yr1999 = 0 (12) [1]age - [2]age = 0 (13) [1]yr2001 - [2]yr2001 = 0
(14) [1]dUWbursary1 - [2]dUWbursary1 = 0 (15) [1]dMBloan1 - [2]dMBloan1 = 0 (16) [1]dMBbursary1 - [2]dMBbursary1 = 0
(Std. Err. adjusted for 5013 clusters)
success Odds Ratio Robust
Std. Err. z P>|z|
[95% Conf.
Interval]
1 - withdrawn
age 0.99 0.01 -0.58 0.56 0.97 1.02
aboriginal 1.16 0.25 0.70 0.48 0.77 1.76
permanent 0.66 0.16 -1.73 0.08 0.41 1.06
visa 1.65 0.42 1.99 0.05 1.01 2.71
male 1.04 0.06 0.66 0.51 0.93 1.17
deg4yrs 1.23 0.23 1.07 0.28 0.84 1.79
degedu 2.45 0.25 8.65 0.00 2.00 3.00
degnoinfo 0.95 0.07 -0.69 0.49 0.81 1.10
hsaverage 1.01 0.00 1.61 0.11 1.00 1.02
gpayr1 1.51 0.07 9.38 0.00 1.39 1.65
fcesyr1 1.43 0.04 12.90 0.00 1.35 1.51
private 0.90 0.07 -1.38 0.17 0.77 1.05
dMBloan1 0.72 0.07 -3.22 0.00 0.59 0.88
dMBbursary1 0.80 0.16 -1.11 0.27 0.53 1.19
27
dUWbursary1 0.98 0.17 -0.14 0.89 0.69 1.38
dUWscholarship1 0.99 0.08 -0.09 0.93 0.84 1.17
rincome 1.00 0.00 -2.02 0.04 0.99 1.00
rexpenditure 1.00 0.03 0.00 1.00 0.94 1.06
yr1998 1.08 0.11 0.76 0.45 0.88 1.33
yr1999 1.32 0.15 2.42 0.02 1.05 1.66
yr2000 1.39 0.16 2.85 0.00 1.11 1.74
yr2001 1.27 0.14 2.15 0.03 1.02 1.58
yr2002 1.09 0.12 0.74 0.46 0.87 1.35
_cons 0.08 0.04 -5.38 0.00 0.03 0.19
2 - continuing
age 0.99 0.01 -0.58 0.56 0.97 1.02
aboriginal 1.16 0.25 0.70 0.48 0.77 1.76
permanent 0.97 0.24 -0.11 0.91 0.60 1.58
visa 2.69 0.72 3.70 0.00 1.59 4.54
male 1.04 0.06 0.66 0.51 0.93 1.17
deg4yrs 1.23 0.23 1.07 0.28 0.84 1.79
degedu 1.63 0.16 5.01 0.00 1.35 1.98
degnoinfo 0.95 0.07 -0.69 0.49 0.81 1.10
hsaverage 1.01 0.00 1.61 0.11 1.00 1.02
gpayr1 1.75 0.08 12.01 0.00 1.60 1.91
fcesyr1 1.82 0.06 18.17 0.00 1.70 1.94
private 0.90 0.07 -1.38 0.17 0.77 1.05
dMBloan1 0.72 0.07 -3.22 0.00 0.59 0.88
dMBbursary1 0.80 0.16 -1.11 0.27 0.53 1.19
dUWbursary1 0.98 0.17 -0.14 0.89 0.69 1.38
dUWscholarship1 0.99 0.08 -0.09 0.93 0.84 1.17
rincome 1.00 0.00 0.90 0.37 1.00 1.01
rexpenditure 1.00 0.03 0.00 1.00 0.94 1.06
yr1998 1.08 0.11 0.76 0.45 0.88 1.33
yr1999 1.32 0.15 2.42 0.02 1.05 1.66
yr2000 1.15 0.13 1.17 0.24 0.91 1.44
yr2001 1.27 0.14 2.15 0.03 1.02 1.58
yr2002 1.09 0.12 0.74 0.46 0.87 1.35
_cons 0.01 0.00 -10.39 0.00 0.00 0.02
28
We can summarize the results as follows. For simplicity, we focus on the results reported in the bottom of Table 7, which is listed as category 2 – continuing. The odds ratios reported there as discussed earlier, reflect the probability of being above category 2, that is, it captures the effect of an increase in the covariate on the odds of graduating, which is category 3.
The covariates which increase odds of graduating are (i) being a visa student (2.69), (ii) being enrolled in education (1.63), (iii) having a higher first year gpa (1.75), and (iv) taking more full course equivalents in the first year (1.82). While it appears that being an aboriginal student and pursuing a 4 year degree (deg4yrs) increase the odds of graduating, these coefficients are not statistically significant. In contrast, the odds of a student who receives a loan from the Province of Manitoba graduating is
significantly less (0.80) than students not needing this type of financial assistance. While it appears that attending a private school lowers the odds of graduating, it is not statistically significant.
Most of these results accord with intuition. Factors such as full time attendance, reflected by
taking a larger number of full course equivalents, and having a clear degree goal in mind, for example,
seeking an education degree, raise the probability of graduating. In contrast, the odds of graduating, at
least within the five year period, are lower for students who need financial assistance.
29 7. Conclusion
This paper has introduced a relatively novel approach to estimating the determinants of student success in post secondary institutions. Using both an Ordered Logit Estimator and a Generalized Ordered Logit estimator, we are able to estimate the role a number of key factors have in influencing student success. An important feature of this approach is that it can be conducted using readily available administrative data. Furthermore, the data collection and data praparation requred for estimation of these models is relatively modest.
The results from this study can be summarized as follows. First, we find similar to other studies that students who achieved a higher first year gpa (gpayr1) are more likely to succeed in university where success is measured as graduating within a five year period. Second, we find that students who take a larger number of full course equivalents in their first year (fcesyr1) are more successful. This may be a measure of full time status, which could reflect the fact that these students have greater financial support to pursue their studies. Third, we find Visa students (Visa) and students enrolled in education programs (degedu) are more successful. Finally, we find that students who received a loan from the Manitoba government (dMBloan1) were less successful. These results we find are independent of the inclusion of high school fixed effects.
In testing the robustness of the Ordered Logit results, we find that for a number of covariates the parallel lines/proportional odds assumption is violated. That is the parameter estimates for a number of covariates differ across categories of success. To address this issue, we used a Generalized Ordered Logit estimator which imposes the equality of coefficients on those covariates that satisfy the parallel
lines/proportional odds assumption, while allowing the other covariates to vary across the categories of success.
We feel our analysis helps to identify the challenges facing different student types which can
provide input into developing effective support programs. While the results are based on data from one
institution, we think our approach is helpful to post secondary officials at other institutions in designing
their data collection methods. We feel an important contributon of our study is to provide a relatively
30
straightforward approach to predicting the success of a variety of students using data that is routinely collected by university officials. In addition, we urge university officials to enhance their data collection to capture as many characteristics of students and their backgrounds to include in their ordered logit and generalized ordered logit models.
It is important to note there are varying measures of student success that can be considered
depending on the respective goals of university officials and policy makers. For example, our measure
suggests that students who continue their studies are more successful over a five year observation period
than students who withdraw. However, it may be case that some students who do not complete their
degrees in a timely manner should perhaps reconsider their career plans. From the perspective of
university officials, students who continue their studies contribute to university budgets; however, from
the perspective of policy makers some of these students may be better suited to other types of training. In
conclusion, what constitutes success for post secondary students is clearly an issue that can benefit from
further discussion.
31 8.0 Appendix I:
Table 7(b): GOLOGIT2 Estimates (Logit Coefficients)