Multiple factors influence any given outcome, and multiple linear regression is a statistical way to examine the relationship among various explanatory factors and an outcome. Multiple linear regression is used to assess a question such as, “Did participation in my program influence Algebra I scores after accounting for past math performance and aspirations?” The explanatory factors are your program, each student’s past math performance, and each student’s aspirations. The outcome is the Algebra I score.
Addressing the question this way acknowledges that in addition to benefits from program participation, factors such as performance in past math classes and aspirations could influence Algebra I scores. A student who did well in prior math classes may have an easier time meeting challenges in Algebra I. Students who plan to go to college may put more effort into their studies than students who do not. Regression analysis permits examining those factors simultaneously to see whether the service in question influences the outcome beyond those other factors. Here, you can think of regression as constructing a formula for the math score that includes variables that may provide alternate explanations for Algebra I scores. Correlations and t-tests may help determine which measures to include. For example, finding that aspirations are correlated with mathematics performance suggests you should include aspirations in the analysis. As with other analyses, appropriate data are necessary. Answering this sample question requires having data about each student’s prior math performance and having survey data in which students reported their aspirations.
Analysis results might indicate that the relationship between program participation and the student’s Algebra I score is statistically significant, even when you control for the other factors. Accounting for other explanatory factors increases confidence in the conclusion that your program influences Algebra I scores. However, this result does not necessarily mean that the program caused the improved Algebra I score. Perhaps other factors that you could not include in the analysis contributed to the outcome. Even so, this result does show an association between program participation and the outcome.
Alternatively, this multiple regression analysis might indicate that once prior achievement is included in the model, the effect of participation in your program on Algebra I scores dissipates. In other words, students’ prior achievement might be the strongest predictor of Algebra I scores. This result suggests the need to target additional resources to students who have struggled with math. Perhaps students who have not performed well in middle school mathematics need a bridge program or other supports to help them succeed in Algebra I. By understanding the way different factors are associated with each other, you can develop or refine services to better meet student needs.
Logistic Regression
In multiple linear regression (described above), the outcome variables (Algebra I score) must be an interval or ratio, which means that differences between the levels of the variable are equal (as discussed in Chapter 3). Here, it makes sense to say an Algebra I score of 282 is two more than a score of 280, and a score of 270 is 5 less than a score of 275. Examining a yes/no outcome, such as high school dropout and college enrollment, requires a different approach. Here, logistic regression models are appropriate.
If you wanted to examine the question, “Did participation in my program influence college enrollment after accounting for past math performance and aspirations?” you would have a yes/no outcome. A logistic regression model yields the odds of college enrollment given program participation, even when accounting for other factors that might predict the outcome, such as prior math performance and aspirations. Results might indicate, for example, that participants were two times more likely to enroll in college than those who did not participate in the program, even when prior math performance and aspirations are included in the model.
As with the multiple linear regression example, you may find that including additional measures reduces the association between your program and the outcome. That information can highlight services you may want to expand or modify.
Summary
Data analyses can be resource intensive, particularly if you are acquiring, linking, analyzing, and reporting on new data. In deciding which methods to use, you need to be strategic in deciding how the project can benefit from the new information. Those involved with the program have the best insights about which kinds of additional knowledge can yield the greatest benefit. For example, it may be of the greatest importance to determine which services are most influential for a given outcome. Alternatively, when introducing an innovative program that departs from past practices, it may be most important to quickly obtain information that would help you modify it if needed. Regardless of whether the goal is to monitor a program or conduct an evaluation, it is crucial to have accurate data and use appropriate methods to guide your decision-making efforts.
If your study has a control group, you must ensure that the treatment and control groups are as similar as possible, except that students in the control group do not receive the service. With careful consideration in defining treatment and control groups, you are better able to understand the potential influence of a service on student outcomes.
Descriptive analyses allow researchers to examine program implementation. Techniques uch as correlations, t-tests, and ANOVA permit identifying a relationship between two items. On their own, these approaches do not meet criteria for evidence of promise or strong evidence of effectiveness. However, they are methods for conducting statistical hypothesis tests, which help identify key relationships between service recipients and outcomes. Multiple regression permits incorporating different factors to isolate the influence of the program on outcomes.