• No results found

COIL Research Initiation Grant Proposal Learning Analytics: Leveraging Big Data to Improve Learner Success

N/A
N/A
Protected

Academic year: 2021

Share "COIL Research Initiation Grant Proposal Learning Analytics: Leveraging Big Data to Improve Learner Success"

Copied!
16
0
0

Loading.... (view fulltext now)

Full text

(1)

COIL Research Initiation Grant Proposal

Learning Analytics: Leveraging Big Data to Improve Learner Success Core Team Members

College of Education, University Park

Simon Hooper, Professor, Department of Learning and Performance Systems Information Technology Services, University Park

Chris Millet, Assistant Director, Education Technology Services

Bart Pursel, Faculty Programs Coordinator, Teaching and Learning with Technology

Gi-Woong Choi, Graduate Assistant, Education Technology Services Office of Planning and Institutional Assessment, University Park Nicholas Warcholak, Senior Planning and Research Associate Administrative Information Systems, University Park

Andrew Fisher, Programmer and Analyst Stakeholders

College of Science, University Park

James Hager, Assistant Director of Undergraduate Education, Mathematics Stanley Smith, Associate Professor, Mathematics

Collaborators

College of Information Sciences and Technology, University Park Marcela Borge, Lecturer, IST

Jim Jansen, Associate Professor, IST College of the Liberal Arts, University Park William Goffe, Senior Lecturer, Economics Mark Fisher, Lecturer, Philosophy

Boise State University

Gary Hagerty, Director of Math Learning Center

Designated Principal Investigator (primary point of contact) Bart Pursel - [email protected] - (814) 865-1843

Faculty Projects Coordinator

Teaching and Learning with Technology Date of submission: May 15th, 2013 Amount requested: $49,055

(2)

Abstract

During the past year our team of interdisciplinary researchers has explored the emerging field of learning analytics, with the goal of leveraging student data to develop systems that adapt to the specific needs of students (Long & Siemens, 2011; Siemens, 2010).

We will develop learning analytics competencies at Penn State in a three-phase process: 1) data acquisition, cleaning and storage; 2) data analysis and modeling; and 3) prototype development and data visualization. Our project focuses on the resident and online sections of Math 110. We chose this course due to the high opportunity for success: The course enrolls approximately 1,000 students each semester yet up to a third of enrolled students receive a “D” or below. We hope to identify the behaviors that lead to success, predict student outcomes, and help teaching assistants and instructors focus specific resources to address student knowledge gaps.

Our long-term goal is to broaden our scope, working with identified collaborators in other disciplines to apply similar tools to support student success. This also provides opportunities for future research, exploring how analytics systems might differ across disciplines and if predictors of student success uncovered through Math 110 data are the same predictors found in other disciplines.

(3)

  3  

Project Description

During the past year a team of PSU learning technologists, data analysts, instructors and subject matter experts coalesced to explore the goal of

developing a Learning Analytics (LA) system targeted at the individual needs of PSU students. That group is requesting support to design and develop a

prototype LA tool targeted at Math 110, a high-enrollment, low-success course. Our tool will combine two analytics approaches. First, we will develop a predicative model of student performance based on personal and class history data. Second, we will develop a tool that will provide detailed course feedback at each of three levels: individual students, course TAs and the course instructor. The tool will allow course instructors to connect multiple-choice exams with learning objectives. The resulting system will provide formative evaluation data that can be used by students to diagnose individual learning difficulties, and by TAs and the course instructor to identify common student misconceptions and content difficulties.

Our research will allow us to use data to modify the course design. For example, we anticipate modifying recitation sections to address student needs. Moreover, the ability to diagnose individual student need by performance on each learning objective creates the opportunity to customize access to learning modules and perhaps to allow exam re-takes or modifications to the current assessment system. The project includes three distinct phases:

(4)

Phase 1: Data acquisition, cleaning and storage

A data cleaning team will aggregate data from the Student Information (SIS) and Course Management systems (CMS) to produce a database that will be mined by the data analysis and modeling groups. The SIS and CMS contain unexplored data that could guide instructional decision-making. Raw data, as well as online math problem set data and survey data will be cleaned, formatted and stored in a relational database.

This group will also aggregate other data on individual student differences. For example, math department members have hypothesized that student self-efficacy has a positive correlation to success in Math 110. As part of our research, we plan on using a modified survey instrument (Zimmerman &

Kitsantas, 2005) to collect self-efficacy data from Math 110 students in order to better understand how self-efficacy influences student success.

Phase 2: Data analysis and modeling

A data analysis group will develop predicative models of student success, i.e., predicting grades. Prior researchers examined SIS and CMS variables in conjunction with success, providing a list of variables to use in our analysis (Macfadyen & Dawson, 2010; Arnold & Pistilli, 2012).

Using statistical techniques (logistical regression, naive Bayes, etc.) the analysis group will establish the most accurate model to fit our data. The model creation process will be highly iterative, and we’ll test the model in the spring 2014 sections of Math 110.

(5)

  5  

Phase 3: Prototype development and data visualization

A visualization group will develop information visualization tools that address the two main project goals. First, the team will develop the software interfaces to be used by Math 110 students, TAs and faculty. Second, working with the data analysis and modeling group they will develop exploratory interfaces to visualize predictive data.

Research Questions

• What are the strongest predictors of success? How accurate is the model? • Do success predictors differ in the resident and online versions of Math 110? • Is self-efficacy a strong predictor of success in Math 110?

• Does model accuracy differ across disciplines such as Economics and Philosophy?

• Do the proposed prototypes lead to higher rates of student success? Significance of Learning Analytics

LA is on the leading edge of educational technology research. Each year the Horizon report describes technologies they believe will have a large impact on society. For the past 3 years, LA has been listed as one of two technologies included in the 2-3 year horizon. Several funding organizations (NSF, Gates and MacArthur foundations) support learning analytics research and are interested in exploring how ‘big data’ can improve student outcomes.

Design Process and Research Design

We will follow an Interaction Design process (Lowgren & Stolterman, 2007) that involves the use of iterative cycles of theory, design, and implementation

(6)

research and conduct design-based research (DBR) to modify and improve the software. DBR is an established empirical methodology for instructional

designers interested in changing instructional outcomes and enhancing the learner experience through development of technologies that support learning in classroom environments (Design Based Research Collective, 2003).

Timeline

Major Milestones Start End Responsible

Identify/hire data scientist July 1st July 31st ETS Gather SIS and CMS data July 1st July 31st ETS/AIS

Self-efficacy survey creation July 17th Aug. 2nd Education/ETS SIS/ANGEL data analysis &

predictive modeling

Aug. 1st Dec. 20th OPIA

Prototype development Aug. 5th Nov. 31st Education/ETS Implement survey in Math 110

sections

Sept. 2nd Sept. 13th Education/Math/ ETS

End user focus groups /prototype focus groups

Sept. 2nd Dec. 20th Education/ETS Math 110 spring 2014

implementation plan

Nov. 4th Dec. 20th Education/ETS/ Math

Math 110 prototype roll out Jan. 2014 May, 2014 ETS/Math

instructors & TAs Focus groups with end users

of prototype

Feb. 2014 May, 2014 Education/ETS Collect/analyze data,

determine prototype’s impact on student success

April, 2014 Aug., 2014 Education/ETS/ OPIA

Prepare manuscript, identify

2nd round funding sources. Sept., 2014 Ongoing Education/ETS The need for funding

(7)

  7  

After exploring LA and making incremental progress towards data acquisition, analysis, modeling and prototyping, we need dedicated resources in this area to move forward. Specifically, we require a data scientist, someone that can merge diverse data sets into a single, normalized and cohesive database environment. A data scientist can also take a research question, translate that into the

necessary SQL code, and extract a specific data set a researcher needs for analysis. The team also needs a statistician, experienced at examining large data sets representing various student data points and familiarity with a variety of data analysis methods. The other personnel are volunteering time.

Future funding opportunities

Following the completion of the proposed project, we will be in position to submit an NSF Cyberlearning proposal. NSF seeks interdisciplinary project teams to study technology integration issues. Exploratory projects ‘…explore the feasibility of a technological innovation and to shed light on the answers to fundamental research questions related to learning with technology’. Teams must be

operational entities and are expected to be familiar with ‘…challenges associated with assessment and evaluation, robustness and broader usability that they anticipate’. Projects must include research and development components and teams are expected to engage in ‘…iterative refinement of the design,

implementation, or use of a technological innovation based on systematic analysis of formative data.’ The Gates and MacArthur Foundations represent additional funding opportunities.

(8)

References

Arnold, K., & Pistilli, M. (2012). Course signals at Purdue: Using learning

analytics to increase student success. Presented at the Learning

Analytics and Knowledge, Vancouver, B.C.

Design-Based Research Collective. (2003). Design-based research: An emerging paradigm for educational inquiry. Educational Researcher, 32(1), 5-8.

Long, P., & Siemens, G. (2011). Penetrating the fog: Analytics in learning and education. Educuase Review, 46(5).

Macfadyen, L., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers &

Education, 54(2), 588–599.

Siemens, G. (2010). What are learning analytics? Retrieved from

http://www.elearnspace.org/blog/2010/08/25/what-are-learning-analytics/ Zimmerman, B. J., & Kitsantas, A. (2005). Homework practice and academic

achievement: The mediating role of self-efficacy and perceived

responsible beliefs. Contemporary Educational Psychology, 31(4), 397– 417.

(9)

  9  

Supporting Material

Core members of the proposal team are also part of Penn State’s Learning Analytics Working Group (http://sites.psu.edu/analytics/) and spent significant time in extracting data from Penn State’s SIS (Pursel), CMS (A. Fisher) and assorted other data sources throughout 2012 and early 2013. Another member of the group (Warcholak) is currently exploring large data sets from Economics 104, using techniques described in the proposal. Other members of the team, guided by Simon Hooper, are actively designing prototypes and data visualizations, each designed for specific sets of end users (students, TAs, instructors, advisers and others). Below are several visualization examples created by the team.

(10)

Instructor-facing interface visualization:

Self-efficacy survey instrument and estimated budget are found in the Appendix

(11)

  11  

Qualifications of Key Personnel

The team we have assembled includes faculty, staff and graduate students that all have experience related to their specific tasks. One of the difficulties with undertaking a project related to learning analytics is the need for a variety of skillsets. The personnel on this proposal collectively possess skills in instructional design, software design and development, statistic, predictive modeling,

educational psychology, system administration, pedagogy, research and assessment. In addition to the core team and stakeholders, we also have a variety of collaborators that will act as a sounding board, coming together bi-monthly to review progress, provide feedback and guide future directions for the application of learning analytics in different contexts.

Dr. Bart Pursel is the Faculty Projects Coordinator for Teaching and Learning with Technology (TLT), collaborating with faculty on various projects that reside at the intersection of technology and pedagogy. Dr. Pursel is a leader in Penn State’s current exploration of learning analytics with Chris Millet,

specifically focusing on data acquisition, analysis and modeling. Bart is also the data coordinator for Penn State’s partnership with Coursera, where he will be working with both Coursera on data acquisition and Penn State researchers on various research projects examining student behaviors in MOOCs. Before arriving at TLT, Dr. Pursel worked for the Schreyer Institute for Teaching Excellence working on a variety of institutional research projects focused on undergraduate student support. Dr. Pursel also has experience in the areas of

(12)

game design and development, game based learning, online course design and delivery, teaching, research and assessment.

Chris Millet is Assistant Director of Education Technology Services (ETS), a division of Teaching and Learning with Technology (TLT) and Information Technology Services (ITS) at Penn State University. Chris is responsible for helping establish the overall strategy for ETS, including the research and development of new technologies and technology-supported pedagogies with a university-wide scope. Chris also oversees Penn State’s Educational Gaming Commons and the Media Commons, and provides leadership to the learning analytics, digital badges, and mobile learning initiatives. Chris has a Masters in Instructional Systems from Penn State’s College of Education.

Nick Warcholak is a Senior Planning and Research Associate in Penn State's Office of Planning and Institutional Assessment. His main responsibilities include the analysis, modeling, and presentation of data to answer operational questions. He regularly employs statistical techniques to summarize trends in data and infer correlational and possible causal mechanisms. He is experienced with the use of nonlinear modeling, HLM, and categorical data analysis as well as assessment modeling frameworks such as SEM and IRT. He is currently working with data from several sections of Penn State's Econ 104 class as part of Penn State's learning analytics initiative. Nick has an MS in Educational Psychology from Penn State and is currently ABD in the same program.

(13)

  13  

Dr. Simon Hooper is a professor in Learning, Design, and Technology. During the past nine years he has served as PI on grants from the US

Department of Education and from the NSF to develop electronic assessment systems and to design web-based education. He is a highly experienced Internet researcher, information visualization designer and multimedia developer, and has conducted studies in which data collection involved thousands of participants and millions of data records.

Andy Fisher has served for over five years as the Team Lead for the ANGEL Developers group. He has significant experience using SQL and SSIS to perform ETL processes for usage inside of an LMS as well as reporting. In this role he has lead efforts to implement new LMS integration standards, which have allowed far more users to seamlessly implement a wide variety of

tools. Andy is also a member of the IMS Global Learning Analytics Leadership group that is tasked with determining how standards could help Learning Analytics.

Gi Woong Choi is a Ph.D. student in Learning, Design, and Technology Program at Penn State and a graduate research assistant of Education

Technology Services. He has a bachelor's degree in the field of bio and brain engineering and a master's degree in cognitive science and engineering with a focus in human-computer interaction (HCI) and worked as a user interface researcher and game user experience designer. For the summer of 2013, he will be attending 9th Annual Learnlab Summer School at Carnegie Mellon University for the educational data-mining track.

(14)

Our Stakeholders, Dr. James Hager and Dr. Stan Smith, are the key individuals in charge of Math 110 curriculum and teaching, in residence at University Park (Hager) and online through the World Campus (Smith). Both Dr. Hager and Dr. Smith are excited to work with us in exploring how learning analytics can help increase student success in Math 110.

Our collaborators from IST, Dr. Jim Jansen and Dr. Marcela Borge, both have experience in analytics. Dr. Jansen specializes in web and advertising analytics, and is currently consulting with us to identify different data analysis and predictive modeling methods. Dr. Borge is currently engaged in learning analytics exploration and prototyping with faculty from Carnegie Mellon University,

specifically examining how text analysis and intelligent agents might inform future learning analytics tools.

Dr. Mark Fisher and Dr. William Goffe are instructors in Liberal Arts, and are working with us to explore learning analytics in their specific courses

(Philosophy 12 and Economics 104). The outcomes of this proposed project directly impact how we work with their data, and how we can use analytics to support specific learning needs across disciplines.

Dr. Gary Hagerty, Director of Math Learning Center at Boise State University, is actively researching self-efficacy in math education, and will guide our efforts in measuring self-efficacy, understanding how it relates to success in Math 110 and how it might be leveraged in any predictive models or instructional strategies.

(15)

  15  

Dissemination Plans

As learning analytics is a very diverse field, we intend to leverage a diverse number of outlets to disseminate our work. Locally at Penn State, we will use both the COIL website, as well as outlets on the TLT, OPIA and College of

Education websites, to provide general updates on our work. Through local COIL symposia, and the TLT Symposium, the team will share more detailed results of the research and results of our implementation of a learning analytics prototype in Math 110.

At the regional and national level, results will be shared at Educause, focusing on how the technology leveraged and how the prototype impacted student learning and pedagogy. From a data analysis and modeling standpoint, results will be shared at the Northeast Association for Institutional Research (NEAIR).

The primary venue for this work will be the fourth annual Learning Analytics and Knowledge conference in Indianapolis during the spring of 2014. This conference brings together learning scientists, statisticians, programmers, data analysts, learning designers and instructors, providing a very diverse audience to share our results with.

(16)

References

Related documents

In the Northern Territory, the National Partnership Agreement on Remote Indigenous Housing (NPARIH) subsumed the 2008 Strategic Indigenous Housing and Infrastructure Program

 working parents wanted a safe place to leave their children before school lessons started.. In some school

If you are planning to have students do the freezing point depression lab using the benzoic acid/ Lauric acid solution, each pair must use the same thermometer they used in the

Requirements Functional The solution shall include the capability for inspection, measurement, inventory and reporting of all types of infrastructure & structural

Given that there are now relatively few property funds management groups, each of whom are relatively large and global, together with a decreasing number of outsourced

 Turkey is the only country in its own region having already established a well-advanced automotive industry, and it has proven itself to be a successful one in exporting every

The motivations cited in most research have emphasized long- term system benefits without identifying measurable short-term payoffs (Moreau and Back 2000, Griffis and Sturts