• No results found

Building on the analysis of primary concepts in the previous section, we next investigate student success by SQL concept. For each student, we computed the set of SQL concepts (from Table 5.10) for which the student submitted at least one correct response. We then applied the Apriori algorithm described by Agrawal et al. in [1] to identify frequently-occurring subsets of SQL concepts. (Such subsets are often referred to as “frequent itemsets.”) In this way, we identified areas where

0 10 20 30 40 50

Average Attempts per Student

70 75 80 85 90 95 100

% of Students who Submitted Correct Responses

Set Operations

Correlated Subquery Self-Join

Relational Division

Full Outer Join

Figure 5.15: Average Attempts and Percent Success for the 18 Core SQL Con- cepts Listed in Table 5.10

student success in certain concepts is associated with success in other concepts. In particular, we identify several frequent subsets that include both fundamental and advanced concepts.

Table 5.11 lists SQL concept subsets that occur for 90% or more of students, when considering only concepts for which a student submitted at least one cor- rect response. This list highlights associations between SQL concepts student success. Specifically, we note that the Grouping and Grouping Restrictions con- cepts appear in most of the frequent itemsets, Furthermore, these two concepts appear to be linked with several advanced concepts, including Self-Join. This seems counter-intuitive, since, on the surface, grouping has little to do with the self-join operation. We suspect that this association reveals a link from solid student understanding of unfamiliar relational concepts to mastery of SQL as a whole. Once a student has learned to successfully apply core concepts such

as joins and grouping, we believe that the student is better table to learn and apply advanced SQL concepts. This observation, and future analysis proposed in Section 7.2.2, offers key information related to the overall research questions we posed in this thesis.

Table 5.11: Frequently-Occurring (≥ 90% support) Subsets of SQL Concepts in Successful Student Responses. The “Support” column represents the percentage of students who successfully solved at least one exercise containing each concept in the listed subset.

SQL Concept Subset Support

Grouping, Grouping Restrictions 97.7%

Grouping, Parameter Distinct 97.7%

Parameter Distinct, Grouping Restrictions 97.7%

Grouping, Parameter Distinct, Grouping Restrictions 97.7%

Grouping, Multi-Table 93.1%

Multi-Table, Parameter Distinct, Grouping Restrictions 93.1% Grouping, Multi-Table, Parameter Distinct, Grouping Restrictions 93.1%

Grouping, Self-Join, Grouping Restrictions 93.1%

Self-Join, Multi-Table, Grouping Restrictions 90.8% Grouping, Multi-Table, Grouping Restrictions, Self-Join, Parame-

ter Distinct

6 Threats to Validity

Internal Validity. Each student is free to complete exercises within a given

lab in whatever order he or she chooses, raising the possibility that fatigue or other factors may play a role in student performance. In addition, since our experiment uses repeated measures design, we explicitly do not address possible effects related to the order in which students work on lab exercises.

The Lab 365 tool we developed could affect participants’ abilities to solve SQL exercises. Although the tool is designed as a minimal SQL interface, usability and user interaction decisions invariably have been made (either intentionally or not.) These decisions have not been rigorously tested for impact on study results. For most exercises in this study, students were permitted to view expected output at any time during SQL query development. This matches previous similar studies ([2], [35]), which also allowed students to view expected query output. In most real-world situations, however, a SQL developer does not have the ability to preview results. The ability to preview results may artificially simplify certain query tasks.

External Validity. We conducted our study using a specific RDBMS (MySQL)

within a quarter-long database course which did not extensively cover database design topics such as the Entity-Relationship model or normalization. Our re- sults may not generalize to courses in which a different RDBMS is used, or to courses in which students are exposed to additional database topics.

At the beginning of our study, students are informed that their interactions with the Lab 365 application will be recorded for analysis. This knowledge may alter student problem-solving behavior, introducing Hawthorne effects. For ex- ample, a student might devote an atypical amount of effort to manually inspecting

SQL code before testing each query in an attempt to prevent the system from recording too many incorrect attempts.

7 Conclusions and Future Work

In this thesis, we originally set out to investigate the learning process in an intro- ductory database course, and to quantitatively study troublesome SQL concepts and common errors. The database lab tool we developed proved effective. Lab 365 facilitated collection of a large volume of data related to the student problem- solving process, and promises to be a useful tool in the future.

Our results are largely consistent with similar previous studies. In concurrence with Ahadi et al. [2], we observe that SQL syntax errors are a significant source of student frustration. Analyzing the most difficult SQL concepts, we find that self-joins, correlated subqueries, and (to a lesser extent) grouping restrictions are most troublesome. These findings are similar to results reported by Taipalus et al. [36], and Ahadi et al. [2]. In comparison to these previous studies, we studied a larger collection of SQL exercises based on a wider variety of database structures and problem domains. We also investigated several SQL concepts that, to our knowledge, have not been previously studied in an educational context. With the benefit of these extensions to previous studies, we performed an initial study of SQL concept combinations. Drawing from our analysis, we are well-positioned to suggest improvements to lab exercises and to validate the effect of these changes on the student learning process.

Related documents