Chapter 2 Literature review
2.4 Teaching introductory programming courses
2.4.3 Problems novices experience
2.4.3.2 What do novices find difficult
Much of the research in computing education (CER) focuses on why students find learning to program difficult, and what topics cause difficulties, as well as on factors predicting student success. This can be seen from the following themes revealed by surveys of CER:
• Theories and models (pedagogy) of teaching/learning (Clancy, Stasko, Guzdial, Fincher & Dale, 2001; Palumbo, 1990; Pears, Seidman, Eney, Kinnunen & Malmi, 2005; Pears, Seidman et al., 2007; Robins et al., 2003; Sheard et al., 2009; Valentine, 2004).
• Ability/aptitude/understanding, including the difference between novices and experts (Clancy et al., 2001; Palumbo, 1990; Robins et al., 2003; Sheard et al., 2009).
• Teaching/learning and assessment techniques (Clancy et al., 2001; Palumbo, 1990; Pears et al., 2005; Pears, Seidman et al., 2007; Robins et al., 2003; Sheard et al., 2009; Valentine, 2004).
• Tools to assist with teaching/learning and assessment (Clancy et al., 2001; Palumbo, 1990; Pears et al., 2005; Pears, Seidman et al., 2007; Robins et al., 2003; Sheard et al., 2009; Valentine, 2004).
For the purpose of this study the topics that novices find difficult to understand is important, as the purpose of the tutorial developed for this study is to address some of those topics. A brief summary of the reasons why novices fail is presented, before the topics that novices find difficult are expanded on. The list that follows does not claim to be comprehensive, but among the reasons why novices fail to learn to program, the following are listed from the literature:
• fragile programming knowledge and in particular, fragile problem-solving strategies (Robins et al., 2003);
• lack of problem-solving skills (McCracken et al., 2001; Miliszewska & Tan, 2007); • inability to organise programming knowledge efficiently (Wiedenbeck, 2005);
• lack of prior experience, such as previous abstraction experience, background deficiencies in particular kinds of mathematics such as discrete mathematics and logic (Kinnunen et al., 2007; Pears, East et al., 2007; Pears, Seidman et al., 2007);
• lack of a viable mental model of programming and the computer, possibly due to lack of programming experience (Kinnunen et al., 2007; Miliszewska & Tan, 2007; Robins et al., 2003);
• the difficulty of the curriculum and course content as well as the teaching methods (Jenkins, 2002; Kinnunen & Malmi, 2008; Kinnunen et al., 2007; Miliszewska & Tan, 2007; Pears, East et al., 2007);
• students’ attitude and motivation; behaviour, such as deep learning versus surface learning; and expectations, such as lack of self-efficacy2 (Ala-Mutka, 2004; de Raadt, Hamilton, Lister, Tutty, Baker, Box, Cutts, Fincher, Hamer & Haden, 2005; Jenkins, 2002; Kinnunen & Malmi, 2008; Kinnunen et al., 2007; Pears, East et al., 2007; Wiedenbeck, 2005);
• intrinsic student abilities, such as natural level of ability and ways of thinking (Ala-Mutka, 2004; Jenkins, 2002; Kinnunen et al., 2007; Pears, East et al., 2007); and
• inability to read and understand program code (Lister et al., 2004; Mannila, 2006).
Some novices are effective and some ineffective. Effective novices learn to program without too much assistance; ineffective novices do not learn, despite disproportionate effort and personal attention. The difference in strategies, rather than knowledge, is probably the most important difference between effective and ineffective novices. Unsuccessful novices frequently stop when stuck. Instead of exploring different ideas; they do not trace, or do so half-heartedly; try small changes instead of considering all of the code; and neglect to break problems down into parts, or err in doing so (Perkins et al., 1989).
Many of the specific problems novices experience have been identified in the latter part of the previous century already, as can be seen from Soloway and Spohrer’s (1989) compilation of several studies that report on novices’ understanding of specific language features and how they use them. Samurçay (1989) investigated novices’ concepts of a variable and using it. He reports that initialisation is a complex cognitive operation and that novices understand reading (obtaining external input) better than initialisation. Initialisation is also more complex than updating and testing variables, which are on the same complexity level. Du Boulay (1989) too, points out various difficulties novices
2 Perceived self-efficacy is “beliefs in one’s capabilities to organise and execute the courses of action required to produce given attainments” (Bandura, 1997:3).
experience with the assignment statement, such as inappropriate analogies. Hoc (1989) found that certain kinds of abstractions can lead to errors in constructing conditional statements. In their analysis of bugs in simple Pascal programs, Spohrer, Soloway and Pope (1989) found that errors associated with loops and conditionals are more common than those with input, output, initialisation, update, syntax/block structure and planning. Soloway, Bonar and Ehrlich (1989) studied how novices use loops, and concluded that they preferred to use loops that first read, then process rather than using process/read loops, since this concurred with their natural strategy. Du Boulay (1989) points out that novices frequently have trouble with for loops because they fail to understand that the loop control variable is incremented in each loop cycle. He also notes the problems novices experience with using arrays, especially understanding the difference between subscripts and the values in the array cells (Du Boulay, 1989). Kahney (1989) showed that novices may acquire different models of recursion, of which most are incorrect. Kessler and Anderson (1989) found that novices were better able to write recursive functions after studying iterative functions first, but not vice versa. They concluded that novices’ difficulty in understanding flow of control, is due to their inadequate mental models of the task.
Despite these problems having been exposed for a quarter of a century, they persist, as can be seen from more recent research. In imperative-first courses students seem to experience problems with two core concepts, assignment and sequence, and recursion/iteration (Bornat et al., 2008). Simon (2011) found that it seems as if many students in their first, second and even third programming courses do not understand the semantics of the assignment statement, or that a sequence of statements are executed sequentially, and possibly both. Jimoyiannis’ (2013) study confirms that many novice programmers have mathematical-like mental models about variables and the assignment statement. Sorva (2008) too, points out various misunderstandings students have about primitives and object variables. This may be due to common misconceptions about variables and confusion between assignment and equality based on their prior knowledge of algebra (Khalife, 2006; Sorva, 2008). Nevertheless, it keeps them from understanding a standard swap operation and will make it more difficult to learn further concepts (Simon, 2011) as proposed by the LEM hypothesis (Robins, 2010). Without proper mental models for variables and assignments, students will also be unable to understand and implement object-oriented techniques (Ben-Ari, 2001; Farrell, 2006; Ma et al., 2011; Sorva, 2008).
In an investigation focused on topics in object-oriented programming second-year students find difficult, Milne and Rowe (2002) conclude that students struggle mostly with pointer and memory topics, because they do not understand what happens in memory when their programs are executed. This will continue to be a problem, unless they develop a clear mental model of how their program is working in terms of how it is stored in memory and how the objects in memory relate to each other.
In an objects-first course, Ragonis and Ben-Ari (2005) found that novices experience difficulties in understanding program flow and sequential program execution, because they do not understand the difference between the integrated development environment (IDE) and the execution of a program. They struggled to understand that methods could be invoked more than once, where parameters receive values, and where the return value of a method goes. They also found it difficult to comprehend the need for input instructions, as well as the relationship between constructor declaration, invocation and execution (Ragonis & Ben-Ari, 2005). This confirms Milne and Rowe’s (2002) observation that students need a clear mental model of program-memory interaction to master object-oriented programming. It also ties in with program-memory interaction as a threshold concept (Rountree & Rountree, 2009).
In a Delphi process3 the 11 most important and difficult concepts in introductory computing courses have been identified, in order of most difficult and most important, as – (Goldman, Gross, Heeren, Herman, Kaczmarczyk, Loui & Zilles, 2008; Goldman, Gross, Heeren, Herman, Kaczmarczyk, Loui & Zilles, 2010)
• abstract pattern recognition and use; • debugging, exception handling;
• functional decomposition, modularisation; • conceptualise problems, design solutions; • designing tests;
• inheritance;
• memory model, references, pointers; • parameter scope, use in design • procedure design
• recursion, tracing and designing; and
• issues of scope, local vs. global (Goldman et al., 2008; Goldman et al., 2010).
As part of the same study, Kaczmarczyk, Petrick, East and Herman (2010) identified the following four themes emerging from students’ misconceptions:
• Misunderstanding the relationship between language elements and underlying memory usage. • Misunderstanding the process of while loop operation.
• Lacking a basic understanding of the object concept. • Inability to trace code linearly.
Kaczmarczyk, Petrick, East and Herman (2010) elaborate on the first theme (misunderstanding the relationship between language elements and underlying memory usage) in Table 2.1 below. From the
3
perceptions that students hold about memory allocation as identified in Table 2.1, it is clear that the misconceptions are based on inappropriate or absent mental models of variables and assignments. Table 2.1 Misconceptions about the relationship between language elements and underlying memory usage (Kaczmarczyk, Petrick, East and Herman 2010)
Language element Underlying memory usage
Semantics to semantics Student applies real-world semantic understanding to variable declarations. All objects same size Student thinks all objects are allocated the same amount of memory regardless of
definition and instantiation. Instantiated no memory
allocation Student thinks no memory is allocated for an instantiated object. Uninstantiated memory
allocation Student thinks memory is allocated for an uninstantiated object. Off by 1 array construction Student thinks an array's construction goes from 0 to length, inclusive. Primitive no default Student thinks primitive types have no default value.
Primitives don't have
memory Student thinks primitives without a value have no memory allocated.
Dropout students identified the most difficult topics as designing one’s own code, exception handling, handling files, arrays, identifying structures in given code, adopting the programming style required in the course and transferring one’s own thinking into programing language (Kinnunen & Malmi, 2008). At Victoria University in Melbourne, Australia, novices found classes and methods, graphical user interfaces and event handling the most difficult; followed by iteration, selection, and input/output (Miliszewska & Tan, 2007). The technicalities of programming were also perceived as difficult. In Malaysia, novices without prior programming experience considered the most difficult topics in decreasing order to be designing a program to solve a certain task, dividing functionality into procedures, learning the programming language syntax, finding bugs in their own programs, understanding basic concepts of programming structures, using the IDE and gaining access to computers/networks (Tan, Ting & Ling, 2009). Butler and Morgan (2007) confirm that students find topics of a more conceptual nature more difficult both to understand and to implement. These include object-oriented concepts and design, program design and arrays.
Dale (2006) identified four categories of difficult topics, namely problem-solving and design, general programming topics, object-oriented constructs and student maturity (or rather, lack thereof). Based on the feedback related to the first and fourth categories, the results of her study suggest that students may not have developed the logical thought processes that are required for problem-solving. These higher-order skills are also required for the third category. The second category includes control structures, I/O, parameters and recursion with parameters, arrays, looping and files (in descending order of difficulty) indicated as the most difficult topics. Concrete examples and better teaching tools are suggested to improve learning (Dale, 2006).
In an international survey, Lahtinen, Ala-Mutka and Järvinen (2005) identified the most difficult issues in programming as understanding how to design a program to solve a certain task; dividing functionality into procedures, functions and/or classes; and finding bugs in own programs. All of these require understanding larger entities of a program instead of only some details. Recursion, pointers and references, abstract data types, error handling, and using language libraries were singled out as the most difficult programming concepts. Learning to apply rather than understanding basic concepts appears to present the biggest problem. This work confirms the findings by Soloway and Spohrer (1989), Milne and Rowe (2002) and Dale (2006).
Based on the work by Milne and Rowe (2002), Lahtinen et al. (2005) and Tan, Ting and Ling (2009), Piteira and Costa (2012) summarised topics in introductory programming courses from the highest level of student comprehension to the lowest, as being – selection structures; variables (lifetime/scope); loop structures; operators and precedence; structured data types; abstract data types; recursion; pointers and references. This concurs to a large extent with the work by Ben-Ari (2001), Bornat et al. (2008), Butler and Morgan (2007), Dale (2006), Farrell (2006), Goldman et al. (2008), Goldman et al. (2010), Hertz and Ford (2013), Jimoyiannis (2013), Kaczmarczyk et al. (2010), Khalife (2006), Kinnunen and Malmi (2006), Lahtinen et al. (2005), Ma et al. (2011), Miliszewska and Tan (2007), Milne and Rowe (2002), Ragonis and Ben-Ari (2005), Simon (2011), Soloway and Spohrer (1989) and Sorva (2008) investigating the topics that novices find difficult discussed/described above.
Schulte and Bennedsen (2006) investigated what teachers teach in introductory programming and came to the conclusion that the majority of the ‘classic’ topics (such as iteration and syntax) are taught at the same level, with the same difficulty and the same relevance regardless of which paradigm (imperative or OO) is followed. An interesting finding was that most teachers consider a mental model of the notional machine less important, but despite that, believed a hierarchy of object- interaction presented as the basis for a notional machine in OO, was important (Schulte & Bennedsen, 2006). Related to this is, Herz and Ford’s (2013) investigation in the importance an instructor ascribes to a topic and the time spent teaching a topic as underlying factors that influence introductory student leaning. Little correlation has been found between time spent on teaching a topic and students’ abilities related to that topic. In contrast, novices’ abilities on a topic correlate to a large extent positively with how important the instructor deems the topic to be, with the exception of control structures, subroutines/functions and types. This suggests that these are difficult topics to master, once again confirming previous research.
Considering the problems highlighted by the research presented above, it seems as if students experience the same problems after the introduction of OO, as they did before. Many of these problems originate in a lack of an appropriate mental model of a running program, which Khalife
(2006) considers the first threshold facing introductory programming students. An inadequate comprehension of program-memory interaction is reflected in the problems students experience with memory topics such as variables, assignment, data structures, parameters, objects, pointers and references. Lack of understanding the flow of control in a running program is reflected in problems with the execution sequence, control structures, procedures and recursion. Students also struggle with program design and debugging. The complexities of OO may even introduce more problems since more material have to be covered in introductory courses, with less time allocated to basic concepts such as control structures. Event-handling and class methods also add an additional layer of complexity (McCauley, Fitzgerald, Lewandowski, Murphy, Simon, Thomas & Zander, 2008).
2.5 Tracing
One of the mechanisms to assist with clearing up misconceptions held by novices (see section 2.4.3.2 What do novices find difficult) and developing mental models (as discussed under the Section 2.4.2.1 Mental models), is to teach them tracing. In the following sections tracing itself is defined, the reasons why tracing is important are explained, difficulties novices experience with tracing as well as the advantages of teaching novices to trace are pointed out.