Program Comprehension - Learning to Program

The Process of Learning to Program

2.4 Learning to Program

2.4.1 Program Comprehension

The process of understanding a program is known as program comprehension (Ramalingam & Wiedenbeck 1997). Programmers need to understand programs written by themselves or by other programmers. Tasks, such as debugging, modification and code reuse, require that programs be comprehended and as such is an important skill that novice programmers acquire (Navarro-Prieto & Cañas 2001). Pennington’s Two Stage mental model for program comprehension (Pennington 1987a, b; Wiedenbeck 1999; Navarro-Prieto & Cañas 2001) is based on the mental model approach to text understanding (van Dijk & Kintsch 1983). The Two Stage model states that when a program’s source code is read, a number of different representations, at varying levels of abstraction are formed in a bottom-up fashion (Figure 2.4).

At the lowest level is a verbatim representation, which is identical or very similar to the program code itself. This representation contains no abstract information and consists only of the program text itself. Above the verbatim representation are more abstract representations of the program. Dual mental representations called the program model and the domain model, as well as a third kind of knowledge, state, are formed above the verbatim representation.

The program model (Figure 2.4) contains information about the program entities and relationships explicitly encountered in the program text. The model is made up of knowledge about operations and control flow within the program, both of which are at a low level of abstraction and can be extracted directly from the program code.

Program Model Domain Model State ________ ________ ________ ________ ________ ________ Verbatim Representation

Figure 2.4: Pennington's Two Stage Mental Model for Program Comprehension

Operations knowledge is concerned with elementary operations performed in the program code. This knowledge typically corresponds to a few lines of text and is close to the verbatim representation of the program code. Displaying a line of text on screen or updating a variable are examples of elementary operational knowledge that might be extracted from a program’s text.

Control flow knowledge pertains to the order in which instructions are executed within the program. By default this is sequential, but may be modified by program language constructs such as loops and function calls. Operations and control-flow information are extracted while reading the program code and does not require purposeful, detailed study to obtain.

The domain model (Figure 2.4) contains knowledge about the problem domain3 (Figure 2.3) being referred too by the program, where functional relationships between disparate parts of the program are expressed in the language of problem domain entities. The domain model consists of data-flow and function knowledge.

For clarity, the problem domain is the situation in which the problem to be solved is found. The program model and domain model are mental representations of a program that a programmer forms by examining program code and performing appropriate actions. The domain model includes information from the problem domain, whereas the program model does not.

Data-flow knowledge describes the transformation of data as it flows through the program. The transformation of the input data to output data is the purpose of the program and goals, linking it to the goals of the program. Function knowledge is concerned with the goals of the program in terms of the problem it addresses. Data- flow and function information is not readily available from the program code and requires explicit comprehension tasks to extract.

State (Figure 2.4) contains information about the state of all program aspects whenever a given action occurs within the program. Pennington (1987b) states that state information may be difficult to extract from program code, as there is no explicit highlighting in typical programming notations.

Construction of the various representations of a program is formed in a bottom-up fashion (Pennington 1987a, b). The program model is formed from the verbatim representation and is built up as the program code is read, while the domain model is formed based on the program model and knowledge of the problem domain and constructed under appropriate task conditions.

The type of comprehension task required of a programmer influences the mental model acquired. When required to perform read-to-do tasks, such as program modification and debugging, as opposed to read-to-recall tasks, for example program documentation, a different mental model construction method is employed (Détienne 1990). For simple tasks, such as reading program code, only a program model needs to be constructed. For more complex tasks, such as debugging and program modification, a program model and domain model need to be constructed, as both control flow and data flow information are required to complete such tasks.

There are a number of situational variables that affect when and how programmers build these models. Two of these are the notation used, both the programming language and secondary notation, and the notation’s visual characteristics (Gilmore & Green 1988). Textual programming languages, such as the general-purpose programming languages (C, C++, C#, Pascal, etc.) allow programmers to extract control flow information more strongly, while visual programming languages allow programmers to extract data flow information more readily. Data flow information is

accessed from textual programming languages only when programmers are required to perform a difficult task, such as debugging (Navarro-Prieto & Cañas 1999).

Programming languages affect the mental models constructed, but also other aspects of the process of learning. Section 2.4.2 discusses programming languages in more detail.

In document The evaluation of a pedagogical-program development environment for Novice programmers : a comparative study (Page 47-50)