Software Design
4.3 INTRODUCTION TO SOFTWARE DESIGN REPRESENTATIONS
Any notations, techniques, or tools that can help to understand systems or describe them should receive serious consideration from the person modeling the system. We will focus our attention in this section on techniques for modeling and will briefly discuss design notations.
Suppose that we cannot find any existing software that either solves our problem directly, or else is a candidate solution after it is modified. In this case, we are forced to design new software to solve the problem. How do we do this?
There are several ways to describe a software system:
• The software can be described by the flow of control through the system. • The software can be described by the flow of data through the system. • The software can be described by the actions performed by the system. • The software can be described by the objects that are acted on by the system.
Each of these system descriptions, in turn, leads us to one or more design representations that have been found to be useful for describing and understanding these types of systems.
The first type of description of a software system leads us to the concept of a flow graph or flowchart. The earliest popular graphical design representations were called “flowcharts,” which were control-flow oriented. The term control flow is a method of describing a system by means of the major blocks of code that control its operation. In the 1950s and 1960s, a flowchart was generally drawn by hand using a graphical notation in which control of the program was represented as edges in a directed graph that described the program. Plastic templates were used for consistency of notation.
The nodes of a control flow graph are boxes whose shape and orientation provided addi- tional information about the program. For example, a rectangular box with sides either horizontal or vertical means that a computational process occurs at this step in the pro- gram. A diamond-shaped box, with its sides at a 45-degree angle with respect to the hori- zontal direction, is known as a “decision box.” A decision box represents a branch in the
control flow of a program. Other symbols are used to represent commonly occurring situ- ations in program behavior. An example of a flow chart for a hypothetical program is given in Figure 4.1. More information on flowcharts is given in Appendix C.
The second method of describing designs is appropriate for a data flow representation of the software. As was mentioned in Chapter 1, data flow representations of systems were developed somewhat later than control flow descriptions. The books by Yourdon, one by him and the other coauthored with Constantine, are probably the most accessible basic sources for information on data flow design (Yourdon and Constantine, 1979; Yourdon, 1989). Most software engineering books contain examples of the use of data flow diagrams in the design of software systems.
Since different data can move along different paths in the program, it is traditional for data flow design descriptions to include the name of the data along the arrows indicating the direction of data movement.
Data flow designs also depend on particular notations to represent different aspects of a system. Here, the arrows indicate a data movement. There are different notations used for different types of data treatment. For example, a node of the graph represent- ing a transformation of input data into output data according to some rule might be represented by a rectangular box. A source of an input data stream such as an interac- tive terminal input would be represented by another notation, indicating that it is a “data source.” On the other hand, a repository from which data can never be recalled, such as a terminal screen, is described by another symbol, indicating that this is a “data sink.” See Appendix C.
Start A > 0 No Yes Step 1 B = 0 No Yes Step 2 Exit
Since different data can move along different paths in the program, it is traditional for data flow design descriptions to include the name of the data along the arrows indicating the direction of data movement.
Typical data flow descriptions of systems use several diagrams at different “levels.” Each level of a data flow diagram represents a more detailed view of a portion of the system at a previously described, higher level. A very high-level view of the preliminary analysis of the data flow for a hypothetical program is shown in Figure 4.2. This simple diagram would probably be called a level 1 data flow diagram, with level 0 data flow diagrams simply representing input and output.
The third method of representing a software system’s design is clearly most appropriate for a procedurally oriented view of the system. It may include either control flow or data flow, or even combined descriptions of the software. The notations used for this hybrid approach are not standard. Such a design may be as simple as having a link from, for example, the box labeled “Step 1” in Figure 4.2 to the flow chart described in Figure 4.1.
Finally, the fourth method is clearly most appropriate for an object-oriented view of the system. This is obviously a different paradigm from the previous ones. We have chosen to use a modeling representation known as Unified Modeling Language, or UML. UML is an attempt to describe the relationships between objects within a system, but does not describe the flow of control or the transformation of data directly.
Note that each of the boxes shown in Figure 4.3 represents an object for which the inher- itance structure is clearly indicated by the direction of the arrow: the object of type class 1
Get input Step 1 Step 2 Exit A B A, B
FIGURE 4.2 A level 1 data flow diagram (DFD) description of a hypothetical computer system.
Class 1 Attribute: type 1 Operation 1: (arg, type)
Class 3 Attribute: type 3 Operation 3: (arg, type)
Class 2 Attribute: type 2 Operation 1: (arg, type) Operation 2: (arg, type)
FIGURE 4.3 An oversimplified object-oriented description of a hypothetical computer system
is a superclass of the object of type class 3. In an actual design using the unified object model, the horizontal line would contain information about the type of relationship (one- to-one, many-to-one, one-to-many) and about the aggregation of multiple objects into a single object. We omit those details here.
It is often very difficult for a beginning software engineer to determine the objects that are appropriate for a software system’s design. Although there is little comparative research to support the hypothesis that good object-oriented design is harder to create than design using traditional procedurally oriented approaches, we do believe that considerable train- ing is necessary to appreciate the subtleties of the object-oriented approach. The student unfamiliar with the object-oriented approach is encouraged to read this and related dis- cussions throughout this book several times after he or she has become more familiar with software development in order to understand why certain design decisions were made.
A helpful approach is to think of a description of the system’s functionality in complete sentences. The verbs in the sentences are the actions in the system; the nouns represent the objects. We will use this simple approach when developing that portion of the design of our major software engineering project example for which the object-oriented approach makes sense.
It is natural for the beginning student of software engineering to ask why multiple design representation approaches are necessary. Of course some techniques evolved for historical reasons. However, many techniques and representations have survived because they provide useful, alternative views of software design.
The familiar problem of sorting an array of data provides a good illustration of the dis- tinction among the four approaches. Consider the well-known quicksort algorithm devel- oped by C. A. R. Hoare (1961).
The quicksort algorithm partitions an array into two subarrays of (perhaps) unequal size. The two subarrays are separated by a “pivot element,” which is at least as large as all elements in one subarray and is also no larger than the elements in the other subarray. Each of the subarrays is then subjected to the same process. The process terminates with a completely sorted array. Note that this algorithm description is recursive in nature. Think of the quicksort algorithm as being used in a design to meet some requirements for an efficient sort of a data set of unknown size. (Students wishing a more detailed description of the quicksort algorithm are advised to consult any of the excellent books on data struc- tures, or any introductory book on algorithms. Hoare’s original paper (Hoare, 1961) is still well worth reading.)
A control flow description of the algorithm would lead to a design that would be heavily dependent on the logical decomposition of the program’s logic. Think of implementing the recursive quicksort algorithm in a language such as FORTRAN or BASIC that does not support recursion. The logic of the design would be very important. On the other hand, too large a data set might mean too many levels of recursive function calls if a purely recursive algorithm is used. This control flow view might point out some limitations of the design of a simple quicksort algorithm.
A data flow description of the algorithm would emphasize the data movement between arrays and the two smaller subarrays that are created at each iteration of the quicksort
algorithm. Attention to the details of setting aside storage locations for these subarrays might lead to a consideration of what to do if there is no room in memory. Most courses in data structures ignore the effects of data sets that are too large to fit into memory. However, such sets exist often in applications and efficient sorting programs must treat them carefully.
A third view can be obtained by considering a purely procedural solution. In this case, a standard library function can be used. For example, the standard C library has a func- tion called qsort(), which takes an array argument and produces a sorted array in the same location. That is, the input array is overwritten by the output array.
A user of the qsort() function in the standard C library must provide a comparison function to do comparisons between array elements. The function prototype for this com- parison function, compare(), is
int *compare(*element1, *element2);
This user-defined comparison function takes two arguments (which represent arbitrary array elements) and returns 0 if the two arguments are the same. The comparison func- tion must return –1 if the first argument is “less than” the second, and must return 1 otherwise. The number of elements in the array and the size of an array element must also be provided.
The qsort() function in the standard C library is accessed by including the header file stdlib.h within a C program. This function has the syntax
void qsort(
const void *base, size_t num_elts, size_t elt_size,
int (*compare(const void *, const void *) );
and the typical usage is
ptr = qsort(arr,num_elts,elt_size, compare);
Finally, an object-oriented approach would most likely use a general class for the array of elements and invoke a standard member function for sorting. Such a function typically would be found in a class library for some abstract class. An example of this can be found in the class libraries that are provided with most C++ software development systems.
A general class in an object-oriented system will contain the methods that can be applied to the object represented by the class. In C++, the methods that are associated with an object are called the “member functions” of the class. In the case of C++, a class descrip- tion is likely to contain what are called “templates,” or general classes in which the specific type of a member of the class is only relevant when the class is used. Thus a template class
can refer to an abstract array of integers, character strings, or any other relevant type for which the class operations can make sense.
Provided that the methods of the template class can be implemented for the particular type of object in the class, the general method known as a sorting method can be invoked by simply calling a member function of an object as in
A.sort();
Here, the type of the object A is compatible with the template type and the sort() member function for that object is used.
Even for the simple case of a sorting algorithm, we have seen several different approaches that can be used for software system design. Each has advantages and disadvantages. We have not examined any of the disadvantages in detail. In particular, we have never consid- ered the question of the efficiency of software written according to any of the designs given previously. For example, a slow sorting algorithm, or one that uses recursion, would be completely inadequate for many software applications.
There is one other commonly used method for describing a system. This involves pseudocode that provides an English-like (at least in English-speaking software devel- opment environments) description of the system. The pseudocode is successively refined until the implementation of the source code is straightforward, at least in theory. Note that pseudocode is a nongraphical notation. An example of pseudocode is shown in Example 4.1. The pseudocode describes a portion of an authentication system for password protec- tion in a hypothetical computer system.
Example 4.1: A Pseudocode Description of a Hypothetical Computer System
GET login_id as input from keyboard
Compare Input to entries in login_id database IF not a match THEN
WAIT 10 seconds SET error_count to 1 REPEAT
PROMPT user for new login_id IF login_id matches database
THEN PROMPT for password
ELSE increment error_count WAIT 10 seconds
END IF
IF error_count > 3 EXIT password PROMPT END REPEAT
ELSE
GET password as input from keyboard
IF error THEN EXIT
ELSE
BEGIN login process END IF
Pseudocode representations have two major advantages over the graphical ones: they can be presented as textual information in ASCII files, and pseudocode descriptions of large systems are no more complicated to represent than those of small systems. Of course, pseudocode representations may be so long and have so many levels of nesting in their statement outline that they are extremely difficult to understand.
You should note one other advantage of pseudocode representations when used for designs: they can produce instant internal documentation of source code files. We will address this point in the exercises.
The design representations described in this section by no means exhaust the number of representations available.
A more complex design representation is available when using the Department of Defense Architectural Framework (DoDAF) processes. The framework consists of seven differ- ent “Viewpoints,” each of which includes multiple views. The seven DoDAF Viewpoints (Department of Defense Architectural Framework, 2009) are
1. Capability Viewpoint
2. Data and Information Viewpoint 3. Operational Viewpoint
4. Project Viewpoint 5. Services Viewpoint 6. Standards Viewpoint 7. Systems Viewpoint
There is also a combined Viewpoint called “All Viewpoint.” Each of these Viewpoints con- sists of several elements. For example, the Operational Viewpoint requires at least seven distinct model descriptions. Keeping these Viewpoints consistent for a DoDAF represen- tation for any nontrivial system clearly requires the use of a high-quality CASE tool, such as System Architect from Telelogic AG. Such CASE tools are expensive and geared to very large-scale systems. They are unlikely to be available for general use at most colleges or universities for general instruction. Consult the references for more information.