10.1 Data modules

Arrays and data manipulation

So far we have considered only three different kinds of program units, namely the main program, subroutines, and functions. Program execution always starts with a main program, from which subroutines and functions (collectively called “procedures”) may be called. Procedures may call on other procedures in turn. Whenever a procedure is called, a list of arguments is used to pass data in either direction between the two program units, the calling and the called. In the case of functions, of course, the function name itself transfers data back from the function. If a procedure is written to carry out a simple self-contained operation, like calculating a simple mathematical function, then data transmission by argument passing may be good enough, but in general it can be very useful, if not essential, to have a way in which program units can share larger sets of data in a more flexible and open way. Modules provide a method of doing this.

Normally a name given to a variable (or to a named constant) is meaningful only within a particular program unit. A declaration like

CHARACTER(80) :: line(60)

is only known to the main program or procedure in which it appears. If this statement were to be repeated in different program units, the processor would assume that it referred to different things, and distinct areas of memory would be allocated. In other words, the names of variables in Fortran are usually “local” entities.

Using modules, however, it becomes possible for the same sets of data to be accessible to a number of different program units.

Suppose, as an example, that a program needs to use an array of character strings representing a page of text together with some integer and logical data items related to it. The form of the data might be specified by the declaration statements

CHARACTER(80) :: line(60)

INTEGER :: linelength, linesperpage, numpage LOGICAL :: checkin, checkspell, checkout

To have access to this data in different procedures, these declaration statements could be encapsulated in another kind of program unit called a “module”, consisting simply of the statements above topped and tailed by MODULE and END MODULE statements:

MODULE Textpage

CHARACTER(80) :: line(60)

INTEGER :: linelength, linesperpage, numpage

LOGICAL :: checkin, checkascii, checkspell, checkout END MODULE Textpage

The module can then be invoked in any other program unit simply by supplying the statement USE Textpage

The name Textpage is of course arbitrary. The general form of the MODULE statement is simply the keyword MODULE followed by a name that may be chosen according to the usual Fortran rules, like the name of a variable or a procedure. The END MODULE statement is similar, just like an END FUNCTION or END SUBROUTINE statement, and it is not compulsory to repeat the module's name; in fact END by itself would suffice. As will be seen later, the USE statement can be a little more complicated than is indicated here, but basically it consists of the keyword USE followed by the name of a module.

A program may include many different modules as long as they have different names. A particular program unit may invoke a number of modules by having a series of USE statements. USE statements are non-executable and they must appear at the very beginning of a program unit immediately following the PROGRAM (or SUBROUTINE, etc.) statement and before any other non-executable statements. A module may itself invoke a further module, e.g. we could have

MODULE Textpage USE Language

CHARACTER(SO) :: line(60)

INTEGER :: linelength, linesperpage, numpage

LOGICAL :: checkin, checkascii, checkspell, checkout END MODULE Textpage

but it is not permissible for a module to invoke itself, directly or indirectly. So, the module Language may not have a USE Textpage statement, or a circularity would be created.

The next example is a data module containing information about metals, and it shows that a module can be used to hold a database of fairly complex design. It must be stressed that this module does not actually

“do” anything. It has no executable statements. Its purpose is to declare the structure of a set of data that can subsequently be utilized in other program units.

MODULE Metals

INTEGER :: number_of_metals, namelengths(100) CHARACTER(20) :: metal_name(100)

REAL :: weight(100), density_0(100), density_100(100), &

tempmelt(100), conductivity_0(100), &

conductivity_100(100), pricerange(2, 100) LOGICAL :: data_has_been_read_in END MODULE Metals

This could be invoked (by the statement USE Metals)in different subroutines to carry out tasks like reading the data from a disk file, modifying the data, rewriting the disk file, and making calculations that need access to this dataset. For example, to estimate the conductivity of a particular metal at a particular temperature we could write a function such as

FUNCTION Conductivity (metal, temperature) USE Metals

REAL :: Conductivity

REAL, INTENT(IN) :: temperature REAL :: c0, c100

CHARACTER(20), INTENT(IN) :: metal

Conductivity = –99.0! default function value for an

!unrecognized metal

DO i = 1, number_of_metals IF (metal/=metal_name(i)) CYCLE c0 = conductivity_ (i)

c100 = conductivity_100(i)

Conductivity = c0 + temperature*(c100–c0)/100.0 EXIT

END DO

END FUNCTION Conductivity

This is obviously much simpler than transmitting all the data through a long list of arguments.

With a module, a large or complex set of data need only be designed and declared once. This saves memory, keeps programs shorter, avoids error, and avoids passing clumsy long lists of arguments to procedures. In fact subroutines need have no arguments at all. A main program could simply look like

PROGRAM Economic_Prediction CALL Startup

CALL Calculate CALL Display

END PROGRAM Economic_Prediction with the associated subprograms being, say,

SUBROUTINE Startup USE Basedata

USE Workspace…

SUBROUTINE Calculate USE Workspace

USE Results…

SUBROUTINE Display USE Results

USE Output_Formats…

MODULE Basedata…

MODULE Workspace…

MODULE Results…

MODULE Output_Formats…

In this example the four modules could contain all the data declarations and the subroutines could contain only executable statements after the USE statements. The program as a whole consists of eight program units. This sort of design can give the programmer great flexibility: for example, the module called output_Formats could be written in two or three completely different versions to interface with different display devices that could be connected to the processor.

In a module, variables should be declared with the SAVE attribute if their values are intended to be preserved between calls to program units that USE the module (and if the module is not perpetually in USE through the main program). If a module is not perpetually in USE from somewhere or other, unsAVEd variables could become undefined.

Modules encourage a systematic approach to the design, management, and use of a program or system of programs. When important data is kept in modules, the programmer’s work can become focused more on the careful design of the data structures rather than on procedures. Modules take on a life of their own, central to how we perceive the program, while procedures can be regarded as mere ancillaries that carry out operations on the data modules. In many application areas that depend on large data sets, this sort of programming style (“declarative” or “data-oriented” programming) is far more appropriate than the “stream-of-consciousness” style that concentrates on the flow of control between subroutines and functions.

This section was headed Data modules and we have seen that a module may contain a number of data declarations. The declarations may be type declaration statements of any kind. We shall see later that modules may also contain declarations of derived types (Chapter 12), procedure interfaces (Chapter 11) and namelist groups (Appendix A). They may also include procedures, as explained in the next section.

COMMON

There is an alternative to using a data module for “sharing” data between different program units. The alternative, an old construction that goes back to Fortran’s early days, is the “COMMON block”. An example of the syntax is

COMMON /Pool/ rnums(10), arr(3, 3, 3), intflag

Between slashes, Pool is the (arbitrary) name of the common block and can be used in different program units to refer to a single shared area of memory. Data are to be found there according to the ordered list of names and array shapes that follows: in this example, some arrays of real numbers and

integers with the local names rnums, arr and intflag. The data does not also need to appear in type declaration statements unless further attributes have to be specified. The data in a common block can have different local names in different program units, i.e. the names and array shapes that follow the slashes can vary, but they will everywhere refer to the same items of data.

The name of the common block can be omitted, in which case the statement refers to a special area of memory known as “blank common”. The syntax is like this:

COMMON // rnums(10), arr(3,3,3), intflag 10.2 Exercises 10.A 10.

Write a data module to contain in a convenient form the names of the days of the week, the names of the months, and the usual numbers of days in the months.

10.

Write a data module to contain the mathematical constants π and e and the first 20 powers of 2.

10.

Write a data module containing a two-dimensional “spreadsheet” of 30 rows and 10 columns, each cell containing data in the form of a string of 12 characters. An ancillary array is to contain single-character indicators to say which cells contain values and whether the values are to be interpreted simply as words or as integers or real numbers.

10.3

In document Fortran 95 (Page 178-183)