Arrays and data manipulation
9.2 Vector subscripts
Section 4.4 dealt with “array sections”, i.e. the notation for picking out a subset of the elements of an array.
This was done with a colon notation, as in the simple rank-one example decade(10:1:–1)
The three integers, separated by colons, form what is known as a “subscript triplet”, specifying a starting index, a final index, and a “stride” (Section 4.6). “Vector subscripts” are a further tool for forming array sections, permitting a set of indices to be specified in any order. The idea is that, instead of a subscript triplet, a rank-one integer array is given that contains the desired indices in the desired order. Thus, if we have an array
npick = (/10, 9, 8, 7, 6, 5, 4, 3, 2, 1/)
then the array section mentioned above could be specified equally well by decade(npick)
In other words, an array may have an index that is itself a rank-one array. In decade (npick), if npick were a scalar integer then we would just have a single element of the array; but because npick is an array so is decade (npick). Another example: with the vector subscript
u = (/4, 9, 1, 6/)
the array section x (u) consists of the elements x(4), x(9), x(1) and x (6) in that order.
A vector subscript may be any integer expression of rank one, as long as its elements are within the bounds of the array of which it is a subscript. With the arrays
INTEGER :: matrices(4, 4, 200), triad(3), mum(3, 3) and if, say,
triad = (/196, 2, 34/) then
mum = matrices(3:1:–1, 2, triad) is the rank-two array with elements
mum(1, 1) = matrices(3, 2, 196)
mum(2, 1) = matrices(2, 2, 196) mum(3, 1) = matrices(1, 2, 196) mum (1, 2) = matrices(3, 2, 2) mum(2, 2) = matrices(2, 2, 2) mum(3, 2) = matrices(1, 2, 2) mum(1, 3) = matrices(3, 2, 34) mum(2, 3) = matrices(2, 2, 34) mum(3, 3) = matrices(1, 2, 34)
As this example illustrates, it is possible to have an array section in which different dimensions have different forms of subscript, i.e. we can have a mixture of single subscripts, subscript triplets, and vector subscripts.
A multidimensional array may have more than one vector subscript. For example, if isquare = RESHAPE(SOURCE=(/ (k**2, k=1, 64) /), &
SHAPE=(/8, 8/))
and we have the vector subscript k3 =(/1, 2, 3 /), then
knine = RESHAPE(SOURCE=isquare(k3, k3), SHAPE=(/9/))
is the rank-one array (/1, 4, 9, 81, 100, 121, 289, 324, 361/). Here, isquare is an 8×8 array of integers, and isquare(k3, k3) is an array section with two vector subscripts that happen to be equal. A vector subscript may have more than one element of the same value, in which case it forms what is called a “many-one”
array section. For example, given the array of single characters
chars = (/("abcdefghijklmnopqrstuvwxyz_"(i:i), 1=1, 27)/) and the vector subscript given by
name = (/13, 1, 18, 20, 9, 14, 27, 10, 1, 13, 5, 19, 27, 3, 15, &
21, 14, 9, 8, 1, 14/)
then chars (name) is a many-one array section consisting of 21 single-character elements such that WRITE (*,*) chars(name)
spells out
martin_james_counihan
If a vector subscript has many repeated elements, the resulting array section could be much greater in size than the array of which it is a section.
There are just a few restrictions on the use of vector subscripts: in particular, a many-one array section may not appear as the variable on the left side of an assignment statement and an array with a vector subscript may not be used as the argument of a procedure that will redefine it (INTENT(IN) should be used). Also, a pointer assignment statement (explained later in this book) may not have as its target an array section with a vector subscript.
There are other games we can play. An array may have itself as a vector subscript, so if, say, z = (/1, 3, 4, 2/)
then z(z) is (/1, 4, 2, 3/) and z(z(z)) is (/1, 2, 3, 4/).
Also, a vector subscript does not have to take the form of an array constructor: it could be a named array variable or even a rank-one array-valued function result as in the expressions
r(locations_of_cells)
r(Indices_of_Positive_Values(r))
9.3 Bits
The Fortran language allows for a number of different data types, described in Section 8.4, including derived data types constructed out of the basic types real, integer, character, etc.
In addition, it is possible to manipulate data in the form of sequences of binary bits, each bit having the value 0 or 1. In a way, LOGICAL data is binary and a logical array is rather like a sequence of bits, but in fact a data element declared LOGICAL is not normally stored in a single memory bit. So, "bits" in Fortran have nothing directly to do with LOGICAL data. Instead, an item of data declared as INTEGER may be interpreted as a sequence of binary bits. One integer will contain a large number of bits: commonly 32 or 64.
The exact number for a particular processor can be ascertained by calling the intrinsic inquiry function BIT_SIZE.
When an integer is being regarded as a sequence of bits, ordinary integer arithmetic will generally be irrelevant. The integer’s value when it is taken as an ordinary integer will, in general, differ from its value when the bits are interpreted as a binary number. A Fortran INTEGER, of course, may be negative, but a sequence of binary digits can only be positive. So, instead of the operations of ordinary integer arithmetic, a special set of intrinsic procedures exists so that bits (sometimes called “bitstream” data) can be manipulated. The details are set out in Appendix B.
9.4 Exercises 9.A 9.A
1
Given that r is an array declared by the TDS
REAL :: r(4)
which of the following expressions are arrays equal to r?
Using the information in Appendix B, write a function that will convert a length-32 logical array into the corresponding stream of 32 bits, assuming that BIT_SIZE(1)=32. Write a function to convert a string of 32
bits (represented as an integer variable, and assuming BIT_SIZE(1)=32) into a positive hexadecimal number (in the form of a length-eight character string representing eight hexadecimal digits). Use RESHAPE in an assignment statement setting the first 48 elements of an 8×8 matrix, in normal array element order, equal to the elements of a size-48 vector. The matrix is called chessboard and the vector is
called six_rows. They are to be of logical type. The 16 elements of chessboard which are not filled by six_rows are to be filled with alternating . TRUE . and . FALSE . values. If vowels = (/"all, "e", "in, "o",
"u"/), khard = (/1, 4, 5/) and ksoft = (/2, 3 /) what are the values of (i) vowels(khard) (ii) vowels(ksoft) (iii) khard(ksoft) (iv) vowels((/4,2/) ) (v) vowels(khard)//vowels(2)
9.5
Internal files; TRANSFER
Some parts of this section will not be completely understandable without reference to Appendix A, which deals with the input and output of data, and in particular to format specifiers (see Appendix A.2).
However, “internal” files are not really files at all: they are simply areas of the processor’s memory being treated as if they were sequential-access formatted files. Data can be written to or read from internal files using WRITE and READ statements, but there is no transfer of information outside the processor. Internal files can be very useful for changing the type interpretation of data: you can write to an internal file in one format, then read from it in another format.
Internal files can only be used with the READ and WRITE statements. All the other i/o statements are inapplicable to them, and “non-advancing” i/o is not possible (see Appendix A). An internal file is not only internal to the processor; it is internal to the program and cannot be used to transfer information between different programs or to save data from one program run to another. In fact, an internal file is a local entity that cannot even be used to transfer data between program units except by the usual methods of data association.
An internal file is no more than a character variable. To read or write an internal file, the unit number is replaced by the name of the character variable. An example should make the principle clear:
WRITE (UNIT=digit_string, FMT="(I6)") nref
This statement requires digit_string to have been previously declared in the usual way as a character variable of length six or more. The integer nref, assuming it has a defined value, is written into the string digit_string, taking up six character positions. If the length of digit_string is greater than six, it will be padded out with trailing blanks. The precise meaning of FMT="(I6)" is explained in Appendix A.
Subsequently,
READ (digit_string, "(5X, I1)") junits will yield the last digit of the number. The statement
READ (digit_string, "(F6.0)") ref
will yield the original number but now as a variable of real type. If the integer nref had been, say, 524288, then digit_string would be given the value “524288”, junits would be the integer 8, and ref would be the real number 524288.0. Within the processor’s memory, the internal representations of nref, digit_string and
ref would be quite different, since they are variables of different types.
The next example shows how an internal file WRITE statement can be used for concatenation:
CHARACTER(10) :: title
CHARACTER(30) :: forename, midname, surname CHARACTER(100) :: name
. . .
WRITE (name, "(A, 1X, A, 1X, A, 1X, A)") title, forename, &
midname, surname This is equivalent to
name = title//" "//forename//" "//midname//" "//surname
In the above examples, the internal files called digit_string and name were scalar variables regarded as having one record. If the internal file is an array, then each element of the array is regarded as a “record” of
the internal file, taken in normal array element order. An internal file may be an array section, but not with a vector subscript.
If names is a character array, the statement
READ (names(1:2), "(A12)") forename, surname
reads a file of two records, each being a length-12 string. This statement involves “format reversion” and is equivalent to
forename = names(1); surname = names(2)
except that the READ statement, unlike the assignment statements, will ignore leading blanks in the elements of names. Moreover,
WRITE (names(1:2), "(2A12)") forename, surname is equivalent to
names(1:2) = (/forename, surname/)
but the WRITE statement will give an error if the elements of names have lengths less than 12. Notice that, in our previous examples, it would be possible for digit_string, nref, junits, ref, title, forename etc. to be arrays as long as they are declared as such and as long as the internal files conform in shape with the data.
When data is input to a Fortran program (say, by reading a magnetic tape) it cannot normally be interpreted properly unless each data item is of a type known within the program. The data types that are read must correspond to the variables named in the input list. This can lead to difficulties when, as sometimes happens, the contents of a file are not known in advance. Suppose intin is a size-100 integer array and we have
READ(UNIT=7, FMT="(100I10)") intin
then what happens if in fact the file contains 100 real numbers? Or an unpredictable mixture of integer and real numbers? Fortunately Fortran provides an intrinsic function, TRANSFER (SOURCE, MOLD, SIZE), which can be used to change the type interpretation of a piece of data within the processor, i.e. when it has already been read in. For example, the above statement could be followed by
realin = TRANSFER(SOURCE = intin, MOLD = 0.0, SIZE = 100)
and this would put values into the real-valued array realin by reinterpreting the source data. It is very important to understand that TRANSFER does not convert data between types: the functions REAL, INTEGER, etc. do that. TRANSFER leaves unchanged the bit-patterns within the processor's memory, but simply interprets them in a different way.
TRANSFER (SOURCE, MOLD, SIZE) gives a result whose internal representation is exactly the same as SOURCE, but whose type is that of MOLD. SOURCE may be an array. SIZE is an optional argument discussed below. TRANSFER works for any data type, including a derived type.
Normally if MOLD is scalar the result is a scalar, whereas if MOLD is an array the result will be a rank-one array of sufficient size to hold the contents of SOURCE. The result of TRANSFER is never an array of rank greater than one. For example,
TRANSFER(SOURCE=(1.0, 2.0), MOLD=(/0.0/))
produces a size-two, rank-one real array whose elements are the real and imaginary parts of the complex number (1.0,2.0).
The SIZE argument, if it is present, must be a scalar integer. Its effect is to ensure that the result is an array and to fix its size. If SIZE is specified and does not match the size of SOURCE, then in forming the function result it is possible that the trailing part of SOURCE may be lost or that the trailing part of the result may be left undefined. So,
TRANSFER(SOURCE=(/1.0, 2.0, 3.0, 4.0/), MOLD=(0.0, 0.0), &
SIZE=2)
produces a size-two complex array whose members are (1.0, 2.0) and (3.0, 4.0). The expression TRANSFER((/–7.0, 5.1, 6.4/), (0.0, 0.0),1)
has the value (/(–7.0, 5.1)/), while
TRANSFER((/–7.0, 5.1, 6.4/), (0.0, 0.0)) is a scalar with the value (–7.0, 5.1).
If the argument ch is of character type, TRANSFER(SOURCE=ch, MOLD=1) will give an integer-type result of processor-dependent value preserving the bit-pattern of ch; this could be used in conjunction with bit manipulation functions.
Incidentally, TRANSFER (SOURCE=array, MOLD=array) provides a neat way of replacing an array of any shape by the corresponding rank-one array containing the same elements. This could otherwise be done using the RESHAPE function.
The TRANSFER function means that a formatted READ statement can be made very flexible and the data unravelled later, e.g.
READ (UNIT=in, FMT="(A)") chararray .
. .
header = TRANSFER(SOURCE=chararray(1:5), MOLD="",&
SIZE=5)
length = TRANSFER(SOURCE=chararray(6:10), MOLD=1) IF (length>0.AND.length<(recmax–7)) THEN
rdata = TRANSFER(SOURCE=chararray(11:), &
MOLD=(/0.0/), SIZE=length) .
. .
where header, length and rdata are of character, integer and real types respectively. A long program that reads records of different sorts from different devices could in this way use a single generalized READ statement.
Notice that
REAL :: h (1000) INTEGER :: i (1000) .
. .
(read r from an external device) .
. .
i = TRANSFER(SOURCE=h, MOLD=1, SIZE=1000)
allows elements of the data read as real numbers into r to be interpreted instead as integers by referring to i instead of r. This code does a similar job to that sometimes done with the now obsolete EQUIVALENCE statement.
Initialization expressions may make reference to the TRANSFER function, so a declaration of the sort INTEGER :: i = TRANSFER("counihan", 1)
is possible.
Finally, the point must be clearly understood that TRANSFER is not an alternative to the writing and reading of an internal file. TRANSFER works at a lower level and the effect of TRANSFERring an integer into a real number (say) is not easily predictable and will in general depend on the particular processor’s methods of representing integer and real data as sequences of bits. Manipulating internal files, on the other hand, will always translate an integer like 64 into the real number 64.0 because internal files are based on characters, not on bits. The result of TRANSFER will be processor-independent only when the type transfer is between real and complex types, since a complex number is defined in Fortran as an ordered pair of real numbers.
9.6 Exercises 9.B 9.A
1
Write a function that reverses the digits of an integer.
9.A 2
Write a function that uses an internal file to convert between a character string and a real number, it being assumed that the string contains digits etc. representing a real number.
Modules
Modules can be of great importance in organizing the structure of any large Fortran program.
They enable data to be communicated between subprograms, they are useful for encapsulating sets of related subprograms, and they have a role in the writing of “generic” procedures, which can operate on arguments of different types. This chapter explains the use of modules for sharing data and for containing procedures. There is a discussion of the different forms of data association (argument, host, and USE), a detailed description of the USE statement, and finally an account of how modules can be used in relation to procedure interfaces.
10.1