Both tasks listed above can be carried out using if but sometimes they can become very complicated and repetitive. Therefore, Quantum has an additional testing statement, require, specifically designed to increase the efficiency of this checking process.
☞
For more information on the if statement, see section 9.1, ‘Statements of condition – if’.The require statement is used in three different ways:
• Column validation. Tests columns against a given set of characteristics and deals with records not meeting the requirements according to a specified action code.
• Testing the validity of a logical expression. Tests a logical expression and, if it is true, continues with the next statement. If the expression is false, the record is dealt with according to the given action code.
• Testing the equivalence of logical expressions. Compares the logical value of a group of logical expressions. If all are true or all are false, the run continues with the next statement, but
The actions which are carried out when the stated conditions are violated are determined by an error action code defined either in the require statement itself or in a global statement placed at the start of the edit.
☞
For information about the error action code, see ‘The action code’ in the following section.The require statement has three forms, depending upon the function it performs, and these are described in the subsequent sections. Each one must start with the word require which may be abbreviated to r.
11.2 Column and code validation
Quick Reference
To validate columns and codes, type:
require [/code/] condition col1 [,col2 ...]
where code is the error action code, condition is the type of coding required, and col1 and col2 are the columns or fields to be tested.
This form of the require statement has four basic parts:
1. The word require or the letter r followed by a space.
2. An optional error action code enclosed in forward slashes.
3. A code defining the type of coding required.
4. The column or columns to be checked, separated by commas.
It looks like this:
require [/code/] condition ca [,cb, c(m,n)]
For example:
r /5/ nb c110, c125
Our example checks that columns 110 and 125 are not blank (nb). Any records in which this is not the case are written out to a new file and rejected from any tables that may be produced (/5/).
Let’s deal with each of these items separately.
The action code
Quick Reference
To define a default error action code, type:
rqd number
where number is a number between 0 and 7 inclusive.
The action code is a number between 0 and 7 which tells Quantum what to do with records that do not match the required conditions (for example, records which are blank but which should contain codes). The action code may either be entered as a parameter on each require statement or, if it is the same for all statements, on an rqd statement.
Action codes are:
0 Print a summary of errors only — records are not listed individually, but a count is kept of the number of records failing each require statement. This is printed out at the end of the run.
1 Reject the record from the tables.
2 Print the whole record in the print file, out2.
3 Print the record and reject it from the tables. This is the default.
4 Write the record to the output data file, punchout.q.
5 Write the record into the output data file, punchout.q and reject it from the tables.
6 Print the record in the print file, out2, and write it into the output data file, punchout.q.
7 Print, write and reject the record.
To write a statement which would print out incorrect records but include them in the tables, we would write:
r /2/ ....
Similarly, to have all incorrect records printed in the print file, written into the output data file and rejected from the tables, we would write:
r /7/ ....
In both cases the action code is part of the individual require statement, but where the same action applies to all requires, it is quicker and more efficient to define the action code on an rqd statement at the beginning of the edit. For instance, if all erroneous records are to be written out and rejected we would write:
rqd 5
The default action is to print the record out and reject it from the tables:
and if no action code is defined, this will be assumed.
Checking type of coding
Checking with require can be as simple or complex as you like. In this section, we will start with the simplest checks and deal with each extra feature in turn. We will assume, unless otherwise stated, that the error action code is the default Print and Reject (code 3) and will omit it from most of the examples accordingly.
The most basic form of the require statement simply checks whether the column or field of columns contains the correct type of code; it does not check the individual codes themselves. Code types may be:
One of these types must follow the word require since it tells Quantum what to check for.
All that remains is to say which columns are to be inspected; just list each column or field of columns at the end of the statement. If more than one column or field is defined, each one must be separated by a comma.
Here are some examples in which the record to be checked is:
----+----1----+----2----+----3----+----4----+
002411123481231&- *1927235537*&& 1 1 1 The statement:
r nb c10, c(25,35)
r /3/ .... or rqd 3
b Blank
nb Not blank (single-coded or multicoded) sp Single-coded (literally, single-punched) spb Single-coded or blank
checks that columns 10, and 25 to 35 inclusive are not blank — they may contain any number of codes. This record satisfies both conditions so it passes on to the next statement in the edit.
The statement:
r sp c11, c15, c23, c41
looks to see whether columns 11, 15, 23 and 41 are single-coded. In our record they are, but if this were not the case (say c11’123’) the record would be printed out and rejected from any tables that may be produced. Additionally, Quantum would tell us ‘Column 11 is 123’.
✎
Be careful when using field specifications with require: the condition applies to each column individually, not to the field as a whole. For instance:r sp c(1,4)
means that each of columns 1, 2, 3 and 4 must contain one code. It does not mean that the field must contain one code overall. To check that a field contains one code only, use numb.
☞
For further information about numb, see chapter 5, ‘Expressions’.Very often some columns on the questionnaire are not used, so you might like to check that all such columns are blank in the data file. In our example, let’s say that columns 51 to 70 are not used. To check that there are no stray codes in these columns we would write:
r b c(51,70)
Comments with require
Quick Reference
To define a message to be printed when a record fails a test, type:
r [/err_code/ ] condition columns $message$
When incorrect records are printed out, require automatically prints a short text describing the error. Normally, it tells you what codes were found in the column which is wrong, but if this is not what you want, you may define your own error text by entering it enclosed in dollar signs at the end of the statement. This text will then be printed in place of the default text when errors are found.
For example, if c329 is multicoded when it should be single-coded, the statement:
r sp c329
will print the whole record and tell us which codes were found in that multicode:
Instead of being told which codes the column contains, you may prefer to see a message linking the error to a question on the questionnaire. In this case you will need to add your own error text as follows:
r sp c329 $q21a not sp$
These texts may be as long or short as you like.
Checking codes in columns
Quick Reference
To check for specific codes in a column, type:
r [/err_code/] condition col1’codes1’ [, col2’codes2’ ... ]
where codes1 are the codes to be tested for in column or field col1, and codes2 are the codes to be tested for in column or field col2.
Any codes which are present in col1 but are not listed in codes1 are ignored. The same applies to any other column and code pairs listed.
Sometimes it is not sufficient to check just the type of coding, and you will want to know whether the codes found are valid for that column. To do this, we use the information given in the previous section as a base, and add on our first ‘optional extra’.
To check whether a column or field of columns contains specific codes, follow the column specification with the codes to be checked, enclosed in single quotes. For example:
r /5/ sp c223’1/5’
tells us that column 223 should be single-coded within the range of codes 1 through 5. Any other codes in this column are ignored. Thus, a record in which c223’14’ is incorrect because it contains two of the listed codes, whereas a record in which c223’27’ is correct because it contains only a 2 from the range ‘1/5’. Of course, any record which does not contain a 1, 2, 3, 4 or 5 at all is also incorrect, regardless of whether or not it is single-coded: c223’9’ is just as wrong as c223’789&’.
Codes may also be defined with all other code types, thus:
r /3/ nb c156’2/6’
If c156 does not contain at least one of the codes 2 through 6 (regardless of anything else it may contain) the record is printed out. Column 156 may be multicoded as long as at least one of the codes is within the required range.
----+----6 ----+----6 ---+----6 1 2 2 2 and 7 and 5 8 8 are valid, but:
----+----6 9
is not because ‘9’ is not one of the listed codes.
Even though it checks for blanks, require b may be followed by columns and codes. You would do this when you are checking that a column is either blank or, if not blank, that it does not contain certain codes. Here’s an example to clarify this:
r b c134’1/8’
This statement tells Quantum that column 134 must never contain any of the codes 1 through 8:
only ‘09-&’ or blank are acceptable. This is the opposite of r sp and r nb, both of which list valid codes. Any record failing this condition will be printed and rejected via the default action code 3.
Exclusive codes
Quick Reference
To check that a column or field contains no codes other than those listed, type:
r [/err_code/] condition col1’codes1’o
If col1 contains any codes other than those given in codes1, the test is false.
Now that you know how to check codes, the next thing to discuss is how to check that all other code positions are blank.
We have said that statements of the form:
r sp ca’p’
accept all records containing only one of the codes ‘p’ in column a, regardless of what other codes are also present. To check that a column contains only the listed codes and nothing else, follow the code specification with the letter o (for only) in upper or lower case. For example, to indicate that c356 must be single-coded in the range ‘1/5’ and that all other positions (‘6/&’) must be blank, you should type:
r sp c356’1/5’o which is the same as:
if (c356’6/&’.or.numb(c356).ne.1) write; reject Any of the following would cause the record to be printed and rejected:
c356’34’ c356’59’ c356’8’ c356’ ’
The require statement may define conditions for more than one column. Just follow each column with the code positions to be checked and separate each set with a comma:
r sp c164’12-’, c165’1/70’, c166’1/3’, c167’1/9-’, c168’1/5’
Here the columns to be checked are consecutive but have been listed separately because they each have different sets of valid codes. If all columns could be single-coded in the range 1 to 7 we might abbreviate this to:
r sp c(164,168)’1/7’ $q10a/e$
since this notation means that each column in the field must be single-coded within the given range rather than that the field as a whole may contain only one of those codes.
Automatic error correction
Quick Reference
To define a correction code to be used as a replacement for codes which fail the required condition, type:
r [/err_code/] condition col1’codes1’ :’new_code’
new_code is the code or codes to be inserted in col1 if it fails the test condition. Any codes already in that column are overwritten.
As you know, records found to have errors are printed, coded and/or rejected according to the error action code. When the run is finished you will look at these records and, if possible, correct the errors by using the on-line edit or correction file facilities.
☞
For information about on-line editing and the corrections file, see chapter 12, ‘Data correction’.Occasionally you will know in advance what to do with certain types of error; say, for instance, the respondent’s sex has been miscoded. You may decide or be told to recode this person as a ‘3’ in the appropriate column indicating that the sex was not known. The way to do all this in one go is to write the normal require statement that checks columns and codes, and to follow the code specification with a colon (:) and the replacement code (in this case ‘3’) enclosed in single quotes, thus:
r /2/ sp c106’12’ :’3’
Any record in which c106 is not single-coded with either a ‘1’ or a ‘2’ will have the contents of c106 overwritten with a ‘3’.
The equivalent using if and an assignment statement would be written:
if (numb(c106’12’).ne.1) c106’3’;
+write $c106 incorrect$
Once again, the require is shorter and quicker.
When working with fields, it is not possible to define replacement strings for the field as a whole.
You should, however, note that if a single replacement code is given for a field of columns, any incorrect columns in that field will be overwritten with the replacement code. The correct columns remain untouched.
If we have:
+----4----+
1927
and we write c(237,240)’1/5’ :’&’" we will have:
+----4----+
1&2&
✎
If you use this facility, remember that the replacement code is an alteration to the data, and as such is operative only as long as each record is in the C array. If you want to save these modifications you must include a statement in your edit which will write records to another file. Statements which write out new data files are split and write. Alternatively, you can use one of the action codes which writes records to the output data file.☞
For information about split, see section 12.4, ‘Creating clean and dirty data files’.For information about write, see section 7.1, ‘Print files’.
Defaults in a require statement
Quick Reference
To define defaults for all columns or fields tested, type:
r type [’codes’][o] [:’new_code’] columns
The defaults may be overridden for an individual column by following the column with the required coding, only flag and replacement code as usual.
By now you will have guessed that require statements can become lengthy things, especially when specific codes have to be checked, replacement characters defined and error texts entered. In many cases some, if not all, of these items will be common to the majority of the columns listed in the statement; for instance, several non-consecutive columns may have the same set of valid codes.
When this happens you may enter these common items at the beginning of the require statement as defaults for that statement. There are several ways of doing this, so let’s take the statement:
r spb c127’0/9’o, c129’0/9’o, c131’0/9’o, c133’0/9’o as an example. This can be more efficiently written as:
r spb ’0/9’o c127, c129, c131, c133
Both statements check whether columns 127, 129, 131 and 133 are single-coded n the range 0 to 9 or are blank. If the − or & codes appear in any of these columns, or if the columns are multicoded, the offending records will be printed and rejected.
Defaults defined at the start of a require may be overridden for an individual column or field by following that item with the new specification. For example:
r sp ’1/5’ c10, c12, c15, c20’1/3’
tells us that columns 10, 12 and 15 must be single-coded in the range 1 to 5 while column 20 must be single-coded in the range 1 to 3.
Here is another example which uses the Only operator:
r sp ’1/5’o c10, c12, c15, c20’1/7’, c24
This checks that columns 10, 12, 15 and 24 are single-coded in the range 1 to 5 and that none of the codes ‘6/&’ are present in those columns. Column 20 has its own code specification which overrides not only the default codes but also the Only operator. Quantum will check that c20 contains only one of the codes 1 to 7, but it will ignore anything it finds in the range ‘8/&’.
Finally, let’s look at one more statement:
r sp ’1/5’o :’&’ c10, c12, c20’1/7’, c24
This is exactly the same as the previous example except that we have added a replacement code to be used when errors are found. This code refers to all columns named with this require, even though column 20 has a different set of valid codes.