• No results found

Data file qualifiers

In document ASREML user guide release 3.0 (Page 91-95)

Table 5.2 lists the qualifiers relating to data input. Use the Index to check for examples or further discussion of these qualifiers.

Table 5.2: Qualifiers relating to data input and output

qualifier action

Frequently used data file qualifier

!SKIP n causes the firstnrecords of the (non-binary) data file to be ignored. Typically these lines contain column headings for the data fields.

Other data file qualifiers

!CSV used to make consecutive commas imply a missing value; this

is automatically set if the file name ends with .csvor .CSV

(see Section 4.2)

WarningThis qualifier is ignored when reading binary data. ,

!DATAFILE f specifies the datafile name replacing the one obtained from

the datafile line. It is required when different !PATHS (see

!DOPATHin Table 11.3) of a job must read different files. The

!SKIP qualifier, if specified, will be applied when reading the file.

!FILTER v [ !SELECT n] enables a subset of the data to be analysed;vis the number or

name of a data field. When reading data, the value in fieldvis

checkedafterany transformations are performed. If!select

is omitted, records with zero in fieldvare omitted from the

analysis. Otherwise, records withnin fieldvare retained and

all other records are omitted. The argument n is typically an integer which is compared with the numeric value if a field

after any conversion if the input field performed by the!Aor

!Idata field qualifiers. However,nmay be a quoted string in

which case to is compared to the character value of the field as it is read and before any conversion to numeric value.

Warning If the filter column contains a missing value, the value from the previous non-missing record is assumed in that position.

!FOLDER s ASReml3

specifies an alternative folder for ASRemlto find input files.

This qualifier is usually placed on a separate line BEFORE

the data filename line (and any pedigree/.giv .grm file-

name lines. For example,

!FOLDER ../Data

data.asd !SKIP 1is equivalent to

5 Command file: Reading the data 65

Table 5.2: Qualifiers relating to data input and output

qualifier action

!FORMAT s supplies aFortranlikeFORMATstatement for reading fixed for-

mat files. A simple example is !FORMAT(3I4,5F6.2) which

reads 3 integer fields and 5 floating point fields from the first 42 characters of each data line. A format statement is en- closed in parentheses and may include 1 level of nested paren-

theses, for example, e.g. !FORMAT(4x,3(I4,f8.2)). Field

descriptors are

rXto skiprcharacter positions,

rAwto definerconsecutive fields ofwcharacters width,

rIw to definer consecutive fields ofw characters width,

and

rFw.dto definerconsecutive fields ofwcharacters width;

d indicates where to insert the decimal point if it is not

explicitly present in the field,

whereris an optional repeat count.

In ASReml, theA and Ifield descriptors are treated identi- cally and simply set the field width. Whether the field is interpreted alphabetically or as a number is controlled by the

!Aqualifier.

Other legal components of a format statement are

the , character; required to separate fields - blanks are

not permitted in the format.

the/character; indicates the next field is to be read from

the next line. However a/on the end of a format to skip

a line is not honoured.

BZ; the default action is to read blank fields as missing

values. * andNA are also honoured as missing values. If

you wish to read blank fields as zeros, include the string

BZ.

the stringBM; switches back to ’blank missing’ mode.

the stringTc; moves the ’last character read’ pointer to line

position c so that the next field starts at positionc+ 1.

For exampleT0goes back to the beginning of the line.

the stringD; invokes debug mode.

A format showing these components is

!FORMAT(D,3I4,8X,A6,3(2x,F5.2)/4x,BZ,20I1) and is suitable for reading 27 fields from 2 data records such as

111122223333xxxxxxxxALPHAFxx 4.12xx 5.32xx 6.32 xxxx123 567 901 345 7890

5 Command file: Reading the data 66

Table 5.2: Qualifiers relating to data input and output

qualifier action

!MERGEc f [!SKIPn]

ASReml3

[!MATCH a b] may be specified on a line following the datafile line. The purpose is to combine data fields from the (primary) data

file with data fields from a secondary file (f). This !MERGE

qualifier has been replaced by the much more powerfulMERGE

statement (see Chapter 12).

The effect is to open the named file (skipnlines) and then in-

sert the columns from the new file into field positions starting

at position c. If!MATCHa bis specified,ASReml checks that

the fielda(0< a < c) has the same value as fieldb. If not, it

is assumed that the merged file has some missing records and missing values are inserted into the data record and the line from the MERGE file is kept for comparison with the next record.

It is assumed that the lines in the MERGE file are in the same order as the corresponding lines occur in the primary data file, and that there are no extraneous lines in the MERGE file. A much more powerful merging facility is provided by the MERGE directive described in chapter 12.

For example, assuming the field definitions define 10 fields,

PRIMARY.DAT !skip 1

!MERGE 6 SECOND.DAT !SKIP 1 !MATCH 1 6

would obtain the first five fields from PRIMARY.DATand the

next five fromSECOND.DAT, checking that the first field in each

file has the same value.

Thus each input record is obtained by combining information from each file, before any transformations are performed.

!READ n formally instructsASRemlto readndata fields from the data file. It is needed when there are extra columns in the data file that must be read but are only required for combination into

earlier fields in transformations, or whenASRemlattempts to

read more fields than it needs to.

!RECODE is required when reading a binary data file with pedigree iden- tifiers that have not been recoded according to the pedigree

file. It is not needed when the file was formed using the!SAVE

option but will be needed if formed in some other way (see Section 4.2).

5 Command file: Reading the data 67

Table 5.2: Qualifiers relating to data input and output

qualifier action

!RREC[n]

ASReml2

causes ASReml to read n records or to read up to a data

reading error ifnis omitted, and then process the records it

has. This allows data to be extracted from a file which con- tains trailing non-data records (for example extracting the

predicted values from a.pvs file). The argument (n) speci-

fies the number of data records to be read. If not supplied,

ASRemlreads until a data reading error occurs, and then pro-

cesses the data it has. Without this qualifier,ASRemlaborts

the job when it encounters a data error. See!RSKIP.

!RSKIPn[s]

ASReml2 allowsASRemlto skip lines at the heading of a file down to

(and including) thenth instance of strings. For example, to

read back the third set predicted values in a .pvs file, you

would specify

!RREC !RSKIP 4 ’ Ecode’

since the line containing the 4th instance of ’ Ecode’ imme-

diately precedes the predicted values. The !RREC qualifier

means thatASRemlwill read until the end of the predict ta-

ble. The keywordEcodewhich occurs once at the beginning

and then immediately before each block of data in the .pvs

file is used to count the sections.

Combining rows from separate files

ASRemlcan read data from multiple files provided the files have the same layout. ASReml2

The file specified as the ’primary data file’ in the command file can contain lines of the form

!INCLUDE<filename>!SKIP n

where <filename> is the (path)name of the data subfile and !SKIP n is an optional qualifier indicating that the firstnlines of the subfile are to be skipped. After reading each subfile, input reverts to the primary data file.

Typically, the primary data file will just contain!INCLUDEstatements identifying the subfiles to include. For example, you may have data from a series of related experiments in separate data files for individual analysis. The primary data file for the subsequent combined analysis would then just contain a set of !INCLUDE

5 Command file: Reading the data 68

If the subfiles have CSV format, they should all have it and the!CSVfile should be declared on the primary datafile line. This option is not available in combination with !MERGE.

In document ASREML user guide release 3.0 (Page 91-95)