• No results found

Grouping records

By using the Open Query File (OPNQRYF) command, you can group records by like values of one or more fields and calculate aggregate functions, such as the minimum field value and average field value, for each group.

Summarizing data from database file records (grouping)

The group processing function of the Open Query File (OPNQRYF) command allows you to summarize data from existing database records.

You can specify:

• The grouping fields

• Selection values both before and after grouping

• A keyed sequence access path over the new records

• Mapped field definitions that allow you to do such functions as sum, average, standard deviation, and variance, as well as counting the records in each group

• The sort sequence and language identifier that supply the weights by which the field values are grouped You normally start by creating a file with a record format containing only the following types of fields:

• Grouping fields. Specified on the GRPFLD parameter that define groups. Each group contains a constant set of values for all grouping fields. The grouping fields do not need to appear in the record format identified on the FORMAT parameter.

• Aggregate fields. Defined by using the MAPFLD parameter with one or more of the following built-in functions:

%COUNT

Counts the records in a group

%SUMA sum of the values of a field over the group

%AVGArithmetic average (mean) of a field, over the group

%MAXMaximum value in the group for the field

%MINMinimum value in the group for the field

%STDDEV

Standard deviation of a field, over the group

%VARVariance of a field, over the group

• Constant fields. Allow constants to be placed in field values. The restriction that the Open Query File (OPNQRYF) command must know all fields in the output format is also true for the grouping function.

When you use group processing, you can only read the file sequentially.

Example: Summarizing data from database file records (grouping)

This example shows how to use the group processing function to summarize data from existing database records.

Assume that you want to group the data by customer number and analyze the amount field. Your database file is FILEA and you create a file named FILEAA containing a record format with the following fields.

FILEA FILEAA

Cust Cust

Type Count (count of records per customer)

Amt Amtsum (summation of the amount field)

Amtavg (average of the amount field)

Amtmax (maximum value of the amount field)

When you define the fields in the new file, you must ensure that they are large enough to hold the results.

For example, if the Amt field is defined as 5 digits, you might need to define the Amtsum field as 7 digits.

Any arithmetic overflow causes your program to end abnormally.

Assume that the records in FILEA have the following values.

Cust Type Amt

001 A 500.00

001 B 700.00

004 A 100.00

002 A 1200.00

003 B 900.00

001 A 300.00

004 A 300.00

003 B 600.00

You then create a program (PGMG) using FILEAA as input to print the records.

OVRDBF FILE(FILEAA) TOFILE(FILEA) SHARE(*YES) OPNQRYF FILE(FILEA) FORMAT(FILEAA) KEYFLD(CUST) + GRPFLD(CUST) MAPFLD((COUNT '%COUNT') + (AMTSUM '%SUM(AMT)') +

(AMTAVG '%AVG(AMT)') + (AMTMAX '%MAX(AMT)'))

CALL PGM(PGMG) /* Created using file FILEAA as input */

CLOF OPNID(FILEA) DLTOVR FILE(FILEAA)

Your program retrieves the following records.

Cust Count Amtsum Amtavg Amtmax

001 3 1500.00 500.00 700.00

002 1 1200.00 1200.00 1200.00

003 2 1500.00 750.00 900.00

004 2 400.00 200.00 300.00

Note: If you specify the GRPFLD parameter, the groups might not appear in ascending sequence. To ensure a specific sequence, you should specify the KEYFLD parameter.

Assume that you want to print only the summary records in this example in which the Amtsum value is greater than 700.00. Because the Amtsum field is an aggregate field for a given customer, use the GRPSLT parameter to specify selection after grouping. Add the GRPSLT parameter:

GRPSLT('AMTSUM *GT 700.00')

Your program retrieves the following records.

Cust Count Amtsum Amtavg Amtmax

001 3 1500.00 500.00 700.00

002 1 1200.00 1200.00 1200.00

003 2 1500.00 750.00 900.00

The Open Query File (OPNQRYF) command supports selection both before grouping (QRYSLT parameter) and after grouping (GRPSLT parameter).

Assume that you want to select additional customer records in which the Type field is equal to A. Because Type is a field in the record format for file FILEA and not an aggregate field, you add the QRYSLT statement to select before grouping as follows:

QRYSLT('TYPE *EQ "A" ')

Note: Fields used for selection do not have to appear in the format processed by the program.

Your program retrieves the following records.

Cust Count Amtsum Amtavg Amtmax

001 2 800.00 400.00 500.00

002 1 1200.00 1200.00 1200.00

Note: The values for CUST 001 changed because the selection took place before the grouping took place.

Assume that you want to arrange the output by the Amtavg field in descending sequence, in addition to the previous QRYSLT parameter value. You can do this by changing the KEYFLD parameter on the OPNQRYF command as:

KEYFLD((AMTAVG *DESCEND))

Your program retrieves the following records.

Cust Count Amtsum Amtavg Amtmax

002 1 1200.00 1200.00 1200.00

001 2 800.00 400.00 500.00

Final total-only processing

The Open Query File (OPNQRYF) command supports final-total-only processing. It is a special form of grouping in which you do not specify grouping fields. The output is only one record.

All of the special built-in functions for grouping can be specified. You can also specify the selection of records that make up the final total.

Example 1: Final total-only processing This example shows simple total processing.

Assume that you have a database file FILEA and decide to create file FINTOT for your final total record as follows.

FILEA FINTOT

Code Count (count of all the selected records)

Amt Totamt (total of the amount field)

Maxamt (maximum value in the amount field)

The FINTOT file is created specifically to hold the single record which is created with the final totals. You can specify:

OVRDBF FILE(FINTOT) TOFILE(FILEA) SHARE(*YES) OPNQRYF FILE(FILEA) FORMAT(FINTOT) +

MAPFLD((COUNT '%COUNT') +

(TOTAMT '%SUM(AMT)') (MAXAMT '%MAX(AMT)')) CALL PGM(PGMG) /* Created using file FINTOT as input */

CLOF OPNID(FILEA) DLTOVR FILE(FINTOT)

Example 2: Final total-only processing

This example shows total-only processing with record selection.

Assume that you want to change the previous example so that only the records where the Code field is equal to B are in the final total. You can add the QRYSLT parameter as follows:

OVRDBF FILE(FINTOT) TOFILE(FILEA) SHARE(*YES) OPNQRYF FILE(FILEA) FORMAT(FINTOT) +

QRYSLT('CODE *EQ "B" ') MAPFLD((COUNT '%COUNT') + (TOTAMT '%SUM(AMT)') (MAXAMT '%MAX(AMT)')) CALL PGM(PGMG) /* Created using file FINTOT as input */

CLOF OPNID(FILEA) DLTOVR FILE(FINTOT)

You can use the GRPSLT keyword with the final total function. The GRPSLT selection values you specify determines if you receive the final total record.

Example 3: Final total-only processing

This example shows total-only processing using a new record format.

Assume that you want to process the new file/format with a control language (CL) program. You want to read the file and send a message with the final totals. You can specify:

DCLF FILE(FINTOT)

DCL &COUNTA *CHAR LEN(7) DCL &TOTAMTA *CHAR LEN(9)

OVRDBF FILE(FINTOT) TOFILE(FILEA) SHARE(*YES)

OPNQRYF FILE(FILEA) FORMAT(FINTOT) MAPFLD((COUNT '%COUNT') + (TOTAMT '%SUM(AMT)'))

RCVFCLOF OPNID(FILEA) CHGVAR &COUNTA &COUNT CHGVAR &TOTAMTA &TOTAMT

SNDPGMMSG MSG('COUNT=' *CAT &COUNTA *CAT + ' Total amount=' *CAT &TOTAMTA);

DLTOVR FILE(FINTOT)

You must convert the numeric fields to character fields to include them in an immediate message.

Grouping field references

When the grouping function is used, all fields in the record format for the open query file (FORMAT

parameter) and all key fields (KEYFLD parameter) must be either grouping fields (specified on the GRPFLD parameter) or mapped fields (specified on the MAPFLD parameter).

Mapped fields are defined by using only grouping fields, constants, and aggregate functions.

The aggregate functions are %AVG, %COUNT, %MAX (using only one operand), %MIN (using only one operand), %STDDEV, %SUM, and %VAR. Group processing is required in the following cases:

• When you specify grouping field names on the GRPFLD parameter

• When you specify group selection values on the GRPSLT parameter

• When a mapped field that you specified on the MAPFLD parameter uses an aggregate function in its definition