• No results found

Working with Numbers

In document IBM SPSS Modeler 15 User s Guide (Page 124-127)

Numerous standard operations on numeric values are available in IBM® SPSS® Modeler, such as: Calculating the sine of the specified angle—sin(NUM)

Calculating the natural log of numericfields—log(NUM) Calculating the sum of two numbers—NUM1+NUM2

For more information, see the topicNumeric Functionsin Chapter 8 on p. 138.

Working with Times and Dates

Time and date formats may vary depending on your data source and locale. The formats of date and time are specific to each stream and are set in the stream properties dialog box. The following examples are commonly used functions for working with date/timefields.

115 Building CLEM Expressions Calculating Time Passed

You can easily calculate the time passed from a baseline date using a family of functions similar to the following one. This function returns the time in months from the baseline date to the date represented by the date stringDATEas a real number. This is an approximatefigure, based on a month of 30.0 days.

date_in_months(Date)

Comparing Date/Time Values

Values of date/timefields can be compared across records using functions similar to the following one. This function returns a value oftrueif the date stringDATE1represents a date prior to that represented by the date stringDATE2. Otherwise, this function returns a value of 0.

date_before(Date1, Date2) Calculating Differences

You can also calculate the difference between two times and two dates using functions, such as: date_weeks_difference(Date1, Date2)

This function returns the time in weeks from the date represented by the date stringDATE1to the date represented by the date stringDATE2as a real number. This is based on a week of 7.0 days. If DATE2is prior toDATE1, this function returns a negative number.

Today’s Date

The current date can be added to the data set using the function@TODAY. Today’s date is added as a string to the specifiedfield or newfield using the date format selected in the stream properties dialog box. For more information, see the topicDate and Time Functionsin Chapter 8 on p. 146.

Summarizing Multiple Fields

The CLEM language includes a number of functions that return summary statistics across multiplefields. These functions may be particularly useful in analyzing survey data, where multiple responses to a question may be stored in multiplefields. For more information, see the topicWorking with Multiple-Response Dataon p. 117.

Comparison Functions

You can compare values across multiplefields using themin_nandmax_nfunctions—for example: max_n(['card1fee' 'card2fee''card3fee''card4fee'])

You can also use a number of counting functions to obtain counts of values that meet specific criteria, even when those values are stored in multiplefields. For example, to count the number of cards that have been held for more thanfive years:

count_greater_than(5, ['cardtenure' 'card2tenure' 'card3tenure']) To count null values across the same set offields: count_nulls(['cardtenure' 'card2tenure' 'card3tenure'])

Note that this example counts the number of cards being held, not the number of people holding them. For more information, see the topicComparison Functionsin Chapter 8 on p. 135. To count the number of times a specified value occurs across multiplefields, you can use the count_equalfunction. The following example counts the number offields in the list that contain the valueY.

count_equal("Y",[Answer1, Answer2, Answer3])

Given the following values for thefields in the list, the function returns the results for the valueY as shown.

Answer1 Answer2 Answer3 Count

Y N Y 2

Y N N 1

Numeric Functions

You can obtain statistics across multiplefields using thesum_n,mean_n, andsdev_n functions—for example:

sum_n(['card1bal' 'card2bal''card3bal']) mean_n(['card1bal' 'card2bal''card3bal'])

For more information, see the topicNumeric Functionsin Chapter 8 on p. 138. Generating Lists of Fields

When using any of the functions that accept a list offields as input, the special functions @FIELDS_BETWEEN(start, end)and@FIELDS_MATCHING(pattern)can be used as input. For example, assuming the order offields is as shown in thesum_nexample earlier, the following would be equivalent:

sum_n(@FIELDS_BETWEEN(card1bal, card3bal))

Alternatively, to count the number of null values across allfields beginning with “card”: count_nulls(@FIELDS_MATCHING('card*'))

117 Building CLEM Expressions

Working with Multiple-Response Data

A number of comparison functions can be used to analyze multiple-response data, including: value_at

first_index / last_index first_non_null / last_non_null

first_non_null_index / last_non_null_index min_index / max_index

For example, suppose a multiple-response question asked for thefirst, second, and third most important reasons for deciding on a particular purchase (for example, price, personal recommendation, review, local supplier, other). In this case, you might determine the importance of price by deriving the index of thefield in which it wasfirst included:

first_index("price", [Reason1 Reason2 Reason3])

Similarly, suppose you have asked customers to rank three cars in order of likelihood to purchase and coded the responses in three separatefields, as follows:

customer id car1 car2 car3

101 1 3 2

102 3 2 1

103 2 3 1

In this case, you could determine the index of thefield for the car they like most (ranked #1, or the lowest rank) using themin_indexfunction:

min_index(['car1' 'car2' 'car3'])

For more information, see the topicComparison Functionsin Chapter 8 on p. 135. Referencing Multiple-Response Sets

The special@MULTI_RESPONSE_SETfunction can be used to reference all of thefields in a multiple-response set. For example, if the threecarfields in the previous example are included in a multiple-response set namedcar_rankings, the following would return the same result:

max_index(@MULTI_RESPONSE_SET("car_rankings"))

In document IBM SPSS Modeler 15 User s Guide (Page 124-127)