Select and Examine a Specific Model - Oracle Data Mining Hands On Lab

Using the analysis performed in the previous topic, the Decision Tree model is selected for further analysis.

Follow these steps to examine the Decision Tree model.

1. Back in the workflow pane, right-click the Class Build node again, and selectView Models > CLAS_DT_1_1(Note: The exact name of your Decision Tree model may be different).

Page 42 Result: A window opens that displays a graphical presentation of the Decision Tree.

2. The interface provides several methods of viewing navigation:

 The Thumbnail tab provides a high level view of the entire tree. For example, the Thumbnail tab shows that this tree contains five levels, although you view fewer of the nodes in the primary display window.

 You can move the viewer box around within the Thumbnail tab to dynamically locate your view in the primary window. You can also use the scroll bars in the primary display window to select a different location within the decision tree display.

 Finally, you can change the viewer percentage zoom in the primary display window to increase or decrease the size of viewable content.

Page 43 3. First, navigate to and selectNode 4and click on it to select it.

Notes:

 At each level within the decision tree, an IF/THEN statement that describes a rule is displayed. As each additional level is added to the tree, another condition is added to the IF/THEN statement.

 For each node in the tree, summary information about the particular node is shown in the box.

 In addition, the IF/THEN statement rule appears in the Rule tab, as shown below, when you select a particular node.

 Commonly, a decision tree model would show a much larger set of levels and also nodes within each level in the decision tree. However, the data set used for this lesson is significantly smaller than a normal data mining set, and therefore the decision tree is also small.

Page 44 Notes:

 At this level, we see that the first split is based on the BANK_FUNDS attribute, and the second split is based on the CHECKING_AMOUNT attribute.

 Node 4 indicates that if BANK_FUNDS are greater than 225.5, and CHECKING_AMOUNT is less than or equal to 97, then there is a 62.29% chance that the customer will buy insurance.

 Using Data Miner 4.0, you can copy and paste chart images from virtually any image in the UI. Then, the image may be pasted into another document. For example, here we select Node 4 in the decision tree and copy it to the clipboard.

Page 45 4. Next, selectNode 6, at the bottom level in the tree.

Page 46 Notes:

 At this bottom level in the tree, a final split is added for the MONEY_MONTHLY_OVERDRAWN attribute.

 This node indicates that if BANK_FUNDS are greater than 225.5, and

CHECKING_AMOUNT is greater than 97, and MONEY_MONTHLY_OVERDRAWN is greater than 54.26, then there is a 60% chance that the customer will buy insurance. 5. Dismiss the Decision Tree display tab as shown here:

Page 47

Apply the Model

In this topic, you apply the Decision Tree model and then create a table to display the results. You "apply" a model in order to make predictions - in this case to predict which customers are likely to buy insurance.

To apply a model, you perform the following steps:

1. First, specify the desired model (or models) in the Class Build node.

2. Second, add a new Data Source node to the workflow. (This node will serve as the "Apply" data.) 3. Third, an Apply node to the workflow.

4. Next, connect both the Class Build node and the new Data Source node to the Apply node. 5. Finally, you run the Apply node to create predictive results from the model.

Follow these steps to apply the model and display the results:

1. In the workflow, select the Class Build node. Then, using the Models section of the Properties tab, deselect all of the models except for the DT model.

To deselect a model, click the large green arrow in the model'sOutputcolumn. This action adds a small red "x" to the column, indicating that the model will not be used in the next build. When you finish, the Models tab of the Property Inspector should look like this:

Page 48 Note: Now, only the DT model will be passed to subsequent nodes.

2. Next, add a new Data Source node in the workflow. Note: Even though we are using the same table as the "Apply" data source, you must still add a second data source node to the workflow. A. From the Data category in the Components tab, drag and drop a Data Source node to the workflow canvas, as shown below. The Define Data Source wizard opens automatically. B. In the Define Data Source wizard, select theCUST_INSUR_LTV_SAMPLEtable, and then clickFINISH.

Result: A new data source node appears on the workflow canvas, with the name CUST_INSUR_LTV_SAMPLE1.

Page 49 3. Select the new data source node, and using theDetailssection of the Properties tab, change the

Node Name toCUST_INSUR_LTV_APPLY, like this:

Result: The new table name is reflected in the workflow.

Page 50 5. Drag and drop the Apply node to the workflow canvas, like this:

Note: The yellow exclamation mark in its border indicates that more information is required before the Apply node may be run.

6. Using the Details tab of the Property Inspectory, rename the Apply node toApply Model. 7. Using the techniques described previously, connect theClass Buildnode to theApply

Modelnode, like this

Page 51 Notes:

 The yellow exclamation mark disappears from the Apply node border once the second link is completed.

 This indicates that the node is ready to be run.

9. Before you run the apply model node, consider the resulting output. By default, an apply node creates two columns of information for each customer:

 The prediction (Yes or No)  The probability of the prediction

However, you really want to know this information for each customer, so that you can readily associate the predictive information with a given customer.

To get this information, you need to add a third column to the apply output: CUSTOMER_ID. Follow these instructions to add the customer id to the output:

A. Right-click the Apply Model node and selectEdit.

Result: The Edit Apply Node window appears. Notice that the Prediction, Prediction Probability, and Prediction Cost columns are defined automatically in the Predictions tab.

Page 52 B. Select theAdditional Outputtab, and then click the green "+" sign, like this:

C. In the Edit Output Data Column Dialog:

 SelectCUSTOMER_IDin the Available Attributes list.

 Move it to the Selected Attributes list by using the shuttle control.  Then, clickOK.

Page 53 Result: the CUSTOMER_ID column is added to the Additional Output tab, as shown here:

Page 54 Also, notice that the default column order for output is to place the data columns first, and the prediction columns after. You can switch this order if desired.

D. Finally, clickOKin the Edit Apply Node window to save your changes.

10. Now, you are ready to apply the model. Right-click the Apply Model node and selectRunfrom the menu.

Page 55 Result: As before, the workflow document is automatically saved, and small green gear icons appear in each of the nodes that are being processed. In addition, the execution status is shown at the top of the workflow pane.

When the process is complete, green check mark icons are displayed in the border of all workflow nodes to indicate that the server process completed successfully.

Page 56 11. Optionally, you can create a database table to store the model prediction results (the "Apply"

results).

The table may be used for any number of reasons. For example, an application could read the predictions from that table, and suggest an appropriate response, like sending the customer a letter, offering the customer a discount, or some other appropriate action.

To create a table of model prediction results, perform the following:

A. Using the Data category in the Components pane, drag theCreate Table or Viewnode to the workflow canvas, like this:

Result: An OUTPUT node is created.

Page 57 C. To specify a name for the table that will be created (otherwise, Data Miner will create a default name), do the following:

1. Right-click the OUTPUT node and selectEditfrom the menu.

2. In the Edit Create Table or View window, change the default table name toDT_PREDICTIONS, as shown here:

3. Then, clickOK.

Page 58 Result: The workflow document is automatically saved when the process is executed. When complete, all nodes contain a green check mark in the border, like this:

Note: After you run the OUTPUT node (DT_PREDICTIONS), the table is created in your schema. 12. To view the results:

A. Right-click the DT_PREDICTIONS Table node and selectView Datafrom the Menu. Result: A new tab opens with the contents of the table:

 The table contains four columns: three for the prediction data, and one for the customer ID.

 You can sort the table results on any of the columns using the Sort button, as shown here.

 In this case, the table will be sorted using:

 First - the Predicted outcome (CLAS_DT_1_1_PRED),

inDescendingorder (meaning that the prediction of "Yes" for buying insurance is first).

 Second - Prediction Probability (CLAS_DT_1_1_PROB),

inDescendingorder (meaning that the highest prediction probabilities are at the top of the table display).

Page 59 B. ClickApply Sortto view the results:

Page 60 Notes:

 Each time you run an Apply node, Oracle Data Miner takes a different sample of the data to display. With each Apply, both the data and the order in which it is displayed may change. Therefore, the sample in your table may be different from the sample shown here. This is particularly evident when only a small pool of data is available, which is the case in the schema for this lesson.

 You can also filter the table by entering a Where clause in the Filter box.

 The table contents can be displayed using any Oracle application or tools, such as Oracle Application Express, Oracle BI Answers, Oracle BI Dashboards, and so on.

C. When you are done viewing the results, dismiss the tab for the DT_PREDICTIONS table, and clickSave All.

Summary

In this lesson, you examined and solved a "Classification" prediction data mining business problem by using the Oracle Data Miner graphical user interface, which is included as an extension to SQL Developer, version 4.0.

Page 61 In this tutorial, you have learned how to:

 Identify Data Miner interface components

 Create a Data Miner project

 Build a Workflow document that uses Classification models to predict customer behavior

In document Oracle Data Mining Hands On Lab (Page 41-61)