Using the mapping tool, you can connect a new data source to a preexisting stream. The mapping tool will not only set up the connection but it will also help you to specify howfields in the new source will replace those in the existing stream. Instead of re-creating an entire data stream for a new data source, you can simply connect to an existing stream.
The data mapping tool allows you to join together two stream fragments and be sure that all of the (essential)field names match up properly. In essence, mapping data results simply in the creation of a new Filter node, which matches up the appropriatefields by renaming them. There are two equivalent ways to map data:
Select replacement node.This method starts with the node to be replaced. First, you right-click the node to replace; then, using theData Mapping>Select Replacement Nodeoption from the pop-up menu, select the node with which to replace it.
Map to.This method starts with the node to be introduced to the stream. First, right-click the node to introduce; then, using theData Mapping>Map Tooption from the pop-up menu, select the node to which it should join. This method is particularly useful for mapping to a terminal node. Note: You cannot map to Merge or Append nodes. Instead, you should simply connect the stream to the Merge node in the normal manner.
Figure 5-44
Selecting data mapping options
Data mapping is tightly integrated into stream building. If you try to connect to a node that already has a connection, you will be offered the option of replacing the connection or mapping to that node.
Mapping Data to a Template
To replace the data source for a template stream with a new source node bringing your own data into IBM® SPSS® Modeler, you should use theSelect Replacement Nodeoption from the Data Mapping pop-up menu. This option is available for all nodes except Merge, Aggregate, and all terminal nodes. Using the data mapping tool to perform this action helps ensure thatfields are matched properly between the existing stream operations and the new data source. The following steps provide an overview of the data mapping process.
Step 1: Specify essential fields in the original source node. In order for stream operations to run properly, essentialfields should be specified. For more information, see the topicSpecifying Essential Fieldson p. 94.
Step 2: Add new data source to the stream canvas.Using one of the source nodes, bring in the new replacement data.
93 Building Streams Step 3: Replace the template source node.Using the Data Mapping option on the pop-up menu for the template source node, clickSelect Replacement Node, then select the source node for the replacement data.
Figure 5-45
Selecting a replacement source node
Step 4: Check mapped fields.In the dialog box that opens, check that the software is mappingfields properly from the replacement data source to the stream. Any unmapped essentialfields are displayed in red. Thesefields are used in stream operations and must be replaced with a similar field in the new data source in order for downstream operations to function properly. For more information, see the topicExamining Mapped Fieldson p. 95.
After using the dialog box to ensure that all essentialfields are properly mapped, the old data source is disconnected and the new data source is connected to the stream using a Filter node calledMap. This Filter node directs the actual mapping offields in the stream. AnUnmapFilter node is also included on the stream canvas. TheUnmapFilter node can be used to reversefield name mapping by adding it to the stream. It will undo the mappedfields, but note that you will have to edit any downstream terminal nodes to reselect thefields and overlays.
Figure 5-46
Mapping between Streams
Similar to connecting nodes, this method of data mapping does not require you to set essential fields beforehand. With this method, you simply connect from one stream to another usingMap to from the Data Mapping pop-up menu. This type of data mapping is useful for mapping to terminal nodes and copying and pasting between streams.Note: Using theMap tooption, you cannot map to Merge, Append, and all types of source nodes.
Figure 5-47
Mapping a stream from its Sort node to the Type node of another stream
To Map Data between Streams
E Right-click the node that you want to use for connecting to the new stream. E On the menu, click:
Data Mapping > Map to
E Use the cursor to select a destination node on the target stream.
E In the dialog box that opens, ensure thatfields are properly matched and clickOK.
Specifying Essential Fields
When mapping to an existing stream, essentialfields will typically be specified by the stream author. These essentialfields indicate whether a particularfield is used in downstream operations. For example, the existing stream may build a model that uses afield calledChurn. In this stream, Churnis an essentialfield because you could not build the model without it. Likewise,fields used in manipulation nodes, such as a Derive node, are necessary to derive the newfield. Explicitly setting suchfields as essential helps to ensure that the properfields in the new source node are mapped to them. If mandatoryfields are not mapped, you will receive an error message. If you decide that certain manipulations or output nodes are unnecessary, you can delete the nodes from the stream and remove the appropriatefields from the Essential Fields list.
95 Building Streams To Set Essential Fields
E Right-click the source node of the template stream that will be replaced. E On the menu, click:
Data Mapping > Specify Essential Fields Figure 5-48
Specifying essential fields
E Using the Field Chooser, you can add or removefields from the list. To open the Field Chooser, click the icon to the right of thefields list.
Examining Mapped Fields
Once you have selected the point at which one data stream or data source will be mapped to another, a dialog box opens for you to selectfields for mapping or to ensure that the system default mapping is correct. If essentialfields have been set for the stream or data source and they are unmatched, thesefields are displayed in red. Any unmappedfields from the data source will pass through the Filter node unaltered, but note that you can map non-essentialfields as well.
Figure 5-49
Selecting fields for mapping
Original.Lists allfields in the template or existing stream—all of thefields that are present further downstream. Fields from the new data source will be mapped to thesefields.
Mapped.Lists thefields selected for mapping to templatefields. These are thefields whose names may have to change to match the originalfields used in stream operations. Click in the table cell for afield to activate a list of availablefields.
If you are unsure of whichfields to map, it may be useful to examine the source data closely before mapping. For example, you can use the Types tab in the source node to review a summary of the source data.