• No results found

5 Transforming Data

5.19 Output Type

Output Type is used to set parameters for use in the output tables when the flowgraph is activated.

Before using the Output Type node, you must have an existing table or table type created. You can create an output table type in application function modeler within SAP HANA Studio.

Note

If you use an Input Type node and select real-time processing, you must use the Output Type node.

General Properties

Option Description

Kind Table: Cannot be changed.

Related Information

Input Type [page 123]

5.20 Pivot

Creates a row of data from existing rows.

Use this node to combine data from several rows into one row by creating new columns. A pivot table can help summarize the data by placing it in an output data set. For each unique value in a pivot axis column, it produces a column in the output data set.

Note

The Pivot node is not available for real-time processing.

Table 100: Parameters

Parameter Description

Parameter Description

Display Name The name shown in the Palette pane.

Note

This option can only be changed when creating a template. It cannot be changed when us­ ing the node outside of a template.

Description (Optional.) Provides a comment about the operation. For example, "Employee Expenses."

Non-Pivot Attributes The list of columns from the source table that will appear in the target table without modifica­ tion.

Pivoted Attributes Set of columns whose values will be pivoted from rows into columns.

Default Value The value stored when a rotated column has no corresponding data. The default is "null" if you do not enter a value. Do not enter a blank.

Axis Attribute The column that determines which new columns are needed in the output table. At run time, a new column is created for each Pivoted Attribute and each unique axis value in the Axis Attrib­ ute.

Duplicate Value The action taken when a collision occurs. A collision occurs when there is more than one row with the same key and value in the Axis Attribute. In this case, you can store the first row or you can abort the transform process.

Axis value The value of the Axis Attribute column that represents a particular set of output columns. A set of pivoted columns is generated for each Axis value. There should be one Axis value for each unique value in the Pivot Attribute.

Attribute Prefix Text added to the front of the Pivot Attributes when creating new column names for the ro­ tated data. An underscore is added to separate the prefix name from the pivoted column name.

Mappings

The mappings tab shows how the input column names are mapped to output column names. If you have a large table, you can use Filter pattern to search for specific columns. See "Using the Mapping Editor" topic in the SAP HANA Developer Guide.

Input data

Select the input data General tab by clicking Input_<n>.

Table 101: General

Option Description

Name The name of the input source. You can rename this source.

Table 102: Signature

Option Description

Name The column name in the input source. This can be named the same as the output from the previous node.

Type The type of data contained in the column, for example, Nvarchar, Decimal, Date, and so on.

Length The number of characters allowed in the column.

Scale The number of digits to the right of the decimal point. This is used when the data type is a deci­ mal.

Nullable Indicates whether the column can be blank.

Use the Add, Remove, Up and Down buttons to edit the input fields accordingly.

Table 103: Fixed Content

Option Description

Fixed Content Enable to have the input table of the node saved with the flowgraph file. Otherwise, it is placed in a separate table connected to the node. For more information, see the SAP HANA Developer

Guide topic "Flowgraphs".

Output data

Only one data target is allowed.

Select the output data General tab by clicking Output_<n>.

Table 104: General

Option Description

Name The name of the output target. You can rename this target.

Kind Identifies the type of output target. Table 105: Signature

Option Description

Name The column name in the input source. This can be named the same as the output from the previous node.

Type The type of data contained in the column, for example, Nvarchar, Decimal, Date, and so on.

Length The number of characters allowed in the column.

Scale The number of digits to the right of the decimal point. This is used when the data type is a deci­ mal.

Nullable Indicates whether the column can be null.

Annotations

Create comments for users. For example, you might want to make a note of some particular settings in this flowgraph so that the administrator can schedule or understand certain customizations. The annotations are written to a table. See the "Application Function Modeler" section of the SAP HANA Developer Guide.

All

Shows all of the options in one screen. It includes, General, Mappings, and Annotations.

Example

Suppose that you have employee contact information, and you need to identify those records with missing data.

Employee_ID Contact_Type Contact_Name Contact_Address Phone

2178 emergency Shane McMillian 404 Walnut St. 555-1212 2178 home Bradly Smith 2168 Park Ave. S. 555-8168 2178 work Janet Garcia 801 Wall St. 555-7287 7532 emergency Adam Ellis 7518 Windmill Rd. 555-2165 7532 home Sarah La Fonde 2265 28th St. SW 555-1010 1298 work Ravi Rahim 801 Wall St. 555-7293

Because there are several rows for each employee, finding missing information may be difficult. Use the Pivot node to rearrange the data into a more searchable form without losing any category information. Set the properties as follows.

Option Value Notes

Non-pivot attributes Employee_ID Choose Employee_ID as a column that will not be pivoted. In this case, this ensures that this field is output in a single row. Pivoted attributes Contact_Name

Contact_Phone

Select these two fields so the names and numbers of the con­ tacts are output into a single row for each employee. Default value Null Enter "Null" so that you can identify those areas that are

empty.

Axis attribute Contact_Type Shows the order of the pivot.

Duplicate value First Row If a duplicate is found during processing, only the first record will be output, and processing continues. Choosing Abort causes the processing to fail.

Option Value Notes

Axis value emergency home work

This moves that data into additional columns. These are the values in the Contact_Type column in the source table.

Column prefix Emergency Home Work

This enters a prefix to the column headings. In this case, the column names will be:

● Emergency_Contact_Name ● Emergency_Phone ● Home_Contact_Name ● Home_Phone ● Work_Contact_Name ● Work_Phone

The output data set includes the Employee_ID (not pivoted) and the Contact_Name and Phone fields for each pivot Axis Value (emergency, home, and work).In cases where the data is empty in the source, the Pivot node stores a null value.

The result is a single row for each employee, which you can use to search for missing contact information.

Em­ ployee_ID Emergency_Con­ tact_Name Emer­ gency_Phone Home_Con­ tact_Name Home_Phone Work_Con­ tact_Name Work_Phone

2178 Shane McMillian 555-1212 Bradly Smith 555-8168 Janet Garcia 555-7287 7532 Adam Ellis 555-2165 Sarah La Fonde 555-1010 Null Null 1298 Null Null Null Null Ravi Rahim 555-7293

5.21 Procedure

Use procedures from the catalog in the flowgraph.

Prerequisites

You have added a Procedure node to the flowgraph.

Note

Procedure

1. Select the Procedure node.

2. This step applies only if you added the Procedure node from the Node Palette. In the General tab of the Properties view, select the drop-down menus for the Schema and the Procedure that is represented by the node.

Note

To activate the flowgraph, the database user _SYS_REPO needs EXECUTE object privileges for all procedures represented by Procedure nodes.

5.21.1 Procedure options

Description of options for the Procedure node.

Option Description

Name The name for the node. Display Name

Note

AFM only.

The name shown in the Palette pane.

Note

This option can only be changed when creating a template. It cannot be changed when using the node outside of a template.

Description

Note

AFM only.

(Optional.) Provides a comment about the node. For example, "Run schedule." Schema The location and definition of the procedure.

5.22 R-Script

Use the R-Script node for developing and analyzing statistical data.

R is an open-source programming language and software environment for statistical computing and graphics. The R code is embedded in SAP HANA SQL code in the form of a RLANG procedure. You can embed R-function definitions and calls within SQL Script and submit the code as part of a query to the database.

Note

The R-Script node is not available for real-time processing.

Related Information

SAP HANA R Integration Guide (PDF) SAP HANA R Integration Guide (HTML)

5.23 Row Generation

Creates a table column that contains a row ID.

The Row Generation operation by itself creates only one column that contains a row ID. You would typically follow it with a Query operation, with which you can add other columns or join with other tables. Then you can follow it with other operations such as join.

Note

The Row Generation node is available for real-time processing.

Table 106: Parameters

Parameter Description

Name The name of the output target for the row generation operation.

Display name The name shown in the Palette pane.

Note

This option can only be changed when creating a template. It cannot be changed when us­ ing the node outside of a template.

Description (Optional.) Provides a comment about the operation. For example, "Create a dataset with 5000 rows."

Parameter Description

Row Number Start Defines the integer value at which row numbering starts.

Row Count Defines the number of rows to add.

Mappings

The mappings tab shows how the input column names are mapped to output column names. If you have a large table, you can use Filter pattern to search for specific columns. See "Using the Mapping Editor" topic in the SAP HANA Developer Guide.

Output data

Only one data target is allowed.

Select the output data General tab by clicking Output_<n>.

Table 107: General

Option Description

Name The name of the output source. You can rename this source.

Kind Identifies the type of output target. Table 108: Signature

Option Description

Name

Type The type of data contained in the column, for example, Nvarchar, Decimal, Date, and so on. The column name in the input source. This can be named the same as the output from the previous node.

Length The number of characters allowed in the column.

Scale The number of digits to the right of the decimal point. This is used when the data type is a deci­ mal.

Nullable Indicates whether the column can be null.

Use the Add, Remove column name in the input source. This can be named the, Up and Down buttons to edit the input fields accordingly.

Annotations

Create comments for users. For example, you might want to make a note of some particular settings in this flowgraph so that the administrator can schedule or understand certain customizations. The annotations are written to a table. See the "Application Function Modeler" section of the SAP HANA Developer Guide.

All

Shows all of the options in one screen. It includes General, Mappings, and Annotations.

5.24 Sort

A Sort node represents a relational sort operation.

Prerequisites

You have added a Sort node to the flowgraph.

Context

The Sort node performs a sort by one or more attributes of the input.

Note

The Sort node is available for real-time processing.

Procedure

1. Select the Sort node.

2. In the Properties View, select the General tab to configure the sort order.

3. In the General tab, use the Table Editor to define the Attributes and the Sort Order by which the input is sorted. It is possible to specify several Attributes with descending priority.

5.24.1 Sort Options

Description of options for the Sort node.

Option Description

Name The name for the node.

Display Name

Note

AFM only.

The name shown in the Palette pane.

Note

This option can only be changed when creating a template. It cannot be changed when using the node outside of a template.

Description

Note

AFM only.

(Optional.) Provides a comment about the node. For example, "Sort as­ cending sales order."

Column/Attribute The column used for sorting.

Sort Type/Sort Order How to sort the data.

Ascending: When sorting numerical data, put the smallest number first. When sorting alphabetically, start with the first letter.

Descending: When sorting numerical data, put the largest number first. When sorting alphabetically, start with the last letter.

Add A row is configured to be used for sorting.

Remove The highlighted entry is deleted, so that it will not be used in sorting.

Up The entry is moved up so that it is sorted before any entries below it.

Down The entry is moved down so that it is sorted after any entries above it.

5.25 Table Comparison

Compares two tables and produces the difference between them as a dataset with rows flagged as INSERT, UPDATE, or DELETE.

The table comparison operation compares two datasets and produces the difference between them as a data set with rows flagged as INSERT, UPDATE, or DELETE. The operation generates an Op_Code to identify records to be inserted, deleted, or updated to synchronize the comparison table with the input table.

Note

The input to the Table Comparison node cannot contain any LOBs, text, or shorttext attributes, even if they are not in the list of attributes being compared.

Note

The Table Comparison node is available for real-time processing.

Table 109: Parameters

Parameter Description

Name The name for the output target for the table comparison node.

Display Name The name shown in the Palette pane.

Note

This option can only be changed when creating a template. It cannot be changed when us­ ing the node outside of a template.

Description (Optional.) Provides a comment about the operation. For example, "Compare Southwest re­ gion sales and Northeast region sales."

Comparison Table The table for comparison.

Generated Key Attribute

Provides the compare attributes and primary key.

Filter Condition Provides the filter condition to limit the comparison table, if any.

Deleted Rows Detection

Specifies whether the input table is considered a complete dataset and records in the com­ pare table are to be tagged for deletion if they do not exist in the input.

True: The input table is considered a complete dataset and records in the compare table are tagged for deletion if they do not exist in the input.

False: The input table is not considered a complete dataset and records in the compare table are not tagged for deletion.

More Than One Deleted Row with Same Key Value

If you choose to detect deleted rows, you can choose to detect all rows or detect the row with the largest key value.

Compare Attributes Provides the name and primary key of the compare attributes. You can Add or Remove attrib­ utes, or move an attribute Up or Down in the list.

Mappings

The mappings tab shows how the input column names are mapped to output column names. If you have a large table, you can use Filter pattern to search for specific columns. See "Using the Mapping Editor" topic in the SAP HANA Developer Guide.

Input data

Select the input data General tab by clicking Input_<n>.

Table 110: General

Option Description

Name The name of the input source. You can rename this source.

Kind Identifies the type of input source, For example, table, column, scalar. Table 111: Signature

Option Description

Name The column name in the input source. This can be named the same as the output from the previous node.

Type The type of data contained in the column, for example, Nvarchar, Decimal, Date, and so on.

Length The number of characters allowed in the column.

Scale The number of digits to the right of the decimal point. This is used when the data type is a deci­ mal.

Nullable Indicates whether the column can be blank.

Use the Add, Remove, Up and Down buttons to edit the input fields accordingly.

Table 112: Fixed Content

Option Description

Fixed Content Enable to have the input table of the node saved with the flowgraph file. Otherwise, it is placed in a separate table connected to the node. For more information, see the SAP HANA Developer

Guide topic "Flowgraphs".

Output data

Only one data target is allowed.

Select the output data General tab by clicking Output_<n>.

Table 113: General

Option Description

Name The name of the output source. You can rename this source.

Kind Identifies the type of output target. Table 114: Signature

Option Description

Name The column name in the input source. This can be named the same as the output from the previous node.

Option Description

Type The type of data contained in the column, for example, Nvarchar, Decimal, Date, and so on.

Length The number of characters allowed in the column.

Scale The number of digits to the right of the decimal point. This is used when the data type is a deci­ mal.

Nullable Indicates whether the column can be null.

Use the Add, Remove, Up and Down buttons to edit the input fields accordingly.

Annotations

Create comments for users. For example, you might want to make a note of some particular settings in this flowgraph so that the administrator can schedule or understand certain customizations. The annotations are written to a table. See the "Application Function Modeler" section of the SAP HANA Developer Guide.

All

Shows all of the options in one screen. It includes, General, Mappings, and Annotations.

5.26 Template File

The Template File is similar to a Data Sink node, and is used when you have a file that was converted using the SAP HANA smart data integration file adapter.

Prerequisites

You have added the Template File node to the flowgraph.

Context

The smart data integration file adapter is preinstalled with the SDI Agent. This file adapter converts any fixed-

Related documents