SAS_DI_Developer_-_Fast_Track.pdf
Full text
(2) SAS® Data Integration Studio: Fast Track Course Notes was developed by Linda Jolley, Kari Richardson, Eric Rossland, and Christine Vitron. Editing and production support was provided by the Curriculum Development and Support Department. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. SAS® Data Integration Studio: Fast Track Course Notes Copyright © 2009 SAS Institute Inc. Cary, NC, USA. All rights reserved. Printed in the United States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc. Book code E1477, course code DIFT, prepared date 31Jul2009.. DIFT_001. ISBN 978-1-60764-048-6.
(3) For Your Information. Table of Contents Course Description .................................................................................................................... viii Prerequisites ................................................................................................................................ ix Chapter 1. Introduction .......................................................................................... 1-1. 1.1. Exploring the Platform for SAS Business Analytics ....................................................... 1-3. 1.2. Introduction to Data Integration Applications ............................................................... 1-12 Demonstration: Using SAS Data Integration Studio............................................... 1-20 Demonstration: Using dfPower Explorer and dfPower Profile ............................... 1-55. 1.3. Introduction to Change Management............................................................................. 1-81 Demonstration: Exploring the Basics of Change Management .............................. 1-85. Chapter 2. Introduction to Course Data and Course Scenario ........................... 2-1. 2.1. Introduction to Classroom Environment and Course Data .............................................. 2-3. 2.2. Course Tasks .................................................................................................................... 2-8 Exercises.................................................................................................................. 2-14. 2.3. Solutions to Exercises .................................................................................................... 2-26. Chapter 3 3.1. Creating Metadata for Source Data ..................................................... 3-1. Setting Up the Environment............................................................................................. 3-3 Demonstration: Defining Custom Folders ................................................................ 3-5 Demonstration: Defining Metadata for a SAS Library ........................................... 3-10 Exercises.................................................................................................................. 3-17. 3.2. Registering Source Data Metadata ................................................................................. 3-18 Demonstration: Registering Metadata for SAS Source Tables ............................... 3-20 Demonstration: Registering Metadata for ODBC Data Sources ............................. 3-36 Demonstration: Registering Metadata for External Files ........................................ 3-65. iii.
(4) iv. For Your Information. Exercises.................................................................................................................. 3-82 3.3. Solutions to Exercises .................................................................................................... 3-84. Chapter 4 4.1. Creating Metadata for Target Data ...................................................... 4-1. Registering Target Data Metadata .................................................................................... 4-3 Demonstration: Defining the Product Dimension Table ........................................... 4-6 Exercises.................................................................................................................. 4-22. 4.2. Importing Metadata ........................................................................................................ 4-23 Demonstration: Importing Relational Metadata ...................................................... 4-26. 4.3. Solutions to Exercises .................................................................................................... 4-40. Chapter 5 5.1. Creating Metadata for Jobs ................................................................. 5-1. Introduction to Jobs and the Job Editor ........................................................................... 5-3 Demonstration: Populating Current and Terminated Staff Tables ............................. 5-6. 5.2. Using the SQL Join Transformation .............................................................................. 5-33 Demonstration: Populating the Product Dimension Table ............................ 5-41 Exercises.................................................................................................................. 5-68. 5.3. Working with Jobs ......................................................................................................... 5-69 Demonstration: Investigate Mapping and Propagation Functionality..................... 5-72 Demonstration: Chaining Jobs ................................................................................ 5-99 Demonstration: Investigating Performance Statistics ........................................... 5-105 Demonstration: Using the Reports Window ......................................................... 5-120. 5.4. Solutions to Exercises .................................................................................................. 5-129. Chapter 6 6.1. Orion Star Case Study.......................................................................... 6-1. Exercises .......................................................................................................................... 6-3 Define and Load the Customer Dimension Table...................................................... 6-6 Define and Load the Organization Dimension Table ................................................ 6-8 Define and Load the Time Dimension Table ........................................................... 6-11.
(5) For Your Information. 6.2. Solutions to Exercises .................................................................................................... 6-13 Define and Load the Customer Dimension Table.................................................... 6-13 Define and Load the Organization Dimension Table .............................................. 6-19 Define and Load the Time Dimension Table ........................................................... 6-29. Chapter 7 7.1. Working with Transformations ............................................................ 7-1. Introduction ...................................................................................................................... 7-3 Demonstration: Create Orion Reports Subfolders .................................................... 7-5. 7.2. Using Extract, Summary Statistics, and Loop Transformations ...................................... 7-7 Demonstration: Using the Extract and Summary Statistics Transformation ............. 7-9 Demonstration: Using the Loop Transformations ................................................... 7-34 Exercises.................................................................................................................. 7-54. 7.3. Establishing Status Handling ......................................................................................... 7-57 Demonstration: Working with Transformation and Job Status Handling ............... 7-61 Demonstration: Using the Return Code Check Transformation ............................. 7-71. 7.4. Using the Data Validation Transformation..................................................................... 7-77 Demonstration: Using the Data Validation Transformation .................................... 7-79 Exercises.................................................................................................................. 7-93. 7.5. Using Transpose, Sort, Append, and Rank Transformations ......................................... 7-95 Demonstration: Using the Transpose, Sort, Append, and Rank Transformations............................................................................. 7-99. 7.6. Basic Standardization with the Apply Lookup Standardization Transformation ......... 7-123 Demonstration: Using the Apply Lookup Standardization Transformation .......... 7-125 Exercises................................................................................................................ 7-138. 7.7. Solutions to Exercises .................................................................................................. 7-140. Chapter 8. Working with Tables and the Table Loader Transformation ............. 8-1. 8.1. Basics of the Table Loader Transformation ..................................................................... 8-3. 8.2. Load Styles of the Table Loader Transformation ............................................................. 8-8. v.
(6) vi. 8.3. For Your Information. Table Properties and Load Techniques of the Table Loader Transformation ................. 8-14. Chapter 9. Working with Slowly Changing Dimensions...................................... 9-1. 9.1. Defining Slowly Changing Dimensions .......................................................................... 9-3. 9.2. Using the SCD Type 2 Loader and Lookup Transformations ........................................ 9-15 Demonstration: Populate Star Schema Tables Using the SCD Type 2 Loader with the Surrogate Key Method .................................................... 9-29. 9.3. Introducing the Change Data Capture Transformations ................................................ 9-93. Chapter 10. Defining Generated Transformations ............................................... 10-1. 10.1 SAS Code Transformations............................................................................................ 10-3 10.2 Using the New Transformation Wizard ......................................................................... 10-8 Demonstration: Creating a Generated Transformation ......................................... 10-11 Exercises................................................................................................................ 10-35 10.3 Solutions to Exercises .................................................................................................. 10-40 Chapter 11. Implementing Data Quality Techniques (Self-Study) ...................... 11-1. 11.1 SAS and Data Quality .................................................................................................... 11-3 11.2 Working with the DataFlux IS Transformations .......................................................... 11-10 Demonstration: Confirming DataFlux Integration Server Is Running .................. 11-15 Demonstration: Configuring the DataFlux Integration Server Manager............... 11-18 Demonstration: Creating Jobs for Execution on DataFlux Integration Server ..... 11-20 Demonstration: Creating a Job to be Used as a Service ........................................ 11-63 Demonstration: Uploading Jobs and Services to DataFlux Integration Server ..... 11-69 Demonstration: Registering DataFlux Integration Server in SAS Management Console .................................................................. 11-76 Demonstration: Using the DataFlux IS Job Transformation ................................. 11-87 Demonstration: Using the DataFlux IS Service Transformation......................... 11-101 Exercises.............................................................................................................. 11-120.
(7) For Your Information. Chapter 12. Deploying Jobs ................................................................................... 12-1. 12.1 Overview of Deploying Jobs ......................................................................................... 12-3 12.2 Deploying Jobs for Scheduling ...................................................................................... 12-5 Demonstration: Scheduling Orion Jobs ................................................................ 12-13 12.3 Deploying Jobs as Stored Processes ............................................................................ 12-38 Demonstration: Creating SAS Stored Processes from Report Jobs ...................... 12-43 Chapter 13. Learning More ..................................................................................... 13-1. 13.1 SAS Resources ............................................................................................................... 13-3. vii.
(8) viii. For Your Information. Course Description This intensive training course provides accelerated learning for those students who will register sources and targets; create and deploy jobs; work with transformations; set up change management; work with slowly changing dimensions; and understand status handling and change data capture. This course is for individuals who are comfortable with learning large amounts of information in a short period of time. The &di1 and &di2 courses are available to provide the same type of information in a much more detailed approach over a longer period of time.. To learn more… For information on other courses in the curriculum, contact the SAS Education Division at 1-800-333-7660, or send e-mail to [email protected]. You can also find this information on the Web at support.sas.com/training/ as well as in the Training Course Catalog.. For a list of other SAS books that relate to the topics covered in this Course Notes, USA customers can contact our SAS Publishing Department at 1-800-727-3228 or send e-mail to [email protected]. Customers outside the USA, please contact your local SAS office. Also, see the Publications Catalog on the Web at support.sas.com/pubs for a complete list of books and a convenient order form..
(9) For Your Information. ix. Prerequisites Experience with SAS programming, SQL processing, and the SAS macro facility is required. This experience can be gained by completing the SAS® Programming 1: Esstentials, SAS® SQL 1: Essentials, and SAS® Macro Language 1: Essentials courses..
(10) x. For Your Information.
(11) Chapter 1 Introduction 1.1. Exploring the Platform for SAS Business Analytics ................................................... 1-3. 1.2. Introduction to Data Integration Applications............................................................ 1-12 Demonstration: Using SAS Data Integration Studio ............................................................ 1-20 Demonstration: Using dfPower Explorer and dfPower Profile ............................................. 1-55. 1.3. Introduction to Change Management ......................................................................... 1-81 Demonstration: Exploring the Basics of Change Management ........................................... 1-85.
(12) 1-2. Chapter 1 Introduction.
(13) 1.1 Exploring the Platform for SAS Business Analytics. 1.1 Exploring the Platform for SAS Business Analytics. Objectives . . Compare the two types of SAS installations. Define the architecture of the platform for SAS Business Analytics. Describe the SAS platform applications used for data integration, reporting, and analysis.. 3. Two “Flavors” of SAS With SAS®9 there are two different types of SAS installations. SAS The traditional SAS installation, which Foundation enables you to write SAS programs or use a point-and-click application such as SAS Enterprise Guide to assist with program creation Platform for SAS Business Analytics. 4. Enterprise software that utilizes multiple machines throughout the organization and consists of applications that help you accomplish the various tasks for accessing and creating information, as well as creating analysis and reporting. 1-3.
(14) 1-4. Chapter 1 Introduction. SAS Windowing Environment The SAS windowing environment can be used to develop and run SAS programs.. 5. SAS Enterprise Guide SAS programs can also be developed using the point-and-click interface in SAS Enterprise Guide.. 6.
(15) 1.1 Exploring the Platform for SAS Business Analytics. Platform for SAS Business Analytics The platform for SAS Business Analytics is enterprise software with components that exist on multiple machines throughout the organization.. 7. . The platform for SAS Business Analytics is also known as the SAS Enterprise Intelligence Platform and the SAS Intelligence Platform.. SAS Platform Architecture The platform for SAS Business Analytics consists of a multiple-tier environment that is typically represented by the following: client tier middle tier server tier data tier. 8. 1-5.
(16) 1-6. Chapter 1 Introduction. SAS Platform: Client Tier In the most basic terms, if an application is installed on the machine where the user is sitting, that machine is part of the client tier.. . SAS platform applications cannot execute SAS code on their own. They must request code submission and other services from a SAS server.. SAS Platform: Middle Tier The middle tier is where the Web applications reside and execute. The middle tier also contains the infrastructure that supports the execution of the Web browser applications, including a Java Servlet Container (or Web Application Server), the Java Software Development Kit (SDK), the SAS Content Server, and the SAS Services Application.. SAS Platform: Server Tier The server tier consists of one or more machines where the SAS servers are installed and accessed by the SAS platform applications. There are different types of SAS servers, including the metadata server, the workspace server, the stored process server, and the OLAP server.. SAS Platform: Data Tier The data tier contains the enterprise data sources, which might be in one or more of the following formats: • SAS data sets • RDBMS (relational database management systems) tables • OLAP (Online Analytical Processing) cubes • SAS SPD Server (Scalable Performance Data Server) files • ERP (Enterprise Resource Planning) data structures. SAS Platform: Metadata The SAS platform utilizes the metadata server and metadata repositories to manage information about the entire environment, including server definitions, data definitions, users and groups, security settings, and business intelligence content..
(17) 1.1 Exploring the Platform for SAS Business Analytics. SAS Platform Applications Instead of having one large client application that does everything for all people across the organization, there are several applications to accomplish these tasks. The SAS platform applications were created to organize the functions of various job roles into the different applications. Some of these applications are installed on each user’s machine; others are accessed using a Web browser.. 9. SAS Platform Applications The SAS platform applications include the following: SAS Information Map SAS Add-In for Studio Microsoft Office SAS Management Console SAS AppDev Studio Eclipse Plug-Ins SAS OLAP Cube Studio SAS BI Dashboard SAS Visual BI (JMP) SAS Data Integration SAS Web OLAP Viewer Studio SAS Web Report Studio SAS Enterprise Guide dfPower Studio SAS Information Delivery Portal. 10. SAS Add-In for Microsoft Office. The SAS Add-In for Microsoft Office enables business users to transparently leverage the power of SAS analytics, reporting, and data access directly from Microsoft Office via integrated menus and toolbars.. SAS AppDev Studio Eclipse Plug-Ins. SAS AppDev Studio is a comprehensive, stand-alone development environment for developing, deploying, and maintaining a wide variety of applications that leverage the full power of SAS.. 1-7.
(18) 1-8. Chapter 1 Introduction. SAS BI Dashboard. SAS BI Dashboard is a point-and-click dashboard development application that enables the creation of dashboards from a variety of data sources to surface information visually.. SAS Data Integration Studio. SAS Data Integration Studio enables a data warehouse developer to create and manage metadata objects that define sources, targets, and the sequence of steps for the extraction, transformation, and loading of data.. SAS Enterprise Guide. SAS Enterprise Guide provides a guided mechanism to exploit the power of SAS and publish dynamic results throughout the organization. SAS Enterprise Guide can also be used for traditional SAS programming.. SAS Information Delivery Portal. The SAS Information Delivery Portal is a Web application that can surface the different types of business analytic content such as information maps, stored processes, and reports.. SAS Information Map Studio. SAS Information Map Studio is used to build information maps, which shield business users from the complexities of the underlying data by organizing and referencing data in business terms.. SAS Management Console. SAS Management Console provides a single interface for managing the metadata of the SAS Platform. Specific administrative tasks are supported by plug-ins to the SAS Management Console.. SAS OLAP Cube Studio. SAS OLAP Cube Studio is used to create OLAP cubes, which are multidimensional structures of summarized data. The Cube Designer provides a point-and-click interface for cube creation.. SAS Visual BI (JMP). SAS Visual BI, powered by JMP software, provides dynamic business visualization, enabling business users to interactively explore ideas and information, investigate patterns, and discover previously hidden facts through visual queries.. SAS Web OLAP Viewer. The SAS Web OLAP Viewer provides a Web interface for viewing and exploring OLAP data. It enables business users to look at data from multiple angles, view increasing levels of detail, and add linked graphs.. SAS Web Report Studio. SAS Web Report Studio provides intuitive and efficient access to query and reporting capabilities on the Web.. dfPower Studio. dfPower Studio from DataFlux (a SAS company) combines advanced dataprofiling capabilities with proven data quality, integration, and augmentation tools for incorporating data quality into a data collection and management process..
(19) 1.1 Exploring the Platform for SAS Business Analytics. SAS Folder Structure SAS applications use a hierarchy of SAS folders to store metadata including the following: Channels. Cubes. Dashboards. Data explorations. Folders. Information maps. Jobs. Libraries. OLAP schema. Prompts. Reports. Stored processes. Tables. Table columns. 11. SAS Folders Tree The folders are arranged in a structure that: segregates system information from business information provides personal folders for individual users provides an area for shared data. 12. 1-9.
(20) 1-10. Chapter 1 Introduction. The initial folder structure includes the following main components: SAS Folders My Folder ( Products. is the root folder for the folder structure. This folder cannot be renamed, moved or deleted. It can contain other folders, but it cannot contain individual objects.. ). is a shortcut to the personal folder of the user who is currently logged on. contains folders for individual SAS products. These folders contain content that is installed along with the product. For example, some products have a set of initial jobs, transformations, stored processes, or reports which users can modify for their own purposes. Other products include sample content (for example, sample stored processes) to demonstrate product capabilities. Where applicable, the content is stored under the product's folder in subfolders that indicate the release number for the product.. Shared Data. During installation, the SAS Deployment Wizard enables the installer to assign a different name to this folder. Therefore, your Products folder might have a different name.. is provided for you to store user-created content that is shared among multiple users. Under this folder, you can create any number of subfolders, each with the appropriate permissions, to further organize this content.. . You can also create additional folders under SAS Folders in which to store shared content.. Follow these best practices when interacting with SAS folders: • Use personal folders for personal content, shared folders for content that multiple users need to view. • Use folders instead of custom repositories to organize content. • Do not delete or rename the Users folder. • Do not delete or rename the home folder or personal folder of an active user. • Do not delete or rename the Products or System folders or their subfolders. • Use caution when renaming the Shared Data folder. • When you create new folders, the security administrator should set permissions..
(21) 1.1 Exploring the Platform for SAS Business Analytics. Users, Groups, and Roles In order to control access to SAS platform content, SAS must know who is making each request and what type of functionality has been requested. Users. A user is an individual person or service identity.. Groups A group is a set of users. Groups provide an easy method to specify permissions for similar users. Roles. Roles are used to control access to application features. An application feature that is under role-based management is called a capability.. 13. Some important features of roles include: • not all applications support roles • not all application features are under role management • there are no negative capabilities and you cannot deny a capability to anyone • having a certain capability is not an alternative to meeting permission requirements • roles and groups serve distinct purposes. 1-11.
(22) 1-12. Chapter 1 Introduction. 1.2 Introduction to Data Integration Applications. Objectives . State the purpose of SAS Data Integration Studio. State the purpose of dfPower Studio. Explore the available interfaces.. 16. What Is SAS Data Integration Studio? SAS Data Integration Studio is a visual design tool for building, implementing, and managing data integration processes, regardless of data sources, applications, or platforms.. 17.
(23) 1.2 Introduction to Data Integration Applications. SAS Data Integration Studio and Metadata. 18. Through its metadata, SAS Data Integration Studio provides a single point of control for managing the following resources: data sources, from any platform that is accessible to SAS and from any format that is accessible to SAS data targets, to any platform that is accessible to SAS, and to any format that is supported by SAS processes that specify how data is extracted, transformed, and loaded from a source to a target jobs that organize a set of sources, targets, and processes (transformations) source code that is generated by SAS Data Integration Studio user-written source code. SAS Data Integration Studio Interface SAS Data Integration Studio has many components available in the desktop interface.. 19. 1-13.
(24) 1-14. Chapter 1 Introduction. Title Bar, Menu Bar, Toolbar, Status Bar Title Bar Menu Bar Toolbar. Status Bar. 20. The title bar shows the current version of SAS Data Integration Studio, as well as the name of the current connection profile. The menu bar provides access to the drop-down menus. The list of active options varies according to the current work area and the kind of object that you select. Inactive options are disabled or hidden. The Toolbar provides access to shortcuts for items on the menu bar. The list of active options varies according to the current work area and the kind of object that you select. Inactive options are disabled or hidden. The status bar displays the name of the currently selected object, the name of the default SAS Application Server if one has been selected, the login ID and metadata identity of the current user, and the name of the current SAS Metadata Server. To select a different SAS Application Server, double-click the name of that server to display a dialog box. If the name of the SAS Metadata Server turns red, the connection is broken. In that case, you can double-click the name of the metadata server to display a dialog box that enables you to reconnect..
(25) 1.2 Introduction to Data Integration Applications. 1-15. Tree View, Basic Properties Pane. Tree View. 21. The tree view provides access to the Basic Properties pane, Folders tree, Inventory tree, Transformations tree, and Checkouts tree. The Basic Properties pane displays the basic properties of an object selected in a tree view. To surface this pane, select View Ö Basic Properties from the desktop. The Folders tree organizes metadata into folders that are shared across a number of SAS applications. The Inventory tree displays metadata for objects that are registered on the current metadata server, such as tables and libraries. Metadata can be accessed in folders that group metadata by type, such as Table, Library, and so on. The Transformations tree displays transformations that can be dragged and dropped into SAS Data Integration Studio jobs. The Checkouts tree displays metadata that has been checked out for update, as well as any new metadata that has not been checked in. The Checkouts tree is not displayed in above view of SAS Data Integration Studio. The Checkouts tree automatically displays when you are working under change management..
(26) 1-16. Chapter 1 Introduction. Job Editor. Job Editor. 22. The Job Editor window enables you to create, maintain, and troubleshoot SAS Data Integration Studio jobs. The Diagram tab is used to build and update the process flow for a job. The Code tab is used to review or update code for a job. The Log tab is used to review the log for a submitted job. The Output tab is used to review the output of a submitted job. The Details pane is used to monitor and debug a job in the Job Editor.. SAS Data Integration Studio Options The Options window is used to specify global options for SAS Data Integration Studio.. 23.
(27) 1.2 Introduction to Data Integration Applications. dfPower Tools in SAS Data Integration Studio The Data Quality tab on the Options window has a setting for accessing DataFlux dfPower tools.. When set, the Tools menu adds a choice to access dfPower Tools.. 24. What Is dfPower Studio? dfPower Studio is a powerful, easy-to-use, suite of data cleansing and data integration applications. With dfPower Studio, you have access to various applications that can help eliminate data quality problems.. 25. 1-17.
(28) 1-18. Chapter 1 Introduction. dfPower Studio Application The components that comprise dfPower Studio include the following: dfPower Explorer. enables you to explore relationships at database, table, and column level.. dfPower Profile. enables you to analyze your data using a set of profiling metrics.. dfPower Customize. enables you to interact with the components of the Quality Knowledge Base (QKB).. dfPower Architect. enables you to define a sequence of operations and run them at one time.. 26. dfPower Studio Interface The dfPower Studio Navigator provides access to all dfPower Studio objects. This navigator allows you access to database connections, Quality Knowledge Bases (QKBs), management resources, reference sources, and more.. 27.
(29) 1.2 Introduction to Data Integration Applications. Navigation Tree View The navigation tree view displays a variety of options available to view in the dfPower Studio Navigator. Click any of the options that have a + next to the item to view a list of database connections, quality knowledge bases (QKBs), and more.. 28. Menus and Tools The various applications to which dfPower Studio enables access are listed as a tool in the tool bar or as a menu choice from the Tools menu.. For example, the Tools Ö Profile menu provides access to dfPower Profile's Configurator, dfPower Profile's Viewer, and dfPower Explorer.. 29. 1-19.
(30) 1-20. Chapter 1 Introduction. Using SAS Data Integration Studio This demonstration illustrates logging into SAS Data Integration Studio and investigating the interface by using predefined metadata objects. 1. Select Start Ö All Programs Ö SAS Ö SAS Data Integration Studio 4.2. 2. Log on using Bruno’s credentials. a. Verify that the connection profile is My Server.. b. Click. Do not click Set this connection profile as the default. to close the Connection Profile window and to open the Log On window.. c. Type Bruno as the value for the User ID field and Student1 as the value for the Password field.. d. Click. Do not click Save user ID and password in this profile. to close the Log On window..
(31) 1.2 Introduction to Data Integration Applications. SAS Data Integration Studio opens:. 1-21.
(32) 1-22. Chapter 1 Introduction. 3. If necessary, click the Folders tab in the tree view area.. Some folders in the Folders tree are provided by default, such as My Folder, Products, Shared Data, System, and Users. Three folders (and subfolders) were added by an administrator: Chocolate Enterprises, Data Mart Development, and Orion Star. 4. Click. in front of the Data Mart Development folder to expand the folder..
(33) 1.2 Introduction to Data Integration Applications. 5. Click. 1-23. in front of the DIFT Demo folder to expand the folder.. The DIFT Demo folder contains seven metadata objects: two library objects, four table objects, and one job object. Each metadata object has its own type of properties..
(34) 1-24. Chapter 1 Introduction. 6. Single-click on the DIFT Test Table – ORDER_ITEM table object. The Basic Properties pane displays basic information for this table object..
(35) 1.2 Introduction to Data Integration Applications. 7. Single-click on the DIFT Test Source Library library object. The Basic Properties pane displays basic information for this library object.. 1-25.
(36) 1-26. Chapter 1 Introduction. 8. Single-click on the DIFT Test Job – OrderFact Table Plus job object. The Basic Properties pane displays basic information for this library object..
(37) 1.2 Introduction to Data Integration Applications. 1-27. 9. Examine the properties of a table object in more detail. a. Right-click on DIFT Test Table – ORDER_ITEM and select Properties.. The name of the metadata table object is shown on the General tab, as well as the metadata folder location..
(38) 1-28. Chapter 1 Introduction. b. Click the Columns tab.. The Columns tab displays the column attributes of the physical table. Note that all columns are numeric..
(39) 1.2 Introduction to Data Integration Applications. 1-29. c. Click the Physical Storage tab.. The Physical Storage tab displays the type of table, the library object name, and the name of the physical table. d. Click. to close the properties window..
(40) 1-30. Chapter 1 Introduction. 10. Right-click on DIFT Test Table – ORDER_ITEM and select Open. The View Data window opens and displays the data for this table..
(41) 1.2 Introduction to Data Integration Applications. The functions of the View Data window are controlled by the View Data toolbar:. The View Data toolbar contains the following items: TOOL. EXPLANATION Specifies the number of the first row that is displayed in the table. Positions the data with the Go-to row as the first data line displayed. Navigates to the first record of data in the View Data window. Navigates to the last page of data in the View Data window. Switches to browse mode. Switches to edit mode. Displays the Search area. Refreshes view of the data. Displays the Search area. Displays the Sort By Column tab in the View Data Options window. Displays the Filter tab in the View Data Options window. Displays the Columns tab in the View Data Options window. Displays physical column names in the column headers. You can display any combination of column metadata, physical column names, and descriptions in the column headers. Displays optional descriptions in the column headers. Displays optional column metadata in the column headers. This metadata can be entered in some SAS Intelligence Platform applications, such as SAS Information Map Studio. Toggles between showing formatted and unformatted data in the View Data window.. 11. To close the View Data window, select File Ö Close (or click. ).. 1-31.
(42) 1-32. Chapter 1 Introduction. 12. Examine the properties of a library object in more detail. a. Right-click on DIFT Test Source Library and select Properties.. The name of the metadata table object is shown on the General tab, as well as the metadata folder location..
(43) 1.2 Introduction to Data Integration Applications. 1-33. b. Click the Options tab.. The Options tab displays the library reference and the location of the physical path of this library. c. Click. to close the properties window..
(44) 1-34. Chapter 1 Introduction. 13. Display the generated LIBNAME statement for this library object by right-clicking on DIFT Test Source Library and selecting View Libname.. 14. Click. to close the Display Libname window..
(45) 1.2 Introduction to Data Integration Applications. 1-35. 15. Access the Job Editor window to examine the properties of the job objects in more detail. a. Right-click on DIFT Test Job – OrderFact Table Plus and select Open.. This job joins two source tables and then loads the result into a target table. The target table is then used as the source for the Rank transformation, the result of the ranking is loaded into a target table, sorted, and then a report is generated based on the rankings..
(46) 1-36. Chapter 1 Introduction. b. Click the DIFT Test Table – ORDERS table object. Note that the Details area now has a Columns tab.. The Columns tab in the Details area displays column attributes for the selected table object. These attributes are fully editable in this location. Similarly, selecting any of the table objects in the process flow diagram (DIFT Test Table – ORDERS, DIFT Test Table – ORDER_ITEM, DIFT Test Target – Order Fact Table (in diagram twice), DIFT Test Target – Ranked Order Fact) displays a Columns tab for that table object..
(47) 1.2 Introduction to Data Integration Applications. 1-37. c. Click the SQL Join transformation. Note that the Details area now has a Mappings tab.. The full functionality of the Mappings tab from the SQL Join Designer window is found on this Mappings tab. Similarly, selecting any of the transformations in the process flow diagram (SQL Join, Table Loader, Rank, Sort, List Data) displays a Mappings tab for that transformation..
(48) 1-38. Chapter 1 Introduction. d. Run the job by clicking which node is executing.. . As the transformations execute, they are highlighted to denote. As each transformation finishes, the icon is decorated with a symbol to denote success or failure. Those transformations that had errors are also outlined in red.. Also, the Status tab in the Details area provides the status for each part of the job that executed..
(49) 1.2 Introduction to Data Integration Applications. e. Double-click the word Error under Status for the Table Loader. The Details area moves focus to Warnings and Errors tab. The error indicates that the physical location for the target library does not exist.. 1-39.
(50) 1-40. Chapter 1 Introduction. f.. Select DIFT Test Target Library found in the Data Mart Development Ö DIFT Demo folder on the Folders tab. The Basic Properties pane displays a variety of information, including the physical path location..
(51) 1.2 Introduction to Data Integration Applications. 1-41. g. Create the needed folder. 1) Open a Windows Explorer by selecting Start Ö All Programs Ö Accessories Ö Windows Explorer. 2) Navigate to S:\Workshop\dift. 3) Create a new folder by selecting File Ö New Ö Folder. 4) Type testdm as the name of the new folder and then press ENTER. h. Run the entire job again by clicking transformation completed successfully.. i.. . The Details area shows that all but the List Data. Double-click the word Error under Status for the List Data transformation. The Details area moves focus to Warnings and Errors tab. The error indicates that the physical file does not exist. However, because the file is to be created from the transformation, it is more likely that the location for the file does not exist..
(52) 1-42. Chapter 1 Introduction. j.. Create the needed folder. 1) If necessary, open a Windows Explorer by selecting Start Ö All Programs Ö Accessories Ö Windows Explorer. 2) Navigate to S:\Workshop\dift. 3) Create a new folder by selecting File Ö New Ö Folder. 4) Type reports as the name of the new folder and then press ENTER.. k. Run the List Data transformation. 1) Click the List Data transformation. 2) Click. (the Run Selected Transformations tool).. The Status tab of the Details pane shows the transformation completed successfully.. l.. Select File Ö Close (or click ) to close the job editor window. If any changes were made while viewing the job, the following window opens:. m. If necessary, click. to not save changes to the job..
(53) 1.2 Introduction to Data Integration Applications. 16. Investigate some of the options available for SAS Data Integration Studio by selecting Tools Ö Options….. The General tab of the Options window opens:. 1-43.
(54) 1-44. Chapter 1 Introduction. 17. Examine the Show advanced property tabs option (this option is on the General tab of the Options window). a. If Show advanced property tabs is de-selected …. … then tabs such as Extended Attributes and Authorization do not appear in the properties window for a specified object.. b. If Show advanced property tabs is selected…. … then tabs such as Extended Attributes and Authorization do appear in the properties window for a specified object..
(55) 1.2 Introduction to Data Integration Applications. 1-45. 18. Examine the Enable row count on basic properties for tables option (this option is on the General tab of the Options window). a. If Enable row count on basic properties for tables is de-selected…. … then the Number of Rows field displays Row count is disabled for a selected table object..
(56) 1-46. Chapter 1 Introduction. b. If Enable row count on basic properties for tables is selected…. … then the Number of Rows field displays the number of rows found for the selected table object..
(57) 1.2 Introduction to Data Integration Applications. 1-47. 19. Click the SAS Server tab in the Options window.. a. Click to establish and/or test the application server connection for SAS Data Integration Studio. An information window opens verifying a successful connection:. b. Click. . to close the Information window. The application server can also be set and tested via the status bar. For example, if the application server has not been defined, the status bar shows: . Double-clicking on this area opens the Default Application Server window. A selection can be made and tested..
(58) 1-48. Chapter 1 Introduction. 20. Click the Job Editor tab.. The options on this tab affect the Job Editor..
(59) 1.2 Introduction to Data Integration Applications. a. Verify the default selection in the Nodes area is Collapse.. This results in objects on the Diagram tab, such as the following:. If Expand is selected in the Nodes area…. …the resultant objects in the Diagram area are then drawn as the following:. 1-49.
(60) 1-50. Chapter 1 Introduction. b. Verify that the default selection in the layout area is Left To Right.. This results in process flow diagrams going horizontally, such as the following:. If Top To Bottom is selected in the layout area…. …the process flow diagrams resemble the following:.
(61) 1.2 Introduction to Data Integration Applications. 21. Click the View Data tab.. The options on this tab affect how data are displayed in the View Data window.. 1-51.
(62) 1-52. Chapter 1 Introduction. a. Verify the default selection for the Column headers area is Show column name in column header.. This results in column heading such as the following:. If Show column description in column header is selected in the Column headers area…. …the column headings resemble the following:. If both Show column name in column header and Show column description in column header are selected in the Column headers area…. …the column headings resemble the following:.
(63) 1.2 Introduction to Data Integration Applications. 22. Click the Data Quality tab. a. Verify that the following fields are set appropriately in the Data Quality area: Default Locale: ENUSA DQ Setup Location: C:\Program Files\SAS\SASFoundation\9.2\dquality\sasmisc\dqsetup.txt Scheme Repository Type: dfPower scheme (BFD) Scheme Repository: C:\Program Files\DataFlux\QltyKB\CI\2008A\scheme b. Verify the path specified for DataFlux Installation Folder under the DataFlux dfPower area. DataFlux Installation Folder: C:\Program Files\DataFlux\dfPower Studio\8.1. 1-53.
(64) 1-54. Chapter 1 Introduction. 23. Select the Tools menu. Note that there is an item, dfPower Tool, that provides direct access to many of the DataFlux dfPower Studio applications..
(65) 1.2 Introduction to Data Integration Applications. Using dfPower Explorer and dfPower Profile. Introducing dfPower Explorer 1. From SAS Data Integration Studio session, select Tools Ö dfPower Tool Ö dfPower Explorer. 2. Create a new project a. Click File Ö New Project…. b. Verify that DIFT Repository is the selected Repository.. 1-55.
(66) 1-56. Chapter 1 Introduction. . If the DIFT Repository has not been created, then follow these steps to create and initialize it: • From SAS Data Integration Studio, select dfPower Tool Ö dfPower Studio. • In the Navigation area, right-click on Repositories and select New Repository….. • •. Enter a name of DIFT Repository. Select File as the type.. •. Navigate to S:\Workshop\dift and create a new folder named diftrepos.. •. Navigate to S:\Workshop\dift\diftrepos and enter a filename of DIFTRepository.rps.. •. Click. .. •. Click. . The repository is initialized.. •. Click. ..
(67) 1.2 Introduction to Data Integration Applications. a. Click. 1-57. .. 1) Type DIFT Orion Detail as the value for the Description field. 2) Click. next to the Directory field. The Browse for Folder window opens.. 3) Navigate to S:\Workshop\OrionStar\ordetail. 4) Click to close the Browse for Folder window. The Directory field displays the selected path.. 5) Click. to close the Add SAS Data Set Directory window..
(68) 1-58. Chapter 1 Introduction. b. Expand DIFT Orion Detail..
(69) 1.2 Introduction to Data Integration Applications. 1-59. c. Double-click CUSTOMER, ORDER_ITEM, ORDERS and PRODUCT_LIST. The tables are moved to the Selected list.. d. Click. ..
(70) 1-60. Chapter 1 Introduction. e. Click Primary key metadata. f.. Click Foreign key metadata.. g. Click Index metadata. The final settings resemble the following:. h. Click. ..
(71) 1.2 Introduction to Data Integration Applications. i.. Type DIFT Orion Detail Project as the value for the Project name field.. j.. Type DIFT Orion Detail Project as the value for the Description field.. k. Click. .. 1-61.
(72) 1-62. Chapter 1 Introduction. The results are displayed in dfPower Explorer. Four tables were analyzed, and in the four tables there are thirty columns..
(73) 1.2 Introduction to Data Integration Applications. 2. Expand Databases in the Project Metadata panel. 3. Click DIFT Orion Detail.. The selected table, CUSTOMER, has one matching table, ORDERS.. 1-63.
(74) 1-64. Chapter 1 Introduction. 4. Click the ORDERS table in the Database area. There are two matching tables for the ORDERS table..
(75) 1.2 Introduction to Data Integration Applications. 5. Click the ORDER_ITEM table in the Matching Tables area. Having both tables selected displays the relationship between the two tables.. The Order_ID column could potentially link these tables.. 1-65.
(76) 1-66. Chapter 1 Introduction. 6. Click ORDER_ITEM table in the Database area.. Note that two tables are potentially matches..
(77) 1.2 Introduction to Data Integration Applications. 7. Click PRODUCT_LIST in the Matching Tables area. The Product_ID column could potentially link these tables.. 8. Select File Ö Exit to close dfPower Explorer.. 1-67.
(78) 1-68. Chapter 1 Introduction. Introducing dfPower Profile Before initiating any data warehousing project, it is important to first examine the data and identify any potential issues that may exist. 1. From dfPower Explorer, right-click on the CUSTOMER table in the Database area and select Add Table to Profile Task.. The table and all its columns get added to the Profile Job Definition & Notes area..
(79) 1.2 Introduction to Data Integration Applications. 1-69. 2. From dfPower Explorer, right-click on the ORDER_ITEM table in the Database area and select Add Table to Profile Task. 3. From dfPower Explorer, right-click on the PRODUCT_LIST table in the Database area and select Add Table to Profile Task. 4. Collapse the listing of columns for each of the tables in the Profile Job Definition & Notes area.. 5. Right-click on the Databases keyword and select Create Profile Job..
(80) 1-70. Chapter 1 Introduction. 6. Type DIFT Orion Detail Information as the value for the Name field.. 7. Click. . A message window confirms creation of the profile job.. 8. Click. to close the message window.. 9. Select File Ö Exit to close dfPower Explorer..
(81) 1.2 Introduction to Data Integration Applications. 10. From SAS Data Integration Studio session, select Tools Ö dfPower Tool Ö dfPower Profile (Configurator).. 11. Select File Ö Open. 12. Single-click DIFT Orion Detail Information.. 13. Click. .. 1-71.
(82) 1-72. Chapter 1 Introduction. . If a dfPower Profile job is not available (for instance, one was not created using dfPower Explorer), SAS data can be added by using the following steps: • Select Insert Ö SAS Data Set Directory…. • Type DIFT Orion Detail Data as the value for the Description field. • Click. next to the Directory field. The Browse for Folder window opens.. – Navigate to S:\Workshop\OrionStar\ordetail. – Click. to close the Browse for Folder window.. • The Directory field displays the selected path. • Click. to close the Insert SAS Data Set Directory window.. The link to the SAS Data Set Directory appears in the database listing. 14. Expand the DIFT Orion Detail data source. A list of available SAS tables is displayed. The one’s selected are the one’s added from dfPower Explorer..
(83) 1.2 Introduction to Data Integration Applications. 15. Select Job Ö Select Metrics…. The Metrics window opens. 16. Click Frequency distribution and Pattern frequency distribution. 17. Click Select/unselect all.. 18. Click. to close the Metrics window.. 1-73.
(84) 1-74. Chapter 1 Introduction. 19. Select Job Ö Run Job…. The Run Job dialog box opens.. . If you did not open an existing job in dfPower Profile (Configurator) and you attempt to run a job, a warning window opens.. Clicking. opens the Save As window. Typing a valid name and then clicking displays the Run Job window as above..
(85) 1.2 Introduction to Data Integration Applications. 20. Verify that Standard output is selected. 21. Accept the default name for the report that matches the job name (a good best practice). 22. Verify that Append to report (if already exists) is selected. 23. Verify that Launch viewer when processing is complete is selected.. 24. Click. to close the Run Job window. The Executor executes the job.. 1-75.
(86) 1-76. Chapter 1 Introduction. The results are displayed in a dfPower Profile (Viewer) window..
(87) 1.2 Introduction to Data Integration Applications. 1-77. 25. Select CUSTOMER in the Tables pane.. The columns from the CUSTOMER table are listed in the Tables area with a tabular view of the each column and its calculated statistics..
(88) 1-78. Chapter 1 Introduction. 26. Select the Country column. 27. Click the Frequency Distribution tab to display frequency counts for the Country column.. 28. Click the Pattern Frequency Distribution tab..
(89) 1.2 Introduction to Data Integration Applications. 29. Select Customer_ID column. 30. Locate the Uniqueness and Primary Key Candidate statistics.. 1-79.
(90) 1-80. Chapter 1 Introduction. 31. Select Tools Ö Visualize Table…. 32. In the Metrics area, select only the Data Length, Maximum Length and Minimum Length statistics. 33. In the Fields area, select only Customer_Address, Customer_FirstName, Customer_LastName and Customer_Name.. 34. Click. .. 35. Select File Ö Exit to close the dfPower Profile (Viewer) window. 36. Select File Ö Exit to close the dfPower Profile (Configurator) window..
(91) 1.3 Introduction to Change Management. 1.3 Introduction to Change Management. Objectives . Define the change management feature of SAS Data Integration Studio.. 35. SAS Data Integration Studio and Metadata SAS Data Integration Studio enables you to create metadata objects that define sources, targets, and the transformations that connect them. These objects are saved to one or more metadata repositories.. 36. 1-81.
(92) 1-82. Chapter 1 Introduction. Change Management The Change Management facility in SAS Data Integration Studio enables multiple SAS Data Integration Studio users to work with the same metadata repository at the same time without overwriting each other's changes. Change Management Metadata Repository. 37. Working with Change Management Under change management, most users are restricted from adding or updating the metadata in a changemanaged repository. Authorized users, however, can add new metadata objects and check them into the change-managed repository check out metadata objects to update them.. 38.
(93) 1.3 Introduction to Change Management. Checkouts Tree If you are authorized to work with a project repository, a Checkouts tree is added to the desktop of SAS Data Integration Studio. The Checkouts tree displays metadata in your project repository, which is an individual work area or playpen.. 39. Checked Out Objects To update a metadata object in the change-managed repository, the object needs to be checked out. When checked out, the object is locked in the change-managed repository, and a copy is placed in the Checkouts tree. Metadata that has been checked out for update has a check mark beside it, such as the three objects shown in the following display capture.. 40. 1-83.
(94) 1-84. Chapter 1 Introduction. New Metadata Objects New metadata objects can be added to a changemanaged repository as usual, with the new metadata added to the Checkouts tree. New metadata objects that have never been checked in do not have a check mark beside them, such as the table object shown here.. 41.
(95) 1.3 Introduction to Change Management. Exploring the Basics of Change Management. Access SAS Data Integration Studio Using Barbara’s Project Repository 1. Select Start Ö All Programs Ö SAS Ö SAS Data Integration Studio 4.2. 2. Log on using Barbara’s credentials to access her project repository. a. Select Barbara’s Work Repository as the connection profile.. b. Click. Do not click Set this connection profile as the default. to close the Connection Profile window and open the Log On window.. c. Type Barbara as the value for the User ID field and Student1 as the value for the Password field.. d. Click. Do not click Save user ID and password in this profile. to close the Log On window.. 1-85.
(96) 1-86. Chapter 1 Introduction. SAS Data Integration Studio opens..
(97) 1.3 Introduction to Change Management. 1-87. 3. Double-click on the application server area of the status bar to open the Default Application Server window.. The Default Application Server window opens.. 4. Verify that SASApp is selected as the value for the Server field. 5. Click. 6. Click 7. Click following:. .. to close the Information window. to close the Default Application Server window. The status bar updates to be the.
(98) 1-88. Chapter 1 Introduction. 8. Verify that the tree view area now has a Checkouts tab.. This tab displays metadata objects checked out of the parent repository, as well as any new objects that Barbara creates. 9. If necessary, click the Folders tab. 10. Expand the Data Mart Development Ö DIFT Demo folders.. 11. Select the DIFT Test Job – OrderFact Table Plus job, hold down the CTRL key, and select both DIFT Test Source Library and DIFT Test Table – ORDER_ITEM..
(99) 1.3 Introduction to Change Management. 12. Right-click on one of the selected items and select Check Out.. The icons for the three objects are decorated with a check (. ).. 1-89.
(100) 1-90. Chapter 1 Introduction. 13. Click the Checkouts tab. The three checked out objects are shown on the Checkouts tab.. 14. Right-click on DIFT Test Table – ORDER_ITEM and select Properties. 15. Type Testing out Change Management as the value for the Description field.. 16. Click. to close the Properties window..
(101) 1.3 Introduction to Change Management. 17. Right-click on DIFT Test Table – ORDER_ITEM and select Check In (optionally, select Check Outs Ö Check In with the table object selected). The Check In Wizard opens.. 1-91.
(102) 1-92. Chapter 1 Introduction. 18. Type Testing out Change Management as the value for the Title field. 19. Type Showing off features of Change Management – simply added a description to the table object as the value for the Description field.. 20. Click. ..
(103) 1.3 Introduction to Change Management. 21. Verify that DIFT Test Table – ORDER_ITEM is selected.. 22. Click. .. 1-93.
(104) 1-94. Chapter 1 Introduction. 23. Review the Summary window.. 24. Click. .. The Checkouts tab no longer has this table object..
(105) 1.3 Introduction to Change Management. 1-95. Access SAS Data Integration Studio Using Ole’s Project Repository 1. Select Start Ö All Programs Ö SAS Ö SAS Data Integration Studio 4.2. 2. Log on using Ole’s credentials to access his project repository. a. Select Ole’s Work Repository as the connection profile.. b. Click. Do not click Set this connection profile as the default. to close the Connection Profile window and open the Log On window.. c. Type Ole as the value for the User ID field and Student1 as the value for the Password field.. d. Click. Do not click Save user ID and password in this profile. to close the Log On window..
(106) 1-96. Chapter 1 Introduction. SAS Data Integration Studio opens..
(107) 1.3 Introduction to Change Management. 3. Verify that the default application server is set to SASApp.. 4. Verify that the tree view area now has a Checkouts tab.. 5. If necessary, click the Folders tab. 6. Expand Data Mart Development Ö DIFT Demo. Two objects are checked out (by Barbara).. 1-97.
(108) 1-98. Chapter 1 Introduction. 7. Right-clicking on DIFT Test Source Library (or on DIFT Test Job – OrderFact Table Plus) shows that the Check Out option is not available for this checked out object.. 8. Right-click on DIFT Test Source Library and select History.. Ole can tell that Barbara has the object checked out. 9. Select File Ö Close (or click. ) to close the History window..
(109) 1.3 Introduction to Change Management. 10. Right-click on DIFT Test Table – ORDER_ITEM and select History.. Ole can tell that Barbara had this object checked out and that it was checked back in. The title and description information filled in by Barbara in the Check In Wizard can give Ole an idea on what updates Barbara made to this metadata object. 11. Select File Ö Close (or click. ) to close the History window.. 12. Right-click on DIFT Test Table – ORDER_ITEM and select Check Out. 13. Click the Checkouts tab and verify that the table object is available for editing.. 1-99.
(110) 1-100. Chapter 1 Introduction. 14. Right-click on DIFT Test Table – ORDER_ITEM and select Properties. 15. Clear the text in the Description field.. 16. Click. to close the Properties window..
(111) 1.3 Introduction to Change Management. Access SAS Data Integration Studio Using Ahmed’s Credentials 1. Select Start Ö All Programs Ö SAS Ö SAS Data Integration Studio 4.2. 2. Log on using Ahmed’s credentials (an administrator) to access the Foundation repository. a. Select My Server as the connection profile. b. Click. to close the Connection Profile window and open the Log On window.. c. Type Ahmed as the value for the User ID field and Student1 as the value for the Password field. d. Click. to close the Log On window.. SAS Data Integration Studio opens.. Ahmed does not see a Checkouts tab.. 1-101.
(112) 1-102. Chapter 1 Introduction. 3. If necessary, click the Folders tab. 4. Expand Data Mart Development Ö DIFT Demo. Three objects are checked out (by Barbara and Ole)..
(113) 1.3 Introduction to Change Management. 1-103. 5. Select Check Outs Ö Clear Projects. As an administrator, Ahmed can clear project repositories for all team members.. Clearing a project repository unlocks checked out objects (any changes made to these checked out objects will not be saved) and deletes any new objects that may have been created in the project repository. 6. Select both repositories (that is, select Barbara’s Work Repository, hold down the CTRL key, and select Ole’s Work Repository).. 7. Click. ..
(114) 1-104. Chapter 1 Introduction. 8. Verify that the checked out objects are no longer checked out..
(115) 1.3 Introduction to Change Management. 9. Right-click on DIFT Test Table – ORDER_ITEM and select Properties. 10. Verify that the change Ole made to the Description field was not saved.. 11. Click. to close the Properties window.. 12. Select File Ö Exit to close Ahmed’s SAS Data Integration Studio session.. 1-105.
(116) 1-106. Chapter 1 Introduction. 13. Access Ole’s SAS Data Integration Studio session. This session has not been refreshed, so it looks like Ole still has an object checked out.. 14. Select View Ö Refresh. The Checkouts tab was active, so the metadata for the project repository is refreshed..
(117) 1.3 Introduction to Change Management. 1-107. 15. Click the Folders tab. 16. Select View Ö Refresh. The metadata are updated to show that no objects (in this folder) are checked out.. 17. Select File Ö Exit to close Ole’s SAS Data Integration Studio session..
(118) 1-108. Chapter 1 Introduction. 18. Access Barbara’s SAS Data Integration Studio session. The metadata in the Foundation repository has been refreshed so the Data Mart Development Ö DIFT Demo folder displays no checked out objects. However, the Checkouts tab, representing a different repository, still shows objects as being checked out.. 19. Click the Checkouts tab. 20. Select View Ö Refresh. The Checkouts tab was active, so the metadata for the project repository is refreshed.. 21. Select File Ö Exit to close Barbara’s SAS Data Integration Studio session..
(119) Chapter 2 Introduction to Course Data and Course Scenario 2.1. Introduction to Classroom Environment and Course Data ........................................ 2-3. 2.2. Course Tasks .................................................................................................................. 2-8 Exercises .............................................................................................................................. 2-14. 2.3. Solutions to Exercises ................................................................................................. 2-26.
(120) 2-2. Chapter 2 Introduction to Course Data and Course Scenario.
(121) 2.1 Introduction to Classroom Environment and Course Data. 2.1 Introduction to Classroom Environment and Course Data. Objectives . Define common job roles. Define the classroom environment. Explore the course scenario.. 3. SAS Platform Job Roles Most organizations have multiple types of job roles. The various components that make up the platform for SAS Business Analytics fall into one or more of the following job roles: BI Applications Developer Business Analyst Business User Data Integration Architect Data Integration Developer Platform Administrator Project Manager. 5. This class concentrates on the Data Integration Developer job role.. 2-3.
(122) 2-4. Chapter 2 Introduction to Course Data and Course Scenario. Data Integration Developer Data integration developers collect, cleanse, and store the data required for reporting and analysis. The software and applications primarily used by data integration developers include the following: SAS Data Integration Studio SAS Data Quality Solution SAS Scalable Performance Data Server. 6. Classroom Environment During this course, you will use a classroom machine on which the SAS platform has been installed and configured in a single machine environment. The single machine environment provides an easy way for each student to learn how to interact with the SAS platform without impacting each other. The classroom environment includes the following predefined elements in the SAS metadata: users for the various job roles groups metadata folders with data and report objects a basic security model 7.
(123) 2.1 Introduction to Classroom Environment and Course Data. Course Data The data used in the course is from a fictitious global sports and outdoors retailer named Orion Star Sports & Outdoors.. Orion Star has traditional stores, an online store, and a large catalog business. The corporate headquarters is located in the United States with offices and stores in many countries throughout the world.. 8. Course Data The Orion Star data used in the course consists of the following: data ranging from 2003 through 2007 employee information for the employees located in many countries as well as in the United States headquarters approximately 5,500 different sports and outdoor products approximately 90,000 customers worldwide approximately 750,000 orders 64 suppliers. 9. 2-5.
(124) 2-6. Chapter 2 Introduction to Course Data and Course Scenario. Orion Star Users and Groups In addition to the users and groups that are part of the SAS platform installation, users and groups for Orion Star have been defined in the metadata. The Orion Star users each belong to one or more job roles.. continued.... 10. Orion Star Users and Groups The Orion Star users have been organized into several metadata groups.. 11.
(125) 2.1 Introduction to Classroom Environment and Course Data. Orion Star Metadata Folders SAS metadata folders will be used to organize the metadata objects needed to development a data mart. You will work with a predefined folder (Data Mart Development) and will create subfolders under this grouping to further organize the metadata objects.. 12. Course Scenario During this course, you will have the opportunity to learn about SAS Data Integration Studio as a data integration developer. The course consists of follow-along demonstrations, exercises to reinforce the demonstrations, and a case study to practice what you have learned.. 13. 2-7.
(126) 2-8. Chapter 2 Introduction to Course Data and Course Scenario. 2.2 Course Tasks. Objectives . Define the tasks for the course scenario. Define the data model to be used for the data mart.. 16. Course Tasks There are several main steps you will accomplish during this class.. 17. Step 1:. Register metadata for source tables.. Step 2:. Register metadata for target tables.. Step 3:. Create jobs to load tables.. Step 4:. Investigate a variety of transformations.. Step 5:. Investigate table relationships.. Step 6:. Investigate slowly changing dimensions.. Step 7:. Develop user-defined transformations.. Step 8:. Deploy jobs..
(127) 2.2 Course Tasks. Course Tasks – Step 1 Step 1:. Register metadata for source tables.. There are many types of data sources that can be used to create tables in a data mart. The Register Tables wizards enable you to register one or more selected tables, based on the physical structure of the tables. For this step, you will investigate each of the following types of source data: SAS tables ODBC data source You will also use the External File wizards.. 18. Course Tasks – Step 2 Step 2:. Register metadata for target tables.. The New Tables wizard allows you to register metadata for a new table. The metadata can include the following: n Type of table to be constructed n Library for the new table n Name of the new table n Columns for the new table n Indexes for the new table For this step, you will define metadata such as the above several target tables.. 19. 2-9.
(128) 2-10. Chapter 2 Introduction to Course Data and Course Scenario. Course Tasks – Step 3 Step 3:. Create jobs to load tables.. The process flow to load or populate a target table from sources can be defined in the Job Editor window. The process flows will involve the metadata objects for the source tables, for the target tables and one or more transformations. The combination of the sources, targets and transformations will specify how the extraction, transformation and loading of the targets will occur.. 20. Course Tasks – Step 4 Step 4:. Investigate a variety of transformations.. There are many transformations available by default within SAS Data Integration Studio. For this step, you will investigate each of the following transformations: Extract Summary Statistics Loop/Loop End Return Code Check Data Validation Transpose Append Sort Rank Apply Lookup Standardization 21.
(129) 2.2 Course Tasks. Course Tasks – Step 5 Step 5:. Investigate table relationships.. Table maintenance is important for a data integration developer. Various properties can be defined for each table that can include the following: integrity constraints keys and indexes load techniques. 22. Course Tasks – Step 6 Step 6:. Investigate slowly changing dimensions.. During data modeling for a data warehouse, dimension tables may be defined as slowly changing dimensions. SAS Data Integration Studio provides transformations for loading and maintaining dimension and fact tables.. 23. 2-11.
(130) 2-12. Chapter 2 Introduction to Course Data and Course Scenario. Course Tasks – Step 7 Step 7:. Develop user-defined transformations.. While SAS Data Integration Studio provides a large library of transformations, it is possible to add to this library using the New Transformation Wizard.. 24. Course Tasks – Step 8 Step 8:. Deploy jobs.. SAS Data Integration Studio jobs can be deployed to run in a batch environment. Jobs can also be converted in to a SAS stored process. For this step you will investigate both of the deployment techniques.. 25.
(131) 2.2 Course Tasks. Orion Star Target Star Schema One goal of this class is to produce, from the Orion Star source data, a dimensional data model that is a star schema. Organization Dimension. Customer Dimension. Order Fact Table. Product Dimension. Time Dimension. 26. Data Modeling Technique There are many data modeling techniques that can be employed to develop meaningful tables for data warehouses. One simple technique is to gather a list of questions that need to be answered in each area that is to be a dimension (for example, the product area that is to be the product dimension). These questions can be used to help formulate what is needed column-wise for the data warehouse tables. A study can then be conducted using available source table columns to design the sources for the target dimension table, as well as any calculated columns.. 27. 2-13.
Related documents
Your doctor may prescribe nonsteroidal anti-inflammatory drugs or in some cases, a muscle relaxant, to relieve mild to moderate back pain that doesn't get better with
Typical SharePoint Deployment SQL Database Intranet SharePoint Server Fast Search Server Active Directory.. Primary
On the Select Server Roles page, in the Roles list, select the Active Directory Lightweight Directory Services check box, and then click Next.. Finish adding the AD LDS server
The paper first explained the research background and significance, summarized the related research at home and abroad, put forward the research target, contents, idea and
Although WBL programs are more often offered across higher education and vocational training, WBL has also demonstrated promise at the secondary school level as a model for
For example (1) TPKag93 and JFIwa98 shared the highest levels of identities in the RNA1 and Protein A sequences (Fig. 1), (2) SJNag93, TPKag93, and TNV shared similar lengths of
In the present study clinical characteristics such as inability to feed, cyanosis, lethargy and convulsion were found to be significantly associated with positive blood culture (p
The insurance market for the commercial space transportation industry is a global one, with satellite owners, satellite manufacturers, launch services providers, insurance