• No results found

X Task 6: Create a package for incremental employee data extraction

In document 10777A ENU TrainerHandbook Part4 (Page 73-76)

1. Maximize SQL Server Data Tools, and then in Solution Explorer, right-click the SSIS Packages folder and click New SSIS Package.

2. When the new package is created, in Solution Explorer, right-click Package1.dtsx and click Rename. Then rename the package to Extract Changed Employee Data.dtsx.

3. In the SSIS Toolbox, in the Other Tasks section, drag a CDC Control Task to the control flow surface of the Extract Changed Employee Data.dtsx package.

4. On the control flow surface, right-click CDC Control Task and click Rename. Then rename it to Get

Processing Range.

5. Double-click the Get Processing Range task, and in the CDC Control Task Editor dialog box, set the following properties. Then click OK.

Property Setting

SQL Server CDC database ADO.NET connection manager

localhost HumanResources ADO NET

CDC control operation Get processing range

Variable containing the CDC state Click New and then in the Add New Variable dialog box, click OK to create a variable named

CDC_State in the Extract Changed Employee Data container.

Automatically store state in a database table Selected Connection manager for the database

where the state is stored

localhost Staging ADO NET

Table to use for storing state [dbo].[cdc_states]

MCT USE ONL

Y. STUDENT USE PR

OHIBITED

L7-56 Module 7: Implementing an Incremental ETL Process

6. In the SSIS Toolbox, in the Favorites section, drag a Data Flow Task to the to the control flow surface of the Extract Initial Employee Data.dtsx package, and drop it under the Get Processing

Range task.

7. On the control flow surface, right-click Data Flow Task and click Rename. Then rename it to Extract

Changed Employee Data. Then drag a success precedence constraint from the Get Processing Range task to the Extract Changed Employee Data task.

8. Double-click the Extract Changed Employee Data task to view its data flow surface.

9. In the SSIS Toolbox, in the Other Sources section, drag a CDC Source to the to the data flow surface of the Extract Changed Employee Data task. Then on the data flow surface, right-click CDC Source, click Rename, and rename it to Employee Changes.

10. Double-click Employee Changes and in the CDC Source dialog box, set the following properties. Then click OK.

Property Setting

ADO.NET connection manager localhost HumanResources ADO NET

CDC enabled table [dbo].[Employee]

Capture instance dbo_Employee

CDC processing mode Net

Variable containing the CDC state User::CDC_State

11. In the SSIS Toolbox, in the Other Transforms section, drag a CDC Splitter to the data flow surface and drop it below the Employee Changes data source. Then click Employee Changes and drag the blue data flow path connection to the CDC Splitter transformation.

12. In the SSIS Toolbox, in the Other Destinations section, drag an ADO NET Destination to the data flow surface and drop it below the CDC Splitter transformation. Then on the data flow surface, right- click ADO NET Destination, click Rename, and rename it to Employee Inserts.

13. Click the CDC Splitter transformation and drag the blue data flow path connection to the Employee

Inserts destination. In the Input Output Selection dialog box, select the InsertOutput output and

click OK.

14. Double-click Employee Inserts and in the ADO.NET Destination Editor dialog box, set the following properties. Then click OK.

Property Setting

Connection manager localhost Staging ADO NET

Use a table or view "dbo"."EmployeeInserts"

Mappings On the Mappings tab, verify that all available input columns other

than _$start_lsn, _$operation, and _$update_mask are mapped to destination columns of the same name.

MCT USE ONL

Y. STUDENT USE PR

OHIBITED

Lab 7A: Extracting Modified Data L7-57

15. In the SSIS Toolbox, in the Other Destinations section, drag an ADO NET Destination to the data flow surface and drop it directly below and to the left of the CDC Splitter transformation. Then on the data flow surface, right-click ADO NET Destination, click Rename, and rename it to Employee

Updates.

16. Click the CDC Splitter transformation and drag the blue data flow path connection to the Employee

Updates destination. In the Input Output Selection dialog box, select the UpdateOutput output

and click OK.

17. Double-click Employee Updates and in the ADO.NET Destination Editor dialog box, set the following properties. Then click OK.

Property Setting

Connection manager localhost Staging ADO NET

Use a table or view "dbo"."EmployeeUpdates"

Mappings On the Mappings tab, verify that all available input columns other

than _$start_lsn, _$operation, and _$update_mask are mapped to destination columns of the same name.

18. In the SSIS Toolbox, in the Other Destinations section, drag an ADO NET Destination to the data flow surface and drop it below and to the right of the CDC Splitter transformation. Then on the data flow surface, right-click ADO NET Destination, click Rename, and rename it to Employee Deletes. 19. Click the CDC Splitter transformation and drag the blue data flow path connection to the Employee

Deletes destination. The DeleteOutput output should be selected automatically.

20. Double-click Employee Deletes and in the ADO.NET Destination Editor dialog box, set the following properties. Then click OK.

Property Setting

Connection manager localhost Staging ADO NET

Use a table or view "dbo"."EmployeeDeletes"

Mappings On the Mappings tab, verify that all available input columns other

than _$start_lsn, _$operation, and _$update_mask are mapped to destination columns of the same name.

21. Click the Control Flow tab of the Extract Changed Employee Data.dtsx package, and in the SSIS Toolbox, in the Other Tasks section, drag a CDC Control Task to the control flow surface and drop it below the Extract Changed Employee Data task.

22. On the control flow surface, right-click CDC Control Task and click Rename. Then rename it to Mark

Processed Range. Then drag a success precedence constraint from the Extract Changed Employee Data task to the Mark Processed Range task.

MCT USE ONL

Y. STUDENT USE PR

OHIBITED

L7-58 Module 7: Implementing an Incremental ETL Process

23. Double-click the Mark Processed Range task, and in the CDC Control Task Editor dialog box, set the following properties. Then click OK.

Property Setting

SQL Server CDC database ADO.NET connection manager

localhost HumanResources ADO NET

CDC control operation Mark processed range

Variable containing the CDC state User:: CDC_State

Automatically store state in a database table Selected Connection manager for the database where the

state is stored

localhost Staging ADO NET

Table to use for storing state [dbo].[cdc_states]

State name CDC_State

24. On the File menu, click Save All.

In document 10777A ENU TrainerHandbook Part4 (Page 73-76)