SQL
Server
Integration
Services
Design
Patterns
Second Edition
Andy
Leonard
Tim Mitchell
Matt
Masson
Jessica Moss
Michelle Ufford
Apress*
Contents
J
First-Edition Foreword xv
About the Authors xvii
About theTechnical Reviewer xix
Chapter
1: Metadata Collection 1About SQL Server Data Tools 1
A Peek at the Final Product 1
SQLServer
Metadatacatalog
3 sys.dm_os_performance_counters 3 sys.dm_db_index__usage_stats 3 sys.dm_os_sys_info 3 sys.tables 3 sys.indexes 3 sys.partitions 4 sys.allocation_units 4Setting Up
the CentralRepository
4The Iterative Framework 6
Metadata Collection 14
Summary
26HChapter
2:Execution Patterns 27Building
the Demonstration SSISPackage
27DebugExecution 28
Command-LineExecution 29
ExecutePackage Utility 30
TheSQLServer 2014
Integration
Services Service 30 IntegrationServicesCatalogs 30IntegrationServerCatalogStored Procedures 31
Scheduling
SSISPackage
Execution 53 SchedulinganSSIS Package 53 SchedulingaFileSystem Package 54RunningSQLServerAgentJobs with the Custom Execution Framework 55
Runningthe Custom ExecutionFrameworkwith SQL ServerAgent 56 ExecutePackageTask 57
Execution from
Managed
Code 58The DemoApplication 58
ThefrmMain Form 59
Conclusion 70
^Chapter
3:Scripting
Patterns 71The Toolset 71
Should IUseScript? 72
TheScriptEditor 72
TheScriptTask 75
TheScript Component 77
Script
Maintenance Patterns 78Code Reuse 78
Source Control 79
Scripting Design
Patterns 79ConnectionManagersandScripting 80
Variables 82
NamingPatterns 85
CONTENTS
Chapter
4:SQLServer Source Patterns 87Setting Up
a Source 87Selecting
aSQL Server ConnectionManager
and Provider 88ADO.NET 89
ODBC 89
OLE DB 91
Creating
aSQL Server SourceComponent
92Writing
a SQL Server SourceComponent Query
95ADO.NETData Access 95
OLE DB Data Access 96
WasteNot,Want Not 97
Data Translations 97
Source Assistant 97
Summary
99Chapter
5: Data Correction with DataQuality
Services 101OverviewofData
Quality
Services 101 Usingthe DataQualityClient 102Using
DQS withSSIS 108DQSCleansingTransform 108
DQS ExtensionsonCodePlex 113
Cleansing
Data in the Data Flow 114 HandlingtheOutputof theDQS CleansingTransform 114Performance Considerations 117
Approving
andImporting
Cleansing
Rules 121Conclusion 123
^Chapter
6: DB2 Source Patterns 125 DB2 DatabaseFamily
125Selecting
a DB2 Provider 126Find theDatabase Version 126
Pick ProviderVendor 127
Connecting
to aDB2 Database 127Querying
the DB2 Database 130DB2 SourceComponentParameters 131
DB2 SourceComponent DynamicQueries 132
Summary
133Chapter
7: Flat File Source Patterns ...135Flat File Sources 135 Movingto SSIS! 136
Strong-Typingthe Data 138
IntroducingaData-Staging Pattern 140
Variable-Length
Rows 143Readingintoa Data Flow 144
Splitting RecordTypes 145
Terminatingthe Streams 146
Header and Footer Rows 147 ConsumingaFooter Row 148
ConsumingaHeader Row 150
Producing aFooter Row 152
Producing aHeaderRow 159
The Archive File Pattern 163
Summary
169Chapter
8:Loading
a PDWRegion
in APS 171Massively
ParallelProcessing
171APS
Appliance
Overview 172Hardware Architecture 172
Software Architecture 173
Shared-NothingArchitecture 175
Clustered Columnstore Indexes 175
CONTENTS
Loading
Data 176DWLoadervs.IntegrationServices 176
ETLvs. ELT 177
Data
Import
Pattern for PDW 178Prerequisites 178
Preparingthe Data 179
PackageOverview 181
The Data Source 181
The Data Transformation 183
The DataDestination 184
Multithreading 189
Limitations 190
Summary
191Chapter
9:XML Patterns 193Using
the XML Source 193DealingwithMultiple Outputs 194
MakingThingsEasier with XSLT 200
Using
aScript Component
203ConfiguringtheScript Component 203 ProcessingXML with XmlSerializer 209
ProcessingXMLwith XmlReader and LINQ to XML 210
Conclusion 212
Chapter
10:Expression Language
Patterns 213Getting
to Know theExpression Language
213WhatIs the Expression Language? 213
WhyUse Expressions? 214
LanguageEssentials 215
Limitations 215
Putting
theExpression
Language
to Work 216Package Expressions 216
VariableExpressions 217
ConnectionManagers 217
Project-LevelConnectionManagers 219
Control Flow 219
Data Flow Expressions 222
Conclusion 226
aChapter
11: Data Warehouse Patterns • 227Incremental Loads 227
WhatIsanIncrementalLoad? 227
Why Incremental Loads? 228
TheSlowly Changing Dimension 228 Incremental Loads of Fact Data 228
Incremental Loads in SSIS 228
Native SSISComponents 229
TheSlowly
Changing
Dimension Wizard 232TheMERGE Statement 234
Change DataCapture (CDC) 237
DataErrors 242
SimpleErrors 242
MissingData 243
Codingto Allow Errors 246
Data Warehouse ETLWorkflow 248 Dividing Upthe Work 248 OnePackage=One Unit of Work 249
Conclusion 250
CONTENTS
Chapter
12: OData Source -251Understanding
the OData Protocol 251DataType Mappings 252
Query Options 253
Configuring
the OData ConnectionManager
254Enabling
Microsoft Online ServicesAuthentication 254Configuring
the SourceComponent
256Overriding
DataTypes
259Conclusion 260
a*
Chapter
13:Slowly Changing
Dimensions 261 TheSlowly
Changing
Dimension Transform 261Runningthe Wizard 262
Usingthe Transformations 267
OptimizingPerformance 268
Third-Party
SCDComponents
269Merge
Pattern 270Handling Type 1 Changes 271
Handling Type2Changes 272
Conclusion 272
HChapter
14:Loading
the Cloud 275Interacting
with the Cloud 275Incremental Loads to Azure SQL Database 276
ChangeDetection 276
New Rows(Only) 276
Building
the Cloud Loader 277Conclusion 280
Chapter
15:Logging
andReporting
Patterns 281Package
Logging
andReporting
281Setting Up PackageLogging 281 ReportingonPackage Logging 282
DesignPattern:PackageExecutions 283
Catalog Logging
andReporting
283Setting UpCatalog Logging 283 CatalogTables 285
ChangingLoggingLevels After the Fact 286
Design
Patterns 287ChangingtheLoggingLevel 287
UsingtheExistingReports 289 CreatingNewReports 290
Summary
291Chapter
16: Parent-Child Patterns 293 MasterPackage
Pattern 293 AssigntheChildPackage 294 ConfigureParameterBinding 295Dynamic
ChildPackage
Pattern 296Child-to-Parent Variable Pattern 302
Conclusion 303
Chapter
17:Configuration
305Parameters 305
ConfiguringYourPackageUsingParameters 307
Usingthe Parametrize Dialog 309
CreatingVisual StudioConfigurations 310
SpecifyingEntry-PointPackages 312
Connection
Managers
313CONTENTS
Parameter
Configuration
onthe Server 313 DefaultConfiguration 314Server Environments 315
DefaultParameterValuesUsingT-SQL 317
Package ExecutionThroughthe SSISCatalog 317
Parameterswith DTEXEC 320
Projectsonthe FileSystem 320
Projectsin the SSISCatalog 321
Dynamic Configurations
322ConfiguringfromaDatabase Table 323
SettingValuesUsingaScriptTask 326
DynamicPackageExecutions 327
Conclusion 329
Chapter
18:Deployment
331Project Deployment
Model 331SSIS
Catalog
332Deployment
Methods 334DeploymentfromtheCommandLine 335
Deployment Using Custom Code 336
Deployment UsingPowerShell 337
Deployment UsingSQL 338
Package
Deployment
Model 339Conclusion 341
Chapter
19: BusinessIntelligence Markup Language
343 A BriefHistory
of BusinessIntelligence
Markup Language
343Building
Your First Biml File 344Building
aBasic Incremental Load SSISPackage
347CreatingDatabasesandTables 347
Adding Metadata 349
SpecifyingaData Flow Task 350
AddingTransforms 350
Testingthe Biml 356
Using
Bimlas an SSISDesign
PatternsEngine
360Time for aTest 367
Conclusion 368
HChapter
20: Biml andSSIS Frameworks 369Using
Biml with an SSIS Framework 369AddingSSISPackageMetadatato theFramework 369
Executingthe Biml File 374
Generating
the SSIS Command-Line 375Summarizing
376£9Appendix
A: Evolution ofan SSIS Framework 377Starting
in the Middle 377IntroducingSSISApplications 387
A Note AboutRelationships 389
RetrievingSSISApplicationsinT-SQL 392
RetrievingSSISApplicationsin SSIS 396
Monitoring
Execution 399Building ApplicationInstanceLogging 399
Building Package Instance Logging 406
BuildingErrorLogging 410
Reporting
Execution Metrics 420Conclusion 434
Index 435