• No results found

IBM Information Server Blue Book

N/A
N/A
Protected

Academic year: 2021

Share "IBM Information Server Blue Book"

Copied!
149
0
0

Loading.... (view fulltext now)

Full text

(1)

IBM Information Server

Information Server Introduction

(2)
(3)

IBM

Information

Server

Information

Server

Introduction

Version8.0

(4)

Note

Beforeusingthisinformationandtheproductthatitsupports,besuretoreadthegeneralinformationunder“Noticesand trademarks”onpage133.

©CopyrightInternationalBusinessMachinesCorporation2006.Allrightsreserved.

(5)

Contents

Chapter

1.

Introduction

.

.

.

.

.

.

.

. 1

Chapter

2.

Architecture

and

concepts

.

. 5

ParallelprocessinginIBMInformationServer . . . 7

ParallelismbasicsinIBMInformationServer . . 8

ScalabilityinIBMInformationServer. . . 11

SupportforgridcomputinginIBMInformation Server . . . 12

SharedservicesinIBMInformationServer . . . . 13

AdministrativeservicesinIBMInformation Server . . . 13

ReportingservicesinIBMInformationServer . . 15

Chapter

3.

Metadata

services

.

.

.

.

. 17

Metadataservicesintroduction . . . 17

AcloserlookatmetadataservicesinIBM InformationServer . . . 19

WebSphereBusinessGlossary . . . 20

WebSphereBusinessGlossarytasks . . . 21

WebSphereMetadataServer. . . 23

Informationresourcesformetadataservices . . . 28

Chapter

4.

Service-oriented

integration

29

Introductiontoservice-orientedintegrationinIBM InformationServer . . . 29

Acloserlookatservice-orientedintegrationinIBM InformationServer . . . 32

SOAcomponentsinIBMInformationServer . . . 35

WebSphereInformationServicesDirectortasks . . 36

SOAanddataintegration. . . 39

InformationresourcesforWebSphereInformation ServicesDirector. . . 41

Chapter

5.

WebSphere

Information

Analyzer

.

.

.

.

.

.

.

.

.

.

.

.

.

. 43

WebSphereInformationAnalyzercapabilities . . . 43

AcloserlookatWebSphereInformationAnalyzer 46 WebSphereInformationAnalyzertasks . . . 49

Dataprofilingandanalysis . . . 50

Datamonitoringandtrending . . . 54

Resultsoftheanalysis. . . 57

InformationresourcesforWebSphereInformation Analyzer . . . 58

Chapter

6.

WebSphere

QualityStage

.

. 59

IntroductiontoWebSphereQualityStage. . . 59

AcloserlookatWebSphereQualityStage . . . . 63

WebSphereQualityStagetasks . . . 66

Investigatestage. . . 67

Standardizestage . . . 68

Matchstagesoverview . . . 69

Survivestage. . . 73

Accessingmetadataservices. . . 74

InformationresourcesforWebSphereQualityStage 74

Chapter

7.

WebSphere

DataStage

.

.

. 77

IntroductiontoWebSphereDataStage. . . 78

AcloserlookatWebSphereDataStage . . . 79

WebSphereDataStagetasks . . . 83

WebSphereDataStageelements. . . 83

OverviewoftheDesigner,Director,and Administratorclients . . . 85

DatatransformationforzSeries . . . 102

WebSphereDataStageMVSEdition . . . 102

WebSphereDataStageEnterpriseforz/OS. . . 104

InformationresourcesforWebSphereDataStage 105

Chapter

8.

WebSphere

Federation

Server

.

.

.

.

.

.

.

.

.

.

.

.

.

. 107

IntroductiontoWebSphereFederationServer. . . 108

AcloserlookatWebSphereFederationServer . . 111

Thefederatedserveranddatabase . . . 111

Wrappersandotherfederatedobjects . . . . 112

Queryoptimization . . . 113

Two-phasecommitforfederatedtransactions 114 RationalDataArchitect . . . 115

WebSphereFederationServertasks . . . 116

Federatedobjects . . . 116

Cachetablesforfasterqueryperformance. . . 117

Monitoringfederatedqueries . . . 118

Federatedstoredprocedures . . . 119

InformationresourcesforWebSphereFederation Server . . . 119

Chapter

9.

Companion

products

.

.

. 121

WebSphereDataStagePacks . . . 121

AcloserlookatWebSphereDataStagePacks 123 WebSphereDataStageChangeDataCapture . . . 126

WebSphereReplicationServer. . . 127

WebSphereDataEventPublisher. . . 128

InformationresourcesforIBMInformationServer companionproducts . . . 129

Accessing

information

about

IBM

.

. 131

ContactingIBM . . . 131

Accessibledocumentation . . . 132

Providingcommentsonthedocumentation . . . 132

Notices

and

trademarks

.

.

.

.

.

.

. 133

Notices . . . 133

Trademarks . . . 135

(6)
(7)

Chapter

1.

Introduction

Mostoftoday’scriticalbusinessinitiativescannotsucceedwithouteffective integrationofinformation.Initiativessuchassingleviewof thecustomer,business intelligence,supplychainmanagement,and BaselIIand Sarbanes-Oxley

compliancerequire consistent,complete, andtrustworthy information.

IBM®InformationServeristheindustry’sfirstcomprehensive,unifiedfoundation forenterpriseinformationarchitectures,capable ofscalingtomeet anyinformation volume requirementsothatcompaniescandeliverbusinessresultswithin these initiatives fasterandwithhigherqualityresults.

IBM InformationServer combinesthetechnologieswithin theIBMInformation IntegrationSolutionsportfolio(WebSphere® DataStage®,WebSphereQualityStage,

WebSphereProfileStage,andWebSphereInformationIntegrator)intoa single unifiedplatformthatenablescompaniestounderstand,cleanse,transform,and delivertrustworthyand context-richinformation.

Over thelast twodecades,companieshavemadesignificantinvestmentsin

enterpriseresourceplanning,customer relationshipmanagement,andsupplychain management packages.Companiesalso areleveraginginnovations suchas

service-orientedarchitectures(SOA),Webservices,XML,gridcomputing,and Radio FrequencyIdentification(RFID).

Theseinvestmentshaveincreasedtheamount ofdatathatcompaniesarecapturing abouttheirbusinesses.Butcompaniesencounter significantintegrationhurdles whenthey trytoturnthatdataintoconsistent,timely, andaccurateinformationfor decision-making.

IBM InformationServer helpsyouderivemorevalue fromcomplex,heterogeneous information.IthelpsbusinessandIT personnelcollaboratetounderstandthe meaning,structure,and contentofinformationacrossa widevarietyofsources. IBM InformationServer helpsyouaccessanduseinformationinnew waysto driveinnovation,increaseoperationalefficiency, andlowerrisk.

IBM InformationServer supportsall oftheseinitiatives:

Businessintelligence

IBM InformationServer makesiteasier developa unifiedviewofthe businessforbetterdecisions.Ithelpsyouunderstandexistingdatasources, cleanse,correct,andstandardizeinformation,andloadanalyticalviews thatcanbereused throughouttheenterprise.

Master datamanagement

IBM InformationServer simplifiesthedevelopmentofauthoritativemaster databyshowingwhereandhow informationisstored acrosssource systems. Italsoconsolidatesdisparatedatainto asingle,reliablerecord, cleanses andstandardizesinformation,removesduplicates,andlinks recordstogetheracrosssystems.Thismasterrecordcanbeloaded into operationaldatastores,datawarehouses,or masterdataapplicationssuch asWebSphereCustomerCenter.Therecord canalso beassembled,

(8)

Infrastructurerationalization

IBM InformationServer aidsinreducingoperatingcostsbyshowing relationshipsbetweensystemsand bydefiningmigrationrulesto

consolidate instancesormovedatafromobsoletesystems.Data cleansing and matchingensurehigh-qualitydatainthenewsystem.

Businesstransformation

IBM InformationServer canspeeddevelopment andincreasebusiness agilitybyprovidingreusable informationservicesthatcanbepluggedinto applications,businessprocesses,andportals.Thesestandards-based informationservices aremaintainedcentrallybyinformationspecialistsbut are widelyaccessiblethroughouttheenterprise.

Riskandcompliance

IBM InformationServer helpsimprovevisibilityand datagovernanceby enabling complete,authoritativeviewsofinformationwithproofoflineage and quality.Theseviewscanbemadewidelyavailableand reusableas shared services,whiletherulesinherent inthemaremaintainedcentrally.

Capabilities

IBM InformationServer featuresaunifiedsetofproductmodulesthatsolve multiple typesof businessproblems.Informationvalidation,accessand processing rulescanbereused acrossprojects, leadingtoa higherdegreeof consistency, stronger controloverdata,andimprovedefficiencyinITprojects.

AsFigure1shows,IBM InformationServer enablesbusinessestoperform fourkey integrationfunctions:

Understandyourdata

IBM InformationServer canhelpyouautomaticallydiscover,define,and modelinformationcontentand structureand understandandanalyzethe meaning,relationships,and lineageofinformation.Byautomatingdata profilingand data-qualityauditing withinsystems,organizations can achievethese goals:

(9)

v Understand datasourcesandrelationships

v Eliminatetheriskofusingorproliferatingbaddata

v Improveproductivitythroughautomation

v LeverageexistingITinvestments

IBM InformationServer makesiteasier forbusinessestocollaborateacross roles. Dataanalystscanuseanalysis andreportingfunctionality,generating integrationspecifications andbusinessrulesthattheycanmonitorover time. SubjectmatterexpertscanuseWeb-basedtoolstodefine, annotate, and reportonfields ofbusinessdata.Acommonmetadatafoundation makesiteasierfor differenttypesofuserstocreateandmanage metadata byusingtoolsthatare optimizedfortheirroles.

Cleanseyourinformation

IBM InformationServer supportsinformationqualityandconsistency by standardizing,validating,matching,and mergingdata.Itcancertifyand enrichcommondataelements,usetrusteddatasuchaspostalrecordsfor nameandaddressinformation,andmatchrecordsacrossorwithindata sources.IBM InformationServer allowsasingle recordtosurvivefromthe bestinformationacrosssourcesforeachuniqueentity,helpingyouto createa single,comprehensive,and accurateviewofinformationacross sourcesystems.

Transformyourdataintoinformation

IBM InformationServer transformsandenrichesinformationtoensurethat it isinthepropercontextfornewuses.Hundredsofprebuilt

transformation functionscombine, restructure,andaggregateinformation. Transformationfunctionalityisbroadandflexible,tomeet the

requirementsofvariedintegrationscenarios.For example,IBMInformation Server providesinlinevalidationandtransformation ofcomplexdatatypes suchasU.S.HealthInsurancePortabilityandAccountabilityAct(HIPAA), alongwith high-speedjoinsandsortsofheterogeneousdata.IBM

InformationServeralso provideshigh-volume,complexdata transformation andmovementfunctionalitythatcanbe usedfor

standalone extract/transform/load(ETL) scenarios,orasareal-timedata processingengineforapplicationsorprocesses.

Deliveryourinformation

IBM InformationServer providestheabilitytovirtualize,synchronize,or moveinformationtothepeople,processes,orapplicationsthatneedit. Informationcanbedeliveredthroughfederationortime-basedor event-basedprocessing, movedinlargebulkvolumesfromlocationto location,oraccessedinplacewhen itcannotbe consolidated.

IBM InformationServer providesdirect,nativeaccesstoa widevarietyof informationsources,bothmainframeanddistributed.Itprovides accessto databases, files,servicesandpackagedapplications,and tocontent

repositoriesandcollaborationsystems.Companion productsallow high-speedreplication,synchronizationand distributionacrossdatabases, changedatacapture,andevent-basedpublishingof information.

(10)
(11)

Chapter

2.

Architecture

and

concepts

IBM InformationServer providesa unifiedarchitecture thatworks withalltypesof informationintegration.Commonservices,unifiedparallelprocessing, andunified metadataareat thecoreoftheserverarchitecture.

Thearchitecture isserviceoriented, enablingIBMInformationServertowork within anorganization’sevolvingenterpriseservice-orientedarchitectures.A service-orientedarchitecturealso connectstheindividualproductmodules ofIBM InformationServer.

By eliminatingduplicationoffunctions,thearchitecture efficientlyuseshardware resources andreducestheamountofdevelopment andadministrativeeffortthat are requiredtodeployan integrationsolution.

(12)

Figure2 onpage5 showsthefivetop-levelcomponentsoftheIBMInformation Server architecture.

Unified parallelprocessingengine

MuchoftheworkthatIBMInformationServerdoestakesplacewithinthe parallelprocessingengine.Theengine handlesdataprocessingneedsas diverseasperforminganalysis oflargedatabasesforWebSphere

InformationAnalyzer,datacleansingforWebSphereQualityStage,and complex transformationsforWebSphereDataStage.Thisparallelprocessing engine isdesigned todeliver:

v Parallelismandpipeliningtocompleteincreasingvolumesofworkin

decreasingtimewindows

v Scalabilitybyadding hardware(forexample,processorsornodesina

grid)with nochangestothedataintegrationdesign

v Optimizeddatabase,file,and queueprocessingtohandlelargefilesthat

cannotfitinmemoryall atonceorwithlargenumbersofsmallfiles

Common connectivity

IBM InformationServer connectstoinformationsourceswhethertheyare structured, unstructured,onthemainframe, orapplications.

Metadata-driven connectivityissharedacrosstheproductmodules,and connection objectsare reusableacrossfunctions.

Connectors providedesign-timeimportingofmetadata, databrowsingand sampling,run-timedynamicmetadataaccess,errorhandling,and high functionalityand highperformancerun-timedataaccess.Prebuilt

interfacesforpackagedapplicationscalledPacks provideadapterstoSAP, Siebel,Oracle,and others,enablingintegrationwithenterpriseapplications and associatedreportingandanalyticalsystems.

Unified metadata

IBM InformationServer isbuiltona unifiedmetadatainfrastructurethat enablessharedunderstandingbetweenbusinessandtechnicaldomains. Thisinfrastructurereducesdevelopmenttimeandprovides apersistent record thatcanimproveconfidenceininformation.All functionsofIBM InformationServershare thesamemetamodel,making iteasierfor differentrolesandfunctionstocollaborate.

AcommonmetadatarepositoryprovidespersistentstorageforallIBM InformationServerproductmodules.Alloftheproductsdependonthe repositorytonavigate,query,andupdatemetadata.Therepository contains twokindsofmetadata:

Dynamic

Dynamic metadataincludesdesign-timeinformation.

Operational

Operationalmetadataincludesperformancemonitoring,auditand logdata,anddataprofilingsampledata.

Becausetherepositoryissharedbyallproductmodules, profiling

informationthatiscreatedbyWebSphereInformationAnalyzerisinstantly available tousersofWebSphereDataStageand QualityStage,forexample. TherepositoryisaJ2EEapplication thatusesa standardrelational database suchasIBMDB2®,Oracle,orSQLServerforpersistence(DB2is

providedwith IBMInformationServer).Thesedatabasesprovidebackup, administration,scalability,parallelaccess,transactions,andconcurrent access.

(13)

Common services

IBM InformationServer isbuiltentirelyonaset ofsharedservices that centralizecoretasksacrosstheplatform.Theseincludeadministrativetasks suchassecurity,useradministration,logging,andreporting.Shared services allowthesetaskstobe managedandcontrolled inoneplace, regardlessofwhichproductmodule isbeingused.Thecommon services also includethemetadataservices,whichprovidestandard

service-orientedaccessand analysisofmetadataacrosstheplatform.In addition,thecommonservices layermanageshow servicesaredeployed fromanyoftheproductfunctions,allowingcleansingandtransformation rulesorfederatedqueriestobe publishedasshared serviceswithinan SOA,usinga consistentandeasy-to-usemechanism.

IBM InformationServer productscanaccessthreegeneralcategoriesof service:

Design

Design serviceshelpdeveloperscreatefunction-specificservices thatcanalsobe shared.Forexample,WebSphereInformation Analyzercallsa columnanalyzerservicethatwascreatedfor enterprisedataanalysisbutcanbe integratedwithotherpartsof IBM InformationServer becauseitexhibitscommonSOA

characteristics.

Execution

Executionservicesinclude logging,scheduling,monitoring, reporting,security,andWebframework.

Metadata

Using metadataservices,metadataisshared“live” acrosstoolsso thatchangesmadein oneIBMInformationServerproductare instantlyvisible acrossalloftheproductmodules.Metadata services aretightlyintegratedwiththecommonrepositoryandare packagedinWebSphereMetadataServer.Youcanalso exchange metadatawithexternal toolsbyusingmetadataservices.

Thecommonservices layerisdeployedonJ2EE-compliantapplication servers suchasIBMWebSphereApplicationServer,whichisincludedwith IBM InformationServer.

Unified userinterface

Theface ofIBMInformationCenterisacommongraphical interfaceand toolframework.SharedinterfacessuchastheIBMInformationServer consoleandWebconsoleprovideacommonlookand feel,visualcontrols, and userexperienceacrossproducts.Commonfunctionssuchascatalog browsing,metadataimport,query,and databrowsing allexpose

underlyingcommonservices inauniform way.IBMInformationCenter provides richclientinterfacesforhighlydetaileddevelopment workand thinclientsthatruninWebbrowsersforadministration.

Applicationprogramminginterfaces(APIs)supporta varietyofinterface styles thatincludestandardrequest-reply,service-oriented,event-driven, and scheduledtaskinvocation.

Parallel

processing

in

IBM

Information

Server

Companiestodaymust manage,store,andsort throughrapidlyexpanding volumesofdataand deliverittoendusersasquicklyaspossible.

(14)

Toaddress thesechallenges,organizationsneeda scalabledataintegration architecture thatcontainsthefollowingcomponents:

v Amethod forprocessingdatawithoutwritingtodisk,inbatchandrealtime.

v Dynamicdatapartitioningandin-flightrepartitioning.

v Scalablehardwarethatsupportssymmetricmultiprocessing(SMP),clustering,

grid,andmassivelyparallel processing(MPP)platforms withoutrequiring changestotheunderlyingintegrationprocess.

v

Supportfor paralleldatabasesincludingDB2,Oracle,andTeradata,inparallel

andpartitionedconfigurations.

v Anextensibleframeworktoincorporatein-house andthird-partysoftware.

IBM InformationServer addressesalloftheserequirementsbyexploitingboth pipeline parallelismandpartitionparallelismtoachievehighthroughput, performance, andscalability.

Parallelism

basics

in

IBM

Information

Server

The pipelineparallelismandpartitionparallelismthatareusedinIBMInformation Server underlyitshigh-performance,scalablearchitecture.

Data

pipelining

Datapipeliningistheprocess ofpullingrecordsfromthesourcesystem andmoving themthrough thesequenceofprocessingfunctionsthataredefinedinthe

data-flow(thejob).Becauserecordsareflowingthrough thepipeline,theycanbe processedwithoutwritingtherecordstodisk,asFigure3 shows.

Data canbebufferedinblockssothateachprocessisnotslowedwhenother componentsare running.Thisapproachavoidsdeadlocksand speedsperformance byallowingbothupstreamanddownstream processestorunconcurrently.

Without datapipelining,thefollowingissuesarise:

v Datamust bewrittentodisk betweenprocesses,degradingperformance and

increasingstorage requirementsand theneed fordiskmanagement. v Thedevelopermust managetheI/Oprocessingbetweencomponents.

v

Theprocessbecomesimpracticalforlargedatavolumes.

v Theapplicationwillbeslower,asdiskuse,management,and design

complexitiesincrease.

v Eachprocessmustcompletebeforedownstream processescanbegin,which

limitsperformance andfulluseofhardwareresources.

(15)

Data

partitioning

Datapartitioningisanapproachto parallelismthatinvolvesbreaking therecord set into partitions,orsubsets ofrecords.Data partitioninggenerallyprovideslinear increasesinapplication performance.Figure4 showsdatathatispartitionedby customer surnamebeforeitflows intotheTransformerstage.

Ascalablearchitecture shouldsupport manytypesofdatapartitioning,including thefollowingtypes:

v Hashkey(data)values

v Range v Round-robin v Random v Entire v Modulus v Databasepartitioning

IBM InformationServer automaticallypartitions databasedonthetypeofpartition thatthestagerequires. Typicalpackagedtoolslackthiscapabilityand require developerstomanuallycreatedatapartitions,whichresultsincostlyand time-consuming rewritingofapplicationsorthedatapartitionswheneverthe administratorwantstousemorehardwarecapacity.

Ina well-designed,scalablearchitecture,thedeveloperdoesnotneedtobe concerned aboutthenumber ofpartitionsthatwillrun,theabilitytoincreasethe number ofpartitions,orrepartitioningdata.

Dynamic

repartitioning

Intheexamplesshown inFigure4 andFigure5onpage10,dataispartitioned based oncustomersurname,andthenthedatapartitioningismaintained throughouttheflow.

(16)

Thistypeof partitioningisimpractical formanyuses,suchasatransformation thatrequiresdatapartitionedonsurnamebutmustthen beloaded intothedata warehouse byusingthecustomeraccountnumber.

Dynamic datarepartitioningisa moreefficientandaccurate approach.With dynamicdatarepartitioning,dataisrepartitionedwhile itmovesbetween

processeswithoutwritingthedatato disk,based onthedownstream processthat datapartitioning feeds.TheIBMInformationServer parallelenginemanagesthe communicationbetweenprocessesfordynamicrepartitioning.

Data isalsopipelinedtodownstreamprocesseswhenit isavailable,asFigure6 shows.

Without partitioninganddynamicrepartitioning,thedevelopermusttakethese steps:

v Createseparateflowsforeachdatapartition, basedonthecurrenthardware

configuration.

v Writedatatodiskbetweenprocesses.

v Manuallyrepartitionthedata.

v

Startthenext process.

The applicationwillbe slower,diskuseandmanagementwillincrease,andthe designwillbemuchmorecomplex.Thedynamicrepartitioningfeatureof IBM InformationServer helpsyouovercomethese issues.

Figure5.Datapartitioningandparallelexecution-alesspracticalapproach

(17)

Scalability

in

IBM

Information

Server

IBM InformationServer isbuiltona highlyscalable softwarearchitecturethat delivershighlevelsofthroughput andperformance.

For maximumscalability,integrationsoftwaremustdo morethanrunon SymmetricMultiprocessing(SMP)and MassivelyParallelProcessing(MPP) computersystems. Ifthedataintegrationplatformdoesnotsaturateallofthe nodesoftheMPPboxorsystemintheclusterorgrid, scalabilitycannotbe maximized.

TheIBM InformationServer componentsfullyexploitSMP,clustered, grid,and MPPenvironmentstooptimizetheuseofallavailable hardwareresources.

For example,whenyoucreatea simplesequentialdata-flowbygraph usingthe WebSphereDataStageandQualityStageDesigner, youdonotneedtoworryabout theunderlyinghardwarearchitectureornumber ofprocessors.Aseparate

configurationfiledefinestheresources(physicalandlogicalpartitionsornodes, memory,and disk)oftheunderlyingmultiprocessorcomputingsystem.

AsFigure7onpage12shows,theconfigurationprovides aclean separation betweencreatingthesequentialdata-flowgraphand theparallel executionof the application. Thisseparationsimplifiesthedevelopmentofscalable dataintegration systemsthatrunin parallel.

(18)

Without supportforscalablehardwareenvironmentsthefollowingproblemscan occur:

v Processingisslower,becausehardwareresourcesarenotmaximized.

v Applicationdesignand hardwareconfigurationcannotbedecoupled,and

manualinterventionand possiblyredesignisrequiredforeveryhardware change.

v Scalingondemandisnotpossible.

IBM InformationServer leveragespowerfulparallelprocessingtechnology to ensure thatlargevolumesofinformationcanbeprocessedquickly. Thistechnology ensures thatprocessingcapacitydoesnotinhibitprojectresultsand allows

solutions toeasilyexpandtonewhardwareand tofullyleveragetheprocessing powerofallavailable hardware.

Support

for

grid

computing

in

IBM

Information

Server

With hardwarecomputing powera commodity,gridcomputingisahighly

compelling optionforlargeenterprises.Gridcomputingallowsyoutoapplymore processingpowertoataskthanwaspreviouslypossible.

Gridcomputing usesall ofthelow-costcomputingresources,processors,and memorythatare availableonthenetworktocreatea singlesystemimage. Grid

(19)

computing softwareprovidesa listofavailablecomputingresources andalistof tasks. Whena computerbecomesavailable,thegridsoftwareassignsnew tasks according toappropriaterules.

Agridcanbemadeupofthousandsofcomputers.Grid-computing software balances ITsupplyanddemandbylettingusersspecifyprocessorand memory requirementsfortheirjobs,and thenfindavailable machinesona networktomeet those specifications.

Theparallel processingarchitectureof IBMInformationServerleveragesthe computing powerof gridenvironmentsandgreatlysimplifies thedevelopmentof scalable integrationsystemsthatexecuteinparallelforgridenvironments.

IBM InformationServer’spre-bundledgrideditionprovides rapidout-of-the-box implementationofgridscalability.Itincludesanintegratedgridscheduler and integratedgridoptimization.Thesecapabilitieshelpyoueasilyandflexiblydeploy integrationlogicacrossagridwithoutimpactingjobdesign,whichprovides unlimited scalability.

Shared

services

in

IBM

Information

Server

IBM InformationServer providesextensiveadministrativeandreportingfacilities thatusesharedservices anda Webapplicationthatoffersa commonlookandfeel forall administrativeand reportingtasks.

Administrative

services

in

IBM

Information

Server

IBM InformationServer providesadministrativeservicestohelpyoumanage users,roles,sessions,security,logs,andschedules.The Webconsoleprovides globaladministrationcapabilitiesthatbasedona commonframework.

TheIBM InformationServer consoleprovides theseservices: v

“Securityservices”

v “Logservices”onpage14

v

“Schedulingservices”onpage15

Security

services

Security servicessupportrole-basedauthenticationof users,access-controlservices, and encryptionthatcomplieswithmanyprivacyandsecurityregulations.As Figure8onpage14shows,theconsolehelpsadministratorsaddusers,groups, and rolesand letsadministrators browse,create,delete,andupdateoperations within InformationServer.

Directoryservicesactasa centralauthoritythatcanauthenticate resourcesand manage identitiesandrelationshipsamongidentities.Youcanbase directorieson IBM InformationServer’sown internaldirectory oronexternaldirectoriesthatare based onLDAP,Microsoft’sActiveDirectory,orUNIX®.

Users onlyuseonecredentialto accessallthecomponentsofInformationServer.A set ofcredentialsisstoredforeachusertoprovidesingle sign-ontotheproducts registeredwith thedomain.

(20)

Log

services

Logserviceshelp youmanage logsacrossall oftheIBM InformationServer suite components.Theconsoleprovidesa centralplacetoview logsandresolve problems.Logsare storedinthecommonrepository,andeachIBM Information Server suitecomponentdefinesrelevantloggingcategories.

Youcanconfigurewhichcategoriesof loggingmessagesare savedinthe repository. Logviewsaresavedqueriesthatanadministratorcancreatetohelp with commontasks.Forexample,youmightwanttodisplayalloftheerrorsin DataStage jobsthatraninthepast24hours.

Figure9 onpage15showstheIBMInformationServerWeb consolebeingusedto configureloggingreports.Loggingisorganizedbyservercomponents.TheWeb consoledisplaysdefaultand activeconfigurationsforeachcomponent.

(21)

Scheduling

services

Scheduling serviceshelpplanandtrackactivitiessuchloggingandreportingand suitecomponenttaskssuchdatamonitoringandtrending.Schedulesare

maintainedusingtheIBM InformationServer console,whichhelpsyoudefine schedules;view theirstatus,history,andforecast;and purgethemfromthesystem.

Reporting

services

in

IBM

Information

Server

Reportingservicesmanage runtimeandadministrativeaspectsofreportingfor IBM InformationServer.

Youcancreateproduct-specificreportsforWebSphereDataStage,WebSphere QualityStage,andWebSphereInformationAnalyzer,andcross-productreportsfor logging, monitoring,scheduling,andsecurityservices.

All reportingtasksare setupandrunfromasingleinterface, theIBM Information Server Webconsole.Youcanretrieveand viewreportsandschedulereportstorun at aspecific timeand frequency.

Figure10onpage16showstheWebconsole.

(22)

Youdefinereportsbychoosing fromaset ofpredefinedparametersandtemplates. Youcanspecifyahistorypolicythatdetermineshowthereportwillbe archived and whenitexpires.Reports canbe formattedasHTML,PDForMicrosoft®Word documents.

(23)

Chapter

3.

Metadata

services

Whenmovingtoanenterpriseintegrationstrategy,largeorganizationsoftenface a proliferation ofsoftwaretoolsthatarebuilttosolveidenticalproblems.Fewof these toolsworktogether,muchlessworkacrossproblemdomains toprovidean integratedsolution.

Data profiling,datamodeling, datatransformation,dataquality,and business intelligence toolsplayakeyroleindataintegration.Integrationcanbecomea mature, manageableprocess ifthesetoolsareenabled toworkacrossproblem domains.

Theconsequences oftheinabilitytomanagemetadataaremanyandsevere: v Changesthataremadetosourcesystemsaredifficult tomanageandcannot

matchthepace ofbusinesschange.

v Datacannotbeanalyzedacrossdepartmentsandprocesses.

v

Metadatacannotbeshared amongproductswithoutmanuallyretypingthe

metadata.

v Withoutbusiness-leveldefinitions,metadatacannotprovidecontextfor

information.

v Documentationisout-of-dateorincomplete, hamperingchange management

andmakingit hardertotrainnewusers.

v Effortstoestablishaneffectivedatastewardshipprogramfailbecauseofa lack

ofstandardizationandfamiliaritywiththedata.

v Establishinganaudittrailforintegrationinitiativesisvirtuallyimpossible.

Themetadataservices componentsofIBMInformationServercreatea fully integratedsuite,eliminatingtheneedtomanuallytransportmetadatabetween applicationsand providea standalonemetadatamanagementapplication.

Metadata

services

introduction

Metadataservicesare partoftheplatformonwhichIBMInformationServeris built.Byusingmetadataservices,youcanaccessdataandachievedataintegration tasks suchasanalysis,modeling, cleansing,and transformation.

Themajor metadataservices componentsofIBMInformationServerare WebSphereBusinessGlossary, WebSphereMetadataServer,and WebSphere MetaBrokers andbridges

WebSphere

Business

Glossary

WebSphereBusinessGlossary isa Web-basedapplication thatprovidesa

business-orientedview intothedataintegrationenvironment.ByusingWebSphere Business Glossary,youcanviewandupdatebusinessdescriptionsandaccess technicalmetadata.

Metadataisbestmanagedbythosewhounderstandthemeaningand importance of theinformationassetstothebusiness.Designedforcollaborativeauthoring,

(24)

WebSphereBusinessGlossary givesuserstheabilitytoshareinsightsand experiences aboutdata.Itprovides userswith thefollowinginformationabout dataresources:

v Businessmeaninganddescriptionsof data

v Stewardshipofdataandprocesses

v Standardbusinesshierarchies

v

Approvedterms

WebSphereBusinessGlossary isorganized andsearchable accordingtothe semantics thataredefinedbya controlledvocabulary,whichyoucancreateby usingtheWebconsole.

WebSphere

Metadata

Server

WebSphereMetadataServerprovidesa varietyof servicestoothercomponentsof IBM InformationServer:

v Metadataaccess

v Metadataintegration

v

Metadataimportandexport

v Impactanalysis

v Searchandquery

WebSphereMetadataServerprovidesa commonrepositorywithfacilitiesthatare capable ofsourcing,sharing,storing,andreconcilinga comprehensivespectrumof metadataincludingbusinessmetadataandtechnicalmetadata.

Businessmetadata

Business metadataprovides businesscontextfor informationtechnology assetsandaddsbusinessmeaningtotheartifactsthatarecreatedand managedbyotherITapplications.Businessmetadataincludescontrolled vocabularies, taxonomies,stewardship,examples,and businessdefinitions.

Technicalmetadata

Technicalmetadataprovidesdetailsaboutsourceandtargetsystems, their tableand fieldstructures,attributes,derivations,anddependencies. Technicalmetadataalsoincludesdetailsaboutprofiling,quality,andETL processes,projects,andusers.

WebSphere

MetaBrokers

and

bridges

WebSphereMetaBrokersand bridgesprovidesemanticmodelmappingtechnology thatallows metadatatobe sharedamongapplicationsforallproductsthatare usedinthedataintegrationlifecycle:

v Datamodeling orcasetools

v

Businessintelligenceapplications

v Datamartsanddatawarehouses

v Enterpriseapplications

v Dataintegrationtools

Customerswhousethesecomponentscanestablishcommondatadefinitions acrossbusinessandITfunctions.

v Driveconsistencythroughoutthedataintegrationlifecycle

(25)

v Provideenterprisevisibilityforchangemanagement

v Easilyextendtonew,existing,andhomegrownmetadatasources

Scenarios

for

metadata

management

AcomprehensivemetadatamanagementcapabilityprovidesusersofIBM InformationServer withacommon waytodealwithdescriptive information surrounding theuseofdata.Thefollowingscenariosdescribeusesofthis capability.

Web-based education:Profilingyourcustomer

AWeb-based,for-profiteducationproviderneededtoretainmorestudents. Business managersneededtoanalyzethestudentlifecyclefromapplication tograduation anddirectrecruitingeffortsatindividualswiththebest chance ofsuccess.

Tomeet thisbusinessimperative,thecompanydesignedanddelivereda businessintelligencesolutionthatusesadatawarehousethatcontains a single viewofstudentinformationthatispopulatedfromoperational systems. TheITorganizationusesWebSphereMetadataServerto coordinate metadatathroughouttheproject.Other toolsthatwereused includedEmbarcaderoERStudiofordatamodeling andBrioforBusiness Intelligence.

Theoverallprojecttimewasreducedbyprovidingmetadataconsistency and accuracyacrosseverytool. Thebusinessusersnowhavetrustworthy metadataabouttheinformationintheirBrioreports.WebSphereBusiness Glossary providedbusinessdefinitionstoWebSphereMetadataServer.The netresultismoreconfidentdecision-makingaboutstudentsandbetter student-retentioninitiatives.

Financial Services:Measuringlevelsofservice

Thedatawarehousing divisionofamajorfinancial servicesprovider neededtoprovideinternalcustomerswithcriticalenterprise-widedata aboutlevelsofservicethatarespecified bysignedservicelevelagreements (SLAs).Thedatawarehousinggroup alsoneededtoprovidebusiness definitionsof eachfield,includingmetricsthatdetailedactualversus promisedlevelsofservice.

TheorganizationusesIBMInformationServertocreatean enterprisedata warehouseand datamartstosatisfyeachSLA. Thedivisionusedmetadata services withinWebSphereInformationAnalyzer,WebSphereQualityStage, and WebSphereDataStagetocollaborateina multiuserenvironment.The datawarehousing groupwasalsoable toprovideHTMLreportsthat outlinedthestatisticsthatare associatedwiththeloadingofthedatamart tosatisfytheSLA.

Thedivisionmetitsservice-levelagreementsandwasable todemonstrate itscompliancetointernaldataconsumers.Additionally,endusersreceived importantbusinessdefinitions throughbusinessintelligencereports.

A

closer

look

at

metadata

services

in

IBM

Information

Server

Metadataservicesencompass awiderangeoffunctionalitythatformsthecore infrastructureofIBM InformationServer andalsoincludessomeseparately packagedcapabilities.

(26)

WebSphere

Business

Glossary

Managingbusinessmetadataeffectivelycanensurethatthesamedata“language” appliesthroughouttheorganization.WebSphereBusiness Glossarygivesbusiness usersthetoolstheyneedtoauthorand ownbusinessmetadata.

For example,onedepartmentrefersto “revenues,”anotherto“sales.”Arethey talking aboutthesame activity?Onesubsidiaryunittalksabout“customers,” anotherabout“users”or“clients.”Arethesedifferentclassifications ordifferent termsforthesameclassification?

WebSphereBusinessGlossary providesbusinessuserswith aWeb-based toolfor creating andmanagingstandarddefinitions ofbusinessconcepts, calledacontrolled vocabulary.Italso simplifiesthebuildingof abusiness-orientedclassification system andthecollaborativeauthoringofbusinessmetadata.

The toolsimplifies thetaskofmanaging, browsing,andcustomizingthebroad varietyofmetadatathatisstoredintherepositoryofWebSphereMetadataServer, metadatathatincludesdetailsabouttables, columns,models,schemas,operations, and othercomponentsofthedataintegrationprocess.

The tooldividesmetadataintocategories,eachofwhichcontainsterms.Youcan usetermsto classifyotherobjectsinthemetadatarepositorybasedontheneedsof your business.Youcanalsodesignateusersorgroupsasstewardsformetadata objects.

WebSphereBusinessGlossary helpsbusinessuserswiththefollowingtasks:

Developinga commonvocabularybetweenbusinessandtechnology

Acommonvocabularyallows multipleusersofdatatosharea common view ofthemeaningofdata.Users canassigncategories andtermstodata thatare meaningfulinabusinesscontext,andcreatea hierarchyof

categories foreaseofbrowsing.

Providing datagovernanceandstewardship

Data assuranceprogramsassignresponsibility tobusinessusers(data stewards) forthemanagementofdatathrough itslifecycle.

Findingbusinessinformation thatisderivedfrommetadata

Metadatahelpsbusinessuserstounderstandthemeaningofthedata,its currency,itslineage,and whoisresponsiblefordefiningandproducing

(27)

thedata.Ifabusinessuserwantstoknowthedefinitionof atermsuchas “corporateprice,”theglossarywillprovidethisinsight.

Accessing metadatawithoutcomplicatedtoolingandquerying

Metadataobjectscanbearrangedina hierarchicalfashiontosimplify browsing ofthedataobjects.

Providing collaborativeenrichmentofbusinessmetadata

Maintenance ofbusinessmetadataisanongoingprocessinwhich automated andmanualdatainputs evolve.Multiplebusinessuserscan collaboratetoaddnotes,annotations,categories,andsynonymstoenrich businessmetadata.

For example,multiplesystemsmaymaintaintablesofcustomer information, howeverthebusinessmayuncoverarequirement fortheconcept of“high-value” customers. Thebusinessneedsaway todefinewhatahighvalue customeris, and how torecognizethem(forexample,a high-valuecustomer isa customerwith combinedaccountbalancesover$10,000).WebSphereBusinessGlossaryprovides a toolforrecordingthesedefinitions,andrelatingbusinessconceptstogether into taxonomies.Thisrecordsthebusinessrequirementsinthesame metadata foundationthattheprofilingandanalysisprocess uses.

WebSphere

Business

Glossary

tasks

Major tasksinWebSphereBusinessGlossaryinclude creatingcategoriesand terms, browsing andsearching,enabling datastewardship,andannotatingdatafor collaboration.

WebSphereBusinessGlossary isa browser-basedapplicationthatyouaccessby usingMicrosoft InternetExplorer.

Enabling

data

stewardship

Data stewardshipisthemanagementofdatathroughoutitslifecycle.Stewardship includesmaking thedataavailable toallthosewho areauthorizedtoaccessit.It also includestheefficientmanagement andintegrationwithrelateddata.Perhaps mostimportantly,stewardshipincludestheresponsibility toensurethatdatais properlydefined,and thatallusersofthedataclearly understanditsmeaning.

WebSphereBusinessGlossary supportstheconceptof datastewardshipandhelps yousetand retrievestewardshipinformationforalldataassets.

Administrators candesignatea userorgroupasa steward.Administratorsand authorscanthenspecifythatthestewardisresponsibleforoneormoremetadata objects.Whenyouview thebrowsepageforan objectthathasasteward,youcan linktocontact informationforthesteward.

Creating

categories

and

terms

AlthoughyoucanuseseveralmethodstofindmetadatainWebSphereMetadata Server, businessusersoftenfindsearchingdatabycategoryisthebeststrategy. Data mustbe organizedintomeaningful taxonomiestoaidthenavigationof a businessglossarybycategory.

Figure12onpage22showstheCreateCategoryfunction inWebSphereBusiness Glossary. Youcreatea businessclassificationsystem ortaxonomythatacts asthe

(28)

hierarchical browsingstructure oftheglossary Website.Youcanalsoimport structure fromothertoolsorspreadsheets.

Atermisa wordorphrasethatcanbe usedtoclassifyand groupobjectsinthe metadatarepository.For example,youmightusetheterm“SouthAmerica Sales” toclassifysomeofthetablesandcolumnsinthemetadatarepository,andtheterm “AsianSales”toclassifyothertablesand columns.

Whenyoucreateor edita term,youcanspecifypropertiesand relationships amongterms, includingsynonymsand relatedterms.Youcanalsospecifyparent categories togroupsimilartermsandcandesignatestewards whohavethe responsibility formaintainingterms. Customattributesenableadministratorsto defineanynumber ofnewattributestobe appliedtoterms, categories,orboth.

Annotating

data

for

collaboration

Whiledatastewardsare responsibleforspecific typesof data,creating abusiness glossary isa collaborativeeffortthatrequiressubjectmatterexperts fromdifferent partsoftheenterprise.WebSphereBusinessGlossary providestoolsforsubject matterexpertsandothers toannotateexistingdatadefinitions,editdescriptions, and assigndataobjecttocategories.

Theseannotations,ornotes,helpbusinessusersshare insightsaboutthe

informationassetsoftheenterprise.For example,an analystmight discoverthata database columnforcustomerinformationalso containsshippinginformationthat doesnotbelonginthecolumn.Theanalystcould sharethatinformationbyusing theNotes®feature. Noteshelpyoucaptureideas intheform ofunstructured metadata. Thisinformationmight otherwisebe unknowntoalargeportionofthe enterprise.

(29)

Browsing

the

Business

Glossary

Youcanstart browsingtheglossarystructurefromtheOverviewpage,which displays thetop-levelcategoriesthattheglossaryadministratorhasdesignatedas mostimportantfornavigationinthemetadatarepository.

Thebrowse bycategoryfunctionenablesdatastewardstofind descriptionsrelated totype ofdataeventhoughtheymaynotknowtheexactnameofthedataitems inquestion.

Whenyouselectanobject, itsbrowsepageisdisplayedontheBrowseGlossary tab, whichliststheobject’sname,class,stewardandotherimportantproperties. Youcaninspectitsattributes,browseitsrelationshipstootherobjects,and send feedback totheadministrator.Administrators andauthorscanaddandeditnotes abouttheobject.

WebSphere

Metadata

Server

IBM InformationServer canoperateasa unifieddataintegrationplatformbecause of theshared capabilitiesof WebSphereMetadataServer.

Common

repository

By storingallmetadatain asharedrepository,IBM InformationServer enables metadatatobe sharedactivelyacrossall tools.Therepositoryprovidesservices for twotypesofdata:

v Designmetadata, whichiscreatedasa partofthedevelopmentprocessand can

beconfiguredtobe eitherprivateorshared bya teamof users.

v Operationalmetadata,whichiscreatedfromongoingintegrationactivity.This

metadataismessage-orientedandtime-stampedtohelptrackthesequenceof events.

With asharedrepository,changes thataremadeinonepartof IBMInformation Server willbe automaticallyand instantlyvisiblethroughoutthesuite.The repositoryoffers thefollowingkeyfeatures:

Activeintegration

Applicationartifactsaredynamicallyintegrated acrosstools.

Multiuser development

Teamscancollaborateina sharedworkspace.

Thecommon repositoryisanIBM WebSphereJ2EEapplication.Therepository usesstandardrelationaldatabasetechnology(suchasDB2orOracle)for persistence. Thesedatabasesprovidebackup,administration,scalability, transactions,andconcurrentaccess.

Common

model

MetadatafordataintegrationprojectscomesfrombothIBMInformationServer productsand vendorproducts.Therepositoryusesmetadatamodels (metamodels) todescribethemetadatafromthesesources.Metadatamodelsprovidea meansfor others tounderstandandshare metadatabetweenapplications.

Thecommon modelisthefoundationofIBMInformationServer.Metadata elementsthatare commontoallmetadatasourcesarediscovered andrepresented once, inaform andformatthatisaccessibletoallofthetools.Thecommonmodel

(30)

enablessharingandreuseof artifactsacrossIBMInformationServer.

Shared

metadata

services

WebSphereMetadataServerexposesa setofmetadatamanipulationandanalysis services foruseacrossIBMInformationServercomponents.Theseservicesenable metadatainterchange,integration,management,and analysis.Theyeliminatethe need forastandalone metadatamanagement productorrepositoryproductby activelymanagingmetadatainthebackground, andbyprovidingmetadata functionalityinthecontextofyour normaldailyactivities.

For example:

v AWebSphereDataStageuser wantsto understandthedependenciesbetween

stagesinanETLjob.Byusingmetadataservices,shecanperforman impact analysisfromtheDesignerclientcanvas,never needingtoleavetheapplication foranotherinterface.

v Adataanalystwhoisworkingwith WebSphereInformationAnalyzercanadd

businessterms,definitions,and notestodataunderanalysisforusebya data modelerorarchitect.

v

AWebSphereQualityStageuserneedstobetterunderstandthebusiness

semanticsthatareassociatedwitha datadomain.Byusingmetadataservices,he canaccessthebusinessdescriptionofthedomainandanyannotationsthatwere addedbybusinessusers.

v AWebSphereDataStagecomponentdeveloperwantstofinda functionthat

performsa particulardataconversion.By usingmetadataservices,shecan performanadvancedsearchforthefunction.

WebSphereMetadataServeroffersthefollowingkeymetadataservices: v Metadatainterchange

v Impactanalysis

v Integratedfind Metadatainterchange

WebSphereMetaBroker®and bridgesenableyoutoaccessand share

metadatawiththebest-of-classtoolsformodeling,dataprofiling,data quality,ETL,OLAP, andbusinessintelligence.

Figure13onpage25showshowMetaBrokerswork.MetaBrokersconvert metadatafromoneformattoanotherbymappingtheelementstoa standardmodelcalledthehubmodel.Theselectedmetadataisthen importedandstored intherepository.Themetadataexchangeenables decomposition andrecompositionofmetadataintosimpleunitsof meaning.

(31)

IBM InformationServer nowsupportsmorethan20MetaBrokersand bridgestovarioustechnologiesandpartnerproducts.Youcanusemost MetaBrokers toimportmetadatafromaparticulartool, file,ordatabase intothemetadatarepositoryofWebSphereMetadataServer.

Table1 describesMetaBrokertypesandthedifferenttypesofmetadatathat youcanaccess.

Table1.MetaBrokertypes

TypeofMetaBroker Typeofmetadata

Designtool CAERwin,OracleDesigner,Rational®

Data ArchitectandtheUnifiedModeling Language(UML)

OLAPandbusinessintelligence CognosPowerPlay,IBMCubeViews™

, ReportNet,BusinessObjects,andHyperion Operationalmetadata Metadatathatdescribesoperationalevents

suchasthetimeanddateofintegration processruns.

Impact analysis

Impact analysishelpsyoumanagetheeffectsofchanges todataby showingdependenciesamongobjects.Thistype ofanalysisextendsacross multiple tools,helpingyouassessthecostofchange.Forexample,a developercanpredicttheeffectsofa changetoatabledefinitionor businesslogic.

Figure14onpage26showstheWebSphereDataStageandQualityStage Designerbeingusedtoselectatable definitioncalled ProdDimfromthe metadatarepositorytoshowwhereuseddependencies.

METABROKER External Tool Metadata Interface Decoder Encoder Mapper

Source (view) model Target (hub) model

(32)

TheImpactAnalysisPathViewerpresentsagraphical viewofthese relationships, asFigure15onpage27shows.

(33)

Thedependenciescanalsobe shownina textualview.Youcanalso runan impactanalysisreportthatcanbeviewedfromtheWebconsole.

Integratedfind

Metadataserviceshelp youlocateandretrieveobjectsfromtherepository byusingeitherthequickfindfeature ortheadvancedfindfeature.The quickfindfeature locatesanobjectbased onafullorpartialnameor description. Theadvancedfindfeaturelocates objectsbasedonthe followingattributes: v Type v Creation data v Lastmodified v Whereitisused

(34)

Information

resources

for

metadata

services

AvarietyofinformationresourcescanhelpyougetstartedwithIBMInformation Server’smetadataservices productmodules.

WebSphereBusinessGlossary

TheGettingStartedpane thatappearswhenyouclicktheGlossarytabof theIBMInformationServer consoledescribesthepurposeofthetaband how togetstarted.Eachpaneandtabontheconsoledisplaysa lineof context-sensitiveinstructionaltext.

TheHelpbuttonlinkstoonline documentationfor WebSphereBusiness Glossary intheIBM InformationServer informationcenterat

http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r0/index.jsp. The InformationCenteralso providesallplanning, installation,and

configurationdetailsforIBMInformationServerand otherproduct modules.

TheWebSphereBusiness GlossaryGuidePDFisalsoavailable ontheQuick StartCD.

WebSphereMetaBrokers

Onlinehelp isavailable forallWebSphereMetaBrokersandbridges.The

Information ServerGuidetoWebSphereMetaBrokersandBridgesPDFisalso available ontheQuickStartCD.

IBMInformationServerandproductmodules

Planning, installation,andconfigurationdetailsare alsoavailableinthe followingPDFs ontheQuickStartCD:

v InformationServerPlanning,Installation,andConfigurationGuide

(35)

Chapter

4.

Service-oriented

integration

IBM InformationServer simplifiesthecreationofshared dataintegrationservices byenablingintegrationlogictobe usedbyanybusinessprocess.

Invoking service-readydataintegrationtasks ensuresthatbusinessprocessessuch asquotegeneration,orderentries, andprocurementrequestsreceivedatathatis correctlytransformed,standardized,andmatchedacrossapplications,partners, suppliers, andcustomers.

Manyorganizationsaredesigning theirnextgenerationofinfrastructureand applicationsasservices.Implementinga service-orientedarchitecture(SOA) offers these benefits:

Adaptability

Functionalcomponentscanbereassembled quicklyandinnew ways.

Consistency

Corerulesforhandlingdataand processesarereusedacrossprojects.

Reducedcost

Increased reuseanda singlepointofmaintenancespeedtimetovalue and reducedevelopment expense.

Federated ownership

Eachserviceisownedandmaintainedindependentlybyitsown group.

IBM InformationServer providesanSOAinfrastructurethatprovides these capabilitiesbyhelpingyoucreateshareddataintegrationservices.Acommon services layermanageshowservicesare deployedfromanyoftheproduct

modules. Cleansingandtransformationrulesorfederatedqueriescanbe published asshared servicesbyusinga consistentandintuitive graphicalinterface,and managedafterpublication usingthesameinterface.

Introduction

to

service-oriented

integration

in

IBM

Information

Server

IBM InformationServer providesstandardservice-orientedinterfacesforenterprise dataintegration.Thebuilt-inintegrationlogicofIBMInformationServercaneasily be encapsulatedasserviceobjectsthatareembedded inuserapplications.

Theseserviceobjectshavethefollowingcharacteristics:

Alwayson

Theservices arealwaysrunning, waitingforrequests.Thisabilityremoves theoverheadofbatchstartupandshutdownand enablesservices to respondinstantaneouslytorequests.

Scalable

Theservices distributerequestprocessingand stopandstart jobsacross multiple WebSphereDataStageservers,enabling highperformancewith large,unpredictablevolumesofrequests.

Standards-based

Theservices arebasedonopenstandardsand caneasilybeinvokedby standards-basedtechnologiesincludingenterpriseapplication integration (EAI)andenterpriseservicebus(ESB)platforms,applications,and portals.

(36)

Flexible

Youcaninvoketheservicesbyusingmultiplemechanisms(bindings)and choose frommanyoptionsforusingtheservices.

Manageable

Monitoringservicescoordinate timelyreportingofsystem performance data.

Reliable andhighlyavailable

IfanyWebSphereDataStage serverbecomesunavailable,it routesservice requeststoa differentserverinthepool.

Reusable

Theservices publishtheirownmetadata, enablingthemtobefoundand calledacrossanynetwork.

Highperformance

Loadbalancing andtheunderlyingparallelprocessingcapabilitiesofIBM InformationServercreatehighperformance foranytype ofdatapayload.

Adataintegrationserviceiscreatedbydesigningthedataintegrationprocesslogic in IBMInformationServer andpublishing itasaservice.Theseservicescanthen be accessedbyexternal projectsandtechnologies.

WebSphereInformationServicesDirectorprovidesa foundationforinformation services byallowing youtoleveragetheothercomponentsofIBMInformation Server forunderstanding,cleansing,and transforminginformationanddeploying those integrationtasksasconsistentandreusable informationservices.

AsFigure16shows,service-readydataintegrationjobs canbeusedwith process-centrictechnologiessuchasEAI,businessProcessManagement(BPM), ESB,and applicationservers.

Packaged Applications Data Warehouses Business Partner Data Legacy Application Master Data Stores Enterprise Data Integration Services Request Data from System 1 Request Data from System 2 Match and Survive Enhance (lookup) Transform to Target Format Business Process Process Flow Create Quote Allocate Inventory Calculate Discount Calculate Quote Get Customer Request Ship Date Process Credit Card Estimate Backlog

(37)

Scenarios

for

service-oriented

integration

Thefollowingexamples showhow customershaveusedservice-oriented architecturesinIBMInformationServer.

Pharmaceuticalindustry:Improvingefficiency

Aleadingpharmaceuticalcompanyneededtoincludereal-time datafrom clinicallabs initsresearchanddevelopment reports.Thecompanyused WebSphereDataStagetodefineatransformationprocess forXML

documents fromlabs.Thisprocess usedSOAtoexposethetransformation asa Webservice,allowing labstosenddataand receivean immediate response.

Pre-clinicaldataisnowavailabletoscientific personnelearlier,allowing labscientiststoselectwhichdatatoanalyze.Now,onlythebestdatais chosen, greatlyimprovingscientists’efficiency.

Insurance:Validatingaddressesin realtime

An internationalinsurancedataservices companyusesIBMInformation Server tovalidateandenrichpropertyaddressesthroughWebservices.As insurance companiessubmitlistsofaddressesforunderwriting,services standardizetheaddressesbasedontheirrules,validate eachaddress, matchtheaddressestoalistofknown addresses,and enrichtheaddresses with additionalinformationthathelpswith underwritingdecisions.The companynowautomates80percentoftheprocessandeliminatedmostof theerrors.Theprojectwassimplified byusingtheSOAcapabilitiesofIBM InformationServerand thestandardization andmatchingcapabilitiesof WebSphereQualityStage.

Where

SOA

fits

in

a

business

context

By enablingintegrationtasks asservices,IBM InformationServer becomesacritical componentoftheapplicationdevelopment andintegrationenvironment.TheSOA infrastructureensures thatdataintegrationlogicthatisdevelopedinIBM

InformationServer canbe usedbyanybusinessprocess.

SOAallowsyoutousebothanalyticaland operationaldata.Thebestdatais available atalltimes,toallpeopleandtoall processes.Thefollowingcategories representcommonusesofSOAinabusinesscontext:

Real-time datawarehousing

Enables companiestopublishtheirexistingdataintegrationlogicas services thatcanbe calledinrealtimefromanyprocess.Thistype of warehousing enablesuserstoperformanalyticalprocessingandloadingof databased ontransactiontriggers,ensuringthattime-sensitivedatainthe warehouseiscompletelycurrent.

Matchingservices

Enables dataintegrationlogictobepackagedasa sharedservicethatcan be calledbyenterpriseapplicationintegrationplatforms.Thismethod allowsreference data(suchascustomer,inventory,andproductdata)tobe matched toandkeptcurrentwith amasterstorewith eachtransaction.

In-flight transformation

Enables enrichmentlogictobe packagedassharedservices sothat capabilitiessuchasproductnamestandardization,addressvalidation,or dataformattransformationscanbe sharedand reusedacrossprojects.

(38)

Enterprisedataservices

Enables thedataaccessfunctionsofmanyapplicationstobeaggregated and sharedina commonservicelayer.Insteadof eachapplicationcreating itsownaccesscode,theseservicescanbe reusedacrossprojects,

simplifying developmentandensuringahigher levelofconsistency.

AsFigure17shows,oneofthemajoradvantagesof usinganSOAapproachisthat youcancombinedataintegrationtasks withtheleadingenterprisemessaging, EnterpriseApplicationIntegration(EAI),and BusinessProcessManagement (BPM) productsbyusingbindingchoices.

SincemostmiddlewareproductssupportWebservices,thereareoftenmultiple optionsforhowthis isdone.For example,WebSphereintegrationproductssuchas WebSphereFederationServer orWebSphereBusiness IntegrationMessageBroker caninvokeIBM InformationServer servicestoaccessservice-readyjobs.

A

closer

look

at

service-oriented

integration

in

IBM

Information

Server

IBM InformationServer providesa SOAinfrastructurethatusesdata transformation processesthatare createdfromneworexistingWebSphere

DataStage orWebSphereQualityStagejobsorfederated queriesthatarecreatedby WebSphereFederationServer andexposesthemasaset ofservicesandoperations.

Afteranintegrationservices isenabled,anyenterpriseapplication, .NetorJava™

developer, MicrosoftOfficeorintegrationsoftwarecaninvoketheservice byusing a bindingprotocolsuchasWebservices.

The followingfeaturesarecentral totheIBM InformationServer SOA infrastructure:

Common administrativeservices

Hostand publishservicemetadata,exposeachoiceofbindingsforeach

(39)

service,andprovideinfrastructureservices suchassecuritymanagement, session management,logging,andmonitoring.

Foundationcomponents fordevelopment

Providea singlesetof datatransformation rulesforanalyticaland

enterpriseapplications,businessactivitymonitoring,federateddataaccess, and businessprocessintegration.

Any-to-any connectivity

Provides technologyindependencefordatatransformation,

standardization,matching,and legacydataaccessbyusingWebservices (.NET andJava)or EnterpriseJavaBeans™(EJB)interfacebindings.

Service-ready

integration

Aservice-readydataintegrationjobacceptsrequestsfromclientapplications, mappingrequestdatatoinputrowsandpassingthemtotheunderlyingjobs.A jobinstance caninclude databaselookups,transformations, datastandardization and matching,andotherdataintegrationtasksthataresupplied byIBM

InformationServer.

Figure18showsaWebSphereDataStagejobwith aserviceinputandservice output.

Thedesignof areal-timejobdetermineswhetherit isalways runningorrunsonce tocompletion.All jobsthatareexposedasservicesprocessrequestsona 24-hour basis.TheSOAinfrastructuresupportsthreejobtopologiesfordifferentloadand workstylerequirements:

Batch jobs

Topology Iusesneworexistingbatchjobsthatareexposedasservices.A batchjobstartsondemand.Eachservice requeststartsoneinstanceofthe jobthatrunstocompletion.Thisjobtypicallyinitiatesa batchprocessfrom a real-timeprocessthatdoesnotneeddirectfeedbackontheresults.This topology istailored forprocessingbulkdatasetsandiscapableof acceptingjobparametersasinputarguments.

Batch jobswitha ServiceOutputstage

Topology IIusesanexistingbatchjoband addsanoutputstage.The ServiceOutputstage istheexitpointfromthejob,returningoneormore rowstotheclientapplicationasa serviceresponse.AsFigure19onpage 34shows,thesejobs typicallyinitiatea batchprocessfroma real-time process thatrequiresfeedbackor datafromtheresults.Thistopology is designed toprocesslargedatasetsandcanacceptjobparametersasinput arguments.

Service Input Stage

Rest of DataStage Job

Service Output Stage

(40)

JobswithaServiceInput stageandServiceOutputstage

InTopologyIII,jobs usebotha ServiceInputstageanda ServiceOutput stage.TheServiceInputstage istheentrypointtoa job,accepting oneor more rowsduring aservicerequest.Thesejobsare alwaysrunning.This topology istypicallyusedtoprocesshighvolumesofsmallertransactions where responsetimeisimportant. Itistailoredtoprocess manysmall requestsratherthana fewlargerequests.Figure20showsan exampleof this topology. Service Output CustomerDB D1Orders Rows ReturnedRows XML Output Order_Transformation

Figure19.BatchjobswithaServiceOutputstage

Service Input Service Output ODBC DSLink1 DSLink2 DSLink3 DSLink4 DSLink5 XML Output XML Input Transformer

(41)

SOA

components

in

IBM

Information

Server

Therun-time componentsthatenableservice-orientedarchitecturesarecontained intherun-timeenvironment ofthecommon servicesofIBMInformationServer.

ThesecomponentsareJ2EEapplicationsthatdistributerequeststoWebSphere DataStage,WebSphereQualityStage,or WebSphereFederationServer basedon load-balancing algorithms.Commoncoreservicesinclude securityand logging.

Threshold-balanced

parallelism

Therun-time environmentcombinesparallelprocessingwith loadbalancingand distribution toprovidehighperformancedataprocessing. Itbalancesservice requestsbyroutingthemtoWebSphereFederationServerorWebSphereDataStage servers,eachof whichtakes advantageofpipelinetechnologyfor parallel

execution.

Threshold-balancedparallelismenablesSOAplatformstoautomaticallyadjust resources basedonthresholdsthatyousetwhenyoudefineservices.Thecommon services startand stopjobsinresponsetoloadconditions.Thecombinationof these capabilitieswith parallelpipeliningisuniquetoIBMInformationServerand enablesIBMInformationServertoprocess dataintegrationtasksfasterthanany othertechnology.

Multiple

binding

support

Virtuallyanyprotocolcanbe madetoadheretoSOAprinciples.IBM Information Server supportsthisapproach,enablingthesameservicetosupportmultiple protocolbindings,all definedwithin theWSDLfile.

An SOAinterfaceshouldbe abletohandlemultiplemechanisms(bindings)for calling services.Thisimproves theutility ofservicesandthereforeincreasesthe likelihood ofreuseandadoptionacrosstheenterprise.

Projects forwhichWebservicesarenota viableoptionbecauseofperformance or architectural requirementscanstillleveragetheservices byusinganinterface better suitedtotheirrequirements.WebSphereInformationServicesDirector can publishthesameserviceusingdifferentbindings:

SimpleObjectAccessProtocol(SOAP) overHTTP(Web services)

Anyapplicationthatcomplies withXMLWebservicescaninvokea WebSphereFederationServer orWebSphereDataStageintegrationprocess asa Webservice.TheseWebservicessupport thegenerationofliteral document-styleand SOAPencodedRPC-styleWeb services.

EnterpriseJavaBeans(EJB)

For Java-centricdevelopment, WebSphereInformationServicesDirectorcan generatea J2EE-compliantEJB(statelesssession bean)where eachdata transformation serviceisinstantiated asa separatesynchronousEJB method call.

Thedesigndoesnotdependonthebindingchoice.AslogicisbuiltinWebSphere DataStage andWebSphereQualityStage,thedesignerdoesnotneed tobeawareof how itwillbeused.Aftertheserviceisdeployed,additionalbindingscaneasilybe implemented withoutchanging thelogic.

(42)

WebSphere

Information

Services

Director

tasks

WebSphereInformationServicesDirectorprovidesanintegratedenvironment for designing servicesthatenablesyoutorapidlydeployintegrationlogicasservices withoutassumingextensivedevelopmentskills.

With asimple,wizard-driveninterface, inafewminutesyoucanattachaspecific bindingand deployareusable integrationservice.WebSphereInformationServices Director alsoprovidesthese features:

v

Load-balancingandadministratorservicesfor catalogingand registeringservices

v Sharedreportingandsecurityservices

v Ametadataservices layerthatpromotesreuse oftheinformationservices by

actuallydefiningwhattheservicedoesandwhatinformationitdelivers.

Information

providers

An informationprovider isboththeserverthatcontainsunits thatyoucanexposeas services andtheunitsthemselves,suchasWebSphereDataStageandWebSphere QualityStagejobs orfederatedSQLqueries.

Eachinformationprovidermust beenabled.Toenabletheseproviders,youuse WebSphereInformationServicesDirector.

YouusetheAddInformationProviderwindow toenableinformationproviders thatyouinstalledoutsideofIBMInformationServer,suchasWebSphereDataStage servers orfederatedservers.

Creating

a

project

Aprojectisacollaborativeenvironmentthatyouusetodesignapplications, services, andoperations.

All projectinformationthatyoucreatebyusingWebSphereInformationServices Director issavedinthecommonmetadatarepositorysothatitcaneasilybe shared amongotherIBM InformationServer components.

Youcanexportaprojecttobackupyour workorshare workwithotherIBM InformationServer users.Theexportfileincludesapplications,services,operations, and bindinginformation.

Creating

an

application

An applicationisacontainerfora setofservices andoperations.Anapplication contains oneormoreservicesthatyouwanttodeploy togetherasanEnterprise Archive(EAR)fileonanapplicationserver.

All design-timeactivity occursinthecontextof applications: v Creatingservicesand operations

v Describinghow messagepayloadsandtransport protocolsareusedtoexposea

service

v Attachinga referenceprovider, suchasaWebSphereDataStagejoboranSQL

(43)

Creating anapplicationisasimple taskfromtheDevelopnavigatormenuofthe IBM InformationServer console.Youcanalso exportservicesfromanapplication before itisdeployedand importtheservicesintoanotherapplication.

Youcanchangethedefaultsettingsforoperationalpropertieswhenyoucreatean application orlater,asFigure21shows.

Creating

a

service

An informationserviceexposesresultsfromprocessingbyinformationproviders suchasDataStageserversand federatedservers.Adeployedservice runsonan application serverandprocessesrequestsfromserviceclientapplications.

An informationserviceisa collectionofoperations thatareselectedfromjobs, maps,federatedqueries,orotherinformationproviders.Youcangroup operations inthesameinformationserviceordesignthemin separateservices.

Youcreateaninformationservicefora setofoperations thatyouwanttodeploy together.Youselecta projectandanapplication withintheprojectintheSelecta Viewarea,asFigure22onpage38shows.

(44)

Whenyoucreatea service,youspecifysuchoptionsasname,base packagename for theclassesthataregeneratedduring thedeploymentof theapplication,and optionallythehomeWebpageand contactinformationfortheservice.

Afteryoucreatetheservice,youattachabindingfortheservice:

SimpleObjectAccessProtocol(SOAP) overHTTP

ToexposeaninformationserviceasaWebservice,attachtheSOAPover HTTPbindingtotheinformationservice.

EnterpriseJavaBeans(EJB)interface

Ifyour serviceconsumerswanttoaccessaninformationservicethroughan EJB interface,attachtheEJB bindingtotheinformationservice.

Deploying

applications

and

their

services

Youdeployan applicationonWebSphereApplicationServer toenablethe informationservices thatarecontainedintheapplicationto receiveservice requests.

The DeployApplicationwindowinWebSphereInformationServicesDirector guides youthrough theprocess,asFigure23onpage39shows.

(45)

Youcanexcludeoneormore services,bindings,andoperationsfromthe

deployment,changeruntimepropertiessuchasminimumnumberofjobinstances, or,forWebSphereDataStagejobs,setconstantvaluesforjobparameters.

WebSphereInformationServicesDirectordeploystheEnterpriseArchive(EAR)file ontheapplicationserver.

SOA

and

data

integration

Enabling anIBMInformationServerjobasa Webserviceenablesthejobto participate invariousdataintegrationscenarios.

Data integrationenablesuserstofederateheterogeneousdataacrossseveraldata sources.SOAallowsWebSphereDataStagejobstoparticipateinfederatedqueries byusingWebSphereFederation Server.

Figure24onpage40showsabusinessscenarioinwhichacustomerservice manager needstointegrateinformationacrossmultipledatastorestoaddressnew customer complaints.Themanagerneedstolookattheactualinvoicetocompare recent shipmentdatainXMLformatplusthehistoricaldatainthewarehouseto ensurethatthedataisaccurate.

References

Related documents