IBM Information Server
Information Server Introduction
IBM
Information
Server
Information
Server
Introduction
Version8.0
Note
Beforeusingthisinformationandtheproductthatitsupports,besuretoreadthegeneralinformationunder“Noticesand trademarks”onpage133.
©CopyrightInternationalBusinessMachinesCorporation2006.Allrightsreserved.
Contents
Chapter
1.
Introduction
.
.
.
.
.
.
.
. 1
Chapter
2.
Architecture
and
concepts
.
. 5
ParallelprocessinginIBMInformationServer . . . 7
ParallelismbasicsinIBMInformationServer . . 8
ScalabilityinIBMInformationServer. . . 11
SupportforgridcomputinginIBMInformation Server . . . 12
SharedservicesinIBMInformationServer . . . . 13
AdministrativeservicesinIBMInformation Server . . . 13
ReportingservicesinIBMInformationServer . . 15
Chapter
3.
Metadata
services
.
.
.
.
. 17
Metadataservicesintroduction . . . 17
AcloserlookatmetadataservicesinIBM InformationServer . . . 19
WebSphereBusinessGlossary . . . 20
WebSphereBusinessGlossarytasks . . . 21
WebSphereMetadataServer. . . 23
Informationresourcesformetadataservices . . . 28
Chapter
4.
Service-oriented
integration
29
Introductiontoservice-orientedintegrationinIBM InformationServer . . . 29Acloserlookatservice-orientedintegrationinIBM InformationServer . . . 32
SOAcomponentsinIBMInformationServer . . . 35
WebSphereInformationServicesDirectortasks . . 36
SOAanddataintegration. . . 39
InformationresourcesforWebSphereInformation ServicesDirector. . . 41
Chapter
5.
WebSphere
Information
Analyzer
.
.
.
.
.
.
.
.
.
.
.
.
.
. 43
WebSphereInformationAnalyzercapabilities . . . 43
AcloserlookatWebSphereInformationAnalyzer 46 WebSphereInformationAnalyzertasks . . . 49
Dataprofilingandanalysis . . . 50
Datamonitoringandtrending . . . 54
Resultsoftheanalysis. . . 57
InformationresourcesforWebSphereInformation Analyzer . . . 58
Chapter
6.
WebSphere
QualityStage
.
. 59
IntroductiontoWebSphereQualityStage. . . 59
AcloserlookatWebSphereQualityStage . . . . 63
WebSphereQualityStagetasks . . . 66
Investigatestage. . . 67
Standardizestage . . . 68
Matchstagesoverview . . . 69
Survivestage. . . 73
Accessingmetadataservices. . . 74
InformationresourcesforWebSphereQualityStage 74
Chapter
7.
WebSphere
DataStage
.
.
. 77
IntroductiontoWebSphereDataStage. . . 78
AcloserlookatWebSphereDataStage . . . 79
WebSphereDataStagetasks . . . 83
WebSphereDataStageelements. . . 83
OverviewoftheDesigner,Director,and Administratorclients . . . 85
DatatransformationforzSeries . . . 102
WebSphereDataStageMVSEdition . . . 102
WebSphereDataStageEnterpriseforz/OS. . . 104
InformationresourcesforWebSphereDataStage 105
Chapter
8.
WebSphere
Federation
Server
.
.
.
.
.
.
.
.
.
.
.
.
.
. 107
IntroductiontoWebSphereFederationServer. . . 108
AcloserlookatWebSphereFederationServer . . 111
Thefederatedserveranddatabase . . . 111
Wrappersandotherfederatedobjects . . . . 112
Queryoptimization . . . 113
Two-phasecommitforfederatedtransactions 114 RationalDataArchitect . . . 115
WebSphereFederationServertasks . . . 116
Federatedobjects . . . 116
Cachetablesforfasterqueryperformance. . . 117
Monitoringfederatedqueries . . . 118
Federatedstoredprocedures . . . 119
InformationresourcesforWebSphereFederation Server . . . 119
Chapter
9.
Companion
products
.
.
. 121
WebSphereDataStagePacks . . . 121
AcloserlookatWebSphereDataStagePacks 123 WebSphereDataStageChangeDataCapture . . . 126
WebSphereReplicationServer. . . 127
WebSphereDataEventPublisher. . . 128
InformationresourcesforIBMInformationServer companionproducts . . . 129
Accessing
information
about
IBM
.
. 131
ContactingIBM . . . 131
Accessibledocumentation . . . 132
Providingcommentsonthedocumentation . . . 132
Notices
and
trademarks
.
.
.
.
.
.
. 133
Notices . . . 133
Trademarks . . . 135
Chapter
1.
Introduction
Mostoftoday’scriticalbusinessinitiativescannotsucceedwithouteffective integrationofinformation.Initiativessuchassingleviewof thecustomer,business intelligence,supplychainmanagement,and BaselIIand Sarbanes-Oxley
compliancerequire consistent,complete, andtrustworthy information.
IBM®InformationServeristheindustry’sfirstcomprehensive,unifiedfoundation forenterpriseinformationarchitectures,capable ofscalingtomeet anyinformation volume requirementsothatcompaniescandeliverbusinessresultswithin these initiatives fasterandwithhigherqualityresults.
IBM InformationServer combinesthetechnologieswithin theIBMInformation IntegrationSolutionsportfolio(WebSphere® DataStage®,WebSphereQualityStage,
WebSphereProfileStage,andWebSphereInformationIntegrator)intoa single unifiedplatformthatenablescompaniestounderstand,cleanse,transform,and delivertrustworthyand context-richinformation.
Over thelast twodecades,companieshavemadesignificantinvestmentsin
enterpriseresourceplanning,customer relationshipmanagement,andsupplychain management packages.Companiesalso areleveraginginnovations suchas
service-orientedarchitectures(SOA),Webservices,XML,gridcomputing,and Radio FrequencyIdentification(RFID).
Theseinvestmentshaveincreasedtheamount ofdatathatcompaniesarecapturing abouttheirbusinesses.Butcompaniesencounter significantintegrationhurdles whenthey trytoturnthatdataintoconsistent,timely, andaccurateinformationfor decision-making.
IBM InformationServer helpsyouderivemorevalue fromcomplex,heterogeneous information.IthelpsbusinessandIT personnelcollaboratetounderstandthe meaning,structure,and contentofinformationacrossa widevarietyofsources. IBM InformationServer helpsyouaccessanduseinformationinnew waysto driveinnovation,increaseoperationalefficiency, andlowerrisk.
IBM InformationServer supportsall oftheseinitiatives:
Businessintelligence
IBM InformationServer makesiteasier developa unifiedviewofthe businessforbetterdecisions.Ithelpsyouunderstandexistingdatasources, cleanse,correct,andstandardizeinformation,andloadanalyticalviews thatcanbereused throughouttheenterprise.
Master datamanagement
IBM InformationServer simplifiesthedevelopmentofauthoritativemaster databyshowingwhereandhow informationisstored acrosssource systems. Italsoconsolidatesdisparatedatainto asingle,reliablerecord, cleanses andstandardizesinformation,removesduplicates,andlinks recordstogetheracrosssystems.Thismasterrecordcanbeloaded into operationaldatastores,datawarehouses,or masterdataapplicationssuch asWebSphereCustomerCenter.Therecord canalso beassembled,
Infrastructurerationalization
IBM InformationServer aidsinreducingoperatingcostsbyshowing relationshipsbetweensystemsand bydefiningmigrationrulesto
consolidate instancesormovedatafromobsoletesystems.Data cleansing and matchingensurehigh-qualitydatainthenewsystem.
Businesstransformation
IBM InformationServer canspeeddevelopment andincreasebusiness agilitybyprovidingreusable informationservicesthatcanbepluggedinto applications,businessprocesses,andportals.Thesestandards-based informationservices aremaintainedcentrallybyinformationspecialistsbut are widelyaccessiblethroughouttheenterprise.
Riskandcompliance
IBM InformationServer helpsimprovevisibilityand datagovernanceby enabling complete,authoritativeviewsofinformationwithproofoflineage and quality.Theseviewscanbemadewidelyavailableand reusableas shared services,whiletherulesinherent inthemaremaintainedcentrally.
Capabilities
IBM InformationServer featuresaunifiedsetofproductmodulesthatsolve multiple typesof businessproblems.Informationvalidation,accessand processing rulescanbereused acrossprojects, leadingtoa higherdegreeof consistency, stronger controloverdata,andimprovedefficiencyinITprojects.
AsFigure1shows,IBM InformationServer enablesbusinessestoperform fourkey integrationfunctions:
Understandyourdata
IBM InformationServer canhelpyouautomaticallydiscover,define,and modelinformationcontentand structureand understandandanalyzethe meaning,relationships,and lineageofinformation.Byautomatingdata profilingand data-qualityauditing withinsystems,organizations can achievethese goals:
v Understand datasourcesandrelationships
v Eliminatetheriskofusingorproliferatingbaddata
v Improveproductivitythroughautomation
v LeverageexistingITinvestments
IBM InformationServer makesiteasier forbusinessestocollaborateacross roles. Dataanalystscanuseanalysis andreportingfunctionality,generating integrationspecifications andbusinessrulesthattheycanmonitorover time. SubjectmatterexpertscanuseWeb-basedtoolstodefine, annotate, and reportonfields ofbusinessdata.Acommonmetadatafoundation makesiteasierfor differenttypesofuserstocreateandmanage metadata byusingtoolsthatare optimizedfortheirroles.
Cleanseyourinformation
IBM InformationServer supportsinformationqualityandconsistency by standardizing,validating,matching,and mergingdata.Itcancertifyand enrichcommondataelements,usetrusteddatasuchaspostalrecordsfor nameandaddressinformation,andmatchrecordsacrossorwithindata sources.IBM InformationServer allowsasingle recordtosurvivefromthe bestinformationacrosssourcesforeachuniqueentity,helpingyouto createa single,comprehensive,and accurateviewofinformationacross sourcesystems.
Transformyourdataintoinformation
IBM InformationServer transformsandenrichesinformationtoensurethat it isinthepropercontextfornewuses.Hundredsofprebuilt
transformation functionscombine, restructure,andaggregateinformation. Transformationfunctionalityisbroadandflexible,tomeet the
requirementsofvariedintegrationscenarios.For example,IBMInformation Server providesinlinevalidationandtransformation ofcomplexdatatypes suchasU.S.HealthInsurancePortabilityandAccountabilityAct(HIPAA), alongwith high-speedjoinsandsortsofheterogeneousdata.IBM
InformationServeralso provideshigh-volume,complexdata transformation andmovementfunctionalitythatcanbe usedfor
standalone extract/transform/load(ETL) scenarios,orasareal-timedata processingengineforapplicationsorprocesses.
Deliveryourinformation
IBM InformationServer providestheabilitytovirtualize,synchronize,or moveinformationtothepeople,processes,orapplicationsthatneedit. Informationcanbedeliveredthroughfederationortime-basedor event-basedprocessing, movedinlargebulkvolumesfromlocationto location,oraccessedinplacewhen itcannotbe consolidated.
IBM InformationServer providesdirect,nativeaccesstoa widevarietyof informationsources,bothmainframeanddistributed.Itprovides accessto databases, files,servicesandpackagedapplications,and tocontent
repositoriesandcollaborationsystems.Companion productsallow high-speedreplication,synchronizationand distributionacrossdatabases, changedatacapture,andevent-basedpublishingof information.
Chapter
2.
Architecture
and
concepts
IBM InformationServer providesa unifiedarchitecture thatworks withalltypesof informationintegration.Commonservices,unifiedparallelprocessing, andunified metadataareat thecoreoftheserverarchitecture.
Thearchitecture isserviceoriented, enablingIBMInformationServertowork within anorganization’sevolvingenterpriseservice-orientedarchitectures.A service-orientedarchitecturealso connectstheindividualproductmodules ofIBM InformationServer.
By eliminatingduplicationoffunctions,thearchitecture efficientlyuseshardware resources andreducestheamountofdevelopment andadministrativeeffortthat are requiredtodeployan integrationsolution.
Figure2 onpage5 showsthefivetop-levelcomponentsoftheIBMInformation Server architecture.
Unified parallelprocessingengine
MuchoftheworkthatIBMInformationServerdoestakesplacewithinthe parallelprocessingengine.Theengine handlesdataprocessingneedsas diverseasperforminganalysis oflargedatabasesforWebSphere
InformationAnalyzer,datacleansingforWebSphereQualityStage,and complex transformationsforWebSphereDataStage.Thisparallelprocessing engine isdesigned todeliver:
v Parallelismandpipeliningtocompleteincreasingvolumesofworkin
decreasingtimewindows
v Scalabilitybyadding hardware(forexample,processorsornodesina
grid)with nochangestothedataintegrationdesign
v Optimizeddatabase,file,and queueprocessingtohandlelargefilesthat
cannotfitinmemoryall atonceorwithlargenumbersofsmallfiles
Common connectivity
IBM InformationServer connectstoinformationsourceswhethertheyare structured, unstructured,onthemainframe, orapplications.
Metadata-driven connectivityissharedacrosstheproductmodules,and connection objectsare reusableacrossfunctions.
Connectors providedesign-timeimportingofmetadata, databrowsingand sampling,run-timedynamicmetadataaccess,errorhandling,and high functionalityand highperformancerun-timedataaccess.Prebuilt
interfacesforpackagedapplicationscalledPacks provideadapterstoSAP, Siebel,Oracle,and others,enablingintegrationwithenterpriseapplications and associatedreportingandanalyticalsystems.
Unified metadata
IBM InformationServer isbuiltona unifiedmetadatainfrastructurethat enablessharedunderstandingbetweenbusinessandtechnicaldomains. Thisinfrastructurereducesdevelopmenttimeandprovides apersistent record thatcanimproveconfidenceininformation.All functionsofIBM InformationServershare thesamemetamodel,making iteasierfor differentrolesandfunctionstocollaborate.
AcommonmetadatarepositoryprovidespersistentstorageforallIBM InformationServerproductmodules.Alloftheproductsdependonthe repositorytonavigate,query,andupdatemetadata.Therepository contains twokindsofmetadata:
Dynamic
Dynamic metadataincludesdesign-timeinformation.
Operational
Operationalmetadataincludesperformancemonitoring,auditand logdata,anddataprofilingsampledata.
Becausetherepositoryissharedbyallproductmodules, profiling
informationthatiscreatedbyWebSphereInformationAnalyzerisinstantly available tousersofWebSphereDataStageand QualityStage,forexample. TherepositoryisaJ2EEapplication thatusesa standardrelational database suchasIBMDB2®,Oracle,orSQLServerforpersistence(DB2is
providedwith IBMInformationServer).Thesedatabasesprovidebackup, administration,scalability,parallelaccess,transactions,andconcurrent access.
Common services
IBM InformationServer isbuiltentirelyonaset ofsharedservices that centralizecoretasksacrosstheplatform.Theseincludeadministrativetasks suchassecurity,useradministration,logging,andreporting.Shared services allowthesetaskstobe managedandcontrolled inoneplace, regardlessofwhichproductmodule isbeingused.Thecommon services also includethemetadataservices,whichprovidestandard
service-orientedaccessand analysisofmetadataacrosstheplatform.In addition,thecommonservices layermanageshow servicesaredeployed fromanyoftheproductfunctions,allowingcleansingandtransformation rulesorfederatedqueriestobe publishedasshared serviceswithinan SOA,usinga consistentandeasy-to-usemechanism.
IBM InformationServer productscanaccessthreegeneralcategoriesof service:
Design
Design serviceshelpdeveloperscreatefunction-specificservices thatcanalsobe shared.Forexample,WebSphereInformation Analyzercallsa columnanalyzerservicethatwascreatedfor enterprisedataanalysisbutcanbe integratedwithotherpartsof IBM InformationServer becauseitexhibitscommonSOA
characteristics.
Execution
Executionservicesinclude logging,scheduling,monitoring, reporting,security,andWebframework.
Metadata
Using metadataservices,metadataisshared“live” acrosstoolsso thatchangesmadein oneIBMInformationServerproductare instantlyvisible acrossalloftheproductmodules.Metadata services aretightlyintegratedwiththecommonrepositoryandare packagedinWebSphereMetadataServer.Youcanalso exchange metadatawithexternal toolsbyusingmetadataservices.
Thecommonservices layerisdeployedonJ2EE-compliantapplication servers suchasIBMWebSphereApplicationServer,whichisincludedwith IBM InformationServer.
Unified userinterface
Theface ofIBMInformationCenterisacommongraphical interfaceand toolframework.SharedinterfacessuchastheIBMInformationServer consoleandWebconsoleprovideacommonlookand feel,visualcontrols, and userexperienceacrossproducts.Commonfunctionssuchascatalog browsing,metadataimport,query,and databrowsing allexpose
underlyingcommonservices inauniform way.IBMInformationCenter provides richclientinterfacesforhighlydetaileddevelopment workand thinclientsthatruninWebbrowsersforadministration.
Applicationprogramminginterfaces(APIs)supporta varietyofinterface styles thatincludestandardrequest-reply,service-oriented,event-driven, and scheduledtaskinvocation.
Parallel
processing
in
IBM
Information
Server
Companiestodaymust manage,store,andsort throughrapidlyexpanding volumesofdataand deliverittoendusersasquicklyaspossible.
Toaddress thesechallenges,organizationsneeda scalabledataintegration architecture thatcontainsthefollowingcomponents:
v Amethod forprocessingdatawithoutwritingtodisk,inbatchandrealtime.
v Dynamicdatapartitioningandin-flightrepartitioning.
v Scalablehardwarethatsupportssymmetricmultiprocessing(SMP),clustering,
grid,andmassivelyparallel processing(MPP)platforms withoutrequiring changestotheunderlyingintegrationprocess.
v
Supportfor paralleldatabasesincludingDB2,Oracle,andTeradata,inparallel
andpartitionedconfigurations.
v Anextensibleframeworktoincorporatein-house andthird-partysoftware.
IBM InformationServer addressesalloftheserequirementsbyexploitingboth pipeline parallelismandpartitionparallelismtoachievehighthroughput, performance, andscalability.
Parallelism
basics
in
IBM
Information
Server
The pipelineparallelismandpartitionparallelismthatareusedinIBMInformation Server underlyitshigh-performance,scalablearchitecture.
Data
pipelining
Datapipeliningistheprocess ofpullingrecordsfromthesourcesystem andmoving themthrough thesequenceofprocessingfunctionsthataredefinedinthe
data-flow(thejob).Becauserecordsareflowingthrough thepipeline,theycanbe processedwithoutwritingtherecordstodisk,asFigure3 shows.
Data canbebufferedinblockssothateachprocessisnotslowedwhenother componentsare running.Thisapproachavoidsdeadlocksand speedsperformance byallowingbothupstreamanddownstream processestorunconcurrently.
Without datapipelining,thefollowingissuesarise:
v Datamust bewrittentodisk betweenprocesses,degradingperformance and
increasingstorage requirementsand theneed fordiskmanagement. v Thedevelopermust managetheI/Oprocessingbetweencomponents.
v
Theprocessbecomesimpracticalforlargedatavolumes.
v Theapplicationwillbeslower,asdiskuse,management,and design
complexitiesincrease.
v Eachprocessmustcompletebeforedownstream processescanbegin,which
limitsperformance andfulluseofhardwareresources.
Data
partitioning
Datapartitioningisanapproachto parallelismthatinvolvesbreaking therecord set into partitions,orsubsets ofrecords.Data partitioninggenerallyprovideslinear increasesinapplication performance.Figure4 showsdatathatispartitionedby customer surnamebeforeitflows intotheTransformerstage.
Ascalablearchitecture shouldsupport manytypesofdatapartitioning,including thefollowingtypes:
v Hashkey(data)values
v Range v Round-robin v Random v Entire v Modulus v Databasepartitioning
IBM InformationServer automaticallypartitions databasedonthetypeofpartition thatthestagerequires. Typicalpackagedtoolslackthiscapabilityand require developerstomanuallycreatedatapartitions,whichresultsincostlyand time-consuming rewritingofapplicationsorthedatapartitionswheneverthe administratorwantstousemorehardwarecapacity.
Ina well-designed,scalablearchitecture,thedeveloperdoesnotneedtobe concerned aboutthenumber ofpartitionsthatwillrun,theabilitytoincreasethe number ofpartitions,orrepartitioningdata.
Dynamic
repartitioning
Intheexamplesshown inFigure4 andFigure5onpage10,dataispartitioned based oncustomersurname,andthenthedatapartitioningismaintained throughouttheflow.
Thistypeof partitioningisimpractical formanyuses,suchasatransformation thatrequiresdatapartitionedonsurnamebutmustthen beloaded intothedata warehouse byusingthecustomeraccountnumber.
Dynamic datarepartitioningisa moreefficientandaccurate approach.With dynamicdatarepartitioning,dataisrepartitionedwhile itmovesbetween
processeswithoutwritingthedatato disk,based onthedownstream processthat datapartitioning feeds.TheIBMInformationServer parallelenginemanagesthe communicationbetweenprocessesfordynamicrepartitioning.
Data isalsopipelinedtodownstreamprocesseswhenit isavailable,asFigure6 shows.
Without partitioninganddynamicrepartitioning,thedevelopermusttakethese steps:
v Createseparateflowsforeachdatapartition, basedonthecurrenthardware
configuration.
v Writedatatodiskbetweenprocesses.
v Manuallyrepartitionthedata.
v
Startthenext process.
The applicationwillbe slower,diskuseandmanagementwillincrease,andthe designwillbemuchmorecomplex.Thedynamicrepartitioningfeatureof IBM InformationServer helpsyouovercomethese issues.
Figure5.Datapartitioningandparallelexecution-alesspracticalapproach
Scalability
in
IBM
Information
Server
IBM InformationServer isbuiltona highlyscalable softwarearchitecturethat delivershighlevelsofthroughput andperformance.
For maximumscalability,integrationsoftwaremustdo morethanrunon SymmetricMultiprocessing(SMP)and MassivelyParallelProcessing(MPP) computersystems. Ifthedataintegrationplatformdoesnotsaturateallofthe nodesoftheMPPboxorsystemintheclusterorgrid, scalabilitycannotbe maximized.
TheIBM InformationServer componentsfullyexploitSMP,clustered, grid,and MPPenvironmentstooptimizetheuseofallavailable hardwareresources.
For example,whenyoucreatea simplesequentialdata-flowbygraph usingthe WebSphereDataStageandQualityStageDesigner, youdonotneedtoworryabout theunderlyinghardwarearchitectureornumber ofprocessors.Aseparate
configurationfiledefinestheresources(physicalandlogicalpartitionsornodes, memory,and disk)oftheunderlyingmultiprocessorcomputingsystem.
AsFigure7onpage12shows,theconfigurationprovides aclean separation betweencreatingthesequentialdata-flowgraphand theparallel executionof the application. Thisseparationsimplifiesthedevelopmentofscalable dataintegration systemsthatrunin parallel.
Without supportforscalablehardwareenvironmentsthefollowingproblemscan occur:
v Processingisslower,becausehardwareresourcesarenotmaximized.
v Applicationdesignand hardwareconfigurationcannotbedecoupled,and
manualinterventionand possiblyredesignisrequiredforeveryhardware change.
v Scalingondemandisnotpossible.
IBM InformationServer leveragespowerfulparallelprocessingtechnology to ensure thatlargevolumesofinformationcanbeprocessedquickly. Thistechnology ensures thatprocessingcapacitydoesnotinhibitprojectresultsand allows
solutions toeasilyexpandtonewhardwareand tofullyleveragetheprocessing powerofallavailable hardware.
Support
for
grid
computing
in
IBM
Information
Server
With hardwarecomputing powera commodity,gridcomputingisahighly
compelling optionforlargeenterprises.Gridcomputingallowsyoutoapplymore processingpowertoataskthanwaspreviouslypossible.
Gridcomputing usesall ofthelow-costcomputingresources,processors,and memorythatare availableonthenetworktocreatea singlesystemimage. Grid
computing softwareprovidesa listofavailablecomputingresources andalistof tasks. Whena computerbecomesavailable,thegridsoftwareassignsnew tasks according toappropriaterules.
Agridcanbemadeupofthousandsofcomputers.Grid-computing software balances ITsupplyanddemandbylettingusersspecifyprocessorand memory requirementsfortheirjobs,and thenfindavailable machinesona networktomeet those specifications.
Theparallel processingarchitectureof IBMInformationServerleveragesthe computing powerof gridenvironmentsandgreatlysimplifies thedevelopmentof scalable integrationsystemsthatexecuteinparallelforgridenvironments.
IBM InformationServer’spre-bundledgrideditionprovides rapidout-of-the-box implementationofgridscalability.Itincludesanintegratedgridscheduler and integratedgridoptimization.Thesecapabilitieshelpyoueasilyandflexiblydeploy integrationlogicacrossagridwithoutimpactingjobdesign,whichprovides unlimited scalability.
Shared
services
in
IBM
Information
Server
IBM InformationServer providesextensiveadministrativeandreportingfacilities thatusesharedservices anda Webapplicationthatoffersa commonlookandfeel forall administrativeand reportingtasks.
Administrative
services
in
IBM
Information
Server
IBM InformationServer providesadministrativeservicestohelpyoumanage users,roles,sessions,security,logs,andschedules.The Webconsoleprovides globaladministrationcapabilitiesthatbasedona commonframework.
TheIBM InformationServer consoleprovides theseservices: v
“Securityservices”
v “Logservices”onpage14
v
“Schedulingservices”onpage15
Security
services
Security servicessupportrole-basedauthenticationof users,access-controlservices, and encryptionthatcomplieswithmanyprivacyandsecurityregulations.As Figure8onpage14shows,theconsolehelpsadministratorsaddusers,groups, and rolesand letsadministrators browse,create,delete,andupdateoperations within InformationServer.
Directoryservicesactasa centralauthoritythatcanauthenticate resourcesand manage identitiesandrelationshipsamongidentities.Youcanbase directorieson IBM InformationServer’sown internaldirectory oronexternaldirectoriesthatare based onLDAP,Microsoft’sActiveDirectory,orUNIX®.
Users onlyuseonecredentialto accessallthecomponentsofInformationServer.A set ofcredentialsisstoredforeachusertoprovidesingle sign-ontotheproducts registeredwith thedomain.
Log
services
Logserviceshelp youmanage logsacrossall oftheIBM InformationServer suite components.Theconsoleprovidesa centralplacetoview logsandresolve problems.Logsare storedinthecommonrepository,andeachIBM Information Server suitecomponentdefinesrelevantloggingcategories.
Youcanconfigurewhichcategoriesof loggingmessagesare savedinthe repository. Logviewsaresavedqueriesthatanadministratorcancreatetohelp with commontasks.Forexample,youmightwanttodisplayalloftheerrorsin DataStage jobsthatraninthepast24hours.
Figure9 onpage15showstheIBMInformationServerWeb consolebeingusedto configureloggingreports.Loggingisorganizedbyservercomponents.TheWeb consoledisplaysdefaultand activeconfigurationsforeachcomponent.
Scheduling
services
Scheduling serviceshelpplanandtrackactivitiessuchloggingandreportingand suitecomponenttaskssuchdatamonitoringandtrending.Schedulesare
maintainedusingtheIBM InformationServer console,whichhelpsyoudefine schedules;view theirstatus,history,andforecast;and purgethemfromthesystem.
Reporting
services
in
IBM
Information
Server
Reportingservicesmanage runtimeandadministrativeaspectsofreportingfor IBM InformationServer.
Youcancreateproduct-specificreportsforWebSphereDataStage,WebSphere QualityStage,andWebSphereInformationAnalyzer,andcross-productreportsfor logging, monitoring,scheduling,andsecurityservices.
All reportingtasksare setupandrunfromasingleinterface, theIBM Information Server Webconsole.Youcanretrieveand viewreportsandschedulereportstorun at aspecific timeand frequency.
Figure10onpage16showstheWebconsole.
Youdefinereportsbychoosing fromaset ofpredefinedparametersandtemplates. Youcanspecifyahistorypolicythatdetermineshowthereportwillbe archived and whenitexpires.Reports canbe formattedasHTML,PDForMicrosoft®Word documents.
Chapter
3.
Metadata
services
Whenmovingtoanenterpriseintegrationstrategy,largeorganizationsoftenface a proliferation ofsoftwaretoolsthatarebuilttosolveidenticalproblems.Fewof these toolsworktogether,muchlessworkacrossproblemdomains toprovidean integratedsolution.
Data profiling,datamodeling, datatransformation,dataquality,and business intelligence toolsplayakeyroleindataintegration.Integrationcanbecomea mature, manageableprocess ifthesetoolsareenabled toworkacrossproblem domains.
Theconsequences oftheinabilitytomanagemetadataaremanyandsevere: v Changesthataremadetosourcesystemsaredifficult tomanageandcannot
matchthepace ofbusinesschange.
v Datacannotbeanalyzedacrossdepartmentsandprocesses.
v
Metadatacannotbeshared amongproductswithoutmanuallyretypingthe
metadata.
v Withoutbusiness-leveldefinitions,metadatacannotprovidecontextfor
information.
v Documentationisout-of-dateorincomplete, hamperingchange management
andmakingit hardertotrainnewusers.
v Effortstoestablishaneffectivedatastewardshipprogramfailbecauseofa lack
ofstandardizationandfamiliaritywiththedata.
v Establishinganaudittrailforintegrationinitiativesisvirtuallyimpossible.
Themetadataservices componentsofIBMInformationServercreatea fully integratedsuite,eliminatingtheneedtomanuallytransportmetadatabetween applicationsand providea standalonemetadatamanagementapplication.
Metadata
services
introduction
Metadataservicesare partoftheplatformonwhichIBMInformationServeris built.Byusingmetadataservices,youcanaccessdataandachievedataintegration tasks suchasanalysis,modeling, cleansing,and transformation.
Themajor metadataservices componentsofIBMInformationServerare WebSphereBusinessGlossary, WebSphereMetadataServer,and WebSphere MetaBrokers andbridges
WebSphere
Business
Glossary
WebSphereBusinessGlossary isa Web-basedapplication thatprovidesa
business-orientedview intothedataintegrationenvironment.ByusingWebSphere Business Glossary,youcanviewandupdatebusinessdescriptionsandaccess technicalmetadata.
Metadataisbestmanagedbythosewhounderstandthemeaningand importance of theinformationassetstothebusiness.Designedforcollaborativeauthoring,
WebSphereBusinessGlossary givesuserstheabilitytoshareinsightsand experiences aboutdata.Itprovides userswith thefollowinginformationabout dataresources:
v Businessmeaninganddescriptionsof data
v Stewardshipofdataandprocesses
v Standardbusinesshierarchies
v
Approvedterms
WebSphereBusinessGlossary isorganized andsearchable accordingtothe semantics thataredefinedbya controlledvocabulary,whichyoucancreateby usingtheWebconsole.
WebSphere
Metadata
Server
WebSphereMetadataServerprovidesa varietyof servicestoothercomponentsof IBM InformationServer:
v Metadataaccess
v Metadataintegration
v
Metadataimportandexport
v Impactanalysis
v Searchandquery
WebSphereMetadataServerprovidesa commonrepositorywithfacilitiesthatare capable ofsourcing,sharing,storing,andreconcilinga comprehensivespectrumof metadataincludingbusinessmetadataandtechnicalmetadata.
Businessmetadata
Business metadataprovides businesscontextfor informationtechnology assetsandaddsbusinessmeaningtotheartifactsthatarecreatedand managedbyotherITapplications.Businessmetadataincludescontrolled vocabularies, taxonomies,stewardship,examples,and businessdefinitions.
Technicalmetadata
Technicalmetadataprovidesdetailsaboutsourceandtargetsystems, their tableand fieldstructures,attributes,derivations,anddependencies. Technicalmetadataalsoincludesdetailsaboutprofiling,quality,andETL processes,projects,andusers.
WebSphere
MetaBrokers
and
bridges
WebSphereMetaBrokersand bridgesprovidesemanticmodelmappingtechnology thatallows metadatatobe sharedamongapplicationsforallproductsthatare usedinthedataintegrationlifecycle:
v Datamodeling orcasetools
v
Businessintelligenceapplications
v Datamartsanddatawarehouses
v Enterpriseapplications
v Dataintegrationtools
Customerswhousethesecomponentscanestablishcommondatadefinitions acrossbusinessandITfunctions.
v Driveconsistencythroughoutthedataintegrationlifecycle
v Provideenterprisevisibilityforchangemanagement
v Easilyextendtonew,existing,andhomegrownmetadatasources
Scenarios
for
metadata
management
AcomprehensivemetadatamanagementcapabilityprovidesusersofIBM InformationServer withacommon waytodealwithdescriptive information surrounding theuseofdata.Thefollowingscenariosdescribeusesofthis capability.
Web-based education:Profilingyourcustomer
AWeb-based,for-profiteducationproviderneededtoretainmorestudents. Business managersneededtoanalyzethestudentlifecyclefromapplication tograduation anddirectrecruitingeffortsatindividualswiththebest chance ofsuccess.
Tomeet thisbusinessimperative,thecompanydesignedanddelivereda businessintelligencesolutionthatusesadatawarehousethatcontains a single viewofstudentinformationthatispopulatedfromoperational systems. TheITorganizationusesWebSphereMetadataServerto coordinate metadatathroughouttheproject.Other toolsthatwereused includedEmbarcaderoERStudiofordatamodeling andBrioforBusiness Intelligence.
Theoverallprojecttimewasreducedbyprovidingmetadataconsistency and accuracyacrosseverytool. Thebusinessusersnowhavetrustworthy metadataabouttheinformationintheirBrioreports.WebSphereBusiness Glossary providedbusinessdefinitionstoWebSphereMetadataServer.The netresultismoreconfidentdecision-makingaboutstudentsandbetter student-retentioninitiatives.
Financial Services:Measuringlevelsofservice
Thedatawarehousing divisionofamajorfinancial servicesprovider neededtoprovideinternalcustomerswithcriticalenterprise-widedata aboutlevelsofservicethatarespecified bysignedservicelevelagreements (SLAs).Thedatawarehousinggroup alsoneededtoprovidebusiness definitionsof eachfield,includingmetricsthatdetailedactualversus promisedlevelsofservice.
TheorganizationusesIBMInformationServertocreatean enterprisedata warehouseand datamartstosatisfyeachSLA. Thedivisionusedmetadata services withinWebSphereInformationAnalyzer,WebSphereQualityStage, and WebSphereDataStagetocollaborateina multiuserenvironment.The datawarehousing groupwasalsoable toprovideHTMLreportsthat outlinedthestatisticsthatare associatedwiththeloadingofthedatamart tosatisfytheSLA.
Thedivisionmetitsservice-levelagreementsandwasable todemonstrate itscompliancetointernaldataconsumers.Additionally,endusersreceived importantbusinessdefinitions throughbusinessintelligencereports.
A
closer
look
at
metadata
services
in
IBM
Information
Server
Metadataservicesencompass awiderangeoffunctionalitythatformsthecore infrastructureofIBM InformationServer andalsoincludessomeseparately packagedcapabilities.
WebSphere
Business
Glossary
Managingbusinessmetadataeffectivelycanensurethatthesamedata“language” appliesthroughouttheorganization.WebSphereBusiness Glossarygivesbusiness usersthetoolstheyneedtoauthorand ownbusinessmetadata.
For example,onedepartmentrefersto “revenues,”anotherto“sales.”Arethey talking aboutthesame activity?Onesubsidiaryunittalksabout“customers,” anotherabout“users”or“clients.”Arethesedifferentclassifications ordifferent termsforthesameclassification?
WebSphereBusinessGlossary providesbusinessuserswith aWeb-based toolfor creating andmanagingstandarddefinitions ofbusinessconcepts, calledacontrolled vocabulary.Italso simplifiesthebuildingof abusiness-orientedclassification system andthecollaborativeauthoringofbusinessmetadata.
The toolsimplifies thetaskofmanaging, browsing,andcustomizingthebroad varietyofmetadatathatisstoredintherepositoryofWebSphereMetadataServer, metadatathatincludesdetailsabouttables, columns,models,schemas,operations, and othercomponentsofthedataintegrationprocess.
The tooldividesmetadataintocategories,eachofwhichcontainsterms.Youcan usetermsto classifyotherobjectsinthemetadatarepositorybasedontheneedsof your business.Youcanalsodesignateusersorgroupsasstewardsformetadata objects.
WebSphereBusinessGlossary helpsbusinessuserswiththefollowingtasks:
Developinga commonvocabularybetweenbusinessandtechnology
Acommonvocabularyallows multipleusersofdatatosharea common view ofthemeaningofdata.Users canassigncategories andtermstodata thatare meaningfulinabusinesscontext,andcreatea hierarchyof
categories foreaseofbrowsing.
Providing datagovernanceandstewardship
Data assuranceprogramsassignresponsibility tobusinessusers(data stewards) forthemanagementofdatathrough itslifecycle.
Findingbusinessinformation thatisderivedfrommetadata
Metadatahelpsbusinessuserstounderstandthemeaningofthedata,its currency,itslineage,and whoisresponsiblefordefiningandproducing
thedata.Ifabusinessuserwantstoknowthedefinitionof atermsuchas “corporateprice,”theglossarywillprovidethisinsight.
Accessing metadatawithoutcomplicatedtoolingandquerying
Metadataobjectscanbearrangedina hierarchicalfashiontosimplify browsing ofthedataobjects.
Providing collaborativeenrichmentofbusinessmetadata
Maintenance ofbusinessmetadataisanongoingprocessinwhich automated andmanualdatainputs evolve.Multiplebusinessuserscan collaboratetoaddnotes,annotations,categories,andsynonymstoenrich businessmetadata.
For example,multiplesystemsmaymaintaintablesofcustomer information, howeverthebusinessmayuncoverarequirement fortheconcept of“high-value” customers. Thebusinessneedsaway todefinewhatahighvalue customeris, and how torecognizethem(forexample,a high-valuecustomer isa customerwith combinedaccountbalancesover$10,000).WebSphereBusinessGlossaryprovides a toolforrecordingthesedefinitions,andrelatingbusinessconceptstogether into taxonomies.Thisrecordsthebusinessrequirementsinthesame metadata foundationthattheprofilingandanalysisprocess uses.
WebSphere
Business
Glossary
tasks
Major tasksinWebSphereBusinessGlossaryinclude creatingcategoriesand terms, browsing andsearching,enabling datastewardship,andannotatingdatafor collaboration.
WebSphereBusinessGlossary isa browser-basedapplicationthatyouaccessby usingMicrosoft InternetExplorer.
Enabling
data
stewardship
Data stewardshipisthemanagementofdatathroughoutitslifecycle.Stewardship includesmaking thedataavailable toallthosewho areauthorizedtoaccessit.It also includestheefficientmanagement andintegrationwithrelateddata.Perhaps mostimportantly,stewardshipincludestheresponsibility toensurethatdatais properlydefined,and thatallusersofthedataclearly understanditsmeaning.
WebSphereBusinessGlossary supportstheconceptof datastewardshipandhelps yousetand retrievestewardshipinformationforalldataassets.
Administrators candesignatea userorgroupasa steward.Administratorsand authorscanthenspecifythatthestewardisresponsibleforoneormoremetadata objects.Whenyouview thebrowsepageforan objectthathasasteward,youcan linktocontact informationforthesteward.
Creating
categories
and
terms
AlthoughyoucanuseseveralmethodstofindmetadatainWebSphereMetadata Server, businessusersoftenfindsearchingdatabycategoryisthebeststrategy. Data mustbe organizedintomeaningful taxonomiestoaidthenavigationof a businessglossarybycategory.
Figure12onpage22showstheCreateCategoryfunction inWebSphereBusiness Glossary. Youcreatea businessclassificationsystem ortaxonomythatacts asthe
hierarchical browsingstructure oftheglossary Website.Youcanalsoimport structure fromothertoolsorspreadsheets.
Atermisa wordorphrasethatcanbe usedtoclassifyand groupobjectsinthe metadatarepository.For example,youmightusetheterm“SouthAmerica Sales” toclassifysomeofthetablesandcolumnsinthemetadatarepository,andtheterm “AsianSales”toclassifyothertablesand columns.
Whenyoucreateor edita term,youcanspecifypropertiesand relationships amongterms, includingsynonymsand relatedterms.Youcanalsospecifyparent categories togroupsimilartermsandcandesignatestewards whohavethe responsibility formaintainingterms. Customattributesenableadministratorsto defineanynumber ofnewattributestobe appliedtoterms, categories,orboth.
Annotating
data
for
collaboration
Whiledatastewardsare responsibleforspecific typesof data,creating abusiness glossary isa collaborativeeffortthatrequiressubjectmatterexperts fromdifferent partsoftheenterprise.WebSphereBusinessGlossary providestoolsforsubject matterexpertsandothers toannotateexistingdatadefinitions,editdescriptions, and assigndataobjecttocategories.
Theseannotations,ornotes,helpbusinessusersshare insightsaboutthe
informationassetsoftheenterprise.For example,an analystmight discoverthata database columnforcustomerinformationalso containsshippinginformationthat doesnotbelonginthecolumn.Theanalystcould sharethatinformationbyusing theNotes®feature. Noteshelpyoucaptureideas intheform ofunstructured metadata. Thisinformationmight otherwisebe unknowntoalargeportionofthe enterprise.
Browsing
the
Business
Glossary
Youcanstart browsingtheglossarystructurefromtheOverviewpage,which displays thetop-levelcategoriesthattheglossaryadministratorhasdesignatedas mostimportantfornavigationinthemetadatarepository.
Thebrowse bycategoryfunctionenablesdatastewardstofind descriptionsrelated totype ofdataeventhoughtheymaynotknowtheexactnameofthedataitems inquestion.
Whenyouselectanobject, itsbrowsepageisdisplayedontheBrowseGlossary tab, whichliststheobject’sname,class,stewardandotherimportantproperties. Youcaninspectitsattributes,browseitsrelationshipstootherobjects,and send feedback totheadministrator.Administrators andauthorscanaddandeditnotes abouttheobject.
WebSphere
Metadata
Server
IBM InformationServer canoperateasa unifieddataintegrationplatformbecause of theshared capabilitiesof WebSphereMetadataServer.
Common
repository
By storingallmetadatain asharedrepository,IBM InformationServer enables metadatatobe sharedactivelyacrossall tools.Therepositoryprovidesservices for twotypesofdata:
v Designmetadata, whichiscreatedasa partofthedevelopmentprocessand can
beconfiguredtobe eitherprivateorshared bya teamof users.
v Operationalmetadata,whichiscreatedfromongoingintegrationactivity.This
metadataismessage-orientedandtime-stampedtohelptrackthesequenceof events.
With asharedrepository,changes thataremadeinonepartof IBMInformation Server willbe automaticallyand instantlyvisiblethroughoutthesuite.The repositoryoffers thefollowingkeyfeatures:
Activeintegration
Applicationartifactsaredynamicallyintegrated acrosstools.
Multiuser development
Teamscancollaborateina sharedworkspace.
Thecommon repositoryisanIBM WebSphereJ2EEapplication.Therepository usesstandardrelationaldatabasetechnology(suchasDB2orOracle)for persistence. Thesedatabasesprovidebackup,administration,scalability, transactions,andconcurrentaccess.
Common
model
MetadatafordataintegrationprojectscomesfrombothIBMInformationServer productsand vendorproducts.Therepositoryusesmetadatamodels (metamodels) todescribethemetadatafromthesesources.Metadatamodelsprovidea meansfor others tounderstandandshare metadatabetweenapplications.
Thecommon modelisthefoundationofIBMInformationServer.Metadata elementsthatare commontoallmetadatasourcesarediscovered andrepresented once, inaform andformatthatisaccessibletoallofthetools.Thecommonmodel
enablessharingandreuseof artifactsacrossIBMInformationServer.
Shared
metadata
services
WebSphereMetadataServerexposesa setofmetadatamanipulationandanalysis services foruseacrossIBMInformationServercomponents.Theseservicesenable metadatainterchange,integration,management,and analysis.Theyeliminatethe need forastandalone metadatamanagement productorrepositoryproductby activelymanagingmetadatainthebackground, andbyprovidingmetadata functionalityinthecontextofyour normaldailyactivities.
For example:
v AWebSphereDataStageuser wantsto understandthedependenciesbetween
stagesinanETLjob.Byusingmetadataservices,shecanperforman impact analysisfromtheDesignerclientcanvas,never needingtoleavetheapplication foranotherinterface.
v Adataanalystwhoisworkingwith WebSphereInformationAnalyzercanadd
businessterms,definitions,and notestodataunderanalysisforusebya data modelerorarchitect.
v
AWebSphereQualityStageuserneedstobetterunderstandthebusiness
semanticsthatareassociatedwitha datadomain.Byusingmetadataservices,he canaccessthebusinessdescriptionofthedomainandanyannotationsthatwere addedbybusinessusers.
v AWebSphereDataStagecomponentdeveloperwantstofinda functionthat
performsa particulardataconversion.By usingmetadataservices,shecan performanadvancedsearchforthefunction.
WebSphereMetadataServeroffersthefollowingkeymetadataservices: v Metadatainterchange
v Impactanalysis
v Integratedfind Metadatainterchange
WebSphereMetaBroker®and bridgesenableyoutoaccessand share
metadatawiththebest-of-classtoolsformodeling,dataprofiling,data quality,ETL,OLAP, andbusinessintelligence.
Figure13onpage25showshowMetaBrokerswork.MetaBrokersconvert metadatafromoneformattoanotherbymappingtheelementstoa standardmodelcalledthehubmodel.Theselectedmetadataisthen importedandstored intherepository.Themetadataexchangeenables decomposition andrecompositionofmetadataintosimpleunitsof meaning.
IBM InformationServer nowsupportsmorethan20MetaBrokersand bridgestovarioustechnologiesandpartnerproducts.Youcanusemost MetaBrokers toimportmetadatafromaparticulartool, file,ordatabase intothemetadatarepositoryofWebSphereMetadataServer.
Table1 describesMetaBrokertypesandthedifferenttypesofmetadatathat youcanaccess.
Table1.MetaBrokertypes
TypeofMetaBroker Typeofmetadata
Designtool CAERwin,OracleDesigner,Rational®
Data ArchitectandtheUnifiedModeling Language(UML)
OLAPandbusinessintelligence CognosPowerPlay,IBMCubeViews™
, ReportNet,BusinessObjects,andHyperion Operationalmetadata Metadatathatdescribesoperationalevents
suchasthetimeanddateofintegration processruns.
Impact analysis
Impact analysishelpsyoumanagetheeffectsofchanges todataby showingdependenciesamongobjects.Thistype ofanalysisextendsacross multiple tools,helpingyouassessthecostofchange.Forexample,a developercanpredicttheeffectsofa changetoatabledefinitionor businesslogic.
Figure14onpage26showstheWebSphereDataStageandQualityStage Designerbeingusedtoselectatable definitioncalled ProdDimfromthe metadatarepositorytoshowwhereuseddependencies.
METABROKER External Tool Metadata Interface Decoder Encoder Mapper
Source (view) model Target (hub) model
TheImpactAnalysisPathViewerpresentsagraphical viewofthese relationships, asFigure15onpage27shows.
Thedependenciescanalsobe shownina textualview.Youcanalso runan impactanalysisreportthatcanbeviewedfromtheWebconsole.
Integratedfind
Metadataserviceshelp youlocateandretrieveobjectsfromtherepository byusingeitherthequickfindfeature ortheadvancedfindfeature.The quickfindfeature locatesanobjectbased onafullorpartialnameor description. Theadvancedfindfeaturelocates objectsbasedonthe followingattributes: v Type v Creation data v Lastmodified v Whereitisused
Information
resources
for
metadata
services
AvarietyofinformationresourcescanhelpyougetstartedwithIBMInformation Server’smetadataservices productmodules.
WebSphereBusinessGlossary
TheGettingStartedpane thatappearswhenyouclicktheGlossarytabof theIBMInformationServer consoledescribesthepurposeofthetaband how togetstarted.Eachpaneandtabontheconsoledisplaysa lineof context-sensitiveinstructionaltext.
TheHelpbuttonlinkstoonline documentationfor WebSphereBusiness Glossary intheIBM InformationServer informationcenterat
http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r0/index.jsp. The InformationCenteralso providesallplanning, installation,and
configurationdetailsforIBMInformationServerand otherproduct modules.
TheWebSphereBusiness GlossaryGuidePDFisalsoavailable ontheQuick StartCD.
WebSphereMetaBrokers
Onlinehelp isavailable forallWebSphereMetaBrokersandbridges.The
Information ServerGuidetoWebSphereMetaBrokersandBridgesPDFisalso available ontheQuickStartCD.
IBMInformationServerandproductmodules
Planning, installation,andconfigurationdetailsare alsoavailableinthe followingPDFs ontheQuickStartCD:
v InformationServerPlanning,Installation,andConfigurationGuide
Chapter
4.
Service-oriented
integration
IBM InformationServer simplifiesthecreationofshared dataintegrationservices byenablingintegrationlogictobe usedbyanybusinessprocess.
Invoking service-readydataintegrationtasks ensuresthatbusinessprocessessuch asquotegeneration,orderentries, andprocurementrequestsreceivedatathatis correctlytransformed,standardized,andmatchedacrossapplications,partners, suppliers, andcustomers.
Manyorganizationsaredesigning theirnextgenerationofinfrastructureand applicationsasservices.Implementinga service-orientedarchitecture(SOA) offers these benefits:
Adaptability
Functionalcomponentscanbereassembled quicklyandinnew ways.
Consistency
Corerulesforhandlingdataand processesarereusedacrossprojects.
Reducedcost
Increased reuseanda singlepointofmaintenancespeedtimetovalue and reducedevelopment expense.
Federated ownership
Eachserviceisownedandmaintainedindependentlybyitsown group.
IBM InformationServer providesanSOAinfrastructurethatprovides these capabilitiesbyhelpingyoucreateshareddataintegrationservices.Acommon services layermanageshowservicesare deployedfromanyoftheproduct
modules. Cleansingandtransformationrulesorfederatedqueriescanbe published asshared servicesbyusinga consistentandintuitive graphicalinterface,and managedafterpublication usingthesameinterface.
Introduction
to
service-oriented
integration
in
IBM
Information
Server
IBM InformationServer providesstandardservice-orientedinterfacesforenterprise dataintegration.Thebuilt-inintegrationlogicofIBMInformationServercaneasily be encapsulatedasserviceobjectsthatareembedded inuserapplications.
Theseserviceobjectshavethefollowingcharacteristics:
Alwayson
Theservices arealwaysrunning, waitingforrequests.Thisabilityremoves theoverheadofbatchstartupandshutdownand enablesservices to respondinstantaneouslytorequests.
Scalable
Theservices distributerequestprocessingand stopandstart jobsacross multiple WebSphereDataStageservers,enabling highperformancewith large,unpredictablevolumesofrequests.
Standards-based
Theservices arebasedonopenstandardsand caneasilybeinvokedby standards-basedtechnologiesincludingenterpriseapplication integration (EAI)andenterpriseservicebus(ESB)platforms,applications,and portals.
Flexible
Youcaninvoketheservicesbyusingmultiplemechanisms(bindings)and choose frommanyoptionsforusingtheservices.
Manageable
Monitoringservicescoordinate timelyreportingofsystem performance data.
Reliable andhighlyavailable
IfanyWebSphereDataStage serverbecomesunavailable,it routesservice requeststoa differentserverinthepool.
Reusable
Theservices publishtheirownmetadata, enablingthemtobefoundand calledacrossanynetwork.
Highperformance
Loadbalancing andtheunderlyingparallelprocessingcapabilitiesofIBM InformationServercreatehighperformance foranytype ofdatapayload.
Adataintegrationserviceiscreatedbydesigningthedataintegrationprocesslogic in IBMInformationServer andpublishing itasaservice.Theseservicescanthen be accessedbyexternal projectsandtechnologies.
WebSphereInformationServicesDirectorprovidesa foundationforinformation services byallowing youtoleveragetheothercomponentsofIBMInformation Server forunderstanding,cleansing,and transforminginformationanddeploying those integrationtasksasconsistentandreusable informationservices.
AsFigure16shows,service-readydataintegrationjobs canbeusedwith process-centrictechnologiessuchasEAI,businessProcessManagement(BPM), ESB,and applicationservers.
Packaged Applications Data Warehouses Business Partner Data Legacy Application Master Data Stores Enterprise Data Integration Services Request Data from System 1 Request Data from System 2 Match and Survive Enhance (lookup) Transform to Target Format Business Process Process Flow Create Quote Allocate Inventory Calculate Discount Calculate Quote Get Customer Request Ship Date Process Credit Card Estimate Backlog
Scenarios
for
service-oriented
integration
Thefollowingexamples showhow customershaveusedservice-oriented architecturesinIBMInformationServer.
Pharmaceuticalindustry:Improvingefficiency
Aleadingpharmaceuticalcompanyneededtoincludereal-time datafrom clinicallabs initsresearchanddevelopment reports.Thecompanyused WebSphereDataStagetodefineatransformationprocess forXML
documents fromlabs.Thisprocess usedSOAtoexposethetransformation asa Webservice,allowing labstosenddataand receivean immediate response.
Pre-clinicaldataisnowavailabletoscientific personnelearlier,allowing labscientiststoselectwhichdatatoanalyze.Now,onlythebestdatais chosen, greatlyimprovingscientists’efficiency.
Insurance:Validatingaddressesin realtime
An internationalinsurancedataservices companyusesIBMInformation Server tovalidateandenrichpropertyaddressesthroughWebservices.As insurance companiessubmitlistsofaddressesforunderwriting,services standardizetheaddressesbasedontheirrules,validate eachaddress, matchtheaddressestoalistofknown addresses,and enrichtheaddresses with additionalinformationthathelpswith underwritingdecisions.The companynowautomates80percentoftheprocessandeliminatedmostof theerrors.Theprojectwassimplified byusingtheSOAcapabilitiesofIBM InformationServerand thestandardization andmatchingcapabilitiesof WebSphereQualityStage.
Where
SOA
fits
in
a
business
context
By enablingintegrationtasks asservices,IBM InformationServer becomesacritical componentoftheapplicationdevelopment andintegrationenvironment.TheSOA infrastructureensures thatdataintegrationlogicthatisdevelopedinIBM
InformationServer canbe usedbyanybusinessprocess.
SOAallowsyoutousebothanalyticaland operationaldata.Thebestdatais available atalltimes,toallpeopleandtoall processes.Thefollowingcategories representcommonusesofSOAinabusinesscontext:
Real-time datawarehousing
Enables companiestopublishtheirexistingdataintegrationlogicas services thatcanbe calledinrealtimefromanyprocess.Thistype of warehousing enablesuserstoperformanalyticalprocessingandloadingof databased ontransactiontriggers,ensuringthattime-sensitivedatainthe warehouseiscompletelycurrent.
Matchingservices
Enables dataintegrationlogictobepackagedasa sharedservicethatcan be calledbyenterpriseapplicationintegrationplatforms.Thismethod allowsreference data(suchascustomer,inventory,andproductdata)tobe matched toandkeptcurrentwith amasterstorewith eachtransaction.
In-flight transformation
Enables enrichmentlogictobe packagedassharedservices sothat capabilitiessuchasproductnamestandardization,addressvalidation,or dataformattransformationscanbe sharedand reusedacrossprojects.
Enterprisedataservices
Enables thedataaccessfunctionsofmanyapplicationstobeaggregated and sharedina commonservicelayer.Insteadof eachapplicationcreating itsownaccesscode,theseservicescanbe reusedacrossprojects,
simplifying developmentandensuringahigher levelofconsistency.
AsFigure17shows,oneofthemajoradvantagesof usinganSOAapproachisthat youcancombinedataintegrationtasks withtheleadingenterprisemessaging, EnterpriseApplicationIntegration(EAI),and BusinessProcessManagement (BPM) productsbyusingbindingchoices.
SincemostmiddlewareproductssupportWebservices,thereareoftenmultiple optionsforhowthis isdone.For example,WebSphereintegrationproductssuchas WebSphereFederationServer orWebSphereBusiness IntegrationMessageBroker caninvokeIBM InformationServer servicestoaccessservice-readyjobs.
A
closer
look
at
service-oriented
integration
in
IBM
Information
Server
IBM InformationServer providesa SOAinfrastructurethatusesdata transformation processesthatare createdfromneworexistingWebSphere
DataStage orWebSphereQualityStagejobsorfederated queriesthatarecreatedby WebSphereFederationServer andexposesthemasaset ofservicesandoperations.
Afteranintegrationservices isenabled,anyenterpriseapplication, .NetorJava™
developer, MicrosoftOfficeorintegrationsoftwarecaninvoketheservice byusing a bindingprotocolsuchasWebservices.
The followingfeaturesarecentral totheIBM InformationServer SOA infrastructure:
Common administrativeservices
Hostand publishservicemetadata,exposeachoiceofbindingsforeach
service,andprovideinfrastructureservices suchassecuritymanagement, session management,logging,andmonitoring.
Foundationcomponents fordevelopment
Providea singlesetof datatransformation rulesforanalyticaland
enterpriseapplications,businessactivitymonitoring,federateddataaccess, and businessprocessintegration.
Any-to-any connectivity
Provides technologyindependencefordatatransformation,
standardization,matching,and legacydataaccessbyusingWebservices (.NET andJava)or EnterpriseJavaBeans™(EJB)interfacebindings.
Service-ready
integration
Aservice-readydataintegrationjobacceptsrequestsfromclientapplications, mappingrequestdatatoinputrowsandpassingthemtotheunderlyingjobs.A jobinstance caninclude databaselookups,transformations, datastandardization and matching,andotherdataintegrationtasksthataresupplied byIBM
InformationServer.
Figure18showsaWebSphereDataStagejobwith aserviceinputandservice output.
Thedesignof areal-timejobdetermineswhetherit isalways runningorrunsonce tocompletion.All jobsthatareexposedasservicesprocessrequestsona 24-hour basis.TheSOAinfrastructuresupportsthreejobtopologiesfordifferentloadand workstylerequirements:
Batch jobs
Topology Iusesneworexistingbatchjobsthatareexposedasservices.A batchjobstartsondemand.Eachservice requeststartsoneinstanceofthe jobthatrunstocompletion.Thisjobtypicallyinitiatesa batchprocessfrom a real-timeprocessthatdoesnotneeddirectfeedbackontheresults.This topology istailored forprocessingbulkdatasetsandiscapableof acceptingjobparametersasinputarguments.
Batch jobswitha ServiceOutputstage
Topology IIusesanexistingbatchjoband addsanoutputstage.The ServiceOutputstage istheexitpointfromthejob,returningoneormore rowstotheclientapplicationasa serviceresponse.AsFigure19onpage 34shows,thesejobs typicallyinitiatea batchprocessfroma real-time process thatrequiresfeedbackor datafromtheresults.Thistopology is designed toprocesslargedatasetsandcanacceptjobparametersasinput arguments.
Service Input Stage
Rest of DataStage Job
Service Output Stage
JobswithaServiceInput stageandServiceOutputstage
InTopologyIII,jobs usebotha ServiceInputstageanda ServiceOutput stage.TheServiceInputstage istheentrypointtoa job,accepting oneor more rowsduring aservicerequest.Thesejobsare alwaysrunning.This topology istypicallyusedtoprocesshighvolumesofsmallertransactions where responsetimeisimportant. Itistailoredtoprocess manysmall requestsratherthana fewlargerequests.Figure20showsan exampleof this topology. Service Output CustomerDB D1Orders Rows ReturnedRows XML Output Order_Transformation
Figure19.BatchjobswithaServiceOutputstage
Service Input Service Output ODBC DSLink1 DSLink2 DSLink3 DSLink4 DSLink5 XML Output XML Input Transformer
SOA
components
in
IBM
Information
Server
Therun-time componentsthatenableservice-orientedarchitecturesarecontained intherun-timeenvironment ofthecommon servicesofIBMInformationServer.
ThesecomponentsareJ2EEapplicationsthatdistributerequeststoWebSphere DataStage,WebSphereQualityStage,or WebSphereFederationServer basedon load-balancing algorithms.Commoncoreservicesinclude securityand logging.
Threshold-balanced
parallelism
Therun-time environmentcombinesparallelprocessingwith loadbalancingand distribution toprovidehighperformancedataprocessing. Itbalancesservice requestsbyroutingthemtoWebSphereFederationServerorWebSphereDataStage servers,eachof whichtakes advantageofpipelinetechnologyfor parallel
execution.
Threshold-balancedparallelismenablesSOAplatformstoautomaticallyadjust resources basedonthresholdsthatyousetwhenyoudefineservices.Thecommon services startand stopjobsinresponsetoloadconditions.Thecombinationof these capabilitieswith parallelpipeliningisuniquetoIBMInformationServerand enablesIBMInformationServertoprocess dataintegrationtasksfasterthanany othertechnology.
Multiple
binding
support
Virtuallyanyprotocolcanbe madetoadheretoSOAprinciples.IBM Information Server supportsthisapproach,enablingthesameservicetosupportmultiple protocolbindings,all definedwithin theWSDLfile.
An SOAinterfaceshouldbe abletohandlemultiplemechanisms(bindings)for calling services.Thisimproves theutility ofservicesandthereforeincreasesthe likelihood ofreuseandadoptionacrosstheenterprise.
Projects forwhichWebservicesarenota viableoptionbecauseofperformance or architectural requirementscanstillleveragetheservices byusinganinterface better suitedtotheirrequirements.WebSphereInformationServicesDirector can publishthesameserviceusingdifferentbindings:
SimpleObjectAccessProtocol(SOAP) overHTTP(Web services)
Anyapplicationthatcomplies withXMLWebservicescaninvokea WebSphereFederationServer orWebSphereDataStageintegrationprocess asa Webservice.TheseWebservicessupport thegenerationofliteral document-styleand SOAPencodedRPC-styleWeb services.
EnterpriseJavaBeans(EJB)
For Java-centricdevelopment, WebSphereInformationServicesDirectorcan generatea J2EE-compliantEJB(statelesssession bean)where eachdata transformation serviceisinstantiated asa separatesynchronousEJB method call.
Thedesigndoesnotdependonthebindingchoice.AslogicisbuiltinWebSphere DataStage andWebSphereQualityStage,thedesignerdoesnotneed tobeawareof how itwillbeused.Aftertheserviceisdeployed,additionalbindingscaneasilybe implemented withoutchanging thelogic.
WebSphere
Information
Services
Director
tasks
WebSphereInformationServicesDirectorprovidesanintegratedenvironment for designing servicesthatenablesyoutorapidlydeployintegrationlogicasservices withoutassumingextensivedevelopmentskills.
With asimple,wizard-driveninterface, inafewminutesyoucanattachaspecific bindingand deployareusable integrationservice.WebSphereInformationServices Director alsoprovidesthese features:
v
Load-balancingandadministratorservicesfor catalogingand registeringservices
v Sharedreportingandsecurityservices
v Ametadataservices layerthatpromotesreuse oftheinformationservices by
actuallydefiningwhattheservicedoesandwhatinformationitdelivers.
Information
providers
An informationprovider isboththeserverthatcontainsunits thatyoucanexposeas services andtheunitsthemselves,suchasWebSphereDataStageandWebSphere QualityStagejobs orfederatedSQLqueries.
Eachinformationprovidermust beenabled.Toenabletheseproviders,youuse WebSphereInformationServicesDirector.
YouusetheAddInformationProviderwindow toenableinformationproviders thatyouinstalledoutsideofIBMInformationServer,suchasWebSphereDataStage servers orfederatedservers.
Creating
a
project
Aprojectisacollaborativeenvironmentthatyouusetodesignapplications, services, andoperations.
All projectinformationthatyoucreatebyusingWebSphereInformationServices Director issavedinthecommonmetadatarepositorysothatitcaneasilybe shared amongotherIBM InformationServer components.
Youcanexportaprojecttobackupyour workorshare workwithotherIBM InformationServer users.Theexportfileincludesapplications,services,operations, and bindinginformation.
Creating
an
application
An applicationisacontainerfora setofservices andoperations.Anapplication contains oneormoreservicesthatyouwanttodeploy togetherasanEnterprise Archive(EAR)fileonanapplicationserver.
All design-timeactivity occursinthecontextof applications: v Creatingservicesand operations
v Describinghow messagepayloadsandtransport protocolsareusedtoexposea
service
v Attachinga referenceprovider, suchasaWebSphereDataStagejoboranSQL
Creating anapplicationisasimple taskfromtheDevelopnavigatormenuofthe IBM InformationServer console.Youcanalso exportservicesfromanapplication before itisdeployedand importtheservicesintoanotherapplication.
Youcanchangethedefaultsettingsforoperationalpropertieswhenyoucreatean application orlater,asFigure21shows.
Creating
a
service
An informationserviceexposesresultsfromprocessingbyinformationproviders suchasDataStageserversand federatedservers.Adeployedservice runsonan application serverandprocessesrequestsfromserviceclientapplications.
An informationserviceisa collectionofoperations thatareselectedfromjobs, maps,federatedqueries,orotherinformationproviders.Youcangroup operations inthesameinformationserviceordesignthemin separateservices.
Youcreateaninformationservicefora setofoperations thatyouwanttodeploy together.Youselecta projectandanapplication withintheprojectintheSelecta Viewarea,asFigure22onpage38shows.
Whenyoucreatea service,youspecifysuchoptionsasname,base packagename for theclassesthataregeneratedduring thedeploymentof theapplication,and optionallythehomeWebpageand contactinformationfortheservice.
Afteryoucreatetheservice,youattachabindingfortheservice:
SimpleObjectAccessProtocol(SOAP) overHTTP
ToexposeaninformationserviceasaWebservice,attachtheSOAPover HTTPbindingtotheinformationservice.
EnterpriseJavaBeans(EJB)interface
Ifyour serviceconsumerswanttoaccessaninformationservicethroughan EJB interface,attachtheEJB bindingtotheinformationservice.
Deploying
applications
and
their
services
Youdeployan applicationonWebSphereApplicationServer toenablethe informationservices thatarecontainedintheapplicationto receiveservice requests.
The DeployApplicationwindowinWebSphereInformationServicesDirector guides youthrough theprocess,asFigure23onpage39shows.
Youcanexcludeoneormore services,bindings,andoperationsfromthe
deployment,changeruntimepropertiessuchasminimumnumberofjobinstances, or,forWebSphereDataStagejobs,setconstantvaluesforjobparameters.
WebSphereInformationServicesDirectordeploystheEnterpriseArchive(EAR)file ontheapplicationserver.
SOA
and
data
integration
Enabling anIBMInformationServerjobasa Webserviceenablesthejobto participate invariousdataintegrationscenarios.
Data integrationenablesuserstofederateheterogeneousdataacrossseveraldata sources.SOAallowsWebSphereDataStagejobstoparticipateinfederatedqueries byusingWebSphereFederation Server.
Figure24onpage40showsabusinessscenarioinwhichacustomerservice manager needstointegrateinformationacrossmultipledatastorestoaddressnew customer complaints.Themanagerneedstolookattheactualinvoicetocompare recent shipmentdatainXMLformatplusthehistoricaldatainthewarehouseto ensurethatthedataisaccurate.