Federated Service Oriented
Information Management
Ahmet Sayar
Introduction
n Aim: Develop a general Grid architecture based approach to
distributed heterogeneous data, information and knowledge –which are provided by different repositories and producers- in an efficient and robust manner.
n Challenges in
¨ Representing, ¨ Transforming, ¨ Integrating and ¨ Displaying
of
¨ Data
¨ Information/knowledge
for decision makers in scientific application domains.
n Methodology:
¨ Create “Federated Service Oriented Information Management
architecture” for the GIS domain based on OGC (Open Geospatial Consortium) specifications.
¨ Determine the requirements for the generalization of the architecture for
Motivation
n SOA based on Grid or Web Services
n We use DIKW to describe the hierarchy of Data-Information-Knowledge-Wisdom that we are attempting to support
n “Filter Services” are Information Sources:
¨ A service inputs DIKW from other Grids or Services and outputs DIKW
– perhaps converting data to information etc.
¨ Web Services, easy to extend and federate. ¨ Easy to publish, located and bind.
¨ Predictable input/output interfaces defined by metadata
n A repository or sensor has or gets DIKW from "outside Grid";
it outputs DIKW; they are “just” filters whose output is Grid compatible DIKW as messages or message streams
n Information management through ASIS (Application Specific
Information System) framework in Science Domains.
GIS – OGC (Motivation Domain) (1)
n
Geographic Information System (
GIS
) is a
system for creating and managing
spatial d
ata
and associated attributes.
n
OGC
(Open Geospatial Consortium) The goal is
to make geographic information and services
neutral and available across any network,
application, or platform.
n
Challenges
(valid for any science domains)
¨ Distributed nature of geospatial data.
¨ Proprietary data formats, and service methodologies. ¨ Lack of interoperable services.
¨ Assembling data from distributed sources ¨ Format conversions
GIS – OGC (Motivation Domain) (2)
n
GML : Geographic Markup language
n
WFS: Web Feature Server
¨ Provides vector data such as rivers, state and city
boundaries in GML.
n
WCS : Web Coverage Server
¨ Provides coverage (raster) data. Grided data, pixel info.
n
WMS : Web Map Server
¨ Provides data in the form of jpeg, svg, png etc. Defined
in its capabilities file.
n
WMS’ : Cascading Web Map Server
¨ Provides data in the form of layers in mages. It is
Information Management Arc
In GIS Domain (Sample Scenario)
WF S MD Vector data WMS ’ Raster data WM S WC S Data capability
n Query : No Standard – Filter specification –
query on vector data by WFS using SQL
n Data Encodings : GML, images
n Metadata : Structured Capability doc in
XML.
n No event notification – WS-Context for
asynchronous run.
n Registry : WRS – we call it MD.
From Raw Data to Information /
Knowledge
n Raw Data GML
(WFS in Filter - ASFS)
n GML Map image
(WMS in Filter - ASVS)
n Each filter provides data in
a consistent format.
n Formats should be
consistent with the systems data model, GML
n Any Data Common Data
Model
n Data Model is XML based
hierarchical data
¨ Portable across
n Languages
n Operating system
S S
Data base
Interactive Decision Support Tools
- Interactive query
- Interactive display, movie and animatio
- Integration to Application Science Simulations
Application Use Domains
n
ServoGrid Projects (
GIS
)
¨ Patter Informatics (PI) ¨ GeoFest
¨ Virtual California (VC)
n
Los Alamos National Labs (LANL)
¨ IEISS (The Interdependent Energy Infrastructure Simulation System )
n Models infrastructure networks (e.g. electric power
systems and natural gas pipelines) and simulates their physical behavior, interdependencies between systems.
n
Chemistry
and
Astronomy
(Future)
¨ CML (Chemistry Markup Language) representation of
Problem Recognition -cont
n Services like discovery and notification do not need to be made application specific.
n BUT If the domain changes then :
¨ choices,
¨ database requirements, ¨ data format,
¨ core service requirements, ¨ attributes, and
¨ metadata context
CHANGES !
n What are the common concepts and characteristics for
¨ data,
¨ metadata,
¨ query language, ¨ services, and
¨ communication language,
Generalization of Service Oriented
Information Management Architecture
n
GIS has some specifications based on standards
such as OGC ISO/TC210, But many others do not
n
GIS
ASIS
(
Science Domain
)
n
GML
ASL
(
Representing
)
n
WFS
ASFS
(
Storing-Resource
)
n
WMS
ASVS
(
Displaying
)
n
Capa.xml
Metadata
(
Integrating
)
Generalization - Overall Structure
Solution
n ASL : Application Specific Language. XML based
hierarchical data representation format.
¨ Cross language, platform and operating system
n ASVS : Application Specific Visualization System
¨ Last filter before the decision maker.
¨ Provides information/knowledge in human readable formats
n ASFS : Application Specific Feature Service.
¨ Stores and provides common data model (ASL)
n Treat binary and common data (in ASL) differently.
ASFS
AS “Senso
AS Tool (generic
)
AS Service
(user defined)
AS Tool (generic
)
ASVS Displ
ay
ASFS and ASVS in SOA
Interfaces, querying, metadata and data
model
HTML, Text, XML GetDataInformation
ASL GetData
Images, svg, png.. GetVis
XML-schema DescribeData
Capability file XML GetCapability
Capability file XML GetCapability
Return types Routines
Return types Routines
ASVS ASFS
n Each routine is published in the WSDL, invoked based on predefined request schema and put into SOAP body.
<request>
…..<GetCapability> </request>
<SOAP:Envelope> …<SO
AP:Body> …
…<request>
Sample Capabilities File (too simplified) – GIS
Domain
n <?xml version='1.0' encoding="UTF-8" standalone="no" ?>
<!DOCTYPE WMT_MS_Capabilities SYSTEM "http://toro.ucs.indiana.edu:8086/xml/capabilities.dtd"> <Capabilities version="1.1.1" updateSequence="0">
<Service>
<Name>CGL_Mapping</Name> <Title>CGL_Mapping WMS</Title>
<OnlineResource xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple“ xlink:href="http://toro.ucs.indiana.edu:8086/WMSServices.wsdl" /> <ContactInformation> ….. </ContactInformation> </Service> <Capability> <Request> <GetCapabilities> <Format>WMS_XML</Format> <DCPType><HTTP><Get>
<OnlineResource xmlns:xlink="http://w3.org/1999/xlink" xlink:type="simple“ xlink:href="http://toro.ucs.indiana.edu:8086/WMSServices.wsdl" /> </Get></HTTP></DCPType> </GetCapabilities> <GetMap> <Format>image/GIF</Format> <Format>image/PNG</Format> <DCPType><HTTP><Get>
<OnlineResource xmlns:xlink="http://w3.org/1999/xlink" xlink:type="simple“ xlink:href="http://toro.ucs.indiana.edu:8086/WMSServices.wsdl" /> </Get></HTTP></DCPType> </GetMap> </Request> <Layer> <Name>California:Faults</Name> <Title>California:Faults</Title> <SRS>EPSG:4326</SRS>
Sample Scenario for ASIS
n
Static linking of filters.Capability aggregation
cycle through “GetCapabilities” interfaces of
filters.
Int eract ive Tools Data A Data D F B,C A A,B,C A,B,C, D,E,F E A, B, C, D, E, F Dn
Each Filter publishes its data through its
capability file.
A,B,C A,B,C E,F Data B,C Data F Data En
GetCapability request from client tools at the
startup. Later requests will be created based on
returned aggregated capabilities
GetVis(A,E) GetData(A)
GetData(A)
GetVis(E)
n
Client needs to visualize Data A and E and
makes a GetVis request to ASVS with specific
attributes for querying. GetVis is defined in a
n
Successive requests are done, user is not
involved. These request chains are created
based on filters capabilities that published
AOverall Structure Solution -cont
n Common data (ASL) is kept in ASFS with query capability. n In a given domain every filter speaks in ASL.
n Filters (ASVS, ASFS) keep their metadata locally.
n ASVS both visualize information and provide a way of navigating ASFS and their underlying DB.
n ASVS can itself be federated and present output interface. n Dynamic metadata update via MD services or P2P metadata
exchange.
n Utilizing data/information at the application level via filters
¨ ASFS provide ASL.
¨ ASVS provide human readable information such as text, graphs
(scalable vector (svg) or portable (png)) and images.
¨ Filters have common ports and interfaces
n Enable chaining for more complex data and information creation.
Applicability to
Different Science Domains
n How strongly our service definitions in proposed
architecture matches to general science domains?
Filters
ASFS ASVS
VOResource VOPlot
TopCat SkyNode
VOTable, FITS
Astronomy
NO
capability.xml schema
Metadata
NO standard JChemPaint WMS
NO CML
Chemistry
WFS GML
Research Issues (1)
n
Requirements for the domain metadata in
capability
¨ What does capabilities do and need to have to
federate filters?
n
Requirements for the ASL (such as CML, GML)
¨ What does ASL need to have to federate the filters?
n
Concept of data (such as feature, coverage)
¨ Common representation? Possible? To what extend?
n
A common information management framework
which can be applied to any domain.
Research Issues (2)
n
Application level data/information federation.
n
Integrating the system with application science
simulations.
n
Creating interactive decision support tools
utilizing integrated filter services.
¨ Tools for map animation, map movies, images
¨ Interactive query support to get further information on
the image and/or animation.
n
Enabling binding of services into pipelines with
or without human intervention through metadata.
n
Caching and load balancing to handle large
Related Wor
SRB (Storage Resource Broker)
n
SRB
¨ Uniform access to distributed heterogeneous data
resources by attributes.
¨ Catalog service is MCAT (Metadata Catalog Service). ¨ Resource and data location transparency.
¨ Remote authentication authorization – user groups. ¨ Not just for access, transferring and replicating.
¨ Sample projects using SRB: BIRN and IVOA.
n
Summary
¨ Other important digital library projects and the NGAS
(Next Generation Archive System) from ESO.
¨ We will research more these important activities, identify
key architecture ideas and incorporate lessons.
Related Work -Con
OGSA-DAI
n
Ogsa-DAI
¨ Open Grid Service Architecture–Data Access and
Integration.
¨ Access to heterogeneous data via common interfaces
on the grid.
¨ Catalog service is MCS (Metadata Catalog Service) ¨ OGSI-compliant Grid.
¨ Components are Grid services. Resources should be
registered.
¨ Sample projects using Ogsa-DAI : LEAD, MyGrid.
n
Summary
¨ OGSA-DAI emphasizes database layer whereas we
are tackling the application specific DIKW.
Contributions
n
Instructions how to build ASL and metadata in
capability for the application sciences.
n
Instructions how to build application specific
information system (ASIS) federating multiple filters
speaking ASL.
n
Information grid (ASIS) formalization through
capabilities metadata, defining all the
data/information sources as interacting Web Service
filters with standard metadata service ports.
n
Optimize and enhance the distributed
THANKS
[email protected]
Literature Survey
Discussions on SRB & Ogsa-DAI
n SRB
¨ Monolithic – does too much ¨ MCAT dependent
¨ MCAT has limited support for application-level metadata
n Need diff metadata for diff domain, and extensions for applications
¨ Not standard based – Not open source
¨ Not handling data based on DIKW hierarchy
n Ogsa-DAI
¨ At the data and Database level ¨ MCS dependent
¨ MCS has limited support for application-level metadata
n Need diff metadata for diff domain, and extensions for applications
¨ For Grid applications - GGF standards
Our Work Compared to SRB & Ogsa-DAI (1)
n Each filter has its own metadata
¨ Distributed metadata handling
n Peer to peer
n Through MD services
n They provide heterogeneous data access and federation through central metadata services
¨ SRB MCAT and Ogsa-DAI MCS
n Main motivation is sharing, interpreting and knowledge
extraction of the data and information.
n Their motivation is storing, accessing and updating of the heterogeneous data.
n We leverages their power and usability in our federated
service oriented information management architecture.
Our Work Compared to SRB & Ogsa-DAI (2)
M ast er S R B O gs a-G DSFR R R
A S F S
R R R
A S F S A S F S A S V S A S V S A S V S Wisdom decisions,
knowledge and information extraction by the user
-Reusable components Filter Services with specific ports and interfaces
-Distributed DIKW abstraction
-Metadata in capability document
-Metadata aggregators -New metadata for different domains -Smart data querying -Web Services based SOA (advantages). Wisdom Decisions,
ready to use information and knowledge
-Central data access abstraction. Uniform access to heterogeneous data sources -Metadata : SRB/MCAT, Ogsa-DAI/MCS -Both provides extensible metadata arch for diff domains -SRB has “zone” concept addresses
similar issues but in Wisdom decisions
Why are we different
Federated Service Oriented Information
Management
n SOA (Service Oriented Architecture)
¨ Easy to extend
¨ Reusable components
¨ Cross platform and language.
¨ XML based hierarchical data representation
n Easy data integration
n Easy querying
n Human readable information
n Easy to access data – no command line
¨ Interactive tools
¨ On the fly query creation.
n Not only accessing data but also transforming through its
path to end users.
n Ports to integrate application simulations to application
specific information system (ASIS)
¨ Integrating application simulation data/information with ASIS
An Example of Other Domains:
Astronomy Domain (IVOA Standards)
FS-2 DB FS-1 DB FS-3
n FS-1 : VOPlot
¨ Integrating, Interacting
visualization tools
n FS-2 : SkyNode
¨ ADQL based SOAP interface returning VOTable based results
n FS-3 : SIA
¨ 2D sky projection, logically a grid of pixels encoded as a FITS image
n FS-4 : SSA
¨ URL-based returning a dataset "document" (VOTable)
n Query : ADQL –extension of SQL n Data Encoding: VOTable, FITS
n Metadata : UCD, VOResource
n Event notification : VOEvent n Registry : VORegistry
n QueryableData in : SSAP and SIAP,