CORBA and Life Sciences
Ulf Leser
Table of Content
• CORBA in a nutshell
• The Life Science Research Domain Task Force
• The Genome Maps Standard
CORBA
• Common Object Request Broker Architecture
- A reference architecture, not an implementation
- Developed in a community process through OMG
- Object-oriented middleware
• Target
- Easier, more flexible RPCs
- Interoperability of applications over networks
- Language- and platform independent
- Free
specification of interfaces
• Main elements
Similar techniques
• RPC, RFC
- Calling a procuedre/function on a remote machine
- Very old
• Enterprise Java Beans
- Also interface-centric
- Also object-oriented
- Pure JAVA
• DCOM
- Also interface-centric
- Somewhat object-oriented
- Pure Microsoft
Object Management Goup (OMG)
• Over 800 member organizations - world’s
largest software consortium
• Founded April 1989
• Small staff (30 full time); no internal
development
• Dedicated to creating and popularizing
object-oriented standards for application
integration based on existing technology
Object Management Architecture
Object Request Broker
Application Objects
Domain Objects
Object Services
Common Facilities
CORBA Object
Legacy Application
Wrapper
Implementation with CORBA
Interface Definition
Client stub generation
Bind stubs to (existing)
application
Bind ORB to application
Server skeleton generation
Implementation of methods
Start ORB
Bind stubs to implement.
Code generation
OMG IDL
specification
OMG IDL
specification
IDL Compiler
IDL Compiler
Client
code
Client
code
Server
code
Server
code
Stub
code
Skeleton
code
Client
ready to request
Client
ready to request
ready to serve
Server
Server
ready to serve
ORB
Library
ORB
Library
Language
mapping
Your parts
RPCs in CORBA
Application
Stub
ORB
ORB
Stub
Server
Database
IIOP
Client:
- Obtain CORBA reference
- Transparent method invocation
Server:
- Manage CORBA objects
- Receive and execute RPCs
Request
Result
ORB:
- Server localisation
- Request propagation
Interface Definition Language
• Defining an
interface
, not an implementation
• Object-oriented language
- Strongly typed
- Multiple inheritance
- Structs
• Language independent
• Mapped to programming languages:
- JAVA
- C++
- App. 20 more
Example: HelloWorld.idl
module Tutorial1 {
interface HelloWorld {
string get_text();
};
};
module Tutorial1 {
interface HelloWorld {
string get_text();
};
};
module defines a naming
context:
names from this module
used outside this module
must be referenced as (for
example):
Tutorial1::HelloWorld
interface defines a CORBA object
(class):
these objects are available for
clients and are implemented by a
server
an available method:
methods are called by
clients to fulfill their
requests
CORBA Services
• CORBA Services support development and
deployment:
-
Naming service
: object localization by name
-
Trading service
: service localization by properties
-
Transaction service
: ‘2-phase commit protocol’ management
-
Query service
-
Event service
-
Relationship service
The Life Science Research
Domain Task Force
Domain Services: LSR
• OMG is organized in “domain services”
- Financial
- Automotive
- Life Science
- Etc.
• L
ife
S
cience
R
esearch Task Force since 1997
(www.omg.org/homepages/lsr)
• Task forces have working groups
• Task forces supervise
definition and adoption
of specifications
(I.e., documents)
LSR working groups
• Biomolecular Sequence Analysis (adopted)
• Genomic Maps (adopted)
• Bibliographic Query Service (adopted)
• Macromolecular Structure
• Laboratory Equipment Control Interfaces
• Gene Expression
• Chemical Structure Access and
Representation
Standardisation Process
RfI
RfP
LoI
Proposals
Standard
• Time-consuming
process
• OMG Architectural board committed to:
- Orthogonality
- Cutting-edge technology
• Participation: relatively open
• Submitters must commit themselves to
provide implementations
Genomic Maps
• Differences:
- Co-ordinate system
- Ordering
- Object types
• Different species,
chromosomes,
regions
Scope of “Genome Maps”
•
Maps
- no sequences
•
Access
- no calculation or comparison
•
Retrieval
- no writing
• An
interface
– not a data model:
- Easy to implement for providers
- Powerful enough for clients
- Covering most types of maps
First Proposal
MapObje ct database name id Mappable species chromosome type getMaps() Se g m e nt length unit Po int Clo ne Bin Ma rke r Ma p getNrOfElements() getAllElements() getRangeBetweenObjects() getElementsInSegment() Cyto g e ne ticEle m e nt rank Line a rMap maxCoordinate minCoordinate 1..1 MapEle m e nt positionPrecisionInte rva lPo s itio n Po intPo s itio n Ra ng e Po s itio n
leftEnd
rightEnd frameWorkElementposition leftFlankingObjrightFlankingObj 1..* onMap 1..1 1..1 crossReferences 0..* mappedObj 1..1
Mappable Objects
Mappable species chromosome type getMaps() MapObje ct database name id Se g m e nt length unit Po int•
Mappable
are all objects which can be placed on a
map
• Cross-linked to
equal objects
in other databases
•
Segments
have extent: clones, bands, maps, ...
•
Points
are points: marker, EST, STS, ...
Maps
•
Maps
are segments
• Maps can be placed on
maps
• Two types:
-
Linear maps
have a co-ordinate
system:
• physical maps
• genetic maps
-
Bin maps
have only ranges:
• Radiation-hybrid maps
Bin Map getNrOfElements() getAllElements() getRangeBetweenObjects() getElementsInSegment() Line a rMa p maxCoordinate minCoordinate getScalarRange() getAround() Se g m e nt length unit 1MapElement
•
MapElement
is the assignment of a
Mappable
to a
Map
with a
Position
in a
Co-ordinate
system
• n:m relationship between Map and Mappable
• Map, MapElement and Mappable could be on
different servers
Vag ue Po s itio n Orde re dPo s itio n
MapEle m e nt
rank
positionPrecision
Inte rvalPo s itio n Po intPo s itio n Ra ng e Po s itio n
leftEnd
First Proposal
MapObje ct database name id Mappable species chromosome type getMaps() Se g m e nt length unit Po int Clo ne Bin Ma rke r Ma p getNrOfElements() getAllElements() getRangeBetweenObjects() getElementsInSegment() Cyto g e ne ticEle m e nt rank Line a rMap maxCoordinate minCoordinate 1..1 MapEle m e nt positionPrecisionInte rva lPo s itio n Po intPo s itio n Ra ng e Po s itio n
leftEnd
rightEnd frameWorkElementposition leftFlankingObjrightFlankingObj 1..* onMap 1..1 crossReferences 0..* mappedObj 1..1 1..1
Implementation: Wrapping IXDB
• Integrated database: > 30 data sources
• Many different maps available
Experiences - Semantic
• Different semantics:
- Relational database, object-oriented MapIDL
- IXDB.Locus does not exists in MapIDL
- Genes with or without extent
- Cardinalities: IXDB stores many values
- Synonyms
• Not all information in IXDB is representable in
MapIDL
Experiences - Technical
• Transient versus persistent references
• Consistency
- Between client and server
- Between CORBA server and database
• Memory management
- Releasing objects
- Multi-copy objects
• Multi-threaded programming
⇒
First shot easy, but “good” implementation
difficult
Interoperability
GDB
RHdb
• Maps are stored in many
data sources ...
- GDB
- RHdb
- CEPH
- Hugemap
- IXDB
- XACE
- ....
• Difficult to get an
integrated view
on all
available data
If Standards were used ...
GDB MGD RHdbORB
Map Comparison
Application
CEPHChoose source:
GDB
MGD
RHdb
CEPH
Choose source:
GDB
MGD
RHdb
CEPH
.getMaps(‘X’)
.getMaps(‘X’)
User
Two Approaches to Interoperability
D ata S ou rc e S ou rc eD ata D ata S ou rc e O R B ID L H T M L H T M L H T M L M e d ia to r JD B C• Someone builds an
integrating system
:
- Typically laborious
- Req. understanding of source data
- Schema and interface evolution
Data
Source SourceData
Data Source
ORB
IDL 1 IDL 1Mediator
IDL 1• Sources provide a
standard access
method:
- Fixed structure and semantic
- Most problems are shifted from
Integration Obstacles Removed ?
• Semantic & structure
- Documentation, MapIDL
• Data model
- CORBA (IDL->language mapping)
• Access mechanism
- CORBA (IIOP)
• Query capabilities
Obstacles Removed, cont’d
• Data conflicts
- Not resolved
• Data source autonomy
- Source implements and maintains server
• Fuzzy concepts
- Documentation
• Object identification
Conclusions and Open
Questions
General Design Problem
• Clients:
- Typed access: no
impedance mismatch, no
parsing
- Homogeneous structure
and semantic
- “Standard” canned queries
• Server:
- Install CORBA (ORB ...)
- Adopt standard semantic
- Implement interface
Make it
powerful !
Make it
simple !
Questions
• Designing a good interface is non-trivial
- Performance:
• Objects versus structs
• Navigation versus queries
- Complexity
• Do we need 5 different position types ?
• Hierarchies ?
- What are the specific needs of potential applications ?
• Map comparison
• Map integration
• Map visualisation
Questions cont’d
• Using CORBA services
- Availability ? For all clients at low cost ?
- Maturity ?
Object-by-Value, MOF, POA ?
• Personal opinion
-
Naming service
: useful
-
Query service
: useless
-
Collection service
: too expensive
-
Relationship service
: too expensive
-
Trading service
: unclear
-
Object-by-Value
: wonderful
-
POA
: essential
Questions cont’d
• Ad-hoc queries ? Against what schema ...
- the IDL ?
• Not possible - IDL is not a data model, no query language
- the schema of the source ? “execQuery( in string query)”
• schema is possibly unknown
• varies from source to source
• sources might not have a schema at all
• sources may change schema
• What is the result ?
- Must be a programming language construct described in IDL
Conclusions
• Trade-off: Comprehensiveness versus ease
- Standard as least common denominator ?
- Sufficient power for all applications ?
• Trade-off: Performance, comfort, usability
- Sufficient performance requires caching and structs / OBV
- Caching affects consistency
- Structs are less elegant
- OBV not yet commonly implemented
• Success ?
- Hype has gone: few implementations available
- Performance !
Literature
• L. Wang, P. Rodriguez-Tome, N. Redaschi, P. McNeil, A. Robinson and P.
Lijnzaad. Accessing and distributing EMBL data using CORBA (common
object request broker architecture). Genome Biology, 1 (5): 2000.
- G. Vossen. The CORBA Specification for Cooperation in Heterogeneous
Information Systems. 1st Workshop on Cooperative Information Systems;
LNCS 1202, Kiel, Germany, 1997.
- S. Baker. CORBA and Databases - Do you really need both ? Object Expert,
May: 1996.
• Emmanuel Barillot, Ulf Leser, Philip Lijnzaad, Christophe Cussat-Blanc, Kim
Jungfer, Fridiric Guyon, Guy Vaysseix, Carsten Helgesen and Patricia
Rodriguez-Tome: "A Proposal for a Standard CORBA Interface for Genome
Maps", Bioinformatics, 15(2), pp. 157-169.
• http://www.omg.org/lsr/
• http://corba.ebi.ac.uk/