Business Network Transformation and Semantic Interoperability
Summary
A company’s network of suppliers, distributors, customers, partners and employees is becoming an increasingly important source of competitive advantage. To optimize these networks, we must tackle the semantic interoperability challenge that drives up the costs of integrating computer systems.
Author: David Frankel Company: SAP
Created on: 2 December 2007
Author Bio
David Frankel is Lead Standards Architect for Model-Driven Systems at SAP Labs. He has over 25 years of experience as a programmer, architect, and technical strategist. He is the author of the book Model-Driven Architecture®: Applying MDA® to Enterprise Computing.
He also is lead editor of the book The MDA Journal. He served several terms as a member of the Architecture Board of the Object Management Group (OMG), the body that manages the MDA standards such as UML and XMI, and he has co-authored a number of industry standards. Recently he has been publishing and speaking about the role of model-driven systems and semantic interoperability in enterprise SOA.
Table of Contents
Introduction ... 3
Crossing Subsystem Boundaries ... 3
Credit Card Networks ... 3
Corporate-Bank Payment Networks... 4
Example: A Debit Transfer Message ... 5
The Integration Analyst ... 6
The Semantic Interoperability Problem... 7
Semantic Metadata ... 7
The Beginnings of Progress... 8
Copyright ... 9
Introduction
At SAPPHIRE ’07, SAP CEO Henning Kagermann pointed out a key macro economic trend: the growing importance of a company’s network of employees, suppliers, customers, partners, and distributors.
Transforming these relationships and linear value chains into optimized business networks is becoming a primary source of a company’s differentiation and competitive advantage. To support businesses in this new era, software vendors are making large investments in business process platforms based on enterprise SOA.
Crossing Subsystem Boundaries
One of the major challenges to efficient business networking is the staggering cost of integrating computer systems. A key contributor to these costs is the fact that, as information crosses subsystem boundaries while moving through supply chains and networks, information encoded in one message or data format often has to be translated to another format.
Credit Card Networks
Consider a credit card network, which Figure 1 depicts. When a credit card holder swipes the card on a merchant’s card reader or submits a credit card number for a purchase via the Web, information flows from the card reader or Web interface to a point of sale collection server, then to the merchant’s bank (called the “acquirer”), then to the credit card system (such as VisaNet), then to the bank that issued the credit card; and then the information flows back to the credit card network, the acquirer, the point of sale server, and finally back to the card reader or Web application. This round trip takes place typically in five seconds or less, and results in an authorization or a rejection.
This picture makes it evident that the authorization transaction entails information crossing multiple subsystem boundaries, and in both directions due to the round trip. Usually the crossing of a subsystem boundary requires transformation of data from one format to another. The picture reveals four subsystem boundaries, each of which is traversed once in each direction for an authorization.
It would thus seem that there are a maximum of eight data transformations that must be engineered for this credit card network. However, there are approximately 100,000,000 credit card terminals, about 30,000 point of sale servers, several thousand acquirer banks and several thousand banks that issue credit cards. The technologies being used over this large and far-flung community are diverse, requiring the various players in the network to maintain a multiplicity of gateways that can handle different message formats. Some banks maintain dozens of gateways, which require a huge amount of engineering
resources to maintain. Furthermore, each of the parties involved in the transaction—that is, the acquirer, issuer, the credit card system, and so on—consists often not of one system as the simplistic picture implies, but, rather of multiple subsystems. Thus, the aggregate costs to the economy of the message format mapping task for the credit card network is much greater than the picture would immediately suggest.1
Furthermore, corporate credit card customers may also receive feeds of data from the credit card system into their corporate ERP and business intelligence systems to pre-populate expense reports and provide aggregate data that corporate purchasing agents can use to control costs and negotiate bulk discounts with frequent suppliers. This creates additional message mapping requirements.
1This information about credit card networks is based on Joseph M. Bugajski, Office of the Global CIO, Visa International Service Association, “Response to Payments RFI,” OMG Document finance/04-08-07, 22 August 2004.
POS Server
Merchant’s Bank (“Acquirer”) Terminal
Or Customer-Facing
Web Site
Credit Card Issuer’s Bank Credit Card System
(e.g. Visa)
Corporate Credit Cardholder ERP Travel & Expense System
Business Intelligence System
Figure 1: Credit Card Network
Corporate-Bank Payment Networks
Another financial example is an electronic payments network that allows corporate ERP systems to generate and receive payments via electronic communication with the payee’s and payer’s banks, as Figure 2 illustrates. Once again, the number of subsystem boundaries appears manageable from the picture, but in reality is more problematic.
A Global 1000 company typically has between eight and 15 relationships with various banking institutions.
The largest organizations have to work with over 100 banks, encompassing over 100 message formats and data structures. So corporate ERP users have to maintain multiple, costly bank interfaces. This situation substantially drives up the costs of corporate-bank communication for electronic payments and hampers straight-through processing.2
Payer’s Bank
Payee’s Payer’s Corporate Bank
ERP System Accounts Payable
Payee’s Corporate ERP System Accounts Receivable
Figure 2: Corporate-Bank Payments Network
2 Although the need to map disparate data and message formats is the focus of this discussion, it is by no means the only interoperability challenge for corporate-bank payment networks. Different institutions use different communications channels and security protocols. See “Getting Started with UNIFI” by David Frankel and Juergen Weiss, SDN,
https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/docs/library/uuid/4c42cc5d-0d01-0010-3f9a- ff2826def994
Example: A Debit Transfer Message
Continuing with our focus on electronic payment networks as an example of business networks, consider an industrial strength debit transfer message that contains multiple blocks of data, including a block of data called something like Debtor Party. The Debtor Party block includes multiple data elements, including one that is named something along the lines of Identification_Issuer_Name_Proprietary.3 Conceptually, this data element represents the name of the issuer of a proprietary identification for a debtor party. The element incorporates a number of semantic concepts reflected by the terms
identification, issuer, name, proprietary, debtor, and party. In addition to those fine-grained concepts, it incorporates the following coarser-grained concepts that combine the more granular concepts:
• Debtor party
• Proprietary identification
• Issuer name
• Identification issuer
Thus, the data element in question is based on a combination of these semantic concepts.
Of course, there are multiple formats for debit transfers in use within the financial services industry.
Issuing new message formats does not mean that system managers rip out and replace applications that use older formats, which is exceedingly costly. Instead, they build translators to mediate between the new formats and the formats that are native to their existing applications. Furthermore, various domains within the finance industry deal with debits, and each of their associated standards bodies defines different message formats. Thus, multiple message definitions come into play in a single financial transaction.
3 This example is loosely based on one of the ISO 20022 payment messages.
The Integration Analyst
An integration analyst takes on the task of mapping one message format to another. The formats that the analyst must map are often lengthy and complicated. Some messages have scores of complex fields.
The amount of help that our computer systems provide to help the analyst figure out what the mapping should be is quite limited.
State of the art data mapping tools display both formats on the screen and allow the analyst to graphically draw connections and write expressions to specify what should map to what according to what rules (see Figure 3). These tools also are good at generating translation code, and embody a genuine advance over having to write transformation programs in lower level code. Thus, once the analyst has figured out what the mapping should be and has entered the rules into the tool, the tools do useful things.
However, it is very time consuming and error prone for the analyst to figure out how to map one complex format to another. Consequently, subtle mistakes occur in data transformations at subsystem boundaries within the financial networks and cost the involved parties serious money. Even if the analyst has access to good documentation of the message formats, the size and complexity of the formats means the process is fraught with opportunities for mistakes.
In current practice, application software developers hard-wire their understanding of semantic concepts into programs that construct and deconstruct a message such as a debit transfer. This semantic
knowledge is buried deep in the code and is not manifest in any machine-readable metadata. Integration analysts mapping one format to another use whatever documentation is available to determine the semantics encoded into the programs, but the mapping tools see only the syntactic format definition and cannot divine the semantics buried in the code, and thus can offer no help to steer the analyst toward correct mappings and away from costly errors.
Debit Transfer Format 2 Debit Transfer
Format 1
Field 1 Field 1
Field 2 Field 2
Field 3 Field 3
Field 4
Field 5
Field 4
… …
Expression Editor: ___________________________________________________
Figure 3: A Mapping Tool
The Semantic Interoperability Problem
Mistakes in mapping message and data formats are a consequence of a low degree of semantic interoperability. Semantic interoperability is the ability of a set of parties to coordinate their functioning based on a shared understanding of the meaning of the communications that flow among them.
In our debit transfer example, the communications involve the passing of messages, as well as the transformation of those messages from one format to another. The level of semantic interoperability is based on the degree to which various parties in the financial network have a common understanding of the semantics of the data that flows through the network.
The lower the degree of semantic interoperability, the more friction in the system. Studies indicate that a modest reduction in such friction in electronic payment systems could add a percentage point to global GDP.4 Even if such projections are too optimistic, it should still be evident from the enormous costs of integration that moderate gains in semantic interoperability could produce substantial returns.
Thus, in attacking this problem, we do not have to strive for full semantic interoperability. Complete automation of all mapping decisions is probably not attainable, certainly not in the foreseeable future.
Humans will have to be involved in these decisions. Even a goal of, say, 50 percent automation may be too high to shoot for at this time, given the technical challenges and the resulting high cost of achieving such a target. But we can certainly do better than we are doing today, and the evidence indicates that moderate improvements will be well worth the effort.
Semantic Metadata
To begin to understand how to get a handle on the semantic interoperability problem, consider again our state of the art mapping tools. As far as these mapping tools are concerned, our example complex data element simply has an opaque name (Identification_Issuer_Name_Proprietary), a data type (Text), and a cardinality (zero to one). There is no metadata about the underlying semantic concepts that the mapping tool can grab onto in order to provide the integration analyst with guidance as to what should map to what. Message definitions currently in use in industry today have little if any machine readable metadata that manifests the underlying semantic concepts. Therefore, whatever degree of semantic interoperability that our data processing systems possess relies on hard-wired code, on written documentation if
available, and on the best efforts of integration analysts who must wade through the complexity largely unassisted.
If message definitions for electronic commerce include the right kind of metadata, then each data element essentially carries a semantic map of itself that reveals, in a structured way, the underlying concepts on which the element is based. Tools can use this metadata to provide a reasonable degree of assistance to analysts trying to work out the right data mappings. Such tools can analyze the elements’ internal semantic maps and suggest (not dictate) to the analyst what the proper mappings to other elements might be, at least for a certain percentage of the elements. This approach semi-automates the mapping.
It also is helpful to the analyst to be able to read an element’s internal semantic map as she carries out the non-automated aspects of the mapping process.
This approach is more effective to the extent that data elements’ underlying semantic concepts are reused from libraries of semantic concepts that have been defined and registered by a duly constituted registration authority. The more that the same concepts start showing up in different elements’ internal semantic maps, the more that tools can find probable mapping relationships. Of course, there are issues with getting different communities aligned around common a concept; that is one of the key reasons why, realistically, we strive for modest results at this stage.
4 Joseph N. Bugajski, Response to Payments RFI, Visa International Payments Association, August 22, 2004, OMG document finance/04-08-07.
The Beginnings of Progress
The technologies exist to make limited yet useful advances in semantic interoperability. From the
standpoint of technical feasibility, therefore, it is now a matter of engineering to work out how to use these technologies together in a way that scales and provides practical improvements. The UN/CEFACT Core Components (CCTS) initiative and Semantic Web technologies, along with modeling tools, give us the basic toolkit we need to build semantic metadata into our message and data definitions.
Service-oriented systems benefit as much as message-oriented systems from the semi-automation because they pass messages around too. Moreover, the financial domain on which our examples focus is only one of many e-commerce domains suffering from friction that could be partially alleviated.
For a number of reasons, industry is starting to recognize the pain of the semantic interoperability problem as it never has before. In future BPX blogs and papers I will discuss what is driving this new recognition and will look at how it is spawning serious projects in several industrial sectors to leverage new approaches to semantic interoperability . I’ll also examine the leadership role that SAP is taking in these endeavors, both in the standards arena and in its own software development.
Copyright
© Copyright 2007 SAP AG. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice.
Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.
Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation.
IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400, iSeries, pSeries, xSeries, zSeries, z/OS, AFP, Intelligent Miner, WebSphere, Netfinity, Tivoli, Informix, i5/OS, POWER, POWER5, OpenPower and PowerPC are trademarks or registered trademarks of IBM Corporation.
Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or registered trademarks of Adobe Systems Incorporated in the United States and/or other countries.
Oracle is a registered trademark of Oracle Corporation.
UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.
Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc.
HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C®, World Wide Web Consortium, Massachusetts Institute of Technology.
Java is a registered trademark of Sun Microsystems, Inc.
JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape.
MaxDB is a trademark of MySQL AB, Sweden.
SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary.
These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.
These materials are provided “as is” without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement.
SAP shall not be liable for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials.
SAP does not warrant the accuracy or completeness of the information, text, graphics, links or other items contained within these materials. SAP has no control over the information that you may access through the use of hot links contained in these materials and does not endorse your use of third party web pages nor provide any warranty whatsoever relating to third party web pages.
Any software coding and/or code lines/strings (“Code”) included in this documentation are only examples and are not intended to be used in a productive system environment. The Code is only intended better explain and visualize the syntax and phrasing rules of certain coding. SAP does not warrant the correctness and completeness of the Code given herein, and SAP shall not be liable for errors or damages caused by the usage of the Code, except if such damages were caused by SAP intentionally or grossly negligent.