Digital Technical Journal
Digital Equipment Corporation
Cover Design
Transaction processing is the common theme for papers in this issue. The automatic teller machine on our cover represents one of the many businesses that rely on TP systems. If we could look behind the familiar machine, we would see the products and technologies - here symbolized by linked databases - that suppo1·t reliable and speedy processing of transactions worldwide.
The cover was designed by Dave Bryant of Digital's Media Communications Group.
Editorial
Jane C. Blake, Editor
Kathleen M. Stetson, Associate Editor Ci.rculation
Catherine M. Phillips, Administrator Suzanne). Babineau, Secretary Production
Helen L. Patterson, Production Editor Nancy jones, Typographer
Peter Woodbury, Illustrator Advisory Board Samuel H. Fuller, Chairman Richard W Beane Robert M. Glorioso Richard). Hollingsworth john W McCredie Alan G. Nemeth Mahendra R. Patel F. Grant Sa viers Robert K. Spitz Victor A. Vyssotsky Gayn B. Winters
The Digital Tecbnicaljoumal is published quarterly by Digital Equipment Corporation, 146 Main Street MLO l-3/B68, Maynard, Massachusetts 0175 4-2571. Subscriptions to the journal are $40.00 for four issues and must be prepaid in .S. funds. niversity and college professors and Ph.D. students in the electrical engineering and computer science fields receive complimentary subscriptions upon request. Orders , inquiries, and address changes should be sent to The Digital Tecbn.ical}oumal at the published-by address. Inquiries can also be sent electronically to DTJ®CRJ..DEC.COM. Single copies and back issues are available for $16.00 each from Digital Press of Digital Equipment Corporation, 12 Crosby Drive, Bedford, M A 01730 -1493.
Digital employees may send subscription orders on the ENET to RDVAX::JOURNAI. or by interoffice mail to mailstop MLO I-3/B68. Orders should include badge number, cost center, site location code and address. All employees must advise of changes of address.
Comn1ents on the content of any paper are welcomed and may be sent to the editor at the published-by or network address.
Copyright <D 1991 Digital Equipment Corporation. Copying without fee is permitted provided that such copies are made for use in educational institutions by faculty members and are not distributed for commercial advantage. Abstracting with credit of Digital Equipment Corporation's authorship is permitted. All rights reserved.
The information in this journal is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may appear in this journal.
ISSN 0898-901 X
Documentation Number EY-F588E-DP
The following are trademarks of Digital Equipment Corporation: DEC, DECforms, DECintact, DECnet, DECserver, DECtp, Digital, the Digital logo, LAT, Rdb/VMS, TA, VAX ACMS, VAX CDD, VAX COBOL, VAX DBMS, VAX Performance Advisor, VAX RALLY, VAX Rdb/VMS, VAX RMS, VAX SPM, VAX SQL, VAX 6000, VAX 9000, VAXcluster, VA.Xft, VAXserver, VMS.
IBM is a registered trademark of International Business Machines Corporation.
TPC Benchmark is a trademark of the Transaction Processing Performance Council.
I
Contents
8
Foreword Carlos G. Borgiall i1 0
DECdta-Digital's Distributed Transaction Processing ArchitectureTransaction Processing, Databases, and Fault-tolerant Systems
Phil ip A. Bernstein, William
T.
Emberton, and Vi jay Trehan18
Digital's Transaction Processing Monitors Thomas G. Speer and MarkW
Storm33
Transaction Management Support in the VMS Operating System KernelWi ll iam A. Laing, James E. Johnson, and Robert V Landau
45
Peiformance Evaluation of Transaction Processing SystemsWalter H. Kohler, Yun-Ping Hsu, Thomas K. Rogers, and Wael H. Bahaa-EI-Di n
58
Tools and Techniques for Preliminary Sizing of Transaction Processing ApplicationsWilliam Z. Zahavi, Frances A. Habib, and Kenneth). Omahen
65
Database Availability for Transaction ProcessingAnanth Raghavan and T. K. Rengarajan
70 Designing an Optimized Transaction Commit Protocol Peter M. Spiro, Ashok M . Joshi, and T. K. Rengarajan
I
Editors Introduction
Jane C. Blake
Editor
Digital's t ransaction processi ng system s are i nte grated hardware and software products that operate in a distributed environment to support commer cial applications, such as bank cash wit hd rawals, credit card t ransactions, and global t rad i ng. For these app lications, data i ntegrity and cont i nuous access to shared resources are necessary system characteristics; anything less would jeopardize the revenues of busine ss operat ions that depend on these applications. Papers in this issue of the Journal look at some of D igi tal 's techologies and products that provide these system characterist ics in three areas: distributed transaction processing, database access, and system fault tolerance.
Opening the issue is a discussion of the architec ture, DECdta, which ensures rel iable interoperation in a d i st ri buted environment. Phil Bernstei n, B i l l Emberton, and V i jay Trehan define some transaction processing termi nology and anal yze a TP applica tion to i l l u strate the need for separate architectural components. They then present overviews of each of the components and interfaces of the distributed transaction p rocessing architecture, giving partic ular attention to transaction management.
Two products, the ACMS and DECi ntact monitors, implement several of the functions defi ned by the D ECdta architecture and are the twi n topics of a paper by Tom Speer and Mark Storm. Although based on di fferent implementation strategies, both ACMS and DECintact provide TP-specific services for developi ng, e xecuting, and managing TP appli cat ions. Tom and Mark discuss the two strategies and then highl ight the functional sim i larities and differences of each monitor product.
The ACMS and DECi ntact monitors are layered on the VMS ope rat i ng system, which provides base services for distributed transaction management. Described by Bill Lai ng, Jim Joh nson, and Bob Landau, these VMS services, called DECdtm, are an
2
addition to the operating system kernel and address the problem of i ntegrat ing data from multiple sys tem s and databases. The authors describe t he t hree DECdtm components, an opt imized implementa tion of the two-phase commit protocol, and some VA.Xclu ster-specific optim izations.
The next two papers turn to the issues of measur i ng TP system pe rformance and of sizi ng a system to ensure a TP appl icat ion will run efficient ly. Wal t Kohler, Yun-Ping Hsu, Tom Rogers, and Wael Bahaa E I-Din discuss how Digital measures and models TP system performance. They present an overview of the industry-standard TPC Benchmark A and Digital's implementation, and then describe an alternative to benchmark measurement- a mult i level analyti cal model ofTP system performance that simplifies the system's complex behavior to a manageable set of parameters. The discussion of performance con tinues but takes a di fferent perspective in t he paper on sizing TP systems. B i l l Zahav i , Fran H abib, and Ken Omahen have wri tten about a methodology for estimat i ng the appropriate system size for a TP application. The tools, techniques and algorithms they describe are used when an appl icat ion is sti l l in i t s early stages of development.
High performance must extend to the database system . ln their paper on database avai labi l i ty, Ananth Raghavan and T. K. Rengarajan exam i ne strategies and novel techniques that minim ize the affects of downtime situations. The two databases referenced in their discussion are the VAX Rdb/YMS and VAX D BMS systems. Both system s u se a database kernel called KODA, which provides t ransaction capabil i t ies and com m i t processing. Peter Spiro, AshokJoshi, and T.K. Rengarajan explain the impor tance of commit processi ng relati ve to throughput and describe new designs for improving the perfor mance of group com mit processing. These designs were tested, and the results of these tests and t he authors' observations are presented .
Biographies
I
Carlos Alonso A principal software engineer, Carlos Alonso is a team leader for the project to port the System-V operat ing system to the VAXft 3000. Previou sly, he was the project leader for various VAXft 3000 system validation development efforts. As a member of the research group, Carlos developed the test bed for evaluati ng concurrency control algorithms using the VMS Distributed Lock Manager, and he designed the prototype alternate lock rebuild algorithm for cl uster transit ions. He holds a B.S. E.E. (1979) from Tulane University and an M.S . C.S. (1980) from Boston University.
Wael Hilal Bahaa-El-Din Wael Bahaa-EI-Din joined Digi tal in 1987 as a senior consultant to t he Systems Performance Group, Database System s. He has led a number of studies to evaluate performance database and transaction process i ng systems under response time constraints. After receiving his Ph. D. (1984) in computer and informat ion science from Ohio State University, Wael spent three years as an assistant professor at the University of Houston. He is a member of ACMS and IEE E , and he has wri tten numerous art icles for profes sional journals and conferences.
Philip A. Bernstein As a senior consultant engineer, Philip Bern stei n is both an architectural consultant i n the Transaction Processi ng Systems Group and a researcher at the Cambridge Research Laboratory. Prior to joining Digital in 1987, he was a professor at Wang Institute of Graduate Studies and at Harvard Un iver sity, a vice president at Sequoia System s, and a researcher at the Computer Corporation of America. He bas published over 60 papers and coauthored two books. Phi l received a B.S. (1971) in engineering from Cornel l University and a Ph. D. ( 197'5) in computer science from the University of Toronto.
William F. Bruckert William Bruckert is a consu lti ng engineer who joined D igital in 1969 after receiving a B.S.E.E. degree from the University of Massachusetts. He received an M.S.E. E./C. E. degree from the same university in 1981 . Begin n i ng as a worldwide product support engineer, Bill later worked on a number of DECsystem-10/20 designs. He developed the cache, memory, and 1/0 subsystem for the VA.,'( 8600 processor and was the system architect of the
VAX
86'50 processor. H is most recent role was as the architect of the VAXft 3000 system . Bi.ll currently holds seven patents.4
William T. Emberton
As a principal software engineer, William Emberton is
currently involved in the development of Queue Management Architecture. He
is also involved in X/Open and POS!X TP Standards work ancl is a member
of the team that is developing the overall DECtp product architecture. Previ
ously, he worked on the initial versions of the DEC:dta architecture. Before com
ing to Digital in
1987,Bill held positions as Director of Software Development
at National Semiconductor and Manager of Systems Development for Inter
national Retail Systems at
NCR.He was educated at London University.
Frances A. Habib
Fran Habib is a principal software engineer involved with
the development of transaction processing workload characterization and siz
ing tools. Previously, Fran worked at Data General and c;TE Laboratories as a
management science consultant. She holds an M.S. in operations research from
MIT and a B.S. in engineering and applied science from Harvard. Fran is a full
member of
ORSAancl belongs to ACM, IEEE, and the
AC:YI S!CMETRJC:S specialinterest group on modeling and performance evaluation of computer systems.
Yun-Ping Hsu
Yun-Ping is currently a principal software engineer in the
Transaction Processing Systems Performance and Characterization Group. He
joined Digital in October
1987,after receiving his master's degree in electrical
and computer engineering from the University of Massachusetts at Amherst. In
his position, Yun-Ping is responsible for performance modeling and bench
mark measurement of both ACMS- and DEC:intact-based TP systems. He also
participated in the TPC Benchmark A standardization activity during
!989He is
a member of ACM and IEEE.
james E. johnson
A consulting software engineer, Jim Johnson has worked
for the VMS Engineering Group since joining Digital in
1984.He is current!)' a
project leader for VMS Engineering in Europe. Prior to this work, Jim led the
RMS project, and after relocating to the UK three years ago, he was responsible
for much of the design and implementation of the DEC:dtm services. At the same
time, Jim was an active participant in the transaction management architecture
review group. He has applied for a patent pertaining to the two-phase commit
protocol optimization currently used in DECdtm services.
TP benchmark standards activities. Before joining D igital in 1988, Walt was a vis i t ing scientist and technical consultant to D igital and a professor of electrical and computer engineering at the Univers i ty of Massachusetts at Amherst. He holds B.S., M.S., and P h . D . degrees in electrical engi neering, all from Princeton University. Walt recently received the IEEE/CS Meritorious Service Award, and he has published over 25 technical articles.
William A. Laing W i l l iam La i ng is a senior consu l ta nt engi neer based in Newbury, England . He is the technical leade r for p roduction systems support for the VMS operat i ng system . D u ring five years spent in the U.S., Bi l l was responsible for the design and i n it ial development of symmetrical mult i processi ng support i n the VMS system . He joined D igital i n 1981, after doing research on operating systems at Edinburgh University for nine years. Bill holds a B.Sc. (1972) in mathematics and computer science and an M.Phil. (1976) i n computer science, both from Edinburgh Univers ity.
Robert V. Landau Principal software engineer Robert Landau is a member of the VMS Engi neering Group, based in Newbury, England. He is currently the project leader of a VMS advanced development team investigat ing a high-perfor mance, transaction-based, flat file system. Before joining D igi tal i n 1987, Bob worked for a variety of software houses speciali zing in database-related prod ucts. He stud ied botany at London Univers ity and, subsequently, obta ined a teaching qualification from Hereford College.
James M. Melvin As a principal design engineer, Jim was responsible for the specification of hardware error-handling mechanisms i n the VAXft system and is presently an engineering project leader for future VA.,'(ft systems. He also speci fied and led the implementatio n of t he hardware system simulation platform and t he hardware des ign verification test plan. Jim joi ned D igital in 1984 and holds a B.S.E.E. (1984) and an M.S. (1989) in engineering management from Worcester Polytechnic Insti tute. He holds t hree patents on the VAXft 3000 sys tem, al l related to error handling in a fault-tolerant system.
Kenneth]. Omahen A principal engineer, Kenneth Omahen is developing
object-oriented queuing network solvers. He designed a variety of perfor mance tools and performed design support stud ies which i nfluenced a number of D igital products. Prior to joining D igital , Ken worked at Bel l Telephone Laboratories, lectured at the University of Newcast le-Upon-Tyne, and was a faculty member at Purdue Un iversity. He received a B.S. degree i n science engi neering from Northwestern University and M . S . and P h . D . degrees in informa tion sciences from the University of Chicago.
Biographies
6
Ananth Raghavan Since join i ng D igital i n 1988, Ananth Raghavan has been a software engi neer who has led projects for t he KODA/Rdb Group. Previous to this position, he was a teaching ass istant in t he computer science department of the University of Wisconsin. Anant h holds a B.S. ( 1985) degree in mechani cal engineering from the I nd ian I nstitu te of Technology, Madras, and an M.S. ( 1987) degree in computer science from t he Un iversity of Wisconsin, Mad ison . H e h a s two patent applicat ions p end i ng for h i s w o r k on undo a n d undo/redo database algori thms.
T. K. Rengarajan T. K. Rengarajan has been a member of the Database Systems Group since 1987 and works on the KODA software kernel for database management systems. He is involved in the support for WORM devices and global buffer management in the VA..'\cluster environment. His work in the areas of boundary element methods and database management systems is reported in several published papers and patent applications. Ranga holds an M.S. degree i n computer-a ided design from the Uni versity o f Kentucky and a n M.S. in com puter science from the Un iversi ty of Wisconsin.
Thomas K. Rogers Thomas Rogers is a project leader for the Transaction Processing Systems Performance ami Characte rization Group. He is respon sible for tes t i ng the V.A.,'C 9000 Model 210 system us ing the TPC Benchmark A standard . Prior to j o i n i ng D igital i n January 1988, Tom worked for Sperry Corporation as a techn ical specia l ist for t he Nort heast region. H e received a bachelor of science degree in mathematical sciences i n 1979 from Johns Hopkins University.
Thomas G. Speer As a principal software engineer i n t he DECtp/East Engineering Group, T homas Speer is currently lead i ng the D EC intact V2.0 pro ject. In this posit ion, his m ajor responsi b i lity is defi n i ng the requirements for DECintact support of DECdtm services, client/server database access, and sup port for the DECform s p roduct. Since joining Digital in 1981 , Tom has worked on several development projects, including FORTRAN-10/20 and RMS-20. He holds degrees from Harvard University, Ru tgers University, and Simmons College. He is a member of Phi Beta Kappa.
TP products for more t han ten years. Currently, he is act ing technica l d irector for t he East Coast Transaction Processing Engi neering Group, as wel l as manag ing a small advanced development group. After join i ng D igital i n 1976, Mark worked on COBOL compi lers for the PDP-11 systems and developed the first native COBOL compiler for t he VAX computer. He holds a B.S. (with honors) i n computer science from t h e Un iversity o f Southern M ississippi .
Vijay Trehan Since joi n i ng Digi tal i n 1978, Vijay Trehan has contributed t o several archi tecture projects. H e i s t h e techn i cal d irector responsi ble for DECtp architecture, design, and standards work. Prior to t his assignment, Vijay was t he archi tect for t he DECdtm p rotocol, architect for the D DIS data inter change format, and i n i t iator of work on t he D DIF document i n terchange format and compound document strategy. He holds a B.S. ( 1972) i n mechan ical engi neering from t he I nd ian I nstitute of Technology and an M.S. ( 1974) in operations research from Syracuse Un iversity.
William Z. Zahavi As an engineering manager, B i l l is responsible for the des ign and development of predict ive sizi ng tools for t ransaction p rocessi ng app.lications. Before join i ng D igital i n 1987, he was a techn ical consu ltant for Sperry Corporation, specializing i n systems performance analysis and capacity planni ng. Bil l rece ived an M . B.A. from Nort heastern Un iversity and a B.S. i n mathematics from t he Univers ity o f Virgi n ia . H e i s an active member o f the
Computer Measurement Group, and frequently presents at CMG conferences.
I
Foreword
Carlos G. Borgialli
Senior Manager, DECtp Software Engineering
Transaction p rocessing is one of the largest, most rapidly growing segments of the computer i nd us try. D igital's st rategy is to be a leader in transaction processing, and toward that end we are making technological advances and delivering products to meet the evolving needs of businesses that rel y on transaction processing systems.
Because of the speed and rel iabi l i ty with which transaction processing systems capture and d is play up-to-date information, they enable businesses to make well-informed, t imely decisions. Industries for which t ransaction p rocessing systems are a sig nificant asset i nclude banki ng, labo ratory au toma tion , manufacturing, government, and i nsurance. For these i ndustries and others, t ransaction p ro cessing is an i nformation l ifeli ne that supports the achievement of da i l y business objectives and i n many instances provides a competitive advantage. Many older transaction processing systems on which busi nesses rely are centralized and tied to a particular vendor. A great deal of money and time has been invested i n these systems to keep pace with busi ness expansion. As expansion continues beyond geographic boundaries, however, the cen tralized, s i ngle-vendor t ransaction p rocessing sys tems are less and less l i kely to offer the flex ibility needed for round- the-clock, rel iable, business operations conducted worldwide. Transaction pro cessi ng technology therefore must evolve to respond to the new business environment and at the same t ime protect the i nvestment made i n existing systems.
Our research efforts and i nnovative p roducts provide the transaction p rocessi ng systems that businesses need today. The demand for d istribu ted
8
rather than central ized systems has focused atten tion on system m anagement. Que u i ng services, highly av a i lable systems, heterogeneous environ ments, securi ty services, and compute r-a ided soft ware engineering (CASE) are a few examples of areas in which research and advanced develop ment efforts have had and will con t i nue to have a major i mpact o n the capabilities of transaction processi ng systems.
Transaction p rocess i ng solut ions requ i re the appli cation of a w ide range of technology and the integration of m u l t iple software and hardware products: from desktop to ma inframe: from presen tation services and user i nterfaces to TP moni tors, database systems, and compu ter-a ided software eng ineeri ng tools; from optim ization of system performance to optimization of availabi lity. Making all of this tcch.nology work well together is a great challenge, but a challenge D igital is u niquely posi t ioned to meet.
D igital ensures broad appl ication of its t rans action p rocess i ng technology by defi n i ng an architecture, the Digital Distribu ted Transaction Architecture (DECdta). DE Cdta, about which you will read i n this issue, defines the major components of a D igital TP systt:m and the way those components can form an integrated transaction p rocessi ng sys- tem. The DECdta architecture describes how data and processi ng are easily d istributed among m ulti p le VAX p rocessors, as wel l as how the components can i nteroperate in a heterogeneous environment. The D ECdta architecture is based on the client/ server computing model, which allows D igital to apply its traditional strengths in networking and expandabi I ity to t ransaction p rocessi ng system so lutions. In the DECdta client/server computing model, the client port ion i nteracts with the user to create processi ng requests, and the server portion performs t he data manipulation and computation to execute the processing request. T his computi ng model facil itates the d ivision of a TP system into small components in three ways. It al lows for dis tribut ion of functions among VA_,\: p rocessors; i t part itions the work performed b y one or more of the components to al low for parallel processi ng; or i t repl icates functions to achieve h igher ava i l ability goals. T hese opt ions permit the customer to p urchase the configurat ion that meets present needs, confident that the system will al low smooth expansion in the future.
coord inated manner. It provides for the cooper ation and interoperation of components imple mented on different platforms, and it supports the expansion of customer applicat ions to meet growth requirements. The DECdta arch i tecture is des igned to work with other Digital arch itectures such as the D igital Network Architecture (DNA), t he network application services (NAS), and the Digi tal database archi tecture (DDA). Moreover, the DECdta architec ture supports ind ustry st andards that enable the portability of appl ications and their interopera t ion in a heterogeneous enviro nment, such as the standard appl ication programming interfaces being developed by t he X/Open Trans action Proce ssing Working Group and t he IEEE POSJX. Standard wire protocols that provide for systems interoperation in a mult ivendor, heterogeneous environment are be i ng developed by the International Standards Organization as part of the Open System Inter connection activities.
Among the products D igi tal has developed speci f ical l y for TP systems are the TP monitors. These monitors provide the system integrat ion "glue," if you will. Rather than act as their own systems inte grators, customers who use D igital's TP monitors are able to spend more t ime on solving bus iness problems and less t ime on solving software in te gration problems, such as how to make forms and database products work together smoothly.
Digital's TP moni tors run on all types of hard ware configurations, including local area networks (LANs), wide area networks (WAJ'\Is), and VAXcluster systems. The DECdta client/server computing model provides t he necessary flex ibility to change hard ware configurations, thus allowing reco nfigura t ion without the need for any source code changes.
The two TP moni tors, DECin tact and VAX AG•IS, i ntegrate vital D igital technologies such as t h e D igital Distributed Transaction Manager (DECcltm) and products such as D igital's forms systems (DECforms) and our Rdb/VMS or V�'\ DBMS data base products. DECdt m uses the two-phase com mit protocol to solve the complex problem of coord i nating updates to multiple data resources or databases.
Major developments in Digita l's database prod ucts have enhanced the strengths of its overal l product offerings. The two mainstrea m database products noted above, Rdb/VMS and VA,"( DBMS, layer on top of a database kernel called KODA, thus providing data access i ndependent of any data mod el. The services made available by KODA,
besides its high performance, allow D igi tal's data base products to eff icient ly support TP applica tions as well as to provide rich functional ity for general-purpose database appl ications.
For those TP systems that require u ser i nter faces, DECforms provides a device-independent, easy-to-use human interface and perm its t he sup port of mult iple devices and users within a single appl icat ion.
TP systems that requ ire high ava ilabil i ty or con t inuous operations are supported by the V�'X fam ily of hardware and software. The introd uct ion of the fault-tolerant VAXft 3000 system, added to t he successf u l V�'Xcluster system, allows for a high level of s ystem av a ilabil i t y. Performance needs also are be ing met by a combination of hardware resources. includ ing the VAX 9000 system.
This combinat ion of architecture, software, and hardware technology, and support for emerging industry standards places D igital in an excellent pos i t ion to become the industry leader for d is tributed, portable transaction processing systems. The papers in this issue of the Journal provide a view of t he key elements of D igital's d istributed transaction process ing technologies.
Many individuals, teams, organizations, and busi ness partners are respons ible for bringing Digi tal's TP v ision to fru it ion. Their dedicat ion, hard work, and creativity will cont inue to drive t he develop ment of new technologies t hat enhance our family of products and services.
9
Philip A. Bernstein William T. Emberton Vijay Trehan
DECdta -Digitals Distributed
Transaction Processing
Architecture
Digital's Distributed Transaction Processing Architecture (DECdta) describes tfJe modules and interfaces that are common to Digital's transaction processing (DECtp) products. The architecture allows easy distribution of DECtjJ products. fn particular. it supports client/server style applications. Distributed transaction management is the main function that ties DECdta modules together it ensures that application programs, database systems, and other resource managers inter operate reliably in a distributed �ystem.
Transaction processing (TP) is the activity of execut ing requests to access shared resources, typical ly databases. A computer system that is configured to execute TP applications is cal led a TP system.
A t ransaction is an execut ion of a set of opera t ions on shared resources that has the fo llowing properties:
• Atom ici ty. Either aJ J of the transaction ·s ope ra
t ions execute, or the transact ion has no effect at all.
• Serializabi li ty. The set of all operat ions that exe
cute on behalf of the t ransaction appears to execute serially with respect to the set of opera tions executed by every other transaction.
• Durabi lity. The effects of the transaction 's oper
ations are resistant to fa i lu res.
A t ransaction term inates by executing the com mit or abort operat ion. Commit tells the system to install the effect of the transact ion's operations permanently. Abort tells the system to undo t he effects of the transact ion's operations.
For enhanced reliabi l i ty and ava i labil ity, a TP application uses t ransactions to execute requests. That is, the application receives a request message (from a d isp lay, compu ter, or other device), exe cutes one o r more t ransactions to process the request, and possibly sends a reply to the origina tor of the request or to some other parry specified by the originator.
TP appl icat ions are essential to the operation of many indust ries, such as finance, reta i l , health care, transportation, govern ment, commun ications,
10
and manufacturing. Given the broad range of appli cat ions of TP, D igital offers a wide variety of prod ucts with which to build Tl' systems.
DECtp is an u mbrel la term that refers to Digi tal's TP p roducts. The goal of DECtp is to offe r an inte grated set of ha rdware and software p roducts t hat supports the development, execu t ion, and management of TP appl ications for enterprises of all sizes.
DECtp systems include software components t hat are specialized for TP, notably TP monitors such as t he ACMS and DECintacr TP monito rs, and transaction managers such as the DEC:dtm t rans action manager. ' ' DECtp systems also req uire the integration of general-purpose hardware products (processors, storage, communications, and termi nals) and software products (operat ing systems, database systems, and com munication gateways). These products a re typically integrated as s hown in Figure l.
TP APPLICATION
TP MONITOR DATABASE SYSTEMS FORMS MANAGER
[image:12.595.314.529.583.698.2]OPERATING SYSTEM COMMUNICATION SYSTEM
Figure 1
Layering of Products to Support
a
TPApplication
Appl ications on DECtp systems can be des igned using a client/server parad igm . This parad igm is especially useful for separat i ng the work of prepar ing a request from that of running t ransactions. Request p reparation can be done by a front-end system, that is, one that is close to the user, i n which processor cycles arc i nexpens ive and inter active feedback is easy to obtain. Transaction execution can be done by a larger back-end sys tem, that is, one that m anages large databases and may be far from the user. Back-end systems may themselves be d istribu ted . Each back-end system manages a p orrion of the enterprise database and executes appl icat ions, usually ones that make heavy use of the database on that back end. D ECtp products are modu larized to al low easy d istribu tion across front ends and back ends, which enables them to support client/server style applications. DECtp systems thereby simplify pro gramming and reco nfiguration in a d istribu ted system.
Digi t a l 's Distributed Transaction Processi ng Architecture (DECdta) defines the modularization and d istribu t ion structure that is common to DI'Ctp products. D ist ributed transaction management is the m a i n fu nction that tics this structu re together. This paper describes the D ECdta structure and explains how DECdta components are integrated by distributed transaction management.
Current versions of DECtp p roducts imp lement most, but not all, modu les and inte rfaces in the DECdta architectur e . Gaps between the architec ture and products will be fi l led over time. D ECtp products that current ly imp lement DECd ta compo nents are referenced throughou t the paper.
TP Application Structure
By analyzing TP appl icat ions, we can see where the need a rises for separate D ECdta co mponents. A typical TP app l ication is structured as fol lows:
Step 1 : The client application i nteracts with a user (a person or machine) to gather input, e.g., using a forms manage r.
Step 2 : The client maps the user's input into a request, that is, a message that asks the system to pe rform some wo rk. The c l ient sends the request to a serve r appl ication to process the request.
A request may he d irect or queued. Jf d irect, the client expects a server to process the request right away. If queued , the cl ient deposits the request in a queue from which a server can dequeue the request later.
Digitu/ Teclmicul jouniUI Vol. ,) Nu I Winter t'J'JI
Step 3: A server processes the request by executing one or more transactions. Each trans action may
a. Access multiple resources
b. Cal. I programs, some of which may be remote
c. Generate requests to execute other t ransactions
d. Interact with a user
e. Return a reply when the transaction fi nishes
Step 4: If the transaction produces a reply, then the client i nteracts with the user to d isplay that reply, e.g., using a forms manager.
Each of the above steps involves the interact ion of two or more programs. In many cases, it is desir able that these programs be d istribu ted . To d is t ribute them conveniently, i t is important that the programs be in separa te components. For exam ple, consider the fol lowing:
• The p resentation service that operates the dis play and the appl ication that controls which form to d isplay may be d istributed.
One may want to off-load presentation services and related functions to front ends, whi le allow ing programs on back ends to cont rol which forms are d isplayed to users. This capabi l i ty is useful in Steps 1 , 3d, and 4 above to gather input and d isplay output. To ensure that the presenta· tion service and application can be d istribu ted, the p resentat ion service should correspond to a separate DECdta component.
• The cl ient appl ication that sends a request and
the server application that processes the request may be d istribu ted. The applicat ions m ay com m u n icate through a nerwork or a queue.
In Step 2, front-end applications may want to send requests direct ly to back-end applicat ions or to place requests in queues that are managed on back ends. Simi larly, in Step 3c, a t rans· action, T, may enqueue a request to run another t ransaction, where the queue resides on a d if ferent system than T. To max imize the flexibi l ity of d istribu t i ng request management , request management should correspond to a separate DECdta component.
• Two t ransaction m anagers that want to run a com m i t protocol may be d istribu ted .
For a transaction to be distributed across different systems, as in Step 3b, the transaction management
Transaction Processing, Databases, and Fault-tolerant Systems
se rvices must be dist ri buted.
'1()en sure that each
t ran saction is at omic, the t ransac tion manage rs on
these sy ste ms must c on t rol t ran sac tion c o m m it
men t using a com mon c o m mit prot oc ol. To c o m
plic ate matte rs, the re is more t han on e w ide ly used
prot oc ol for t ran sac ti on c o m mit men t. To the
exten t possi b le, a sy st e m sh o u ld all ow inte ro pe ra
t ion of th ese protoc ols.
To en sure th at t ran sact ion manag e rs c an be dis
t ributed, the t ran sact ion m an ag e r sho uld be a
c o mponent of DEC:dt a.
Tc>en sure th at they c an
inte ro pe rate, the ir t ran saction p rot oc ol sh o u ld
also be in DECdt a. To en sure th at (liffe rent c o m mit
p rot oc ol s
embe supported , the part of tran saction
man age ment th at define s the prot oc o l for inte r
act ion with re mote t ran sac tion man age rs sh ould
be se parated f ro m the part th at coordinates t ran s
act ion exec ution ac ross loc a l re sources. In the
DECdt a architecture, the forme r is c alled a c o m mu
nic at ion man age r, and the latte r is c al led a t ran s
act ion manage r.
Inte rope rat ion of t ran s action m an age rs and
re source man age rs, such as (latabasc syste ms, also
affect s the m od ul arization of DEC:dt a c omponent s.
A t ran saction may inv olve clifferent ty pe s of
re source s, as in Ste p :)a. For example , it may update
d at a th at is man aged by different database sy ste ms.
To c ont rol t ran saction c o m mit m en t, th e t ransac
tion man age r must inte rac t w i th d iffe rent re source
man age rs, p ossi bly su pplied by diffe rent vend ors.
This re qui re s th at re so urce man ag e rs be separate
c omponents of DE C:dt a.
The DECdta Architecture
H aving seen whe re t he need fo r DECdt a c ompo
nent s ari se s, we are n ow re ady t o de sc ri be th e
DE Cdt a architec ture as a w hole, inc luding the func
t ion s of and interf aces t o e ach comp onent.
Most DECdt a inte rface s are rmblic . S ome of the
public inte rf ace s are c ont rolled by offic ial stan
dard s bodie s and ind ust ry c onsortia; i .e., they are
"open " inte rf ac es . Oth ers are c ont rolled sole ly by
D igit al. DECdt a inte rf ace s and protoc ols w il l be
published and align ed with ind ust ry st andards, as
appropriate.
DECdt a c omponent s are abst ract entitie s. They
do n ot nece ssari ly map one-t o-one to hardware
component s, software c omponent s (e .g ., p ro
g rams or prod uct s), o r exec ution envi ron ment s
(e .g ., a single-th re aded p roce ss, a multith re aded
process, or an ope rating sy ste m se rvice). Rathe r, a
DE Cdt a c omponent m ay be i mple mented as m u lti
ple software c omponents, for ex ample, as seve ral
1 2
proce sse s . Alte rnatively. sev era l DECdt a c o mpo
nen ts may be imple men ted as a s ing le software
c omponent. For ex ample, an ope rating system o r
TPm onit or ty pic a l ly offe rs th e fac il ities of more
th an one DECdt a c ompon en t.
The f ollowing are th e c ompon en ts of DEC:d ta:
•
An applic a tion p rog ram is any prog ram that
use s se rv ice s of D ECdta com pon ent s
•
A re sou rce man ager man ag es resourc es th at sup
port t ran sact ion se mantic s.
•
A t ran saction m an age r c oordin ates tran sac ti on
te rmin at ion (i.e , c o m mi t and abort).
•
A c om munic ati on man age r supports a t rans
ac tion c o m m unic at ion protoc ol between
Tl'syste ms.
•
A p re sent ation man ag e r support s d ev ic e-inde
pendent inte ract ion s with a presen tation d evic e.
•
A re q ue st m an ag er fac i li t ates th e subm ission of
re que sts to exec ute t ran sactions.
DECdt a c ompon ent s are l ay e red on serv ice s that
are p rovided by the underlying operating sy ste m
and dist ributed syste m platform, and arc n ot spec i
fic t o
Tl',as sh mvn in Figure 2.
Application Program
We usc the term app l ic ation prog ra m to mean a
prog ram th at use s th e services provid ed by oth e r
DECd ta c ompon ent s . An app lic ation p rog ram
c o u ld be a c ust omcr-wri tt cn prog ram, a laye red
prod uct . or a DfUita c omponent .
In the D ECdt a arch i tecture, we disting uish tw o
special types of app l ic ation prog ra m : request ini
tiat ors and t ran sact ion se rve rs. A re quest in it iator
is a DECd ta c o mpon ent that prepares ami submi t s
a req ue st for the exec ut ion o f a t ran sact ion.
Tbc reate a re q ue st, t he re que st initiator usua
IIy inte r
act s with a pre sent ati on m an age r that provide s an
inte rface t o a device, such as a te rmin al, a w ork
station, a dig it al priv ate branch exchange, or an
aut o m ated telle r machine .
A t ran s acti on se rve r c an d emarc at e a t ran s
acti on, inte ract with one or more resourc e man
age rs t o acce ss rec ove rable re sourc e s on behalf of
the t ran saction, inv oke ot her t ran sac tion serve rs,
and re spond t o c alls f rom request initi at ors.
For a s im p le re q ue st , a t ransac ti on serv e r
receives the re que st , proce sse s it, and opti on ally
ret urn s a re ply t o the re q ue st initiat o r. A c onve r
sation al re que st is like a simple re que st, exc ept th at
while p roce ssing the re q ue st, t he transac t ion
A P P L ICATION PROGRAMS
TP S E R V I C E S
R EQUEST
I N ITIATOR
R E QU EST MANAGER
P R E S E NTATION MANAGER
R E Q U EST MANAGER
OPERATING SYSTEM A N D D I S T R I BUTED SYSTEM S E R V I C E S
DIST R I B U T E D NAME S E R V I C E
DISTR I BU T E D T I M E S E R V I C E
T H R E A D MANAG E M E N T S E R V I C E
TRANSACTION S E R V E R
RESOU RCE MANAG E R
OTH E R
COM M U N I CATION MANAGE R S
TRANSACTION MANAGER
[image:15.594.73.523.88.367.2]U I D S E R V I C E A U T H ENTICATION S E R V I C E
Figure 2 DECdta Components and Interfaces
server exchanges one or more messages with the user, usuall y through the request initiator.
In principle, a request ini tiator coulll also execute transactions (not shown in Figure
2).
That is, the dis tinction between request i n i t iators and transaction servers is for clarity onl y, and does not restrict an appli cation from perform ing request initiation func t ions i n a transaction. Architectural ly, this amounts to saying that request initiation fu nctions can exe cute in a transaction server.Resource 1l1anager
A resource manager performs operations on shared resources. We are especia l l y i nterested i n recover able resource managers, those that obey transaction semantics. In particular, a recoverable resource manager undoes a transaction's updates to the resources if the transaction aborts. Other recover able resource manager activities i n support of trans actions are described in the next section. In the rest of this paper, we use " resource manager" to mean " recoverable resource manager."
In a TP system, the most common k i nd of resource manager is a database system. Some pre sentation managers and communication managers may also be resource managers. A resource
man-Digita/ 1ec1Jitical jourt�al 1-'11/ . .> Nu. I \Vinter I'J'JI
ager may be wri tten by a customer, a third party, or D igital.
Each resource manage r type offers a resource manager-specific interface that is used by applica tion p rograms to access and modify recoverable resources managed by the resource manager. A des cription of these resource manager i nterfaces is outside the scope of DECdta. However, many of these resource manager interfaces have archi tec tures defined by industry standards, such as SQL (e .g., t he VAX Rdb/Vtv!S product), CODASYL data man ipulation language (e.g., the VAX DB,'v!S product), and COBOL fi le operations (e.g. , RNIS i n the VMS system). One type of resource manager that plays a spe cial role in TP systems is a queue resource manager. It manages recoverable queues, which are often used to store requests. ' I t allows appl ication pro grams to p lace elements i nto queues and retrieve them, so that appl ication programs can com muni cate even though they execute i ndependently and asynchronou s l y. For example, an appl ication pro gram that sends elements can communicate with one that receives elements even if the two applica t ion p rograms are not operationai simultaneously. This communication arrangement improves ava i l abil i ty and faci litates hatch input of elements.
Transact ion Processing, Databases, and Fault-tolerant Systems
A queue resource manager i n terface supports such operations as open-queue, close-queue, enqueue, dequeue, and read-elemen t . The ACMS and DEC in tact TP moni tors both have queue resource managers as components.
Transaction Manager
A t ransaction manager supports the transact ion abstraction. It is responsible for ensur i ng the atom icity of each transaction by tel l i ng each reso urce manager in a transaction when to com m i t . It uses a two-phase comm i t p rotocol to ensure that ei ther all resource managers accessed by a t ransaction comm i t the transaction or they all abort the t rans action. ' To support transaction atomici ty, a t rans action manager provides the fo l lowing functions:
• Transaction demarcation operations allow appli
cation p rograms or resource managers to start and commi t or abort a transaction. (Resource managers sometimes start a transaction to exe cute a resource operat ion if the caller is not executing a transac t ion. The
SQL
standard requires this.)• Transaction exec u t ion operations al low resource managers and com munication man agers to declare themselves part of an existing transaction.
• Two-phase com m i t operations al low resource
managers and communication managers to change a transaction's state (to "prepared," "com mitted," or "aborted ").
The serial izabi l i ty of t ransactions is primari l y the responsibil ity of the resource managers. Usual ly, a resource m anager ensures serial izabi l i ty by set t i ng locks on resources accessed by each transaction, and by releasing t he locks after t he transact ion manager tel l s the resource manager to commit. (The latter activi ty makes serial izabi l i ty partly the respo ns ibility of the t ransaction manager.) If t ransactions become dead locked, a resource manager may detect the dead lock and abort one of the dead locked transact ions.
The durability of transactions is a responsibi l ity of transaction managers and resource managers. The t ransaction manager is responsible for the durabi l i t y of the com m i t or abort decis ion. A
resource manager is responsible for the durabi l i ty of operations of com m i t ted transactions. Usually, i t ensures durabi l it y by storing a description of each t ransact ion 's resource operations and state changes in a stable (e.g., d isk- resident) log. It can
14
later use t he log to reconstruct transactions' states while recovering from a fa i lure.
A deta i led description of the DECdta transaction manager component appears in the Transact ion Manager Architecture section.
Communication Manager
A com munication manager provides services for communication between named ob
j
ects i n a TP system, such as application programs and trans action managers. Some commun ication managers part icipate in coord i n a t i ng the term i nation of a transaction by p ropaga t i ng the transaction man ager's two-phase comm i t operations as messages to remote communication managers. Other com munication managers propagate application data and transact ion context, such as a t ransaction iden tifier, from one node to another. Some do both.A TP system can support multiple commun ica tion managers. These communication managers can interact with other nodes us i ng d ifferent com m i t protocols or message-passi ng p rotocols, and may be part of d ifferen t name spaces, securi ty doma i ns, system management doma i ns, etc. Examples are an IBM SNA LU6.2 commun ication manager or an ISO-TP communication manager.
By support i ng m u l t iple com munication man agers, the DECdta architecture enhances the i nter operability ofTP systems. D i fferent TP systems can i nteroperate by execu t i ng a t ransact ion using d if ferent com m i t protocols.
A com munication manager offers an i n terface for application p rograms to comm u n icate w i t h other application programs. Different communica tion managers may offer d ifferent communication paradigms, such as remote procedure call or peer to-peer message pass i ng.
A com munication m anager also has an i nterface to i ts local t ransaction manager. It u ses this i n ter face to tel l the transaction manager when a trans action has spread to a new node and to obt a i n i nformation about transaction commitment, which it exchanges w i th comm u n i cation managers o n remote nodes.
Presentation Manager
A p resentation manager provides an appl icat ion p rogram with a record-oriented i n terface to a pre sentation device. Its services are used by applica tion p rograms, usual ly request i n i t iators. By using presentation manager servi ces, i nstead of d i rectly access i ng a p resentation device, appl ication pro grams become device i ndependent.
A forms manage r is one type of presentation manager. Just as a database system supports opera t ions to define, open, close, and access databases, a forms m anager supports operations to defi ne, enable, d isable, and access forms. A form i ncludes the defi n i t ion of the fields (wi t h different attributes) that make up the form. I t also i ncludes services to map the fields into device-i ndependent application records, to pe rform data validation, and to perform data conve rsion to map fields onto device-specific frames.
One presentation manager is D igital's DEC:forms forms management p roduct. The DECforms prod uct is the first i mplementat ion of the A NSI/ISO Forms Interface Management Systems standard (COOASYL FIMS) .'
Request Manager
A request manage r provides services to authenti cate the source of requests (a user ami/or a presen tation device), to subm i t requests, and to receive repl ies from the execu tion of requests. It supports such operat ions as send- request and receive- reply. Send- request must p rovide the ident i t y of the source device, the identity of the user who entered the request, the ident ity of the appl ication pro gra m to be i nvo ked, and the i nput data to the program.
A request manager can ei ther pass the request di rect ly to an application program , or it can store requests in a queue. In t he latte r case, anot her request manage r can subsequently schedule the request by dequeuing the request ami i nvoking an a pplication p rogram. The ACMS System Interface is an example of an ex isting request manager inter face for d irect requests. The ACMS Queued Trans action Ini tiator is an example of a request m anager that schedules queued requests.'
Transaction Manager Architecture
OECdta components are t ied together by the t rans action abstraction. Transactions al low application programs, resou rce m anagers, request managers (ind irectly through queue resource managers), and commun ication managers to inte mperate reliably. Si nce transactions p lay an especially important ro le i n the O ECdta archi tecture, we describe the transaction management funct ions in more det a i l.The OECdta archi tecture i ncl udes i nte rfaces between transaction managers and applicat ion p rograms, resource managers, and communication manage rs, as shown in Figure
3.
I t also i ncl udes aDigital Tedmical Jour11al 1'<>1. .i 1\i>. I Winler I') VI
APPLICATION PROGRAM
OTH ER
COMMUNICATION MANAGERS
Figure 3 Transaction Manager A rchitecture
transaction manager protocol, whose messages are propagated by communication managers. This pro tocol is used by D igital's D EC :dtm d istributed t rans action manager.'
From a t ransaction manager's viewpoint, a trans action consists of transact ion demarcation opera t ions, transact ion execution operat ions, two-phase com m it operat ions, and recovery operations.
• The t ransaction demarcation ope rat ions are
issued by an application program to a transac tion manager and incl ude ope rat ions to start and e i ther end or abort a t ransaction.
• Transaction execur ion operations are issued by resource managers ami commun ication man agers to a transaction manager. They i nclude operat ions
For a resource manager or com m unication manager to join an existing transaction
- For a commun icat ion manager to tel l a t rans action manager to start a new branch of a t ransaction that al ready exists at another node
• Two-phase com m i t operat ions are issued by a transaction manager to resource managers, commun ication managers, and through com munication managers to other t ransaction man agers, and vice-versa. They i nclude operat ions
- For a transaction manager to ask a resource manager or commun ication manager to p re pare , comm i t, or abort a transaction
For a resource manager or commun ica t ion manager to tel l a transaction manage r whether i t has p repared, com m i t ted , o r aborted a transaction
Transaction Processing, Databases, and Fault-tolerant Systems
- For a com mu n ication manager to ask a t rans action manager to p repare, co m m i t, or abort a t ransaction
- For a transact ion manager to te l l a com mu n ication manager whether it has prepared, com m i tted, or aborted a transaction
• Recovery operat ions are issued by a resource
manager to its t ransaction manager to deter m i ne the state of a t ransaction (i . e . , com m i tted or aborted).
In response to a start operat ion i nvoked by an application program, the transaction manager d is penses a unique transaction ident ifier for the trans action. The transaction manager that processes the start ope ration is that t ransact ion's home t rans action m anager.
When an application program invokes an opera tion supported by a resource m anager, the resource manager must find out the t ransaction identifier of the appl ication p rogram's t ransaction. This can happen in d iffe rent ways. For example, the appl ication p rogram m ay tag the operation with the t ransaction ident ifier, or the resource m anager may look up the transact ion identifier in the app l i cation program's context. When a resource man ager receives i ts first operation on behalf of a transaction, T, i t must join T, meani ng that it must tell a transact ion manager that i t is a subordinate for T. AJ ternatively, the DECdta architecture sup
ports a model in which a resource manager may ask to be j o ined automatically to all transactions man aged by its transaction manager, rather than asking to join each transaction separately.
A t ransact ion , T, spreads from one node, Node 1, to another node, Node 2 , by send i ng a message (through a commun ication manager) from an appl i cation p rogram that is executing T at Node 1 to an application p rogram at Node 2 . When T sends a message fro m Node 1 to Node 2 fo r the first time, the communication managers at Node 1 and Node 2 m ust perfor m branch registration. This fu nction may be performed automatica l l y by the commu nication managers. Or, it may be done man ually by the application program , which tell s t he comm unication managers at Node 1 and Node 2
that the transaction has spread to Node 2. In ei ther case, the result is as fol lows: the com m unication manager at Node 1 becomes the subord inate of the t ransaction manager at Node 1 for T and the supe rior of the com m u n ication manager at Node 2 for T; and the com munication manager at Node 2 becomes the superior of the transaction manager
1 6
at Node 2 fo r T. This arrangement allows the com mit protocol between transact ion managers to be propagated p roperly by com munication m anagers. After the transaction is done with i ts applicat ion work, the appl ication p rogram that started transac t ion T may i nvoke an "end" operation at the home transaction manager to commit T. This causes the home transact ion manager to ask its su bord i nate resource managers and co m munication m anagers to try to co m m i t T. The t ransaction ma nager does this by using a two-phase commit p rotocol. The p rotocol ensures that ei ther all subord inate resource managers com m i t the transaction or they all abort the t ransaction.
In phase 1 , the home transaction manager asks its subordi nates for T to prepare T. A subord inate p repares T by doing what is necessary to guarantee that it can either com m i t T or abort T if asked to do so by its superior; this guarantee is valid even if i t fa ils i mmed iately after becom i ng p repared . To p repare T,
• Each subordin ate for T recmsively propagates
the p repare request to i ts subordinates for T
• Each resource manager subordi nate writes a l l of
T's updates to stable storage
• Each resource manager and transaction manager
subord i nate writes a prepare-record to sta ble storage
A subord i nate fo r T repl ies with a "yes " vote if and when i t bas completed its stable writes and a l l o f i t s subordinates for T have voted " yes" ; other wise, it votes " no.'' lf any subord inate for T does not acknowledge the request to prepare within the t imeout period, then the home transaction man ager aborts T; the effect is the same as issuing an abort operation.
In phase 2 , when the home transaction manager has received "yes" votes from all of its subordinates for T, i t decides to comm i t T. It writes a com m i t record for T t o stable sto rage a n d tells i t s subordi nates for T to com m i t T. Each subord i nate for T writes a com m i t record for T to stable storage and recursively p ropagates the com m i t request to i ts subord i n2.tes for T. A subord i nate for T rep I ies with an acknowledgment if and when i t has com m itted the transaction (in the case of a reso urce m anager subord inate) and has received acknowledgments from all subord inates for T. When the home trans action manager receives acknowledgments fro m a l l o f i t s subordi nates fo r T, the transaction com m i t ment is complete.
To re cove r from a f a ilu re, all res ource manage rs
that part icipated in a trans action mu st exa m i ne
the i r logs on s table s torage to de te rm i ne w hat to
do.
If the log contains a commit
or abort recordfor
T,t he n
Tcomple ted.
Noact ion is requ i red. If the
log conta i ns no p rep are , com m it, or abort record
for T, the n T w as act ive.
Tmus t be aborted.
If t he
l og con t a i ns a p repare reco
rd for T, bur nocom
m i t or abort re cord for T, T w as b
etw
ee n p
hases
Iand
2. Theres ou rce manage r mus t ask i ts
su
peri
ortransaction manag e r w hether to commi t
or abort t hetrans act ion.
An i
nhcrenrp
ro b
le
m
in aU two
-ph
as
e comm i tproto cols is
that a resource manager is blocked betweenp
ha
s
es
I and 2. that is, after vot i ng "y
es
" and before receivi ng the com m i t or abort decision. It cannot com m i t orabort t
he transaction u nt i lt he
trans action m anag
e r tel lsi t w h ich to
do. I f i ts trans
action m anage r
fa i ls,t he res ou rce manag e r
maybe
block ed i ndef i n i tel y, u n til e i t he r the t ransaction
manage r re cove rs or an ex te rnal age nt,
such asa
system
ma nage r, s teps i n
to tel l t he re s
ou r
ce man
age r w he t he r to
co m m i t orabo rt.
A
trans action
Tmay s pontane ousl y a bort due to
syste m e rrors
atany
rimedu ring i rs execu t i o n.
O r,an appl ication p rog ram (p rior to comp
let
in
gits
work ) o r
ares ource manage r (p rior
t ovo t i ng "yes")
may tell i ts trans ac t io n
manager to abort T.I
ne i the r case ,
the t ransaction manager t hen tel l s a l lof i ts su bord i nates for
T to undot he e ffe ct s
of T's res ource manage r op
erations
.S u bord i nate
re source manage rs abort T,
and su bord i nate
com mun ication managersrecursivel y prop ag ate the
abort re ques t to the i r su bord i nates fo r T.
The two-phase commit
p ro to col is
opt i m i
zed for
t hose cases i n w h ich t he nu mber
ofmessag es
e
xch
ang
ed can be red uced below that
ofthe g e n
eral
c
as
e(e.g. , if the re is onl y o ne su bord i nate
res ou rce manage r. if a
resourcemanag e r d id not
mod i fy res ou rces, or
if thepr
esu m
ed-
ab
ort
protocol
was us ed to s ave
acknowledgments)."Summary
We
h
a
ve
presented an ove rview ofthe DECdta
archi tecture. Asp a rt
of this overview,we i n t r
o ducedthe
components and expla ined t hefu nction
of each i n
t
c
rfa
ce
. Weals o d es cribed
tileD ECd ta
trans act ion manag eme n t an:hi recrure in
some dera i l. Ove rt i me, many i nte rf aces of the DECd ta
model w ill be m ade pu bl ic via prod uct of
f
er
ing
s
or ar
c
hi tectur
epu b! ications .
Digital Teclmical jounwl l'ol . . > .
\
'u. I Winter I')<) IAcknowledgments
T his architecture g rew f rom dis cu ssions w i t h many
col le
agues
.We thank
them a l l
for
their help,espe
cially D ieter G awl ick, B ill La i ng , Dave Lomet, Bru ce
M an n , B
arry
Ru b
i
ns
on
,D
iog
e
n
es
Torres,and
the TP
archi tecture g roup , i nclud i ng Edw ard B ragi nsky,
T
ony De
l laFe
ra, Geo
r
ge Gaj nak,
Per G y l lstrom,and
Yoav
Raz.References
1 . T. Speer and M . Storm, " D igital's Transaction Process ing Monitors," Digital Technical journal, vol . 3. no. I
(W
in te
r 1991 , thi
s
issu e):
18-32.2.
W L1
ing, J. joh
n
son,
and R. Landau, "TransactionM anag ement Su
pp
ort in the VMS Ope rati ng
Sy
ste
m
Ke rnel," Digital Technical journal, vol . 3, no. 1(Wi
nte
r 1991 ,this issue):
:B-44.
3.
PB ernste i n ,
VH
a
dzi
lac
os
,and
N.G o od man,
Concurrency Control and Recouery in Database Systems( Re ad ing,
MA:Add is on-Wes le y,
1987).4 . P Bernste i n , M.
H su , and B.
Mann, " I mplement i ngRe
covera b
leRe q ues ts Us i ng Q ueues,"
Proceedings 1 990 ACM StG/viOD Conference on Management of Data (May 1990).
5. FIMS journal of Developrnent (Norfo l k,
VA:
CODASYL FIMS
Committee,Ju l y
1990).6. C. Mohan,
B.
Linds ay, and R.O bermarck,
"Trans action M anage ment i n t he
R*D istribu ted
D atabase M anag ement
Sy
stem
," ACM Trans actions on Database .�vstems, vo l. 1 1 ,n
o. 4(De
ce mb
er 1986)Digitals Transaction
Processing Monitors
T homas G. Speer Mark W. Storm
Digital provides two transaction processing
(TP)
monitor products -ACi\115(Application Control and Management System) and DECintact (Integrated Appli cation Control). Each monitor is a unified set of transaction processing services for the application environment. These services are layered on the Vi\:15 operating .\)'S tem. Although there is a large junctional overlap between the two, both products achieve similar goals by means of some significant�y different implementation strategies. Flow control and multithreading in the ACM5 monitor is managed by means of a fourth-generation language
(4GL)
task definition language. Flow control and multithreading in the DECintact monitor is managed at the application level by third-generation language()GL)
calls to a library of services. The ACM5 monitor supports a deferred task model of queuing, and the DECintact monitor supports a message-based model. Over time, the persistent distinguishing feature between the two monitors will be their differeYJt application programming interfaces.Transaction p rocessing is the execution of an application that performs an administrative fu nc t ion by accessi ng a s hared database. Within t rans action process ing, p rocess i ng mon itors provide the software "glue" that ties together many soft ware components into a transaction p rocess ing system solu t ion.
A typical t ransaction p rocess ing application involves interaction with many term inal users by means of a p resentation manager or forms system to collect user req uests. Information gathered by the p resentation manager is then used to query or update one or more databases that reflect the cur rent state of the busi ness. A characteristic of t rans action processing systems and appl ications is many users performing a small number of similar funct ions agai nst a common database . A t rans action processing monitor is a system environment that supports the efficient development, execu tion, and management of such applications.
Processing moni tors are usually built on top of or as extensions to the operating system and other products such as database systems and presenta t ion services. By so doing, add i tional components can be integrated into a system and can fil l " holes" by providing functions t hat are specifical ly needed by transaction process i ng appl ications. Some examples of these fu nctions are appl ication con
trol and management, t ransaction-processi
ng-1 8
specific execu t ion environments, and t ransaction processi ng-specific programming inte rfaces.
D igi tal p rovides two t ransaction p rocessing monitors: the Appl ication Control and Manage ment System (ACMS) and the DECintact monitor. Both moni tors are built on top of the VMS operat ing system . Each m on i tor provides a unified set of transaction-process ing-specific services to the application environment, and a large functional overlap ex ists between the services each monitor provides. The d istinguishing factor between the two monitors is i n t he area of appl ication p rogram m ing styles and interfaces - fourth-generation language (4GL) versus third-generation language
(3GL). This d istinction represents D igi tal's recog n ition that customers have their own styles of application program m i ng. Those that prefer 4GL styles should be able to b u i ld t ransaction p rocess ing applications using D igital's TP monitors with out changing t heir sty le. Simi larly, those t hat prefer 3GL styles s hould also be able to bu ild TP applica tions using D igi tal's TP moni tors without changing their style.
The ACMS monitor was first introduced by D igital i n