1.4 Structure of this thesis
2.1.1 Orthogonality in object persistence
Persistence has been studied by several researchers in the last decades. The main mission of this research work was to solve the well-known ”impedance mismatch” problem between object-oriented applications and databases. That mission has been so difficult to achieve that it already been referred to as the Vietnam of the Computer Science [Neward, 2006][Dearle et al., 2010].
Orthogonal persistence is an approach to solve that ancient problem. Atkin- son et al.[1983] conceptualised this kind of data persistence, first introducing the concept and later adapting it to object-oriented contexts [Atkinson & Morrison,
1995][Atkinson, 2000]. Following three principles, programmers can totally ab- stract from their data in objects, allowing code reuse, and the focus on application logic and data consistency. These three principles are desirable characteristics of
systems that manage persistent data. In short, these principles are:
• Persistence Independence - The same code should be applicable to both persistent and non-persistent objects.
• Type Orthogonality - All objects can be persistent or non-persistent irre- spective of their type, size or any other property.
• Persistence Identification or Reachability - A given object is persistent be- cause it is reachable, directly or through other objects, from a persistent root.
2.1.1.1 Orthogonality consequences
Despite the carelessness offered to programmers, this persistence paradigm does also imply several technical challenges to the designers of these orthogonal sys- tems. In fact, this paradigm of data persistence makes the problem very complex and, moreover, it also introduces performance issues, specially regarding the sys- tem’s main memory management and concurrency.
Considerable research work has been done regarding the cost of orthogonality.
Atkinson & Jordan [1999] discusses issues raised by three years of research work on developing PJama, a Java-based orthogonal persistent system. During this research work, the authors have identified several issues that posed many design problems: achieving orthogonality with classes that have a special relationship with the Java Virtual Machine (JVM); treatment of static variables and keyword transient; concurrency; schema evolution; and performance issues.
Nettles [1993]; Nettles & O’Toole [1996], Zigman & Blackburn [1998], Mar- quez et al.[2000],Blackburn & Zigman[1999] andAtkinson & Jordan[1999] have researched concurrency and transactional issues in the context of orthogonal per- sistent systems.
Atkinson & Jordan[1999] proposed that each Java Virtual Machine execution should act as a single and flat transaction. Inside the context of a Java Virtual Machine, each transaction commit moves the data checkpoints forward, but does not release control of the resources it is using. Furthermore, the concurrency model of programming languages (like Java) does not serve the requirements
of the orthogonal persistence properly in terms of transaction isolation, which suggests the need to add transaction support to programming languages. This also enables the undo ability.
Blackburn & Zigman [1999] argue that the aversion to transactional ap- proaches within the orthogonal persistence community stems from the challenge on orthogonality in the transaction model, since such model has implicit the no- tion of two distinct worlds: application environment and persistent data. Those two worlds do not exist in orthogonal persistent systems. Thus, most of these systems follow ”open” approaches [Blackburn & Zigman, 1999] which are based on checkpointing and explicitly synchronization.
The schema evolution and instance adaptation are other challenges when de- signing orthogonal systems. In section 2.3 the database evolution issue is dis- cussed, with particular focus on orthogonal persistence.
The aforementioned issues motivated some authors, such Cooper & Wise
[1996], to criticize orthogonal persistence and advocate another alternative, less restrictive model named Type-Orthogonal Persistence, opposingAtkinson & Mor- rison [1995]’s model. Analyzing Cooper and Wise’s arguments we conclude that these essentially reflect performance issues and not restrictions made to the pro- grammer or the language.
2.1.1.2 Orthogonality benefits
Despite the many challenges posed by this paradigm, its advantages are many, not only in terms of code reuse, as already mentioned, but also in other perspectives such as data type safety checking, coding error reduction, better code organization promotion and improvement to the applications’ refactoring processes. Atkinson & Morrison [1995] summarized the benefits of orthogonal persistence as:
• improving programming productivity from simpler semantics;
• avoiding ad hoc arrangements for data translation and long-term data stor- age;
• providing protection mechanisms over the whole environment; • supporting incremental evolution; and
• automatically preserving referential integrity over the entire computational environment for the whole life-time of a Persistent Application Systems (PAS).
Other concepts, like Safe Queries [Cook & Rai,2005] and Native Queries [Cook & Rosenberger, 2005], also may provide a better understanding of orthogonality on persistence and its potential in terms of code quality and productivity.
Despite the current generalization as to the use of persistence frameworks, which provide some orthogonality, the orthogonal persistent programming paradigm is still a strange concept for many programmers, novice or senior. Indeed, more experienced programmers, that have their thinking formatted to follow the model ”input, process, output” and mappings between the ”internal” and ”external” data structures, have some difficulty in understanding orthogonality [Atkinson & Jordan, 1999].
2.1.1.3 Conventional persistence approaches
Atkinson [2000] did a survey on persistence mechanism options for Java. Re- gardless of the article’s age and its restrictive technological embracing, it can be generalized to today’s reality and programming technologies, in most common programming practices. Based on that earlier study, Balzer [2005] categorized the conventional persistence approaches for object-oriented programming. That categorization is presented next with some adjustments, which we consider rele- vant.
• Object Serialization: This mechanism is based on encoding and decoding object graphs, respectively, into and from binary (or other such as XML) representations. The mechanism serializes whole object graph structure transitively reachable by a root object. Most common object serialization implementations, like Java, require objects to implement a Serializable interface. This approach limits a large number of Java core classes to be serialized. Furthermore, it does not preserve previously common sub- structures; it only provides navigational access to the serialized objects starting from the root object; and it does not scale very well. Thus, object
serialization breaks orthogonal persistence principles, being only suited for a limited number of cases, such as remote method invocation, where sharing of sub-structures is not desired. Object serialization is a valid complement to a persistence mechanism, but not a replacement thereof.
• Relational Database Interface: This two-tiered architecture, based on an object-oriented programming language and a relational database manage- ment system, suffers from the impedance mismatch between the object model of the programming language and the relational model of the database. Consequently, the programmer must manually maintain all complex map- ping code between those two worlds. Programmer performs that mapping through a well-defined Application Programming Interface (API) which is provided by the programming language. This API offers methods to con- nect to databases and methods to store, update, and retrieve data contained in application’s objects.
• Object Database Interface: Object database interfaces do not suffer from the impedance mismatch as relational interfaces do. Apart from the easy mapping, however, object database interfaces provide persistence-related operations, such as for deletion or transaction control, that rather defeat persistence independence.
• Persistence Frameworks: Persistence frameworks provide a huge selection of persistence facilities, such as access to a wide variety of heterogeneous data sources in case of Java Data Objects (JDO) or distributed persistence in case of Enterprise Java Beans (EJB) and Object-Relational Mapping tools that allow the separation of the object-relational mapping concern from code to specialized XML files. Although the majority of those frameworks provide an object database interface, they do not comply with persistence independence.
2.1.1.4 Orthogonal Persistent Systems
Orthogonal persistence was applied to the presented prototype [Pereira & Perez- Schofield, 2010,2011, 2012, 2014a,b] and others implementations, which in some
cases were not totally compliant with the concept. PS-Algol [Atkinson et al.,
1983], PJama [Atkinson & Jordan, 1999], OPJ [Marquez et al., 2000], Visual Zero [Perez-Schofield et al.,2008], and Thor [Liskov et al.,1996], are examples of those systems.
In Grasshopper [Dearle et al.,1994], the file system and memory management components of the operating system were unified in order to provide a seamless orthogonal environment. Orthogonal persistence is provided by the operating system to the programming language. The authors of this work argue that the approach overcomes the problems of implementing orthogonal persistence at the programming language level. Furthermore, this approach theoretically enables any programming language to provide orthogonal persistence. Following that direction, the authors of this work have ported the framework to several pro- gramming languages [Dearle et al.,1996].
Some object-oriented databases, as well some object-relational mapping tools, also implement some level of orthogonality.