3.3 Transformations in Traditional Software Development
3.3.2 Object-Relational Mapping and Migration Frameworks
The definition of SQL scripts is an efficient solution for the problem of manipulating the database’s definition – or its mapping to MDE, the metamodel – and ensuring that existing data (the model) is kept consistent. Nevertheless, this approach presents an additional problem, as different DBMSs usually have differences in the supported SQL dialect(s). Typical examples of such differences are that (1) Microsoft’s SQL Server supports the Transact-SQL dialect, while PostgreSQL and Oracle’s databases support the PL-SQL dialect, or that (2) PostgreSQL supports the automatic increment of a primary key’s value, while Oracle requires that the database designer explicitly define a sequence to provide such values. Such differences present significant difficulties when developing applications that support various databases, like some CMS systems.
To address this issue, developers sometimes use the Strategy design pattern [GHJV 95] (or similar) to decouple the application’s logic from its data access operations; Web-
Comfort8 [SS 08 b] uses this approach to provide support for the Microsoft SQL Server, PostgreSQL, and Oracle DBMSs. Another tactic is the dynamic rewriting of the SQL scripts that are used when accessing the database; this tactic is used by Drupal9 [BBH+ 08],
which only issues MySQL statements, to enable support for databases like Microsoft SQL Server (although, at the time of the writing of this dissertation, Drupal’s developers do not recommend the usage of this feature). All of these strategies, although solving the problem, lead to a loss of productivity when developing applications with support for various kinds of databases. This is because the same kind of work – establish mappings between source code (or domain concepts) and the relational model, namely the SQL dialect for each supported DBMS – must be performed (either manually or by using template-based data access code generators) by developers whenever any changes to the domain occur.
A strategy that effectively addresses the mismatch between object-oriented program- ming language concepts and the relational model is the use of an Object-Relational Mapping (ORM) framework. Each ORM framework presents its own tactic to (as the name suggests) establish a mapping between a certain language’s – or framework – pro- gramming concepts and the relational model. A brief list of examples of ORM frameworks (and their respective tactics) follows:
8
http://www.siquant.pt/WebComfort (accessed on March 15th, 2012)
9
• The ActiveRecord library10 of Ruby on Rails (RoR)11 uses the ActiveRecord ar-
chitectural pattern [Fow 03] to define Ruby classes that map to SQL tables. De- velopers are not required to specify SQL statements: they only have to specify Ruby classes (which must inherit from a predefined class provided by ActiveRecord, ActiveRecord::Base), and the library will handle the mapping to SQL. This tactic is also followed by Django12, in which Python [Lut 09] classes inherit from the django.db.models.Model class13;
• Hibernate14 and NHibernate15 allow developers to map Java and .NET classes
(respectively) to SQL tables. These frameworks require that developers either (1) create an XML mapping file that defines the database schema and the classes that they correspond to, or (2) decorate their source code classes with annotations that will guide the mapping procedure;
• Microsoft ADO.NET Entity Framework (EF)16 also contemplates a number of
mapping approaches: (1) in the Database First approach, a database D is created first, and then an Entity Data Model is automatically generated from D’s schema; (2) in the Model First approach, the Entity Data Model is specified first, and
afterward the database and source code classes are generated; and (3) in the Code First approach, the source code classes and a special DbContext class are defined first, and the mapping to SQL is automatically performed by EF (this approach relies on the usage of a set of annotations, as well as the developer’s adherence to a set of naming conventions).
However, some of these ORM frameworks do not provide a mechanism to handle change to the database’s schema itself. Frameworks such as Django or Microsoft’s Entity Framework require that developers make manual changes to both the database and the source code or mapping files. On the other hand, frameworks such as Hibernate do consider this issue, although in a limited manner17, as its schema upgrade mechanism is based on
the definition of increments to the database’s schema; this mechanism only allows the addition of tables and columns, but not their removal.
Another strategy to address this schema change issue is the use of Migrations (which, in turn, are typically based on the ActiveRecord pattern). Migrations consist of com-
10
http://rubyforge.org/projects/activerecord (accessed on March 15th, 2012)
11
http://rubyonrails.org (accessed on March 15th, 2012)
12
http://www.djangoproject.com (accessed on March 15th, 2012)
13
“Chapter 5: Models”, The Django Book, <http://www.djangobook.com/en/2.0/chapter05> (ac- cessed on March 15th, 2012)
14
http://www.hibernate.org (accessed on March 15th, 2012)
15
http://nhforge.org (accessed on March 15th, 2012)
16
http://msdn.microsoft.com/ef (accessed on March 15th, 2012)
17
Daniel Ostermeier, “Incremental schema upgrades using Hibernate”, <http://www.alittlemadness. com/2006/08/28/incremental-schema-upgrades-using-hibernate> (accessed on March 15th, 2012)
3.3. TRANSFORMATIONS IN TRADITIONAL SOFTWARE DEVELOPMENT
bining ActiveRecord with the Command design pattern [GHJV 95], and are supported by frameworks such as Ruby on Rails’ ActiveRecord (via the ActiveRecord::Migration class), PyMigrate18 for Python, or Microsoft Entity Framework’s Migration mechanism
(which is still under development as of the writing of this dissertation).
The definition of a Migration in the aforementioned ORM frameworks typically involves: (1) considering the database’s definition as a set of classes (via the ActiveRecord pattern or similar); (2) applying the Command design pattern [GHJV 95] to define database transformation unit classes, which provide Execute and Undo methods. These methods are specified in normal source code, which takes advantage of the classes provided by the ActiveRecord implementation. Such implementations typically provide a set of operations over a database, namely to edit its definition (by creating, editing, or removing tables and columns) or even to change the data contained in the database.
Listing 3.10 provides a very simple example of a Migration for the Ruby on Rails framework. More specifically, this Migration adds a boolean column, named ssl enabled and with default value true, to the accounts table. The Undo method (down) simply removes the added column.
Listing 3.10: Example of a Ruby on Rails Migration (extracted fromhttp://api. rubyonrails.org/classes/ActiveRecord/Migration.html).
1 class AddSsl < ActiveRecord::Migration 2 def up
3 add column :accounts, :ssl enabled, :boolean, :default => 1
4 end
5
6 def down
7 remove column :accounts, :ssl enabled
8 end
9 end
Finally, because of their nature, Migrations that involve loss of data (such as the removal of a column) cannot provide an Undo method (down, in the case of Ruby on Rails, see line 6 of Listing 3.10). Such Migrations typically just return an error (or throw an exception, depending on the framework used), signaling that undoing changes is impossible.