A database management system consists of a database (a collection of information, usually organized into records with component fields) and facilities for adding, updating, retrieving, manipulating, and reporting on data.
d
atabases
tructureIn the early days of computing, a database generally con- sisted of a single file that was divided into data blocks that in turn consisted of records and fields within records. The COBOL language was (and is) particularly suited to read- ing, processing, and writing data in such files. This flat file database model is still used for many simple applica- tions including “home data managers.” However, for more complex applications where there are many files containing interrelated data, the flat file model proves inadequate.
In 1970, computer scientist E. F. Codd proposed a rela- tional model for data organization. In the relational model, data is not viewed as files containing records, but as a set of tables, where the columns represent fields and the rows individual entities (such as customers or transactions).
A field (column) that two tables have in common (called the key) can be used to link the two. For example, consider a table of customer information (name, customer number, address, current balance, and so on) and a table of trans- action information (product number, quantity, customer number of purchaser, and so on).
To find all the items purchased by a particular customer, the relational database uses the common field (the customer account number) to join the two tables. A query can then select all records in the transaction file whose customer number field matches the current customer in the customer file. (Notice that the validity of a key field depends on its being unique: If each customer doesn’t have one [and only one] customer number, any report of purchases will not be dependable.)
A procedure called normalization is often used to create a set of tables from a set of data files and records, such that no fields contain duplicate information. This is necessary in order to ensure that a piece of information can be updated and the update “propagated” to the entire database without missing any instances.
Relational databases usually also enforce referential integ- rity. This means preventing changes to the database from causing inconsistencies. For example, if table A and table B are linked and a record is deleted from table A, any links to that record from records in table B must be removed. Simi- larly, if a change is made in a linked field in a table, records in a linked table must be updated to reflect the change.
During the 1980s, the dBase relational database pro- gram became the most popular DBmS on personal comput- ers. microsoft Access is now popular on Windows systems,
Because both the Customer Record and the Transaction Record include the Customer Number field, it is easy to pull information from both databases into a single report, such as a summary of pur- chases for each customer.
and Oracle is prominent in the UNIx world. Beginning in the 1980s, SQL (Structured Query Language) became a widely used standard for querying and manipulating data tables, and most DBmS implement SQL (see sql).
t
rendsThe embracing of object-oriented programming principles starting in the 1980s has led to development of object-ori- ented database structures (see object-oRientedpRogRam-
ming). In this approach tables, queries, views, and other components of the DBmS are treated as objects that present their functionality through interfaces (much in the way a class in an object-oriented program does). This approach can improve data integrity, flexibility (such as through the ability to define new operations), and the development of new capabilities derived from predecessor objects. Object models are also helpful in dealing with a networked world in which data tables are often stored on separate computers.
As important as changes in the architecture of databases have been, the impact of a changing environment has prob-
ably been even more significant. In particular, Web sites of all kinds are increasingly being driven by databases (such as for inventory and order processing for e-commerce). In turn, many databases of all sizes and types are now accessible and searchable via the Web. This has meant a new emphasis on rapid development of database programs, particularly using scripting languages, as well as fast and efficient Web-based database processing (see also ajax). While the traditional high-end corporate database systems such as Oracle and SQL Server are still vital for the enterprise, open-source alternatives (particularly mySQL) are in widespread use for many applications including wikis and content-management systems. The use of flexibly structured data (see xml and
semantic Web) to link and transform databases has also expanded database concepts in the Web-centric world. Further Reading
Allen, Christopher, Catherine Creary, and Simon Chatwin. Intro- duction to Relational Databases. Berkeley, Calif.: mcgraw-Hill Osborne, 2003.
Microsoft Access is a popular relational database program for personal computers. It can be used for both simple (“flat file”) databases and for complex databases with many interrelated files.
Hellerstein, Joseph S., and michael Stonebreaker, eds. Readings in Database Systems. 4th ed. Cambridge, mass.: mIT Press, 2005.
Hoffer, Jeffrey A., mary Prescott, and Fred mcFadden. Modern Database Management. 8th ed. Upper Saddle River, N.J.: Pren- tice Hall, 2006.
Powell, gavin. Beginning XML Databases. Indianapolis: Wrox, 2006.
“Web Programming: Databases.” Available online. URL: http:// w w w.webreference.com /programming /databases.html. Accessed July 8, 2007.
Williams, Hugh E., and David Lane. Web Database Applications with PHP and MySQL. 2nd ed. Sebastapol, Calif.: O’Reilly media, 2004.