We believe that the best way to learn is by doing — so, let’s start building a database. We will be using SQL Server 7.0, but you can use anything you like. The important thing to take away from this session is not only how to build a SQL Server database, but rather how to
Designing a Database
S E S S I O N
design andbuild a database. Use the RDBMS with which you are most comfortable. If you are a novice, try using Microsoft Access. Access is generally not the RDBMS of choice as the back end for a high traffic Web site, but it’s a good program to start with.
In the remainder of this session, we will show you how to build a music catalog database — band names, band members, albums, and so on. For this purpose, you need to know what information, or data elements, your database will store. For example, you may want to gather the following information about a band:
1. Band Title
2. Music Type Title (the type of music a band plays, for example, Rock & Roll)
3. Record Company Title
4. Albums
5. Band Members
Next, you need to determine the data typefor each element. The data type specifies the kind of information (numeric, character, and so on) and how much space that information takes up in a particular column. Though each RDBMS handles data types differently, you are likely to encounter at least three — characters, integers, and dates — as described in Table 3-1. Table 3-1 RDBMS Data Type Categories
Data Type SQL Server Data Type Comments
Character char(n), varchar Stores character information such as a contact’s first name.
Integer int, smallint, tinyint Stores integer values such as a contact’s age. Integers can also be used to store foreign key values. We’ll get to that later.
Date datetime, smalldatetime Store dates and times such as time stamp information.
Next, you have to decide which data type to assign to each band item. At the same time, you need to determine which elements are optional and which are required (see Table 3-2) and how much space each element will occupy in the database. For example, you can rea- sonably assume that a band’s name won’t exceed fifty characters in length. So you specify in the database design that the column should not accept data elements that are longer than 50 characters in length. Data types represent the kind of data that can be stored in a particular column. For example, if you want to store a band’s name, you will store character data. If you want to store the number of members in the band, you would store the data as a number or integer. Each RDMS supports different data types. SQL Server, for instance, provides VARCHARand CHARamong other data types to store character data, and INTand FLOATamong others to store numeric data.
Refer to your RDMS’s documentation for supported data types.
Table 3-2 Optional and Required DB Elements
Contact Element Data Type Size Optional/Required Band Title VARCHAR 50 Required
Music Type Title VARCHAR 25 Required Record Company Title VARCHAR 25 Required Band Members VARCHAR 200 Required Albums VARCHAR 500 Optional
Table 3-2 does not present an optimal table definition. It is merely a starting point from which to begin discussion. Database normalization techniques will be explored later in this session.
Armed with this information, you can now create your table. Follow a standard naming convention when creating table and column names. For example, you may choose to always name your tables using the following convention:
t_[plural object descriptor]
Since the table you are creating will contain a record, or row, for each band, choose t_bandsas the table name. Use the following convention to name columns:
[singular object descriptor]_[column descriptor]
You don’t need to follow our naming convention, but we highly recommend that you use some sort of object naming convention — it will save you time later.
Figure 3-1 shows the design of your t_bandstable.
Figure 3-1 t_bands table
You’ll notice that we have created a field for each of the data elements we defined earlier. The first field, band_idis our unique identifier. Without going into SQL Server details, we created the unique identifier by selecting the column’s identity field.
Note Note
Your RDBMS should be structured to create a unique identifier for each record. For exam- ple, the first band could be assigned a unique identifier of 1, the second, 2, and so on. In Oracle, this is called a sequence, in SQL Server it is referred to as an identity. You should generally make this unique identifier the table’s primary key. A primary key is a field or group of fields that uniquely identifies a record.
Take great care when defining a table’s primary key. If there’s even the slightest possibility of having duplicate information in a primary key or of your requirements changing in such a way that your primary key is no longer valid, use something else or a sequence number. Good candidates for primary keys might be social security numbers or e-mail addresses, but you can never be too careful. That’s why it’s a good idea to use sequences as primary keys; the RDBMS ensures that this field will be unique.
OK, now let’s talk constraints. A constraint is a mechanism for enforcing the integrity of the data in your table. There are several types of constraints. Among these are primary key constraints, foreign key constraints, unique constraints, and check constraints. Check con- straintsensure that the data entered in a column follows a set of rules. A unique constraint ensures that the data inserted into a column, or group of columns, is not duplicated in the table. A foreign keyreferences the primary key of another table and ensures the data in the foreign key column is present in the referenced table.
The implementation of constraints differs drastically by RDBMS, so instead of actually going through the process of creating a constraint in SQL Server, you need to consider where constraints might be useful in your t_bandstable. Because band titles are generally unique, it’s a good idea to place a unique constraint on the band_titlecolumn.