Database pages - Microsoft SQL Server 2000 Performance Optimization and Tuning Handbook Ken E

Database pages are used for a variety of tasks. Database pages that are used to hold table rows and index entries are known as data pages and index pages, respectively. If the table contains columns of the data type TEXT or IMAGE, then these columns are usually implemented as structures of Text/Image pages (unless the TEXT/IMAGE data is stored in the row). There are other types of pages also, namely Global Allocation Map (GAM) pages, Page Free Space (PFS), and Index Allocation Map (IAM) pages. We will discuss these types of pages shortly. First, though, let us take out the magnifying glass and take a closer look at a typical page structure. The most common database page we are likely to meet is a data page, so we will use a data page as an example.

Figure 2.14: The basic structure of a database page

There is a fixed 96-byte page header, which contains information such as the page number, pointers to the previous and next page (if used), and the object ID of the object to which the page belongs. The pointers are needed, because pages are linked together, as shown in Figure 2.15. However, this only happens in certain circumstances, as we shall see in Chapter 3.

Figure 2.15: Pages linked in a chain

What does a data page look like inside? The internal structure of a data page is shown in

Figure 2.16. We can see the data rows, but there is also another structure called a row offset table. The row offset table contains two byte entries consisting of the row number and the offset byte address of the row in the page. The first row in our page is at byte offset 96, because of the 96-byte page header. Our row (plus overhead) is 20 bytes in length, so the next row is at byte offset 116 and so on. The row offset table basically gives us a level of

nonclustered indexes may contain pointers to data rows in their leaf-level index pages. Such a pointer is known as a Row ID and is made up of a File ID, database page number, and a row number. The File ID and database page number (a Page ID) take SQL Server to an individual page in a file and the row number and then takes SQL Server to an entry in the row offset table. In our example, the Row ID of the row nearest the fixed page header would consist of the page number, 23, and the row number, 0.

Figure 2.16: The internals of a data page

Entry 0 in the row offset table contains byte offset address 96. SQL Server can then use this offset to retrieve the row. Because the Row ID is implemented this way, we can see that a row can change position in the table without the Row ID having to change. All that has to change is the offset address in the row offset table entry. Why would a row change position in a page? In Figure 2.16, if row 1 were deleted, row 2 may move up to row 0 in order to keep the free space in the page contiguous if a new row needed to be inserted. The Row ID for row 2 would not change.

Note SQL Server 2000 will not shuffle rows like this for the sake of it. It will only do so to accommodate new inserts on the page.

What does a data row look like inside? Data rows contain columns of data, as you would expect, but they also contain overhead. The amount of overhead depends on whether the row contains all fixed-length columns or whether there are also variable-length columns. In Figure 2.17 we have the structure of the Accounts table row in our BankingDB database. The

Accounts table has five fixed-length columns. The first three columns are of type integer, the fourth column is of type money, and the last column is of type char(400).

The first two bytes are used for status bits. The first status byte holds information that tells SQL Server, for example, whether the row is a primary data row or a forwarded row (described in Chapter 3). A status bit in this byte also specifies whether there is variable- length data in the row. In our example there are no variable-length data.

The next two bytes hold a number representing the length of the fixed data in the row. This number is the number of bytes of data plus the two status bytes and these two bytes

themselves.

The fixed-length data now follow. Finally, there are two bytes holding a number that

represents the number of columns in the row and a variable number of bytes holding a NULL bitmap. This contains one bit for every column with a bit set to show whether the column contains a NULL value. (See Figure 2.17.)

The shaded area represents the overhead. Our Account row, which we expected to be 420 bytes in length, has turned out to be 424 bytes in length—and that does not include the fields holding the number of columns and the NULL bitmap.

Suppose the last column in our Accounts table was not a char(400) data type but a

varchar(400). The structure of our row containing variable length data is shown in Figure 2.18.

Figure 2.18: A row containing fixed- and variable-length columns

The structure shown in Figure 2.18 assumes that the account_notes column does indeed contain 400 characters. If it contains less, then less bytes will be used to hold the account notes. We can immediately see two differences between the structure of a row containing only fixed-length columns and a row that also contains variable-length columns. First, the fixed- length columns are grouped together separate from the variable-length columns, which are also grouped together. Second, there are more overhead bytes.

Looking at the structure, the first status byte will now have a bit set to specify that variable- length columns are present in the row. After the two status bytes the next two bytes hold a number representing the length of the fixed data in the row followed by the fixed data, the two-byte field holding the number of columns, and the NULL bitmap. Now we find extra fields. A two-byte field holds the number of variable-length columns followed by a field known as the column offset array, which contains a two-byte cell for each variable-length column used to hold information that SQL Server uses to find the position of the variable- length data.

We can see that the order of the columns in a row that contains variable-length columns is not the same order as the table definition.

Note SQL Server 2000 also allows small amounts of TEXT/IMAGE data to be held inside the row. Normally, TEXT/IMAGE data is held outside the row.

In document Microsoft SQL Server 2000 Performance Optimization and Tuning Handbook Ken England pdf (Page 34-38)