Introduction to DB2 Objects - DB2 Handout v1.0

Learning Objectives

After completing the session, you will be able to:

Identify the different DB2 objects

Explain the relation between DB2 objects

Describe how the data is being stored in DB2 physically

Database

A database is a collection of data stored in some organized fashion. The simplest way to think of it is to imagine a database as a filing cabinet. The filing cabinet is simply a physical location to store data, regardless of what that data is or how it is organized.

You will see about Database in detail in the later part of this session.

Database: A container (usually a file or set of files) to store organized data.

DBMS

A Database Management System (DBMS) is a set of programs designed for the purpose of managing databases.

Typical examples of DBMS include:

DB2

Oracle

Microsoft Access

Microsoft SQL Server Database Model

A database model refers to the way a DBMS organizes information internally. It is a theory or specification, describing how a database is structured and used.

Common models include the following:

Hierarchical model

Network model

Relational model

Object model

RDBMS

Relational Database Management System (RDBMS) is a type of database management system (DBMS) that stores data in the form of related tables.

Most popular RDBMS includes:

DB2

Oracle

Microsoft SQL Server DB2

DB2 or Database 2 is a Relational Database Management System offered by IBM that runs on IBM Mainframe, AS/400 and on PC. A DB2 database can grow from a small single-user application to a large multi-user system.

Storage Group

A DB2 storage group (STOGROUP) is a set of volumes on direct access storage devices (DASD).

The volumes hold the VSAM data sets in which tables and indexes are actually stored.

Max no of volumes per Storage group is 133 (Ideally 3 or 4). All volumes of a given storage group must have the same device type (3380, 3390, and so no.). But, parts of a single database can be stored in different storage groups. If the volumes in a storage group are different types or if any volume is not mounted or is otherwise invalid, then an error will occur when you try to create a table space or index. Try to assign frequently accessed objects (indexes, for instance) to fast devices and seldom-used tables to slower devices; that choice of storage groups improves performance.

After you define a storage group, DB2 stores information about it in the DB2 catalog. (This catalog is not the same as the integrated catalog facility catalog that describes DB2 VSAM data sets). The catalog table SYSIBM.SYSSTOGROUP has a row for each storage group and

SYSIBM.SYSVOLUMES has a row for each volume.

At installation, the system default storage group is defined. This storage group is named

SYSDEFLT. If you do not explicitly manage your storage, then DB2 uses the default storage group to allocate space.

Following figure represents the physical representation of storage group.

Database in detail

A database is a collection of related data tables or entities. For example, a typical database for an organization would consist of a customer, an order, and order details tables. All these tables are related to one another in some way. In this example, customers have orders and orders have order details. Even though each table exists on its own, collectively the tables comprise a database.

Database is a group of logically related Tablespaces and Indexspaces, which in turn contain tables and indexes respectively.

The default database, DSNDB04, is predefined in the DB2 installation process.

Table Space

Data is actually stored in a structure known as a table space.

Each table space correlates to one or more individual physical VSAM data sets in the DASD volumes of storage Group.

Each table space contains one or more tables.

There are three different types of table space:

Simple table space

Segmented table space

Index Space

An index space is the underlying storage structure for index data.

Each index space correlates to one or more individual physical VSAM data sets in the DASD volumes of storage Group.

It is automatically created by DB2 whenever an index is created.

There can only be one index in an index space.

Table

Tables are logical structures maintained by the database manager.

When you store information in your filing cabinet you do not just toss it in a drawer. Rather, you create files within the filing cabinet, and then you file related data in specific files.

In the database world, that file is called a table. A table is a structured file that can store data of a specific type. A table might contain a list of customers, a product catalog, or any other list of information.

Each table has a name, and within a table, each column has a name. No particular ordering is maintained among the rows of a table, but rows can be retrieved in an order determined by values in their columns. The data in a table is logically related. All table data is assigned to table spaces.

A table consists of data logically arranged in columns and rows.

Column

Tables are made up of columns. A column contains a particular piece of information within a table.

Column is a single field in a table. All tables are made up of one or more columns.

The best way to comprehend this is to envision database tables as grids, somewhat like spreadsheets. Each column in the grid contains a particular piece of information. In a customer table, for example, one column contains the customer number, another contains the customer name, and the address, city, state, and zip are all stored in their own columns.

Row

Data in a table is stored in rows.

Row is a record in a table.

Again, envisioning a table as a spreadsheet style grid, the vertical columns in the grid are the table columns, and the horizontal rows are the table rows.

For example, a customer table might store one customer per row. The number of rows in the table is the number of records in it.

Table Terminology

Following table will give the terminology used in table context:

In this Document Formal Terms Many Database Manuals

Table Relation Table

Column Attribute Field

Row Tuple Record

Types of Table

Base Table: A base table is created with the CREATE TABLE statement and is used to hold persistent user data.

Result Table: A result table is a set of rows that the database manager selects or generates from one or more base tables to satisfy a query.

Summary Table: A summary table is a table defined by a query that is also used to determine the data in the table. Summary tables can be used to improve the performance of queries. If the database manager determines that a portion of a query can be resolved using a summary table, the database manager can rewrite the query to use the summary table.

Temporary Table: A declared temporary table is created with a DECLARE GLOBAL TEMPORARY TABLE statement and is used to hold temporary data on behalf of a single application. This table is dropped implicitly when the application disconnects from the database

Index

An index is a data access aid that can be created on a table. It is an ordered set of pointers to rows in a table.

An index can serve the following purposes:

Improve performance. In most cases, access to data is faster with an index. Provides a fast way to find rows in a table based on their values in the key columns.

Enforces uniqueness rules by defining a column or group of columns as a unique index or primary key.

Provides a logical ordering of the rows of a table based on the key column values.

Clusters the rows of a table in physical storage according to the order of the defined index.

View

A view provides a different way of looking at the data in one or more tables.

View is a virtual table consisting of a SQL SELECT statement that accesses data from one or more tables or views.

A view never stores data. When you access a view, the SQL statement that defines it is executed to derive the requested data.

A view has columns and rows just like a base table. All views can be used just like base tables for data retrieval.

You can use views to control access to sensitive data, because views allow multiple users to see different presentations of the same data. For example, several users may be accessing a table of data about employees. A manager sees data about her employees but not employees in another department. A recruitment officer sees the hire dates of all employees, but not their salaries; a financial officer sees the salaries, but not the hire dates. Each of these users work with a view derived from the base table. Each view appears to be a table and has its own name.

When the column of a view is directly derived from the column of a base table, that view column inherits any constraints that apply to the base table column. For example, if a view includes a foreign key of its base table, insert and update operations using that view are subject to the same referential constraints as is the base table. Also, if the base table of a view is a parent table, delete and update operations using that view are subject to the same rules as are delete and update operations on the base table.

A view can become inoperative (for example, if the base table is dropped); if this occurs, the view is no longer available for SQL operations.

The best way to recognize views is to look at an example. (You will study about the SQLs in detail further. This example is just to make you understand the concept of views).

You have following three tables:

Customers

Orders

OrderDetails

You need to retrieve the customers who had ordered a specific product.

The column details are as follows.

Table Column

AND OrderDetails.order_num = Orders.order_num AND prod_id = 'RGAN01';

Anyone needing this data would have to understand the table structure, as well as how to create the query and join the tables. To retrieve the same data for another product (or for multiple products), the last WHERE clause would have to be modified.

Now imagine that you could wrap that entire query in a virtual table called ProductCustomers. You could then simply do the following to retrieve the same data:

SELECT cust_name, cust_contact FROM ProductCustomers

WHERE prod_id = 'RGAN01';

This is where views come into play. ProductCustomers is a view, and as a view, it does not contain any columns or data. Instead it contains a query—the same query used above to join the tables properly.

Why Use Views:

Here are some common uses of views:

To draw data from multiple tables.

To reuse SQL statements.

To simplify complex SQL operations. After the query is written, it can be reused easily, without having to know the details of the underlying query itself.

To expose parts of a table instead of complete tables.

To secure data. Users can be given access to specific subsets of tables instead of to

To change data formatting and representation. Views can return data formatted and presented differently from their underlying tables.

For the most part, after views are created, they can be used in the same way as tables. You can perform SELECT operations, filter and sort data, join views to other views or tables, and possibly even add and update data. There are some restrictions on this last item. The important thing to remember is views are just that, views into data stored elsewhere. Views contain no data

themselves, so the data they return is retrieved from other tables. When data is added or changed in those tables, the views will return that changed data.

There are some performance issues with views because views contain no data, any retrieval needed to execute a query and that must be processed every time the view is used. If you create complex views with multiple joins and filters, or if you nest views, you may find that performance is dramatically degraded. Be sure you test execution before deploying applications that use views extensively.

Synonym

1. An alternative, private name for a table or view.

2. A synonym can be used only by the individual who creates it.

When a table or view is dropped, all synonyms defined on it are also dropped Alias

1. A locally defined name for a table or view in the same local DB2 subsystem or in a remote DB2 subsystem. Aliases give DB2 location independence because an alias can be created for a table at a remote site, thereby freeing the user from specifying the site that contains the data. Aliases can be used also as a type of global synonym because they can be accessed by anyone, not only by their creator.

2. When a table/view is dropped, all aliases defined on it are NOT dropped

3. Use Synonyms for Program Development, use Aliases for Distributed Applications and use Views for security and joining.

Following table gives the difference between “Synonym” and “Alias”

Synonym Alias

An alternative name for a table or view which should reside in the local DB2 subsystem

An alternative name for a table or view which can reside in the local or remote DB2 subsystem Can be used only by its creator Can be used by anyone including its creator Is dropped when it’s corresponding Table/View is

dropped

Not dropped even when it’s corresponding Table/View is dropped

Physical Storage of Data

The following figure represents how the data is physically stored in DB2 system.

Hierarchy of DB2 Objects

Following figure represents the hierarchy of DB2 objects:

Buffer Pool

Buffer pools are areas of virtual storage in which DB2 temporarily stores pages of table spaces or indexes.

Buffer pools improve database performance. If a needed page of data is already in the buffer pool, that page is accessed faster than if that page had to be read directly from disk. The database manager has agents whose tasks are to retrieve data pages from disk and place them in the buffer pool (pre-fetchers), and to write modified data pages from the buffer pool back to disk (page cleaners).

The reading and writing of data pages to and from disk is called disk input/output (I/O). Avoiding the wait associated with disk I/O is the primary way to improve the performance of the database.

How you create the buffer pool, and configure the database manager and the agents associated with the buffer pool, controls the performance of the database.

In DB2, you have the option of allocating 80 Buffer Pools:

50 4K buffer pools and

10 8K, 16K and 32K buffer pools.

The default buffer pool is BP0.

Summary

Storage Group is a set of volumes on DASD which contains VSAM datasets.

Database is a collection of related data elements.

Table space represents one or more tables which correlate to one or more VSAM data sets in the DASD volumes of storage Group.

Index Space represents an index correlates to one or more VSAM data sets in the DASD volumes of storage Group.

Table consists of data logically arranged in columns and rows.

Index is an ordered set of pointers to rows in a base table.

View is a virtual table that accesses data from one or more tables or views.

Synonym is an alternative name for a table or view which should reside in the local DB2 subsystem.

Alias is an alternative name for a table or view, which can reside in the local or remote DB2 subsystem.

Buffer Pools are areas of virtual storage in which DB2 temporarily stores pages of table spaces or indexes.

Test Your Understanding 1. What is a storage group?

2. What is a database?

3. What is a table space?

4. What is an index space?

5. What is a table?

6. What is an index?

7. What is a View?

8. What is the difference between Synonym and Alias?

9. State the logical and physical objects among the DB2 objects and their hierarchy.

10. How the data is being stored physically?

11. What is a buffer pool?

Session 03: Database Design

Learning Objectives

After completing the session, you will be able to:

Design a DB2 database

Why Should You Be Concerned with Database Design?

You will look at the concept of database design with the real life example.

Say your database is like a custom home and that you are going to have one built for us. What is the first thing you are going to do? Certainly you are not going to hire a contractor immediately and let him build our home however he wishes. Surely you will first engage an architect to design your new home and then hire a contractor to build it. The architect will express your needs as a set of blueprints, recording decisions about size and shape, and requirements for various systems (structural, mechanical, electrical). Next the contractor will procure the labor and materials, including the listed systems, and then assemble them according to the drawings and specifications.

Now let’s return to your database perspective and think of the logical database design as the architectural blueprints and the physical database implementation as the completed home. The logical database design describes the size, shape, and necessary systems for a database; it addresses the informational needs and operational needs of your business. You then build the physical implementation of the logical database design, using your RDBMS software program.

Once you have created your tables, set up table relationships, and established the appropriate levels of data integrity, our database is complete. Now you are ready to create an application that allows interacting easily with the data stored in the database and you can be confident that these applications will provide with timely and above all, accurate information.

It is possible to implement a poor design in an RDBMS, but a well designed database will yield accurate information, store data more efficiently and effectively, and will be easier to manage and maintain.

If a database is designed improperly, users will have difficulty in retrieving certain types of information, and there is the added risk that searches will produce inaccurate information.

Inaccurate information is probably the most detrimental result of improper database design. It can adversely affect the bottom line of a business. In fact, if the data kept and used in a database is going to affect the way a business performs its day-to-day operations or if it's going to influence the future direction of the business, database design must be a concern.

Activities involved in Database Design Designing the database involves:

Logical Database Design

Physical Database Design

Implementing and Altering Database Design

Logical Database Design

Logical database design involves three phases:

Requirements analysis phase

Data modeling phase

Normalization phase

Requirements Analysis Phase

The requirements analysis phase involves examining the business being modeled, interviewing users and management to assess the current system and to analyze future needs, and

determining information requirements for the business as a whole. This process is relatively straightforward.

Data Modeling Phase

The data modeling phase involves modeling the database structure itself. This involves using a

In document DB2 Handout v1.0 (Page 12-40)