• No results found

DB2 Handout v1.0

N/A
N/A
Protected

Academic year: 2021

Share "DB2 Handout v1.0"

Copied!
228
0
0

Loading.... (view fulltext now)

Full text

(1)

Handout: DB2

Version: DB2/Handout/0308/1.0

Date: 20-03-08

Cognizant 500 Glen Pointe Center West Teaneck, NJ 07666 Ph: 201-801-0233 www.cognizant.com

(2)

TABLE OF CONTENTS

Introduction ... 11 

About this Module ... 11 

Target Audience ... 11 

Module Objectives ... 11 

Pre-requisite ... 11 

Session 02: Introduction to DB2 Objects ... 12 

Learning Objectives ... 12  Database ... 12  DBMS ... 12  Database Model ... 12  RDBMS ... 13  DB2 ... 13  Storage Group ... 13  Database in detail ... 14  Table Space ... 14  Index Space ... 15  Table ... 15  Column ... 15  Row ... 15  Table Terminology ... 16  Types of Table ... 16  Index ... 16  View ... 17  Synonym ... 19  Alias ... 19 

Physical Storage of Data ... 20 

Hierarchy of DB2 Objects ... 20 

Buffer Pool ... 21 

Summary ... 21 

(3)

Activities involved in Database Design ... 23 

Logical Database Design ... 24 

Physical database design ... 36 

Implementing and Altering Database Design ... 37 

Advantage of DB2 over VSAM ... 37 

Summary ... 38 

Test Your Understanding ... 38 

Exercises ... 39 

Session 07: Data Integrity... 40 

Learning Objectives ... 40 

Introduction to Data Integrity ... 40 

Entity Integrity ... 40 

Referential Integrity ... 41 

Rules ensuring RI: ... 41 

Domain Integrity ... 42 

Summary ... 45 

Test Your Understanding ... 46 

Session 08: Interaction with DB2 ... 47 

Learning Objectives ... 47 

Interaction with DB2: Overview ... 47 

DB2I ... 48 

SPUFI ... 49 

QMF ... 53 

Summary ... 58 

Test Your Understanding ... 59 

Session 09: SQL ... 60 

Learning Objectives ... 60 

SQL: Overview ... 60 

DDL ... 61 

DDL – Create Storage Group ... 61 

DDL – Alter Storage Group ... 61 

Try It Out ... 62 

DDL – Create Database ... 62 

Try It Out ... 63 

DDL – Alter Database ... 63 

(4)

Try It Out ... 66 

DDL – Alter Tablespace ... 67 

Summary ... 68 

Test Your Understanding ... 68 

Session 10: SQL ... 69  Learning Objectives ... 69  DDL – Create Table ... 69  Try It Out ... 71  Try It Out ... 72  Try It Out ... 73  DDL – Alter Table ... 73  Try It Out ... 74  DDL – Create Index ... 74  DDL – Alter Index ... 78  Try It Out ... 79  DDL – Create View ... 80  DDL – Alter View ... 80  DDL – Create Alias ... 80  DDL – Create Synonym ... 80  DDL – DROP ... 81  Summary ... 81 

Test Your Understanding ... 82 

Exercises ... 82  Session 11: SQL ... 83  Learning Objectives ... 83  DML – Insert ... 83  DML – Update ... 84  DML – Delete ... 84  DML – Simple Retrieval ... 85 

DML – Retrieving Multiple Columns ... 85 

DML – Retrieving All Columns ... 85 

DML – Sorting Retrieved data ... 85 

DML – Sorting by Multiple Columns ... 86 

DML – Sorting by Column Position... 86 

(5)

DML – Advanced Filtering ... 88 

DML – Wildcard Filtering ... 89 

DML – Removing Duplicates ... 90 

DML – Concatenating Fields ... 90 

DML – Using Aliases ... 90 

DML – Performing Mathematical Calculations ... 91 

DML – Functions... 91 

DML – Aggregate Functions ... 91 

DML – Text Functions ... 92 

DML – Date and Time Manipulation Functions ... 92 

DML – Numeric Functions ... 93 

Summary ... 93 

Test Your Understanding ... 94 

Exercises ... 94 

Session 12: SQL ... 96 

Learning Objectives ... 96 

DML – Grouping Data ... 96 

DML – Filtering Groups ... 97 

DML – Grouping and Sorting ... 97 

DML – Subquery ... 97  DML – Join ...100  DML – Union ...102  DCL ...103  DCL - GRANT ...103  DCL - REVOKE ...103  Summary ...104 

Test Your Understanding ...104 

Exercises ...104 

Session 21: Introduction to Simple COBOL DB2 Application Program ...106 

Learning Objectives ...106  Embedded SQL ...106  DCLGEN ...106  Host Variables ...111  SQLCA ...112  SQLCODE ...113  Try It Out ...113  Summary ...115 

(6)

Test Your Understanding ...115 

Exercises ...115 

Session 26: Program Preparation and Execution ...116 

Learning Objectives ...116 

Steps involved in Program Preparation ...116 

Need for Precompilation ...116 

Tasks of Precompiler ...116 

Bind ...118 

Plan ...119 

Need of Package ...119 

Package ...120 

Understanding Plan and Package more clearly ...120 

Version ...120 

Advantage of Version ...120 

Advantages of Package ...120 

Collection ...121 

Advantages of Collection ...121 

Packages and Plans Storage ...125 

Compile ...125 

Link Edit ...126 

Execution ...127 

Try It Out ...128 

Summary ...131 

Test Your Understanding ...132 

Exercises ...132 

Session 31: Error Handling ...133 

Learning Objectives ...133 

Error Handling - Introduction ...133 

SQLCA ...133 

DSNTIAR ...134 

WHENEVER ...135 

Try It Out ...136 

Summary ...139 

Test Your Understanding ...139 

(7)

Learning Objectives ...140 

COMMIT ...140 

Rollback ...141 

Try It Out ...141 

Summary ...144 

Test Your Understanding ...144 

Session 35: Cursor ...145  Learning Objectives ...145  Cursor - Introduction ...145  Cursor Processing ...145  Cursor Declaration ...146  Cursor Open ...146  Cursor Fetch ...146  Cursor Close ...147 

Using Cursor for Data Modification ...147 

Try It Out ...148 

Summary ...151 

Test Your Understanding ...151 

Exercises ...151 

Session 42: Handling Null ...152 

Learning Objectives ...152 

Null - Introduction...152 

Null Indicator ...152 

Try It Out ...153 

Summary ...161 

Test Your Understanding ...161 

Exercises ...161 

Session 46: Handling VARCHAR ...162 

Learning Objectives ...162 

VARCHAR - Introduction ...162 

VARCHAR - Behavior ...162 

Try It Out ...163 

Summary ...171 

Test Your Understanding ...171 

(8)

Session 51: Locks ...172 

Learning Objectives ...172 

Lock - Introduction ...172 

Table Space - Recap ...172 

Simple Table Space ...172 

Segmented Table Space ...173 

Partitioned Table Space ...173 

Lock Size ...173 

Lock Escalation ...174 

Lock Duration ...174 

Summary ...175 

Test Your Understanding ...176 

Session 52: Locks ...177  Learning Objectives ...177  Lock Mode ...177  Suspension ...179  Timeout ...179  Deadlock ...179 

Constructs that affect locking ...180 

Summary ...181 

Test Your Understanding ...181 

Session 54: Miscellaneous ...182 

Learning Objectives ...182 

Bind Parameters ...182 

Common SQL Errors ...183 

Application Programming Guidelines ...184 

Using EXPLAIN ...184 

Summary ...185 

Test Your Understanding ...185 

Session 55: Miscellaneous ...186 

Learning Objectives ...186 

Table-Based Infrastructure of DB2 ...186 

DB2 Catalog ...186 

(9)

Session 56: Miscellaneous ...191 

Learning Objectives ...191 

Utilities - Introduction ...191 

Data Consistency Utilities ...191 

CHECK Utility ...191 

REPAIR Utility ...192 

REPORT Utility ...195 

DIAGNOSE Utility ...195 

Backup and Recovery Utilities ...196 

COPY Utility ...196 

MERGECOPY Utility ...197 

QUIESCE Utility ...198 

RECOVER Utility ...199 

REBUILD Utility ...200 

REPORT RECOVERY Utility ...201 

Summary ...201 

Test Your Understanding ...201 

Session 57: Miscellaneous ...202 

Learning Objectives ...202 

Data Organization Utilities ...202 

LOAD Utility ...202 

REORG Utility ...203 

Catalog Manipulation Utilities ...204 

CATMAINT Utility ...204  MODIFY Utility ...204  RUNSTATS Utility ...204  STOSPACE Utility ...205  DB2 Commands ...206  Summary ...209 

Test Your Understanding ...209 

Session 58: Miscellaneous ...210 

Learning Objectives ...210 

Dynamic SQL - Introduction ...210 

Dynamic SQL - Types ...210 

When to use Dynamic SQL ...210 

Execute Immediate SQL ...210 

(10)

Parameter marker ...213 

Fixed-list select ...213 

Varying-list select SQL ...214 

Summary ...214 

Test Your Understanding ...214 

Session 59: Miscellaneous ...215 

Learning Objectives ...215 

Stored Procedure - Introduction ...215 

Stored Procedure - Development ...215 

Creating Stored Procedures ...216 

Managing Stored Procedures ...217 

Executing a Stored Procedures ...217 

Summary ...218 

Test Your Understanding ...218 

Glossary ...219 

References ...227 

Websites ...227 

Books ...227 

(11)

Introduction

About this Module

This module deals with the fundamentals of DB2. Target Audience

This module is specifically aimed at entry level trainees. Module Objectives

After completing this module, you will be able to:

‰ Explain the introduction to DB2 Objects

‰ Explain Database Design

‰ Describe Data Integrity

‰ Identify the Interaction with DB2

‰ Describe SQL

‰ Explain the introduction of simple COBOL DB2 application program

‰ Identify Error Handling

‰ Explain Commit and Rollback

‰ Explain Cursor

‰ Handle NULL

‰ Handle VARCHAR

‰ Define Locks

‰ State Application Programming Guidelines

Pre-requisite

To follow the course successfully, a trainee needs to have a prior theoretical as well as hands-on knowledge of the following:

‰ TSO/ISPF

‰ COBOL

‰ JCL

(12)

Session 02: Introduction to DB2 Objects

Learning Objectives

After completing the session, you will be able to:

‰ Identify the different DB2 objects

‰ Explain the relation between DB2 objects

‰ Describe how the data is being stored in DB2 physically

Database

A database is a collection of data stored in some organized fashion. The simplest way to think of it is to imagine a database as a filing cabinet. The filing cabinet is simply a physical location to store data, regardless of what that data is or how it is organized.

You will see about Database in detail in the later part of this session.

Database: A container (usually a file or set of files) to store organized data.

DBMS

A Database Management System (DBMS) is a set of programs designed for the purpose of managing databases.

Typical examples of DBMS include:

‰ DB2

‰ Oracle

‰ Microsoft Access

‰ Microsoft SQL Server

Database Model

A database model refers to the way a DBMS organizes information internally. It is a theory or specification, describing how a database is structured and used.

Common models include the following:

‰ Hierarchical model

‰ Network model

‰ Relational model

(13)

RDBMS

Relational Database Management System (RDBMS) is a type of database management system (DBMS) that stores data in the form of related tables.

Most popular RDBMS includes:

‰ DB2

‰ Oracle

‰ Microsoft SQL Server

DB2

DB2 or Database 2 is a Relational Database Management System offered by IBM that runs on IBM Mainframe, AS/400 and on PC. A DB2 database can grow from a small single-user application to a large multi-user system.

Storage Group

A DB2 storage group (STOGROUP) is a set of volumes on direct access storage devices (DASD). The volumes hold the VSAM data sets in which tables and indexes are actually stored.

Max no of volumes per Storage group is 133 (Ideally 3 or 4). All volumes of a given storage group must have the same device type (3380, 3390, and so no.). But, parts of a single database can be stored in different storage groups. If the volumes in a storage group are different types or if any volume is not mounted or is otherwise invalid, then an error will occur when you try to create a table space or index. Try to assign frequently accessed objects (indexes, for instance) to fast devices and seldom-used tables to slower devices; that choice of storage groups improves performance.

After you define a storage group, DB2 stores information about it in the DB2 catalog. (This catalog is not the same as the integrated catalog facility catalog that describes DB2 VSAM data sets). The catalog table SYSIBM.SYSSTOGROUP has a row for each storage group and

SYSIBM.SYSVOLUMES has a row for each volume.

At installation, the system default storage group is defined. This storage group is named

SYSDEFLT. If you do not explicitly manage your storage, then DB2 uses the default storage group to allocate space.

(14)

Following figure represents the physical representation of storage group.

Database in detail

A database is a collection of related data tables or entities. For example, a typical database for an organization would consist of a customer, an order, and order details tables. All these tables are related to one another in some way. In this example, customers have orders and orders have order details. Even though each table exists on its own, collectively the tables comprise a database.

Database is a group of logically related Tablespaces and Indexspaces, which in turn contain tables and indexes respectively.

The default database, DSNDB04, is predefined in the DB2 installation process. Table Space

Data is actually stored in a structure known as a table space.

Each table space correlates to one or more individual physical VSAM data sets in the DASD volumes of storage Group.

Each table space contains one or more tables. There are three different types of table space:

‰ Simple table space

(15)

Index Space

An index space is the underlying storage structure for index data.

Each index space correlates to one or more individual physical VSAM data sets in the DASD volumes of storage Group.

It is automatically created by DB2 whenever an index is created. There can only be one index in an index space.

Table

Tables are logical structures maintained by the database manager.

When you store information in your filing cabinet you do not just toss it in a drawer. Rather, you create files within the filing cabinet, and then you file related data in specific files.

In the database world, that file is called a table. A table is a structured file that can store data of a specific type. A table might contain a list of customers, a product catalog, or any other list of information.

Each table has a name, and within a table, each column has a name. No particular ordering is maintained among the rows of a table, but rows can be retrieved in an order determined by values in their columns. The data in a table is logically related. All table data is assigned to table spaces. A table consists of data logically arranged in columns and rows.

Column

Tables are made up of columns. A column contains a particular piece of information within a table. Column is a single field in a table. All tables are made up of one or more columns.

The best way to comprehend this is to envision database tables as grids, somewhat like spreadsheets. Each column in the grid contains a particular piece of information. In a customer table, for example, one column contains the customer number, another contains the customer name, and the address, city, state, and zip are all stored in their own columns.

Row

‰ Data in a table is stored in rows.

‰ Row is a record in a table.

‰ Again, envisioning a table as a spreadsheet style grid, the vertical columns in the grid are the table columns, and the horizontal rows are the table rows.

(16)

For example, a customer table might store one customer per row. The number of rows in the table is the number of records in it.

Table Terminology

Following table will give the terminology used in table context:

In this Document Formal Terms Many Database Manuals

Table Relation Table

Column Attribute Field

Row Tuple Record

Types of Table

Base Table: A base table is created with the CREATE TABLE statement and is used to hold

persistent user data.

Result Table: A result table is a set of rows that the database manager selects or generates from

one or more base tables to satisfy a query.

Summary Table: A summary table is a table defined by a query that is also used to determine the

data in the table. Summary tables can be used to improve the performance of queries. If the database manager determines that a portion of a query can be resolved using a summary table, the database manager can rewrite the query to use the summary table.

Temporary Table: A declared temporary table is created with a DECLARE GLOBAL TEMPORARY

TABLE statement and is used to hold temporary data on behalf of a single application. This table is dropped implicitly when the application disconnects from the database

Index

An index is a data access aid that can be created on a table. It is an ordered set of pointers to rows in a table.

(17)

An index can serve the following purposes:

‰ Improve performance. In most cases, access to data is faster with an index. Provides a fast way to find rows in a table based on their values in the key columns.

‰ Enforces uniqueness rules by defining a column or group of columns as a unique index or primary key.

‰ Provides a logical ordering of the rows of a table based on the key column values.

‰ Clusters the rows of a table in physical storage according to the order of the defined index.

View

A view provides a different way of looking at the data in one or more tables.

View is a virtual table consisting of a SQL SELECT statement that accesses data from one or more tables or views.

A view never stores data. When you access a view, the SQL statement that defines it is executed to derive the requested data.

A view has columns and rows just like a base table. All views can be used just like base tables for data retrieval.

You can use views to control access to sensitive data, because views allow multiple users to see different presentations of the same data. For example, several users may be accessing a table of data about employees. A manager sees data about her employees but not employees in another department. A recruitment officer sees the hire dates of all employees, but not their salaries; a financial officer sees the salaries, but not the hire dates. Each of these users work with a view derived from the base table. Each view appears to be a table and has its own name.

When the column of a view is directly derived from the column of a base table, that view column inherits any constraints that apply to the base table column. For example, if a view includes a foreign key of its base table, insert and update operations using that view are subject to the same referential constraints as is the base table. Also, if the base table of a view is a parent table, delete and update operations using that view are subject to the same rules as are delete and update operations on the base table.

A view can become inoperative (for example, if the base table is dropped); if this occurs, the view is no longer available for SQL operations.

The best way to recognize views is to look at an example. (You will study about the SQLs in detail further. This example is just to make you understand the concept of views).

You have following three tables:

‰ Customers

‰ Orders

(18)

You need to retrieve the customers who had ordered a specific product. The column details are as follows.

Table Column Customers cust_id cust_name cust_contact Orders order_num cust_id OrderDetails order_num prod_id

The query is as follows:

SELECT cust_name, cust_contact

FROM Customers, Orders, OrderDetails WHERE Customers.cust_id = Orders.cust_id

AND OrderDetails.order_num = Orders.order_num AND prod_id = 'RGAN01';

Anyone needing this data would have to understand the table structure, as well as how to create the query and join the tables. To retrieve the same data for another product (or for multiple products), the last WHERE clause would have to be modified.

Now imagine that you could wrap that entire query in a virtual table called ProductCustomers. You could then simply do the following to retrieve the same data:

SELECT cust_name, cust_contact FROM ProductCustomers

WHERE prod_id = 'RGAN01';

This is where views come into play. ProductCustomers is a view, and as a view, it does not contain any columns or data. Instead it contains a query—the same query used above to join the tables properly.

Why Use Views:

Here are some common uses of views:

‰ To draw data from multiple tables.

‰ To reuse SQL statements.

‰ To simplify complex SQL operations. After the query is written, it can be reused easily, without having to know the details of the underlying query itself.

‰ To expose parts of a table instead of complete tables.

(19)

‰ To change data formatting and representation. Views can return data formatted and presented differently from their underlying tables.

For the most part, after views are created, they can be used in the same way as tables. You can perform SELECT operations, filter and sort data, join views to other views or tables, and possibly even add and update data. There are some restrictions on this last item. The important thing to remember is views are just that, views into data stored elsewhere. Views contain no data

themselves, so the data they return is retrieved from other tables. When data is added or changed in those tables, the views will return that changed data.

There are some performance issues with views because views contain no data, any retrieval needed to execute a query and that must be processed every time the view is used. If you create complex views with multiple joins and filters, or if you nest views, you may find that performance is dramatically degraded. Be sure you test execution before deploying applications that use views extensively.

Synonym

1. An alternative, private name for a table or view.

2. A synonym can be used only by the individual who creates it. When a table or view is dropped, all synonyms defined on it are also dropped Alias

1. A locally defined name for a table or view in the same local DB2 subsystem or in a remote DB2 subsystem. Aliases give DB2 location independence because an alias can be created for a table at a remote site, thereby freeing the user from specifying the site that contains the data. Aliases can be used also as a type of global synonym because they can be accessed by anyone, not only by their creator.

2. When a table/view is dropped, all aliases defined on it are NOT dropped

3. Use Synonyms for Program Development, use Aliases for Distributed Applications and use Views for security and joining.

Following table gives the difference between “Synonym” and “Alias”

Synonym Alias

An alternative name for a table or view which should reside in the local DB2 subsystem

An alternative name for a table or view which can reside in the local or remote DB2 subsystem Can be used only by its creator Can be used by anyone including its creator Is dropped when it’s corresponding Table/View is

dropped

Not dropped even when it’s corresponding Table/View is dropped

(20)

Physical Storage of Data

The following figure represents how the data is physically stored in DB2 system.

Hierarchy of DB2 Objects

(21)

Buffer Pool

Buffer pools are areas of virtual storage in which DB2 temporarily stores pages of table spaces or indexes.

Buffer pools improve database performance. If a needed page of data is already in the buffer pool, that page is accessed faster than if that page had to be read directly from disk. The database manager has agents whose tasks are to retrieve data pages from disk and place them in the buffer pool (pre-fetchers), and to write modified data pages from the buffer pool back to disk (page cleaners).

The reading and writing of data pages to and from disk is called disk input/output (I/O). Avoiding the wait associated with disk I/O is the primary way to improve the performance of the database. How you create the buffer pool, and configure the database manager and the agents associated with the buffer pool, controls the performance of the database.

In DB2, you have the option of allocating 80 Buffer Pools:

‰ 50 4K buffer pools and

‰ 10 8K, 16K and 32K buffer pools.

The default buffer pool is BP0. Summary

‰ Storage Group is a set of volumes on DASD which contains VSAM datasets.

‰ Database is a collection of related data elements.

‰ Table space represents one or more tables which correlate to one or more VSAM data sets in the DASD volumes of storage Group.

‰ Index Space represents an index correlates to one or more VSAM data sets in the DASD volumes of storage Group.

‰ Table consists of data logically arranged in columns and rows.

‰ Index is an ordered set of pointers to rows in a base table.

‰ View is a virtual table that accesses data from one or more tables or views.

‰ Synonym is an alternative name for a table or view which should reside in the local DB2 subsystem.

‰ Alias is an alternative name for a table or view, which can reside in the local or remote DB2 subsystem.

‰ Buffer Pools are areas of virtual storage in which DB2 temporarily stores pages of table spaces or indexes.

(22)

Test Your Understanding 1. What is a storage group? 2. What is a database? 3. What is a table space? 4. What is an index space? 5. What is a table?

6. What is an index? 7. What is a View?

8. What is the difference between Synonym and Alias?

9. State the logical and physical objects among the DB2 objects and their hierarchy. 10. How the data is being stored physically?

(23)

Session 03: Database Design

Learning Objectives

After completing the session, you will be able to:

‰ Design a DB2 database

Why Should You Be Concerned with Database Design?

You will look at the concept of database design with the real life example.

Say your database is like a custom home and that you are going to have one built for us. What is the first thing you are going to do? Certainly you are not going to hire a contractor immediately and let him build our home however he wishes. Surely you will first engage an architect to design your new home and then hire a contractor to build it. The architect will express your needs as a set of blueprints, recording decisions about size and shape, and requirements for various systems (structural, mechanical, electrical). Next the contractor will procure the labor and materials, including the listed systems, and then assemble them according to the drawings and specifications.

Now let’s return to your database perspective and think of the logical database design as the architectural blueprints and the physical database implementation as the completed home. The logical database design describes the size, shape, and necessary systems for a database; it addresses the informational needs and operational needs of your business. You then build the physical implementation of the logical database design, using your RDBMS software program. Once you have created your tables, set up table relationships, and established the appropriate levels of data integrity, our database is complete. Now you are ready to create an application that allows interacting easily with the data stored in the database and you can be confident that these applications will provide with timely and above all, accurate information.

It is possible to implement a poor design in an RDBMS, but a well designed database will yield accurate information, store data more efficiently and effectively, and will be easier to manage and maintain.

If a database is designed improperly, users will have difficulty in retrieving certain types of information, and there is the added risk that searches will produce inaccurate information.

Inaccurate information is probably the most detrimental result of improper database design. It can adversely affect the bottom line of a business. In fact, if the data kept and used in a database is going to affect the way a business performs its day-to-day operations or if it's going to influence the future direction of the business, database design must be a concern.

Activities involved in Database Design Designing the database involves:

‰ Logical Database Design

(24)

‰ Implementing and Altering Database Design

Logical Database Design

Logical database design involves three phases:

‰ Requirements analysis phase

‰ Data modeling phase

‰ Normalization phase

Requirements Analysis Phase

The requirements analysis phase involves examining the business being modeled, interviewing users and management to assess the current system and to analyze future needs, and

determining information requirements for the business as a whole. This process is relatively straightforward.

Data Modeling Phase

The data modeling phase involves modeling the database structure itself. This involves using a data modeling method which provides a means of visually representing various aspects of the database structure, such as the tables, table relationships, and relationship characteristics. Some of the common data modeling methods are:

‰ Entity Relationship modeling (ER modeling)

‰ Semantic object modeling

‰ Object role modeling

The method which you use here is a basic version of ER modeling.

ER Modeling

A simple ER Diagram looks as follows.

This figure represents several aspects of the database. First, it conveys the fact that there are two tables in this database, one called Agents and the other called Clients; each of the tables is represented by a rectangle. The diamond represents the fact that there is a relationship between these two tables, and the "1:N" within the diamond indicates that the relationship is a one-to-many relationship. Finally, the diagram conveys the fact that a client must be associated with an agent (indicated by the vertical line next to the AGENTS table), but an agent does not necessarily have to be associated with a client (as indicated by the circle next to the CLIENTS table).

(25)

Symbols used in ER Diagram: ‰ Box represents Entity

‰ Diamond represents Relationship

‰ Oval represents Attribute

Key Activities in ER Modeling:

Following are the key activities involved in designing Logical database using Entity-Relationship Mode:

‰ Define Entities

‰ Define Primary Key

‰ Define Relationships among Entities

‰ Define Additional Attributes for the Entities

Define Entities:

‰ You begin the ER Model, by defining the entities, the significant objects of interest

‰ Entities are the things about which you want to store information.

Define Primary Key:

‰ A primary key is a unique identifier for an Entity.

‰ If a primary key is made up of more than one attribute, then it is called as “Composite Key”.

Define Relationships among Entities

A connection established between a pair of tables is known as a relationship. A relationship exists when a pair of tables is connected by a Primary key and a foreign key or is linked together by a third table, known as a linking table.

Relationships are very important because they help to reduce redundant data and duplicate data. They also provide the means to define views.

Every relationship can be characterized by the type of relationship that exists between the tables, the type of participation each table has within the relationship, and the degree of participation each table has within the relationship.

Types of Relationships (Cardinality):

When two tables are related, there is always a specific type of relationship (traditionally known as cardinality) that exists between them. There are three possible types of relationships:

‰ one-to-one

‰ one-to-many and many-to-one relationships

(26)

One-to-One Relationships:

A one-to-one relationship exists between a pair of tables if a single record in the first table is related to only one record in the second table and a single record in the second table is related to only one record in the first table.

Following figure shows an example of a one-to-one relationship involving an EMPLOYEES table and a COMPENSATION table. In this example, a single record in the EMPLOYEES table is related to only one record in the COMPENSATION table; likewise, a single record in the COMPENSATION table is related to only one record in the EMPLOYEES table.

One-to-Many Relationships:

A one-to-many relationship exists between a pair of tables if a single record in the first table can be related to one or more records in the second table, but a single record in the second table can be related to only one record in the first table.

A one-to-many relationship involving a STUDENTS table and an INSTRUMENTS table is shown in the following figure. In this case, a single record in the STUDENTS table can be related to one or more records in the INSTRUMENTS table, but a single record in the INSTRUMENTS table is related to only one record in the STUDENTS table.

(27)

Many-to-One Relationships

A many-to-one relationship exists between a pair of tables if a single record in the first table can be related to only one record in the second table, but a single record in the second table can be related to one or many records in the first table.

A many-to-one relationship involving an EMPLOYEE table and a DEPARTMENT table is shown in the following figure. In this example, a single record in the EMPLOYEE table can be related to only one record in the DEPARTMENT table and a single record in the DEPARTMENT table can be related to one or more records in the EMPLOYEE table.

Following figure represents many-to-one relationship. Here a relationship is established between two tables with the help of a linking table.

(28)

Many-to-Many Relationships:

A many-to-many relationship exists between a pair of tables if a single record in the first table can be related to one or more records in the second table, and a single record in the second table can be related to one or more records in the first table.

Following figure shows a classic many-to-many relationship. In this example, a single record in the STUDENTS table can be related to one or more records in the CLASSES table; likewise, a single record in the CLASSES table can be related to one or more records in the STUDENTS table.

Following figure represents many-to-many relationship. Here a relationship is established between two tables with the help of a linking table.

Types of Participation (Optionality):

There are two types of participation that a table can have within a relationship:

‰ Mandatory

‰ Optional

Say there is a relationship between two tables called TABLE A and TABLE B.

If records in TABLE A must exist before any new records can be entered into TABLE B, TABLE A's participation within the relationship is mandatory.

However, if it is not necessary for records in TABLE A to exist in order to enter any new records into TABLE B, TABLE A's participation within the relationship is optional.

(29)

Consider the relationship between the AGENTS and CLIENTS tables in the following figure. The AGENTS table has a mandatory participation within the relationship if agents must exist before a new client can be entered into the CLIENTS table. But the AGENTS table's participation is optional if it isn't necessary to have agents in the AGENTS table before a new client can be entered into the CLIENTS table. The type of participation established for the AGENTS table is determined by the way its data is being used in relation to the data in the CLIENTS table. For example, if it is necessary to ensure that each client is assigned an available agent, then the participation of the AGENTS table within the relationship should be mandatory.

Representation of Cardinality and Optionality:

In general the following conventions are being used for representing cardinality and optionality,

Cardinality:

‰ The notion of cardinality is expressed as either "one" or "many"

‰ A cardinality of “one” is expressed as a “straight line” and a cardinality of “many” is expressed using “crow's feet”.

Optionality:

‰ The notion of optionality is expressed as either "mandatory" or "optional"

‰ An optionality of “Optional” is expressed as a “circle” and an optionality of “Mandatory” is expressed as a “vertical bar”.

Sample ER Diagram:

Consider the example of a database that contains information on the residents of the cities. The ER Diagram contains two Entities – Person and City.

Optionality: A person should live in a city. (That is why a bar appears adjacent to City from

(30)

Define Additional Attributes for the Entities:

Defining the attributes for an entity includes the following activities.

‰ Defining attribute name.

‰ Defining Data Type for the attributes.

‰ Defining appropriate values for the attributes - what values are acceptable for the various attributes of a table.

Normalization: Normalization is a design approach that minimizes data redundancy and optimizes

data structures by systematically and properly placing data elements into the appropriate groupings. Normalization is the process of efficiently organizing data in a database and is the process of decomposing large tables into smaller tables.

Goals of Normalization: There are two goals of the normalization process:

‰ Eliminating redundant data (for example, storing the same data in more than one table)

‰ Ensuring data dependencies make sense (only storing related data in a table).

Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored and avoid problems with inserting, updating, or deleting data.

Normal Form: A series of guidelines were developed by the database community for ensuring that

the databases are normalized. Those guidelines are represented by different normal forms.

Types of Normal forms are as follows: ‰ First Normal Form or 1NF

‰ Second normal Form or 2NF

‰ Third Normal Form or 3NF

(31)

In practical applications, in general, you use only 1NF, 2NF, and 3NF.

First Normal Form:

‰ All entities must have a unique identifier or key, that can be composed of one or more attributes.

‰ Eliminate repeating groups and non-atomic data from an entity.

The term atomic derives from atom, the smallest indivisible particle that can exist on its own. First normal form eliminates repeating groups and non-atomic data from an entity.

To normalize a data model into 1NF, eliminate repeating groups into individual entities. In other words, do not use multiple attributes in a single entity to store similar data. Consider the sample data shown in table for a STUDENT information system for a college or university.

This data contains several violations of 1NF. First, you are tracking courses that really represent a repeating group for STUDENTs. So, the course information should be moved into separate entities. Furthermore, you need to specify identifiers for both entities. The identifier is the primary key for the entity.

A second violation of 1NF is the non-atomic data shown in the StudentName attribute. A student name can be broken down into pieces: first name, middle initial and last name. It is not indivisible, and therefore violates first normal form.

Unnormalized STUDENT Data

StudentID StudentName MajorID StudentMajor CourseNum CourseName CourseCompDate

2907 Smith, Jacob

R

MAT Mathematics MAT0011 Discrete Math 8/1/2002

MAT0027 Calculus I 4/30/2002 EGL0010 English Classics I 12/30/2001 4019 Patterson, Jane K

PHI Philosophy PHI0010 CS00100 Intro to Philosophy Programming Languages 04-30 2002-04-30 5145 Neeld, Norris B EGL English Literature SOC0102 Ascent of Man 8/1/2002 6132 Morrison, Xavier Q

MUS Music MUS0002 SOC0102 Origin of Jazz Ascent of Man 04-30 2002-08-01 7810 Brown, Richard E CS Computer Science 8966 Juarez, Samantha EGL English Literature EGL0010 EGL0101 English Classics I Shakespeare II 2001-12-30 2002-08-01

(32)

STUDENT Entity in 1NF

StudentID LastName FirstName MiddleInit MajorID StudentMajor

2907 Smith Jacob R MAT Mathematics

4019 Patterson Jane K PHI Philosophy

5145 Neeld Norris B EGL English

Literature

6132 Morrison Xavier Q MUS Music

7810 Brown Richard E CS Computer

Science

8966 Juarez Samantha EGL English

Literature

COURSE Entity in 1NF

StudentID CourseNum CourseName CourseCompDate

2907 MAT0011 Discrete Math 8/1/2002

2907 MAT0027 Calculus I 4/30/2002

2907 EGL0010 English Classics I 12/30/2001

4019 PHI0010 Intro to Philosophy 4/30/2002

4019 CS00100 Programming Languages 4/30/2002

5145 SOC0102 Ascent of Man 8/1/2002

6132 MUS0002 Origin of Jazz 4/30/2002

6132 SOC0102 Ascent of Man 8/1/2002

8966 EGL0010 English Classics I 12/30/2001

8966 EGL0101 Shakespeare II 8/1/2002

Second Normal Form:

‰ Should be in First Normal Form

‰ Every non-key attribute is fully dependent on the key

Second normal form (2NF) ensures that all the attributes of each entity are dependent on the primary key.

Notice that certain courses repeat in the COURSE entity, namely "English Classics I" and "Ascent of Man." This situation indicates a violation of 2NF. To correct the problem, we need to identify the attributes that do not depend on the entire key and remove them. The removed attributes, along with the portion of the primary key on which they depend, are placed in a new entity,

ENROLLMENT. The entire primary key of the original entity remains with the original entity. Another benefit of the normalization process is that you will frequently encounter new attributes

(33)

ENROLLMENT Entity in 2NF

StudentID CourseNum CourseCompDate

2907 MAT0011 2002-08-01 2907 MAT0027 2002-04-30 2907 EGL0010 2001-12-30 4019 PHI0010 2002-04-30 4019 CS00100 2002-04-30 5145 SOC0102 2002-08-01 6132 MUS0002 2002-04-30 6132 SOC0102 2002-08-01 8966 EGL0010 2001-12-30 8966 EGL0101 2002-08-01 COURSE Entity in 2NF

CourseNum CourseName Credits

MAT0011 Discrete Math 3

MAT0027 Calculus I 4

EGL0010 English Classics I 3

PHI0010 Intro to Philosophy 3

CS00100 Programming Languages 3

SOC0102 Ascent of Man 3

(34)

Third Normal Form:

‰ Should be in Second Normal Form

‰ Every non-key attribute is non-transitively dependent on the primary key that is, every attribute in the entity should depend only on the key not on any other non-key

attributes.

A rule of thumb for identifying 3NF violations is to look for groups of attributes whose values can apply to more than a single entity occurrence. When you discover such attributes, move them to a separate entity.

It is time to review our STUDENT information again, this time looking for 3NF violations. Examine the STUDENT data in closely. Notice that students can have the same major and, as such, certain major information can be repeated, specifically two students in our small sample are English Literature majors. To correct the problem, we need to remove major attributes that transitively depend on the key and create a new entity for them.

STUDENT Entity in 3NF

StudentID LastName FirstName MiddleInit MajorID

2907 Smith Jacob R MAT

4019 Patterson Jane K PHI

5145 Neeld Norris B EGL

6132 Morrison Xavier Q MUS

7810 Brown Richard E CS

8966 Juarez Samantha EGL

MAJOR Entity in 3NF

MajorID StudentMajor

MAT Mathematics PHI Philosophy

EGL English Literature

MUS Music

CS Computer Science

A Normalized Data Model: To be complete, a diagram should be developed for the 3NF data

model we just created for the STUDENT data. Figure shows such a data model. Notice that we have not filled in the optionality of the relationships. We could do this based on the sample data we used, but we really need to ask more questions before we can answer questions such as Does a every student have to have a major? The current data shows this to be the case, but in reality; you know that most freshmen, and even upperclassmen, may attend college without having a formally declared major.

(35)

The STUDENT data model

Further Normal Forms:

Normalization does not stop with 3NF. Additional normal forms have been identified and documented. However, normalization past 3NF does not occur often in normal practice. The following are additional normal forms. Just for your information we’ve kept this.

Boyce Codd normal form (BCNF) is a further refinement of 3NF. Indeed, in his later writings Codd refers to BCNF as 3NF. A row is in Boyce Codd normal form if and only if every determinant is a candidate key. Most entities in 3NF are already in BCNF.

Fourth normal form (4NF) states that no entity can have more than a single one-to-many

relationship if the one-to-many attributes are independent of each other. An entity is in 4NF if and only if it is in 3NF and has no multiple sets of multivalued dependencies.

Fifth normal form (5NF) specifies that every join dependency for the entity must be a consequence of its candidate keys.

(36)

Physical database design

After completing the logical design of our database, we now move to the physical design. The purpose of building a physical design of our database is to optimize performance while ensuring data integrity by avoiding unnecessary data redundancies.

During physical design, you transform the entities into tables, the instances into rows, and the attributes into columns. You must decide on many factors that affect the physical design, some of which are listed as follows:

‰ How to translate entities into physical tables

‰ What attributes to use for columns of the physical tables

‰ Which columns of the tables to define as keys

‰ What indexes to define on the tables

‰ What views to define on the tables

‰ How to denormalize the tables

‰ How to resolve many-to-many relationships

Physical design is the time when you abbreviate the names that you chose during logical design. For example, you can abbreviate the column name that identifies employees, EMPLOYEE_NUMBER, to EMPNO.

The task of building the physical design is a job that truly never ends. You need to continually monitor the performance and data integrity characteristics of the database as time passes. Many factors necessitate periodic refinements to the physical design.

Denormalization: Denormalization is a key step in the task of building a physical relational

database design. It is the intentional duplication of columns in multiple tables, and the

consequence is increased data redundancy. This is recommended to avoid performance problems occur as a result of normalization. This should be done based on the processing needs of

(37)

Implementing and Altering Database Design Implementing the database design involves:

‰ Implementing DB2 objects

‰ loading data

‰ managing data

‰ altering the design as necessary

Altering Database Design:

‰ After using a relational database for a while, we might want to change some aspects of its design.

‰ To alter the database design we need to change the definitions of DB2 objects.

Advantage of DB2 over VSAM

The following list will give some of the advantages of DB2 over VSAM.

Feature DB2 VSAM

Hardware Independence PC to mainframe Only Mainframe

OS Independence NT, Unix and OS/390 Only OS/390

Ease of development Standard SQL Not so simple

Stored procedure & triggers No such option

Ease of maintenance Standard SQL Difficult

Security High degrees of security Only at Dataset level

Referential Integrity DB2 enforces it Developers responsibility

Query Interface Easy to view/modify Not available

Performance Better for even large data Better when data is less

Optimizer handles Developer responsible

Performance Tuning Can be tuned anytime Depends on initial design Can be at SQL level Only application level Tools available for aiding No tuning aids Abundant tuning skills Tuning skills are rare Reorganization Direct reorganization Delete & recreate

Online reorg possible Downtime needed

Recovery Managed by DB2 Managed by CICS/IMS

Always recoverable No recovery in batch

From log / backup From backup only

Auto Recovery Manual Restore

Backup Online backup possible Downtime needed

Incremental backup No incremental backup

Disaster Recovery Supported by DB2 Part of DASD recovery

(38)

Feature DB2 VSAM

Selective retrieval No Selective retrieval Up to row level archival Dataset level archival Data types Images, Video, Audio, and so on. Text only

Contents can be in file No such option

Summary

‰ Database Design involves:

o Logical Database Design

o Physical Database Design

o Implementing and Altering Database Design

‰ Key activities for designing Logical data modeling with Entity-Relationship Model are:

o Define Entities

o Define Primary Key: Unique identifier

Composite Key: Key made up of more than one attributes

o Define Relationships among Entities: One-to-one relationships

One-to-many and many-to-one relationships Many-to-many relationships

Cardinality: Number of occurrences between the entities. Optionality: Relationships are mandatory or optional Define Additional Attributes for the Entities

‰ Normalization: Eliminating redundant data thereby reducing the amount of space

o Three Normal Forms: 1NF, 2NF, and 3NF

‰ Physical database design is transformation of:

o Entities into tables

o Instances into rows

o Attributes into columns

o Denormalization is recommended to avoid performance problems occur as a result of normalization

‰ Implementing the database design involves: Implementing DB2 objects, loading and managing data and altering the design as necessary.

o Altering Database Design: To alter the database design, you need to change the definitions of DB2 objects

Test Your Understanding

1. What are the tasks involved in database design? 2. What is logical data modeling?

(39)

7. What are all the different types of relationship? 8. What is Cardinality and Optionality?

9. What is Normalization?

10. What are the goals and benefits of Normalization? 11. State the different types of Normal forms.

12. Describe the tasks involved in the physical database design. 13. When will you do Denormalization?

14. State the activities involved in implementing and altering database design. 15. List some of the advantages of DB2 over VSAM.

Exercises

Following figure represents the “Medical Bill” to be produced to a customer. Design a Database for the medical shop system.

(40)

Session 07: Data Integrity

Learning Objectives

After completing the session, you will be able to:

‰ Identify different means of achieving data integrity across the database by DB2

Introduction to Data Integrity

Data integrity refers to the validity, consistency, and accuracy of the data in a database. It cannot be overstated that the level of accuracy of the information retrieved from the database is in direct proportion to the level of data integrity imposed within the database. Data integrity is one of the most important aspects of the database design process, and it should not be underestimated, overlooked, or even partially neglected. To make any of these mistakes would result in a high risk of undetectable errors.

There are three types of data integrity and are as follows:

‰ Entity Integrity

‰ Referential Integrity

‰ Domain Integrity

Entity Integrity

This is the “Table-level integrity” which ensures that the field that identifies each record within the table is unique and is never missing its value.

Entity integrity requires the specification of a primary key (PK) for each table. Key Notes about Primary Key:

‰ Each table can have zero or one primary key.

‰ Primary key should not be Null and if the primary key is a composite key, make sure that each component should not be Null.

‰ Every primary key explicitly defined for a table must be associated with a corresponding unique index.

‰ If you do not create a unique index for a primary key, then an incomplete key is defined for the table, making the table inaccessible.

Unique Constraint:

A unique constraint is similar to a primary key constraint which also enforces unique values on an individual or a group of columns. Each table can have zero, one, or many unique constraints consisting of one or more columns each. The values stored in the unique column, or combination of columns, must be unique within the table. Unique constraint column should not be Null.

(41)

Unique Index:

In addition of creating unique index in primary key column or unique constraint column (which is mandatory), you can create as many unique indexes as we need on any other columns of the table to ensure uniqueness.

The following table will show the difference between unique index on Primary Key/Unique constraint column and other column other than primary/unique constraint column.

Primary Key or unique constraint column Column other than primary/unique constraint but defined with unique index

A table can contain only one primary key constraint and multiple unique constraints

A table can contain multiple unique indexes

Cannot allow NULL values Can allow NULL values

Supports Referential Integrity Cannot support Referential Integrity

Referential Integrity

This is a “Relationship-level integrity” which ensures that the relationship between a pair of tables is sound and that there is synchronization between the two tables whenever data is entered, updated, or deleted.

Referential integrity is achieved through the foreign key.

Foreign Key:

A foreign key (FK) is a column or combination of columns used to establish and enforce a link between the data in two tables.

A link is created between two tables by adding the column or columns that hold one table's primary key values to the other table. This column becomes a foreign key in the second table.

The table with the primary key is called parent table and the table with the foreign key is called

dependent table (or child table).

Referential integrity (RI) means each row in a dependent table must have a foreign key that is equal to a primary key in the parent table.

Rules ensuring RI:

Insert:

When inserting a row with a foreign key in the dependent table, DB2 checks the values of the foreign key column against the values of the primary key column in the parent table. If there is a matching primary key column, the insert is allowed. If there is no matching primary key column, the insert will not happen.

A new row can be inserted in the parent table as long as the primary key value of the new row is unique.

(42)

Update:

When updating foreign key values in the dependent table, DB2 checks whether there is a matching primary key in the parent table or not. If there is a matching primary key, update is allowed. If there is no matching primary key, update is not allowed.

The primary key in the parent table cannot be updated if any rows in the dependent tables refer to it.

Delete:

Deleting a row with a foreign key in the dependent table is permitted.

When deleting a row with a primary key in the parent table, DB2 takes one of the following actions as indicated while defining the table.

RESTRICT: Disallows the deletion of the primary key row if any foreign keys relate to that row. CASCADE: Allows the deletion of the primary key row and also deletes the foreign key rows that relate to it.

SET TO NULL: Allows the deletion of the primary key row and, instead of deleting all related foreign key rows, sets the foreign key columns to NULL.

Operation Child Table Parent Table

Insert Allowed if the foreign key value matches the value of Primary key of the parent table.

Allowed as long as the primary key value is unique.

Update Allowed if the foreign key value matches the value of Primary key of the parent table.

Allowed if there are no foreign key references in the child tables.

Delete Allowed. Depending upon the action specified during

table definition.

Restrict: not allowed if there are any foreign key references

Cascade: allowed and deletes the foreign key references if any in the child tables. SET TO NULL: allowed and a Null value will be set in the foreign key references of the child tables

Domain Integrity

This is the “Field-level integrity” which ensures that the structure of every field is sound, that the values in each field are valid, consistent, and accurate, and that fields of the same type (such as City fields) are consistently defined throughout the database.

(43)

Domain integrity is enforced using:

‰ Data Type and Length

‰ Default Values

‰ NULL Values

‰ Check constraint Data Type and Length

By specifying the data type and maximum length for each column when a table is created, the DBMS will automatically ensure that only the correct type of data with the maximum length allowed is stored in that column.

Default Values

When columns are created within tables, they can be assigned a default value that will be used when inserting or loading the data which do not provide an explicit value for that column. Each column can have only one default value. User can provide a default value for a column. If there is no user specific default value, DB2 will assign the default value based on the data type of that column.

DB2 Data Types with Default Values: Data Type COBOL

PIC

COBOL USAGE

Default Description

CHAR(n) PIC X(n) DISPLAY Blanks Fixed length character

(EBCDIC) data

VARCHAR(n) PIC X(n) DISPLAY Empty String Variable length character data

SMALLINT PIC S9(4) COMP OR

COMP-4

0 Half word integer data

INTEGER PIC S9(9) COMP OR

COMP-4

0 Full word integer data

DECIMAL(p,s) PIC S9(p)V9(s)

COMP-3 0 Packed decimal data.

p è number of decimal digits s è number of digits to the right side of the decimal point

DATE PIC X(10) DISPLAY Current Date Date data (yyyy-mm-dd)

TIME PIC X(8) DISPLAY Current Time Time data (hh.mm.ss)

TIMESTAMP PIC X(26) DISPLAY Current

timestamp

Date and Time data with microseconds

(yyyy-mm-dd-hh.mm.ss.mmmmmm)

GRAPHIC PIC G(n) DISPLAY-1 Blanks Double byte character set

(DBCS) data

VARGRAPHIC PIC G(n) DISPLAY-1 Empty String Variable length DBCS data

FLOAT(n) None COMP-1 or

COMP-2

0 Floating point data in single or

(44)

NULL Values:

Whenever a value is missing or unknown, it is said to be Null. A null value represents neither zero (in the case of numeric data) nor blank (represented by one or more spaces in the case of textual data). Zero and blank are actual values and can be meaningful in some way under certain circumstances. For example, a zero can represent the current state of an Account Balance; a blank in a Middle Initial field can represent the fact that an employee has no middle initial in his or her name. In the following figure, a blank represents the fact that Washington, D.C., is not located in any county whatsoever.

A null value is typically used to represent an unknown value in a field. In the above figure, for example, there are null values in the County field. Shannon McLain did not know what county she lived in at the time her data was entered into the database, so no entry was made into the County field. As a result, the County field contains a null value. This value can be changed, however, once Shannon finds out what county she lives in.

A null value is also used to represent a missing value in a field. If the person who entered the data for Shannon McLain failed to ask her for the name of the county she lives in, the data is

considered missing since no entry was made into the County field due to operator error. Once the error is recognized, it can be easily corrected by obtaining the appropriate value from Ms. McLain.

A drawback to null values is that they cannot be evaluated by mathematical expressions or aggregate functions. If a null value is used in a mathematical expression, that expression will

(45)

logically reasonable—if the number is unknown, the value will necessarily be unknown. Also there is a serious undetected error that occurs if all the values in the Total Value field are then added together: an inaccurate total. The only way to obtain an accurate total is to provide a value for the entries in the Qty On Hand field that are currently Null.

The result of an aggregate function, such as "Count()," will be Null if it is based on a field that contains null values. For example, the following figure shows the results of a summary query that counts the total number of occurrences of each category in the PRODUCTS table shown above. The value of Total Occurrences in the summary query is the result of the function expression "Count([Total Occurrences])." Notice that the summary query shows 0 occurrences of an

unspecified Category, implying that each product has been assigned a category. This information is clearly inaccurate because there are two products in the PRODUCTS table that have not been assigned a category.

Check Constraint

A check constraint is a rule that specifies the values that are allowed in one or more columns of every row of a table. Check constraint enforces business rules directly into the database without requiring additional application logic. This can be defined during column definition.

Summary

‰ Data Integrity ensures the accuracy of the data in a database.

‰ Types of Data Integrity

o Entity Integrity

o Referential Integrity

o Domain Integrity

‰ Entity integrity enforces each occurrence of an entity must be uniquely identifiable and is achieved through primary key.

‰ A table definition can be complete only when a unique index is created on its primary key.

‰ Unique constraint also behaves similar to primary key constraint except the fact of the number of unique constraints can be more than one in a table.

‰ Unique index column allows null values and does not support RI as against primary or unique constraint column.

‰ Foreign key is a column in a child table which holds the value of primary key column of a parent table.

‰ Referential integrity (RI) ensures each row in a dependent table must have a foreign key that is equal to a primary key in the parent table.

(46)

Operation Child Table Parent Table

Insert Allowed if the foreign key value matches the value of Primary key of the parent table.

Allowed as long as the primary key value is unique.

Update Allowed if the foreign key value matches the value of Primary key of the parent table.

Allowed if there are no foreign key references in the child tables.

Delete Allowed. Depending upon the action specified

during table definition.

Restrict: Not allowed if there are any foreign key references

Cascade: Allowed and deletes the foreign key references if any in the child tables. SET TO NULL: Allowed and a Null value will be set in the foreign key references of the child tables

‰ Domain integrity ensures the possible values of a column and is achieved through:

o Data Type and Length:

o Commonly used DB2 data types: Char, Varchar, Smallint, Integer, Decimal, Date,

Time, Timestamp

o Default Values:

Can be user defined or depends on the data type

Used for columns with the missing values during insertion

o NULL Values: Unknown or missing value

o Check constraint: Enforces business rules and allows only the predefined values

Test Your Understanding 1. What is data integrity?

2. What is entity integrity and how do you achieve it? 3. Can you create a table without a primary key? 4. Can a primary key column hold a value of null?

5. State the importance of unique index in the primary key. 6. What is a unique constraint?

7. Differentiate between the primary key constraint and unique constraint. 8. State the differences between primary key index and unique index. 9. What is a foreign key?

10. What is a referential integrity?

11. List the rules applied to ensure RI while inserting, updating, and deleting the data both in the parent table as well as in the child table.

(47)

Session 08: Interaction with DB2

Learning Objectives

After completing the session, you will be able to:

‰ Access DB2 data using DB2I, SPUFI and QMF

Interaction with DB2: Overview

An environment is connected to a DB2 subsystem by an attachment facility. Whenever there is a need of interaction with DB2, a thread will be established.

A thread is a control structure used to:

‰ Send requests to DB2

‰ Send data from DB2 to the requestor

‰ Communicate the status of each SQL statement after it is executed

TSO as a Door to DB2:

You use TSO (Time-Sharing Option) environment as a door that provides access to DB2 data. The TSO Attachment Facility provides access to DB2 resources in two ways:

‰ Online mode, in the TSO foreground, using ISPF (Interactive System Productivity Facility) panels

‰ Batch mode using the TSO Terminal Monitor Program - IKJEFT01 (or IKJEFT1B)

Common types of TSO foreground users include:

‰ DB2I

(48)

DB2I

DB2I (DB2 Interactive) is a TSO-based DB2 application. It consists of a series of ISPF panels, programs, and CLISTs enabling rapid access to DB2 services and data. DB2I increases the TSO DB2 developer's productivity.

How to access DB2I:

‰ Log on to TSO as you normally would. The ISPF main menu appears.

‰ Choose Option 8 (DB2 - Perform DATABASE 2 Interactive Functions) in Cognizant Mainframe.

‰ DB2I Main menu appears as shown in the following screenshot.

DB2I Primary Option Menu

‰ Following Options are available with DB2I:

o SPUFI: SQL Processor Using File Input.

o DCLGEN: Declaration Generator.

o PROGRAM PREPARATION: Prepares a program containing embedded SQL for execution.

o PRECOMPILE: Program containing embedded SQL is parsed to retrieve all SQL and replace it with calls to a runtime interface to DB2

o BIND/REBIND/FREE: Provides the capability to bind a DB2 plan & package, rebound a plan & package, remove plan & package from the system

(49)

o UTILITIES: Provides panels that ease the administrative burdens of DB2 utility processing

o DB2I DEFAULTS:

Lets you modify parameters that control the operation of DB2I

Be sure that the proper DB2 subsystem is specified in the “DB2 Name” parameter

Be sure that the proper language to be used for preparing DB2 programs in the “Application Language” parameter

A valid job card needs to be supplied in the “DB2I Job Statement” parameter.

o EXIT: Leaves DB2I

SPUFI

The first option in the DB2I main menu is SPUFI (SQL Processor Using File Input).

SPUFI is intended primarily for application programmers who wish to test the SQL portions of their programs or administrators who wish to perform SQL operations.

SPUFI reads SQL statements contained as text in a sequential file or in a member of a PDS, processes those statements and places the results in an ISPF browse session. By specifying the input and output data sets and selecting the appropriate options, we can execute SQL statements in an online mode.

The SPUFI Panel is as follows.

Input Dataset

Output Dataset

References

Related documents

This compound exhibits the shortening and elongation of certain C—C single bonds, as reported for related PCU derivatives (Table 1)... mation (flap

A card form is used to display a single record from the table to which it is bound. Card forms typically show data from master tables or setup tables. Figure 20) is an example of a

The central agents of PhEDEx now record this information in two new tables: the BlockAllocator agent records the timestamps of events related to block completion in t dps block

A summary field is a field in a one table record whose value is based on data in related-many table records. Summary fields eliminate repetitive and time-consuming

Table 1 Summary of previous research using medical record recoding methods to exa mine the accuracy of external cause coding in injury-related hospital records Study Aim Setting

• Conventional ETL deployments - Data processing throughput of 1 million records per second (single commodity server, single database table). • Big Data architectures with In

What is the recommended way to display field data from the related Review__c records on a Visualforce page for a single Position__c record.. Utilized the Standard Controller

A database design that eliminates redundant data in any single table, places tightly related datasets into individual tables, keeps unrelated datasets separated in their own