Handout: DB2
Version: DB2/Handout/0308/1.0Date: 20-03-08
Cognizant 500 Glen Pointe Center West Teaneck, NJ 07666 Ph: 201-801-0233 www.cognizant.com
TABLE OF CONTENTS
Introduction ... 11
About this Module ... 11
Target Audience ... 11
Module Objectives ... 11
Pre-requisite ... 11
Session 02: Introduction to DB2 Objects ... 12
Learning Objectives ... 12 Database ... 12 DBMS ... 12 Database Model ... 12 RDBMS ... 13 DB2 ... 13 Storage Group ... 13 Database in detail ... 14 Table Space ... 14 Index Space ... 15 Table ... 15 Column ... 15 Row ... 15 Table Terminology ... 16 Types of Table ... 16 Index ... 16 View ... 17 Synonym ... 19 Alias ... 19
Physical Storage of Data ... 20
Hierarchy of DB2 Objects ... 20
Buffer Pool ... 21
Summary ... 21
Activities involved in Database Design ... 23
Logical Database Design ... 24
Physical database design ... 36
Implementing and Altering Database Design ... 37
Advantage of DB2 over VSAM ... 37
Summary ... 38
Test Your Understanding ... 38
Exercises ... 39
Session 07: Data Integrity... 40
Learning Objectives ... 40
Introduction to Data Integrity ... 40
Entity Integrity ... 40
Referential Integrity ... 41
Rules ensuring RI: ... 41
Domain Integrity ... 42
Summary ... 45
Test Your Understanding ... 46
Session 08: Interaction with DB2 ... 47
Learning Objectives ... 47
Interaction with DB2: Overview ... 47
DB2I ... 48
SPUFI ... 49
QMF ... 53
Summary ... 58
Test Your Understanding ... 59
Session 09: SQL ... 60
Learning Objectives ... 60
SQL: Overview ... 60
DDL ... 61
DDL – Create Storage Group ... 61
DDL – Alter Storage Group ... 61
Try It Out ... 62
DDL – Create Database ... 62
Try It Out ... 63
DDL – Alter Database ... 63
Try It Out ... 66
DDL – Alter Tablespace ... 67
Summary ... 68
Test Your Understanding ... 68
Session 10: SQL ... 69 Learning Objectives ... 69 DDL – Create Table ... 69 Try It Out ... 71 Try It Out ... 72 Try It Out ... 73 DDL – Alter Table ... 73 Try It Out ... 74 DDL – Create Index ... 74 DDL – Alter Index ... 78 Try It Out ... 79 DDL – Create View ... 80 DDL – Alter View ... 80 DDL – Create Alias ... 80 DDL – Create Synonym ... 80 DDL – DROP ... 81 Summary ... 81
Test Your Understanding ... 82
Exercises ... 82 Session 11: SQL ... 83 Learning Objectives ... 83 DML – Insert ... 83 DML – Update ... 84 DML – Delete ... 84 DML – Simple Retrieval ... 85
DML – Retrieving Multiple Columns ... 85
DML – Retrieving All Columns ... 85
DML – Sorting Retrieved data ... 85
DML – Sorting by Multiple Columns ... 86
DML – Sorting by Column Position... 86
DML – Advanced Filtering ... 88
DML – Wildcard Filtering ... 89
DML – Removing Duplicates ... 90
DML – Concatenating Fields ... 90
DML – Using Aliases ... 90
DML – Performing Mathematical Calculations ... 91
DML – Functions... 91
DML – Aggregate Functions ... 91
DML – Text Functions ... 92
DML – Date and Time Manipulation Functions ... 92
DML – Numeric Functions ... 93
Summary ... 93
Test Your Understanding ... 94
Exercises ... 94
Session 12: SQL ... 96
Learning Objectives ... 96
DML – Grouping Data ... 96
DML – Filtering Groups ... 97
DML – Grouping and Sorting ... 97
DML – Subquery ... 97 DML – Join ...100 DML – Union ...102 DCL ...103 DCL - GRANT ...103 DCL - REVOKE ...103 Summary ...104
Test Your Understanding ...104
Exercises ...104
Session 21: Introduction to Simple COBOL DB2 Application Program ...106
Learning Objectives ...106 Embedded SQL ...106 DCLGEN ...106 Host Variables ...111 SQLCA ...112 SQLCODE ...113 Try It Out ...113 Summary ...115
Test Your Understanding ...115
Exercises ...115
Session 26: Program Preparation and Execution ...116
Learning Objectives ...116
Steps involved in Program Preparation ...116
Need for Precompilation ...116
Tasks of Precompiler ...116
Bind ...118
Plan ...119
Need of Package ...119
Package ...120
Understanding Plan and Package more clearly ...120
Version ...120
Advantage of Version ...120
Advantages of Package ...120
Collection ...121
Advantages of Collection ...121
Packages and Plans Storage ...125
Compile ...125
Link Edit ...126
Execution ...127
Try It Out ...128
Summary ...131
Test Your Understanding ...132
Exercises ...132
Session 31: Error Handling ...133
Learning Objectives ...133
Error Handling - Introduction ...133
SQLCA ...133
DSNTIAR ...134
WHENEVER ...135
Try It Out ...136
Summary ...139
Test Your Understanding ...139
Learning Objectives ...140
COMMIT ...140
Rollback ...141
Try It Out ...141
Summary ...144
Test Your Understanding ...144
Session 35: Cursor ...145 Learning Objectives ...145 Cursor - Introduction ...145 Cursor Processing ...145 Cursor Declaration ...146 Cursor Open ...146 Cursor Fetch ...146 Cursor Close ...147
Using Cursor for Data Modification ...147
Try It Out ...148
Summary ...151
Test Your Understanding ...151
Exercises ...151
Session 42: Handling Null ...152
Learning Objectives ...152
Null - Introduction...152
Null Indicator ...152
Try It Out ...153
Summary ...161
Test Your Understanding ...161
Exercises ...161
Session 46: Handling VARCHAR ...162
Learning Objectives ...162
VARCHAR - Introduction ...162
VARCHAR - Behavior ...162
Try It Out ...163
Summary ...171
Test Your Understanding ...171
Session 51: Locks ...172
Learning Objectives ...172
Lock - Introduction ...172
Table Space - Recap ...172
Simple Table Space ...172
Segmented Table Space ...173
Partitioned Table Space ...173
Lock Size ...173
Lock Escalation ...174
Lock Duration ...174
Summary ...175
Test Your Understanding ...176
Session 52: Locks ...177 Learning Objectives ...177 Lock Mode ...177 Suspension ...179 Timeout ...179 Deadlock ...179
Constructs that affect locking ...180
Summary ...181
Test Your Understanding ...181
Session 54: Miscellaneous ...182
Learning Objectives ...182
Bind Parameters ...182
Common SQL Errors ...183
Application Programming Guidelines ...184
Using EXPLAIN ...184
Summary ...185
Test Your Understanding ...185
Session 55: Miscellaneous ...186
Learning Objectives ...186
Table-Based Infrastructure of DB2 ...186
DB2 Catalog ...186
Session 56: Miscellaneous ...191
Learning Objectives ...191
Utilities - Introduction ...191
Data Consistency Utilities ...191
CHECK Utility ...191
REPAIR Utility ...192
REPORT Utility ...195
DIAGNOSE Utility ...195
Backup and Recovery Utilities ...196
COPY Utility ...196
MERGECOPY Utility ...197
QUIESCE Utility ...198
RECOVER Utility ...199
REBUILD Utility ...200
REPORT RECOVERY Utility ...201
Summary ...201
Test Your Understanding ...201
Session 57: Miscellaneous ...202
Learning Objectives ...202
Data Organization Utilities ...202
LOAD Utility ...202
REORG Utility ...203
Catalog Manipulation Utilities ...204
CATMAINT Utility ...204 MODIFY Utility ...204 RUNSTATS Utility ...204 STOSPACE Utility ...205 DB2 Commands ...206 Summary ...209
Test Your Understanding ...209
Session 58: Miscellaneous ...210
Learning Objectives ...210
Dynamic SQL - Introduction ...210
Dynamic SQL - Types ...210
When to use Dynamic SQL ...210
Execute Immediate SQL ...210
Parameter marker ...213
Fixed-list select ...213
Varying-list select SQL ...214
Summary ...214
Test Your Understanding ...214
Session 59: Miscellaneous ...215
Learning Objectives ...215
Stored Procedure - Introduction ...215
Stored Procedure - Development ...215
Creating Stored Procedures ...216
Managing Stored Procedures ...217
Executing a Stored Procedures ...217
Summary ...218
Test Your Understanding ...218
Glossary ...219
References ...227
Websites ...227
Books ...227
Introduction
About this Module
This module deals with the fundamentals of DB2. Target Audience
This module is specifically aimed at entry level trainees. Module Objectives
After completing this module, you will be able to:
Explain the introduction to DB2 Objects
Explain Database Design
Describe Data Integrity
Identify the Interaction with DB2
Describe SQL
Explain the introduction of simple COBOL DB2 application program
Identify Error Handling
Explain Commit and Rollback
Explain Cursor
Handle NULL
Handle VARCHAR
Define Locks
State Application Programming Guidelines
Pre-requisite
To follow the course successfully, a trainee needs to have a prior theoretical as well as hands-on knowledge of the following:
TSO/ISPF
COBOL
JCL
Session 02: Introduction to DB2 Objects
Learning Objectives
After completing the session, you will be able to:
Identify the different DB2 objects
Explain the relation between DB2 objects
Describe how the data is being stored in DB2 physically
Database
A database is a collection of data stored in some organized fashion. The simplest way to think of it is to imagine a database as a filing cabinet. The filing cabinet is simply a physical location to store data, regardless of what that data is or how it is organized.
You will see about Database in detail in the later part of this session.
Database: A container (usually a file or set of files) to store organized data.
DBMS
A Database Management System (DBMS) is a set of programs designed for the purpose of managing databases.
Typical examples of DBMS include:
DB2
Oracle
Microsoft Access
Microsoft SQL Server
Database Model
A database model refers to the way a DBMS organizes information internally. It is a theory or specification, describing how a database is structured and used.
Common models include the following:
Hierarchical model
Network model
Relational model
RDBMS
Relational Database Management System (RDBMS) is a type of database management system (DBMS) that stores data in the form of related tables.
Most popular RDBMS includes:
DB2
Oracle
Microsoft SQL Server
DB2
DB2 or Database 2 is a Relational Database Management System offered by IBM that runs on IBM Mainframe, AS/400 and on PC. A DB2 database can grow from a small single-user application to a large multi-user system.
Storage Group
A DB2 storage group (STOGROUP) is a set of volumes on direct access storage devices (DASD). The volumes hold the VSAM data sets in which tables and indexes are actually stored.
Max no of volumes per Storage group is 133 (Ideally 3 or 4). All volumes of a given storage group must have the same device type (3380, 3390, and so no.). But, parts of a single database can be stored in different storage groups. If the volumes in a storage group are different types or if any volume is not mounted or is otherwise invalid, then an error will occur when you try to create a table space or index. Try to assign frequently accessed objects (indexes, for instance) to fast devices and seldom-used tables to slower devices; that choice of storage groups improves performance.
After you define a storage group, DB2 stores information about it in the DB2 catalog. (This catalog is not the same as the integrated catalog facility catalog that describes DB2 VSAM data sets). The catalog table SYSIBM.SYSSTOGROUP has a row for each storage group and
SYSIBM.SYSVOLUMES has a row for each volume.
At installation, the system default storage group is defined. This storage group is named
SYSDEFLT. If you do not explicitly manage your storage, then DB2 uses the default storage group to allocate space.
Following figure represents the physical representation of storage group.
Database in detail
A database is a collection of related data tables or entities. For example, a typical database for an organization would consist of a customer, an order, and order details tables. All these tables are related to one another in some way. In this example, customers have orders and orders have order details. Even though each table exists on its own, collectively the tables comprise a database.
Database is a group of logically related Tablespaces and Indexspaces, which in turn contain tables and indexes respectively.
The default database, DSNDB04, is predefined in the DB2 installation process. Table Space
Data is actually stored in a structure known as a table space.
Each table space correlates to one or more individual physical VSAM data sets in the DASD volumes of storage Group.
Each table space contains one or more tables. There are three different types of table space:
Simple table space
Index Space
An index space is the underlying storage structure for index data.
Each index space correlates to one or more individual physical VSAM data sets in the DASD volumes of storage Group.
It is automatically created by DB2 whenever an index is created. There can only be one index in an index space.
Table
Tables are logical structures maintained by the database manager.
When you store information in your filing cabinet you do not just toss it in a drawer. Rather, you create files within the filing cabinet, and then you file related data in specific files.
In the database world, that file is called a table. A table is a structured file that can store data of a specific type. A table might contain a list of customers, a product catalog, or any other list of information.
Each table has a name, and within a table, each column has a name. No particular ordering is maintained among the rows of a table, but rows can be retrieved in an order determined by values in their columns. The data in a table is logically related. All table data is assigned to table spaces. A table consists of data logically arranged in columns and rows.
Column
Tables are made up of columns. A column contains a particular piece of information within a table. Column is a single field in a table. All tables are made up of one or more columns.
The best way to comprehend this is to envision database tables as grids, somewhat like spreadsheets. Each column in the grid contains a particular piece of information. In a customer table, for example, one column contains the customer number, another contains the customer name, and the address, city, state, and zip are all stored in their own columns.
Row
Data in a table is stored in rows.
Row is a record in a table.
Again, envisioning a table as a spreadsheet style grid, the vertical columns in the grid are the table columns, and the horizontal rows are the table rows.
For example, a customer table might store one customer per row. The number of rows in the table is the number of records in it.
Table Terminology
Following table will give the terminology used in table context:
In this Document Formal Terms Many Database Manuals
Table Relation Table
Column Attribute Field
Row Tuple Record
Types of Table
Base Table: A base table is created with the CREATE TABLE statement and is used to hold
persistent user data.
Result Table: A result table is a set of rows that the database manager selects or generates from
one or more base tables to satisfy a query.
Summary Table: A summary table is a table defined by a query that is also used to determine the
data in the table. Summary tables can be used to improve the performance of queries. If the database manager determines that a portion of a query can be resolved using a summary table, the database manager can rewrite the query to use the summary table.
Temporary Table: A declared temporary table is created with a DECLARE GLOBAL TEMPORARY
TABLE statement and is used to hold temporary data on behalf of a single application. This table is dropped implicitly when the application disconnects from the database
Index
An index is a data access aid that can be created on a table. It is an ordered set of pointers to rows in a table.
An index can serve the following purposes:
Improve performance. In most cases, access to data is faster with an index. Provides a fast way to find rows in a table based on their values in the key columns.
Enforces uniqueness rules by defining a column or group of columns as a unique index or primary key.
Provides a logical ordering of the rows of a table based on the key column values.
Clusters the rows of a table in physical storage according to the order of the defined index.
View
A view provides a different way of looking at the data in one or more tables.
View is a virtual table consisting of a SQL SELECT statement that accesses data from one or more tables or views.
A view never stores data. When you access a view, the SQL statement that defines it is executed to derive the requested data.
A view has columns and rows just like a base table. All views can be used just like base tables for data retrieval.
You can use views to control access to sensitive data, because views allow multiple users to see different presentations of the same data. For example, several users may be accessing a table of data about employees. A manager sees data about her employees but not employees in another department. A recruitment officer sees the hire dates of all employees, but not their salaries; a financial officer sees the salaries, but not the hire dates. Each of these users work with a view derived from the base table. Each view appears to be a table and has its own name.
When the column of a view is directly derived from the column of a base table, that view column inherits any constraints that apply to the base table column. For example, if a view includes a foreign key of its base table, insert and update operations using that view are subject to the same referential constraints as is the base table. Also, if the base table of a view is a parent table, delete and update operations using that view are subject to the same rules as are delete and update operations on the base table.
A view can become inoperative (for example, if the base table is dropped); if this occurs, the view is no longer available for SQL operations.
The best way to recognize views is to look at an example. (You will study about the SQLs in detail further. This example is just to make you understand the concept of views).
You have following three tables:
Customers
Orders
You need to retrieve the customers who had ordered a specific product. The column details are as follows.
Table Column Customers cust_id cust_name cust_contact Orders order_num cust_id OrderDetails order_num prod_id
The query is as follows:
SELECT cust_name, cust_contact
FROM Customers, Orders, OrderDetails WHERE Customers.cust_id = Orders.cust_id
AND OrderDetails.order_num = Orders.order_num AND prod_id = 'RGAN01';
Anyone needing this data would have to understand the table structure, as well as how to create the query and join the tables. To retrieve the same data for another product (or for multiple products), the last WHERE clause would have to be modified.
Now imagine that you could wrap that entire query in a virtual table called ProductCustomers. You could then simply do the following to retrieve the same data:
SELECT cust_name, cust_contact FROM ProductCustomers
WHERE prod_id = 'RGAN01';
This is where views come into play. ProductCustomers is a view, and as a view, it does not contain any columns or data. Instead it contains a query—the same query used above to join the tables properly.
Why Use Views:
Here are some common uses of views:
To draw data from multiple tables.
To reuse SQL statements.
To simplify complex SQL operations. After the query is written, it can be reused easily, without having to know the details of the underlying query itself.
To expose parts of a table instead of complete tables.
To change data formatting and representation. Views can return data formatted and presented differently from their underlying tables.
For the most part, after views are created, they can be used in the same way as tables. You can perform SELECT operations, filter and sort data, join views to other views or tables, and possibly even add and update data. There are some restrictions on this last item. The important thing to remember is views are just that, views into data stored elsewhere. Views contain no data
themselves, so the data they return is retrieved from other tables. When data is added or changed in those tables, the views will return that changed data.
There are some performance issues with views because views contain no data, any retrieval needed to execute a query and that must be processed every time the view is used. If you create complex views with multiple joins and filters, or if you nest views, you may find that performance is dramatically degraded. Be sure you test execution before deploying applications that use views extensively.
Synonym
1. An alternative, private name for a table or view.
2. A synonym can be used only by the individual who creates it. When a table or view is dropped, all synonyms defined on it are also dropped Alias
1. A locally defined name for a table or view in the same local DB2 subsystem or in a remote DB2 subsystem. Aliases give DB2 location independence because an alias can be created for a table at a remote site, thereby freeing the user from specifying the site that contains the data. Aliases can be used also as a type of global synonym because they can be accessed by anyone, not only by their creator.
2. When a table/view is dropped, all aliases defined on it are NOT dropped
3. Use Synonyms for Program Development, use Aliases for Distributed Applications and use Views for security and joining.
Following table gives the difference between “Synonym” and “Alias”
Synonym Alias
An alternative name for a table or view which should reside in the local DB2 subsystem
An alternative name for a table or view which can reside in the local or remote DB2 subsystem Can be used only by its creator Can be used by anyone including its creator Is dropped when it’s corresponding Table/View is
dropped
Not dropped even when it’s corresponding Table/View is dropped
Physical Storage of Data
The following figure represents how the data is physically stored in DB2 system.
Hierarchy of DB2 Objects
Buffer Pool
Buffer pools are areas of virtual storage in which DB2 temporarily stores pages of table spaces or indexes.
Buffer pools improve database performance. If a needed page of data is already in the buffer pool, that page is accessed faster than if that page had to be read directly from disk. The database manager has agents whose tasks are to retrieve data pages from disk and place them in the buffer pool (pre-fetchers), and to write modified data pages from the buffer pool back to disk (page cleaners).
The reading and writing of data pages to and from disk is called disk input/output (I/O). Avoiding the wait associated with disk I/O is the primary way to improve the performance of the database. How you create the buffer pool, and configure the database manager and the agents associated with the buffer pool, controls the performance of the database.
In DB2, you have the option of allocating 80 Buffer Pools:
50 4K buffer pools and
10 8K, 16K and 32K buffer pools.
The default buffer pool is BP0. Summary
Storage Group is a set of volumes on DASD which contains VSAM datasets.
Database is a collection of related data elements.
Table space represents one or more tables which correlate to one or more VSAM data sets in the DASD volumes of storage Group.
Index Space represents an index correlates to one or more VSAM data sets in the DASD volumes of storage Group.
Table consists of data logically arranged in columns and rows.
Index is an ordered set of pointers to rows in a base table.
View is a virtual table that accesses data from one or more tables or views.
Synonym is an alternative name for a table or view which should reside in the local DB2 subsystem.
Alias is an alternative name for a table or view, which can reside in the local or remote DB2 subsystem.
Buffer Pools are areas of virtual storage in which DB2 temporarily stores pages of table spaces or indexes.
Test Your Understanding 1. What is a storage group? 2. What is a database? 3. What is a table space? 4. What is an index space? 5. What is a table?
6. What is an index? 7. What is a View?
8. What is the difference between Synonym and Alias?
9. State the logical and physical objects among the DB2 objects and their hierarchy. 10. How the data is being stored physically?
Session 03: Database Design
Learning Objectives
After completing the session, you will be able to:
Design a DB2 database
Why Should You Be Concerned with Database Design?
You will look at the concept of database design with the real life example.
Say your database is like a custom home and that you are going to have one built for us. What is the first thing you are going to do? Certainly you are not going to hire a contractor immediately and let him build our home however he wishes. Surely you will first engage an architect to design your new home and then hire a contractor to build it. The architect will express your needs as a set of blueprints, recording decisions about size and shape, and requirements for various systems (structural, mechanical, electrical). Next the contractor will procure the labor and materials, including the listed systems, and then assemble them according to the drawings and specifications.
Now let’s return to your database perspective and think of the logical database design as the architectural blueprints and the physical database implementation as the completed home. The logical database design describes the size, shape, and necessary systems for a database; it addresses the informational needs and operational needs of your business. You then build the physical implementation of the logical database design, using your RDBMS software program. Once you have created your tables, set up table relationships, and established the appropriate levels of data integrity, our database is complete. Now you are ready to create an application that allows interacting easily with the data stored in the database and you can be confident that these applications will provide with timely and above all, accurate information.
It is possible to implement a poor design in an RDBMS, but a well designed database will yield accurate information, store data more efficiently and effectively, and will be easier to manage and maintain.
If a database is designed improperly, users will have difficulty in retrieving certain types of information, and there is the added risk that searches will produce inaccurate information.
Inaccurate information is probably the most detrimental result of improper database design. It can adversely affect the bottom line of a business. In fact, if the data kept and used in a database is going to affect the way a business performs its day-to-day operations or if it's going to influence the future direction of the business, database design must be a concern.
Activities involved in Database Design Designing the database involves:
Logical Database Design
Implementing and Altering Database Design
Logical Database Design
Logical database design involves three phases:
Requirements analysis phase
Data modeling phase
Normalization phase
Requirements Analysis Phase
The requirements analysis phase involves examining the business being modeled, interviewing users and management to assess the current system and to analyze future needs, and
determining information requirements for the business as a whole. This process is relatively straightforward.
Data Modeling Phase
The data modeling phase involves modeling the database structure itself. This involves using a data modeling method which provides a means of visually representing various aspects of the database structure, such as the tables, table relationships, and relationship characteristics. Some of the common data modeling methods are:
Entity Relationship modeling (ER modeling)
Semantic object modeling
Object role modeling
The method which you use here is a basic version of ER modeling.
ER Modeling
A simple ER Diagram looks as follows.
This figure represents several aspects of the database. First, it conveys the fact that there are two tables in this database, one called Agents and the other called Clients; each of the tables is represented by a rectangle. The diamond represents the fact that there is a relationship between these two tables, and the "1:N" within the diamond indicates that the relationship is a one-to-many relationship. Finally, the diagram conveys the fact that a client must be associated with an agent (indicated by the vertical line next to the AGENTS table), but an agent does not necessarily have to be associated with a client (as indicated by the circle next to the CLIENTS table).
Symbols used in ER Diagram: Box represents Entity
Diamond represents Relationship
Oval represents Attribute
Key Activities in ER Modeling:
Following are the key activities involved in designing Logical database using Entity-Relationship Mode:
Define Entities
Define Primary Key
Define Relationships among Entities
Define Additional Attributes for the Entities
Define Entities:
You begin the ER Model, by defining the entities, the significant objects of interest
Entities are the things about which you want to store information.
Define Primary Key:
A primary key is a unique identifier for an Entity.
If a primary key is made up of more than one attribute, then it is called as “Composite Key”.
Define Relationships among Entities
A connection established between a pair of tables is known as a relationship. A relationship exists when a pair of tables is connected by a Primary key and a foreign key or is linked together by a third table, known as a linking table.
Relationships are very important because they help to reduce redundant data and duplicate data. They also provide the means to define views.
Every relationship can be characterized by the type of relationship that exists between the tables, the type of participation each table has within the relationship, and the degree of participation each table has within the relationship.
Types of Relationships (Cardinality):
When two tables are related, there is always a specific type of relationship (traditionally known as cardinality) that exists between them. There are three possible types of relationships:
one-to-one
one-to-many and many-to-one relationships
One-to-One Relationships:
A one-to-one relationship exists between a pair of tables if a single record in the first table is related to only one record in the second table and a single record in the second table is related to only one record in the first table.
Following figure shows an example of a one-to-one relationship involving an EMPLOYEES table and a COMPENSATION table. In this example, a single record in the EMPLOYEES table is related to only one record in the COMPENSATION table; likewise, a single record in the COMPENSATION table is related to only one record in the EMPLOYEES table.
One-to-Many Relationships:
A one-to-many relationship exists between a pair of tables if a single record in the first table can be related to one or more records in the second table, but a single record in the second table can be related to only one record in the first table.
A one-to-many relationship involving a STUDENTS table and an INSTRUMENTS table is shown in the following figure. In this case, a single record in the STUDENTS table can be related to one or more records in the INSTRUMENTS table, but a single record in the INSTRUMENTS table is related to only one record in the STUDENTS table.
Many-to-One Relationships
A many-to-one relationship exists between a pair of tables if a single record in the first table can be related to only one record in the second table, but a single record in the second table can be related to one or many records in the first table.
A many-to-one relationship involving an EMPLOYEE table and a DEPARTMENT table is shown in the following figure. In this example, a single record in the EMPLOYEE table can be related to only one record in the DEPARTMENT table and a single record in the DEPARTMENT table can be related to one or more records in the EMPLOYEE table.
Following figure represents many-to-one relationship. Here a relationship is established between two tables with the help of a linking table.
Many-to-Many Relationships:
A many-to-many relationship exists between a pair of tables if a single record in the first table can be related to one or more records in the second table, and a single record in the second table can be related to one or more records in the first table.
Following figure shows a classic many-to-many relationship. In this example, a single record in the STUDENTS table can be related to one or more records in the CLASSES table; likewise, a single record in the CLASSES table can be related to one or more records in the STUDENTS table.
Following figure represents many-to-many relationship. Here a relationship is established between two tables with the help of a linking table.
Types of Participation (Optionality):
There are two types of participation that a table can have within a relationship:
Mandatory
Optional
Say there is a relationship between two tables called TABLE A and TABLE B.
If records in TABLE A must exist before any new records can be entered into TABLE B, TABLE A's participation within the relationship is mandatory.
However, if it is not necessary for records in TABLE A to exist in order to enter any new records into TABLE B, TABLE A's participation within the relationship is optional.
Consider the relationship between the AGENTS and CLIENTS tables in the following figure. The AGENTS table has a mandatory participation within the relationship if agents must exist before a new client can be entered into the CLIENTS table. But the AGENTS table's participation is optional if it isn't necessary to have agents in the AGENTS table before a new client can be entered into the CLIENTS table. The type of participation established for the AGENTS table is determined by the way its data is being used in relation to the data in the CLIENTS table. For example, if it is necessary to ensure that each client is assigned an available agent, then the participation of the AGENTS table within the relationship should be mandatory.
Representation of Cardinality and Optionality:
In general the following conventions are being used for representing cardinality and optionality,
Cardinality:
The notion of cardinality is expressed as either "one" or "many"
A cardinality of “one” is expressed as a “straight line” and a cardinality of “many” is expressed using “crow's feet”.
Optionality:
The notion of optionality is expressed as either "mandatory" or "optional"
An optionality of “Optional” is expressed as a “circle” and an optionality of “Mandatory” is expressed as a “vertical bar”.
Sample ER Diagram:
Consider the example of a database that contains information on the residents of the cities. The ER Diagram contains two Entities – Person and City.
Optionality: A person should live in a city. (That is why a bar appears adjacent to City from
Define Additional Attributes for the Entities:
Defining the attributes for an entity includes the following activities.
Defining attribute name.
Defining Data Type for the attributes.
Defining appropriate values for the attributes - what values are acceptable for the various attributes of a table.
Normalization: Normalization is a design approach that minimizes data redundancy and optimizes
data structures by systematically and properly placing data elements into the appropriate groupings. Normalization is the process of efficiently organizing data in a database and is the process of decomposing large tables into smaller tables.
Goals of Normalization: There are two goals of the normalization process:
Eliminating redundant data (for example, storing the same data in more than one table)
Ensuring data dependencies make sense (only storing related data in a table).
Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored and avoid problems with inserting, updating, or deleting data.
Normal Form: A series of guidelines were developed by the database community for ensuring that
the databases are normalized. Those guidelines are represented by different normal forms.
Types of Normal forms are as follows: First Normal Form or 1NF
Second normal Form or 2NF
Third Normal Form or 3NF
In practical applications, in general, you use only 1NF, 2NF, and 3NF.
First Normal Form:
All entities must have a unique identifier or key, that can be composed of one or more attributes.
Eliminate repeating groups and non-atomic data from an entity.
The term atomic derives from atom, the smallest indivisible particle that can exist on its own. First normal form eliminates repeating groups and non-atomic data from an entity.
To normalize a data model into 1NF, eliminate repeating groups into individual entities. In other words, do not use multiple attributes in a single entity to store similar data. Consider the sample data shown in table for a STUDENT information system for a college or university.
This data contains several violations of 1NF. First, you are tracking courses that really represent a repeating group for STUDENTs. So, the course information should be moved into separate entities. Furthermore, you need to specify identifiers for both entities. The identifier is the primary key for the entity.
A second violation of 1NF is the non-atomic data shown in the StudentName attribute. A student name can be broken down into pieces: first name, middle initial and last name. It is not indivisible, and therefore violates first normal form.
Unnormalized STUDENT Data
StudentID StudentName MajorID StudentMajor CourseNum CourseName CourseCompDate
2907 Smith, Jacob
R
MAT Mathematics MAT0011 Discrete Math 8/1/2002
MAT0027 Calculus I 4/30/2002 EGL0010 English Classics I 12/30/2001 4019 Patterson, Jane K
PHI Philosophy PHI0010 CS00100 Intro to Philosophy Programming Languages 04-30 2002-04-30 5145 Neeld, Norris B EGL English Literature SOC0102 Ascent of Man 8/1/2002 6132 Morrison, Xavier Q
MUS Music MUS0002 SOC0102 Origin of Jazz Ascent of Man 04-30 2002-08-01 7810 Brown, Richard E CS Computer Science 8966 Juarez, Samantha EGL English Literature EGL0010 EGL0101 English Classics I Shakespeare II 2001-12-30 2002-08-01
STUDENT Entity in 1NF
StudentID LastName FirstName MiddleInit MajorID StudentMajor
2907 Smith Jacob R MAT Mathematics
4019 Patterson Jane K PHI Philosophy
5145 Neeld Norris B EGL English
Literature
6132 Morrison Xavier Q MUS Music
7810 Brown Richard E CS Computer
Science
8966 Juarez Samantha EGL English
Literature
COURSE Entity in 1NF
StudentID CourseNum CourseName CourseCompDate
2907 MAT0011 Discrete Math 8/1/2002
2907 MAT0027 Calculus I 4/30/2002
2907 EGL0010 English Classics I 12/30/2001
4019 PHI0010 Intro to Philosophy 4/30/2002
4019 CS00100 Programming Languages 4/30/2002
5145 SOC0102 Ascent of Man 8/1/2002
6132 MUS0002 Origin of Jazz 4/30/2002
6132 SOC0102 Ascent of Man 8/1/2002
8966 EGL0010 English Classics I 12/30/2001
8966 EGL0101 Shakespeare II 8/1/2002
Second Normal Form:
Should be in First Normal Form
Every non-key attribute is fully dependent on the key
Second normal form (2NF) ensures that all the attributes of each entity are dependent on the primary key.
Notice that certain courses repeat in the COURSE entity, namely "English Classics I" and "Ascent of Man." This situation indicates a violation of 2NF. To correct the problem, we need to identify the attributes that do not depend on the entire key and remove them. The removed attributes, along with the portion of the primary key on which they depend, are placed in a new entity,
ENROLLMENT. The entire primary key of the original entity remains with the original entity. Another benefit of the normalization process is that you will frequently encounter new attributes
ENROLLMENT Entity in 2NF
StudentID CourseNum CourseCompDate
2907 MAT0011 2002-08-01 2907 MAT0027 2002-04-30 2907 EGL0010 2001-12-30 4019 PHI0010 2002-04-30 4019 CS00100 2002-04-30 5145 SOC0102 2002-08-01 6132 MUS0002 2002-04-30 6132 SOC0102 2002-08-01 8966 EGL0010 2001-12-30 8966 EGL0101 2002-08-01 COURSE Entity in 2NF
CourseNum CourseName Credits
MAT0011 Discrete Math 3
MAT0027 Calculus I 4
EGL0010 English Classics I 3
PHI0010 Intro to Philosophy 3
CS00100 Programming Languages 3
SOC0102 Ascent of Man 3
Third Normal Form:
Should be in Second Normal Form
Every non-key attribute is non-transitively dependent on the primary key that is, every attribute in the entity should depend only on the key not on any other non-key
attributes.
A rule of thumb for identifying 3NF violations is to look for groups of attributes whose values can apply to more than a single entity occurrence. When you discover such attributes, move them to a separate entity.
It is time to review our STUDENT information again, this time looking for 3NF violations. Examine the STUDENT data in closely. Notice that students can have the same major and, as such, certain major information can be repeated, specifically two students in our small sample are English Literature majors. To correct the problem, we need to remove major attributes that transitively depend on the key and create a new entity for them.
STUDENT Entity in 3NF
StudentID LastName FirstName MiddleInit MajorID
2907 Smith Jacob R MAT
4019 Patterson Jane K PHI
5145 Neeld Norris B EGL
6132 Morrison Xavier Q MUS
7810 Brown Richard E CS
8966 Juarez Samantha EGL
MAJOR Entity in 3NF
MajorID StudentMajor
MAT Mathematics PHI Philosophy
EGL English Literature
MUS Music
CS Computer Science
A Normalized Data Model: To be complete, a diagram should be developed for the 3NF data
model we just created for the STUDENT data. Figure shows such a data model. Notice that we have not filled in the optionality of the relationships. We could do this based on the sample data we used, but we really need to ask more questions before we can answer questions such as Does a every student have to have a major? The current data shows this to be the case, but in reality; you know that most freshmen, and even upperclassmen, may attend college without having a formally declared major.
The STUDENT data model
Further Normal Forms:
Normalization does not stop with 3NF. Additional normal forms have been identified and documented. However, normalization past 3NF does not occur often in normal practice. The following are additional normal forms. Just for your information we’ve kept this.
Boyce Codd normal form (BCNF) is a further refinement of 3NF. Indeed, in his later writings Codd refers to BCNF as 3NF. A row is in Boyce Codd normal form if and only if every determinant is a candidate key. Most entities in 3NF are already in BCNF.
Fourth normal form (4NF) states that no entity can have more than a single one-to-many
relationship if the one-to-many attributes are independent of each other. An entity is in 4NF if and only if it is in 3NF and has no multiple sets of multivalued dependencies.
Fifth normal form (5NF) specifies that every join dependency for the entity must be a consequence of its candidate keys.
Physical database design
After completing the logical design of our database, we now move to the physical design. The purpose of building a physical design of our database is to optimize performance while ensuring data integrity by avoiding unnecessary data redundancies.
During physical design, you transform the entities into tables, the instances into rows, and the attributes into columns. You must decide on many factors that affect the physical design, some of which are listed as follows:
How to translate entities into physical tables
What attributes to use for columns of the physical tables
Which columns of the tables to define as keys
What indexes to define on the tables
What views to define on the tables
How to denormalize the tables
How to resolve many-to-many relationships
Physical design is the time when you abbreviate the names that you chose during logical design. For example, you can abbreviate the column name that identifies employees, EMPLOYEE_NUMBER, to EMPNO.
The task of building the physical design is a job that truly never ends. You need to continually monitor the performance and data integrity characteristics of the database as time passes. Many factors necessitate periodic refinements to the physical design.
Denormalization: Denormalization is a key step in the task of building a physical relational
database design. It is the intentional duplication of columns in multiple tables, and the
consequence is increased data redundancy. This is recommended to avoid performance problems occur as a result of normalization. This should be done based on the processing needs of
Implementing and Altering Database Design Implementing the database design involves:
Implementing DB2 objects
loading data
managing data
altering the design as necessary
Altering Database Design:
After using a relational database for a while, we might want to change some aspects of its design.
To alter the database design we need to change the definitions of DB2 objects.
Advantage of DB2 over VSAM
The following list will give some of the advantages of DB2 over VSAM.
Feature DB2 VSAM
Hardware Independence PC to mainframe Only Mainframe
OS Independence NT, Unix and OS/390 Only OS/390
Ease of development Standard SQL Not so simple
Stored procedure & triggers No such option
Ease of maintenance Standard SQL Difficult
Security High degrees of security Only at Dataset level
Referential Integrity DB2 enforces it Developers responsibility
Query Interface Easy to view/modify Not available
Performance Better for even large data Better when data is less
Optimizer handles Developer responsible
Performance Tuning Can be tuned anytime Depends on initial design Can be at SQL level Only application level Tools available for aiding No tuning aids Abundant tuning skills Tuning skills are rare Reorganization Direct reorganization Delete & recreate
Online reorg possible Downtime needed
Recovery Managed by DB2 Managed by CICS/IMS
Always recoverable No recovery in batch
From log / backup From backup only
Auto Recovery Manual Restore
Backup Online backup possible Downtime needed
Incremental backup No incremental backup
Disaster Recovery Supported by DB2 Part of DASD recovery
Feature DB2 VSAM
Selective retrieval No Selective retrieval Up to row level archival Dataset level archival Data types Images, Video, Audio, and so on. Text only
Contents can be in file No such option
Summary
Database Design involves:
o Logical Database Design
o Physical Database Design
o Implementing and Altering Database Design
Key activities for designing Logical data modeling with Entity-Relationship Model are:
o Define Entities
o Define Primary Key: Unique identifier
Composite Key: Key made up of more than one attributes
o Define Relationships among Entities: One-to-one relationships
One-to-many and many-to-one relationships Many-to-many relationships
Cardinality: Number of occurrences between the entities. Optionality: Relationships are mandatory or optional Define Additional Attributes for the Entities
Normalization: Eliminating redundant data thereby reducing the amount of space
o Three Normal Forms: 1NF, 2NF, and 3NF
Physical database design is transformation of:
o Entities into tables
o Instances into rows
o Attributes into columns
o Denormalization is recommended to avoid performance problems occur as a result of normalization
Implementing the database design involves: Implementing DB2 objects, loading and managing data and altering the design as necessary.
o Altering Database Design: To alter the database design, you need to change the definitions of DB2 objects
Test Your Understanding
1. What are the tasks involved in database design? 2. What is logical data modeling?
7. What are all the different types of relationship? 8. What is Cardinality and Optionality?
9. What is Normalization?
10. What are the goals and benefits of Normalization? 11. State the different types of Normal forms.
12. Describe the tasks involved in the physical database design. 13. When will you do Denormalization?
14. State the activities involved in implementing and altering database design. 15. List some of the advantages of DB2 over VSAM.
Exercises
Following figure represents the “Medical Bill” to be produced to a customer. Design a Database for the medical shop system.
Session 07: Data Integrity
Learning Objectives
After completing the session, you will be able to:
Identify different means of achieving data integrity across the database by DB2
Introduction to Data Integrity
Data integrity refers to the validity, consistency, and accuracy of the data in a database. It cannot be overstated that the level of accuracy of the information retrieved from the database is in direct proportion to the level of data integrity imposed within the database. Data integrity is one of the most important aspects of the database design process, and it should not be underestimated, overlooked, or even partially neglected. To make any of these mistakes would result in a high risk of undetectable errors.
There are three types of data integrity and are as follows:
Entity Integrity
Referential Integrity
Domain Integrity
Entity Integrity
This is the “Table-level integrity” which ensures that the field that identifies each record within the table is unique and is never missing its value.
Entity integrity requires the specification of a primary key (PK) for each table. Key Notes about Primary Key:
Each table can have zero or one primary key.
Primary key should not be Null and if the primary key is a composite key, make sure that each component should not be Null.
Every primary key explicitly defined for a table must be associated with a corresponding unique index.
If you do not create a unique index for a primary key, then an incomplete key is defined for the table, making the table inaccessible.
Unique Constraint:
A unique constraint is similar to a primary key constraint which also enforces unique values on an individual or a group of columns. Each table can have zero, one, or many unique constraints consisting of one or more columns each. The values stored in the unique column, or combination of columns, must be unique within the table. Unique constraint column should not be Null.
Unique Index:
In addition of creating unique index in primary key column or unique constraint column (which is mandatory), you can create as many unique indexes as we need on any other columns of the table to ensure uniqueness.
The following table will show the difference between unique index on Primary Key/Unique constraint column and other column other than primary/unique constraint column.
Primary Key or unique constraint column Column other than primary/unique constraint but defined with unique index
A table can contain only one primary key constraint and multiple unique constraints
A table can contain multiple unique indexes
Cannot allow NULL values Can allow NULL values
Supports Referential Integrity Cannot support Referential Integrity
Referential Integrity
This is a “Relationship-level integrity” which ensures that the relationship between a pair of tables is sound and that there is synchronization between the two tables whenever data is entered, updated, or deleted.
Referential integrity is achieved through the foreign key.
Foreign Key:
A foreign key (FK) is a column or combination of columns used to establish and enforce a link between the data in two tables.
A link is created between two tables by adding the column or columns that hold one table's primary key values to the other table. This column becomes a foreign key in the second table.
The table with the primary key is called parent table and the table with the foreign key is called
dependent table (or child table).
Referential integrity (RI) means each row in a dependent table must have a foreign key that is equal to a primary key in the parent table.
Rules ensuring RI:
Insert:
When inserting a row with a foreign key in the dependent table, DB2 checks the values of the foreign key column against the values of the primary key column in the parent table. If there is a matching primary key column, the insert is allowed. If there is no matching primary key column, the insert will not happen.
A new row can be inserted in the parent table as long as the primary key value of the new row is unique.
Update:
When updating foreign key values in the dependent table, DB2 checks whether there is a matching primary key in the parent table or not. If there is a matching primary key, update is allowed. If there is no matching primary key, update is not allowed.
The primary key in the parent table cannot be updated if any rows in the dependent tables refer to it.
Delete:
Deleting a row with a foreign key in the dependent table is permitted.
When deleting a row with a primary key in the parent table, DB2 takes one of the following actions as indicated while defining the table.
RESTRICT: Disallows the deletion of the primary key row if any foreign keys relate to that row. CASCADE: Allows the deletion of the primary key row and also deletes the foreign key rows that relate to it.
SET TO NULL: Allows the deletion of the primary key row and, instead of deleting all related foreign key rows, sets the foreign key columns to NULL.
Operation Child Table Parent Table
Insert Allowed if the foreign key value matches the value of Primary key of the parent table.
Allowed as long as the primary key value is unique.
Update Allowed if the foreign key value matches the value of Primary key of the parent table.
Allowed if there are no foreign key references in the child tables.
Delete Allowed. Depending upon the action specified during
table definition.
Restrict: not allowed if there are any foreign key references
Cascade: allowed and deletes the foreign key references if any in the child tables. SET TO NULL: allowed and a Null value will be set in the foreign key references of the child tables
Domain Integrity
This is the “Field-level integrity” which ensures that the structure of every field is sound, that the values in each field are valid, consistent, and accurate, and that fields of the same type (such as City fields) are consistently defined throughout the database.
Domain integrity is enforced using:
Data Type and Length
Default Values
NULL Values
Check constraint Data Type and Length
By specifying the data type and maximum length for each column when a table is created, the DBMS will automatically ensure that only the correct type of data with the maximum length allowed is stored in that column.
Default Values
When columns are created within tables, they can be assigned a default value that will be used when inserting or loading the data which do not provide an explicit value for that column. Each column can have only one default value. User can provide a default value for a column. If there is no user specific default value, DB2 will assign the default value based on the data type of that column.
DB2 Data Types with Default Values: Data Type COBOL
PIC
COBOL USAGE
Default Description
CHAR(n) PIC X(n) DISPLAY Blanks Fixed length character
(EBCDIC) data
VARCHAR(n) PIC X(n) DISPLAY Empty String Variable length character data
SMALLINT PIC S9(4) COMP OR
COMP-4
0 Half word integer data
INTEGER PIC S9(9) COMP OR
COMP-4
0 Full word integer data
DECIMAL(p,s) PIC S9(p)V9(s)
COMP-3 0 Packed decimal data.
p è number of decimal digits s è number of digits to the right side of the decimal point
DATE PIC X(10) DISPLAY Current Date Date data (yyyy-mm-dd)
TIME PIC X(8) DISPLAY Current Time Time data (hh.mm.ss)
TIMESTAMP PIC X(26) DISPLAY Current
timestamp
Date and Time data with microseconds
(yyyy-mm-dd-hh.mm.ss.mmmmmm)
GRAPHIC PIC G(n) DISPLAY-1 Blanks Double byte character set
(DBCS) data
VARGRAPHIC PIC G(n) DISPLAY-1 Empty String Variable length DBCS data
FLOAT(n) None COMP-1 or
COMP-2
0 Floating point data in single or
NULL Values:
Whenever a value is missing or unknown, it is said to be Null. A null value represents neither zero (in the case of numeric data) nor blank (represented by one or more spaces in the case of textual data). Zero and blank are actual values and can be meaningful in some way under certain circumstances. For example, a zero can represent the current state of an Account Balance; a blank in a Middle Initial field can represent the fact that an employee has no middle initial in his or her name. In the following figure, a blank represents the fact that Washington, D.C., is not located in any county whatsoever.
A null value is typically used to represent an unknown value in a field. In the above figure, for example, there are null values in the County field. Shannon McLain did not know what county she lived in at the time her data was entered into the database, so no entry was made into the County field. As a result, the County field contains a null value. This value can be changed, however, once Shannon finds out what county she lives in.
A null value is also used to represent a missing value in a field. If the person who entered the data for Shannon McLain failed to ask her for the name of the county she lives in, the data is
considered missing since no entry was made into the County field due to operator error. Once the error is recognized, it can be easily corrected by obtaining the appropriate value from Ms. McLain.
A drawback to null values is that they cannot be evaluated by mathematical expressions or aggregate functions. If a null value is used in a mathematical expression, that expression will
logically reasonable—if the number is unknown, the value will necessarily be unknown. Also there is a serious undetected error that occurs if all the values in the Total Value field are then added together: an inaccurate total. The only way to obtain an accurate total is to provide a value for the entries in the Qty On Hand field that are currently Null.
The result of an aggregate function, such as "Count()," will be Null if it is based on a field that contains null values. For example, the following figure shows the results of a summary query that counts the total number of occurrences of each category in the PRODUCTS table shown above. The value of Total Occurrences in the summary query is the result of the function expression "Count([Total Occurrences])." Notice that the summary query shows 0 occurrences of an
unspecified Category, implying that each product has been assigned a category. This information is clearly inaccurate because there are two products in the PRODUCTS table that have not been assigned a category.
Check Constraint
A check constraint is a rule that specifies the values that are allowed in one or more columns of every row of a table. Check constraint enforces business rules directly into the database without requiring additional application logic. This can be defined during column definition.
Summary
Data Integrity ensures the accuracy of the data in a database.
Types of Data Integrity
o Entity Integrity
o Referential Integrity
o Domain Integrity
Entity integrity enforces each occurrence of an entity must be uniquely identifiable and is achieved through primary key.
A table definition can be complete only when a unique index is created on its primary key.
Unique constraint also behaves similar to primary key constraint except the fact of the number of unique constraints can be more than one in a table.
Unique index column allows null values and does not support RI as against primary or unique constraint column.
Foreign key is a column in a child table which holds the value of primary key column of a parent table.
Referential integrity (RI) ensures each row in a dependent table must have a foreign key that is equal to a primary key in the parent table.
Operation Child Table Parent Table
Insert Allowed if the foreign key value matches the value of Primary key of the parent table.
Allowed as long as the primary key value is unique.
Update Allowed if the foreign key value matches the value of Primary key of the parent table.
Allowed if there are no foreign key references in the child tables.
Delete Allowed. Depending upon the action specified
during table definition.
Restrict: Not allowed if there are any foreign key references
Cascade: Allowed and deletes the foreign key references if any in the child tables. SET TO NULL: Allowed and a Null value will be set in the foreign key references of the child tables
Domain integrity ensures the possible values of a column and is achieved through:
o Data Type and Length:
o Commonly used DB2 data types: Char, Varchar, Smallint, Integer, Decimal, Date,
Time, Timestamp
o Default Values:
Can be user defined or depends on the data type
Used for columns with the missing values during insertion
o NULL Values: Unknown or missing value
o Check constraint: Enforces business rules and allows only the predefined values
Test Your Understanding 1. What is data integrity?
2. What is entity integrity and how do you achieve it? 3. Can you create a table without a primary key? 4. Can a primary key column hold a value of null?
5. State the importance of unique index in the primary key. 6. What is a unique constraint?
7. Differentiate between the primary key constraint and unique constraint. 8. State the differences between primary key index and unique index. 9. What is a foreign key?
10. What is a referential integrity?
11. List the rules applied to ensure RI while inserting, updating, and deleting the data both in the parent table as well as in the child table.
Session 08: Interaction with DB2
Learning Objectives
After completing the session, you will be able to:
Access DB2 data using DB2I, SPUFI and QMF
Interaction with DB2: Overview
An environment is connected to a DB2 subsystem by an attachment facility. Whenever there is a need of interaction with DB2, a thread will be established.
A thread is a control structure used to:
Send requests to DB2
Send data from DB2 to the requestor
Communicate the status of each SQL statement after it is executed
TSO as a Door to DB2:
You use TSO (Time-Sharing Option) environment as a door that provides access to DB2 data. The TSO Attachment Facility provides access to DB2 resources in two ways:
Online mode, in the TSO foreground, using ISPF (Interactive System Productivity Facility) panels
Batch mode using the TSO Terminal Monitor Program - IKJEFT01 (or IKJEFT1B)
Common types of TSO foreground users include:
DB2I
DB2I
DB2I (DB2 Interactive) is a TSO-based DB2 application. It consists of a series of ISPF panels, programs, and CLISTs enabling rapid access to DB2 services and data. DB2I increases the TSO DB2 developer's productivity.
How to access DB2I:
Log on to TSO as you normally would. The ISPF main menu appears.
Choose Option 8 (DB2 - Perform DATABASE 2 Interactive Functions) in Cognizant Mainframe.
DB2I Main menu appears as shown in the following screenshot.
DB2I Primary Option Menu
Following Options are available with DB2I:
o SPUFI: SQL Processor Using File Input.
o DCLGEN: Declaration Generator.
o PROGRAM PREPARATION: Prepares a program containing embedded SQL for execution.
o PRECOMPILE: Program containing embedded SQL is parsed to retrieve all SQL and replace it with calls to a runtime interface to DB2
o BIND/REBIND/FREE: Provides the capability to bind a DB2 plan & package, rebound a plan & package, remove plan & package from the system
o UTILITIES: Provides panels that ease the administrative burdens of DB2 utility processing
o DB2I DEFAULTS:
Lets you modify parameters that control the operation of DB2I
Be sure that the proper DB2 subsystem is specified in the “DB2 Name” parameter
Be sure that the proper language to be used for preparing DB2 programs in the “Application Language” parameter
A valid job card needs to be supplied in the “DB2I Job Statement” parameter.
o EXIT: Leaves DB2I
SPUFI
The first option in the DB2I main menu is SPUFI (SQL Processor Using File Input).
SPUFI is intended primarily for application programmers who wish to test the SQL portions of their programs or administrators who wish to perform SQL operations.
SPUFI reads SQL statements contained as text in a sequential file or in a member of a PDS, processes those statements and places the results in an ISPF browse session. By specifying the input and output data sets and selecting the appropriate options, we can execute SQL statements in an online mode.
The SPUFI Panel is as follows.
Input Dataset
Output Dataset