Junping Sun Database Systems 4-1
Part 4: Database Language - SQL
Database Languages and Implementation Data Model
Data Model = Data Schema + Database Operations + Constraints
• Database Languages such as SQL and QUEL can be viewed as a tool to implement database schema and data operations at logical or implementation level.
• Database Language = Database Definition Language (DDL) + Database Manipulation Language (DML) • DDL implements database schema
• DML implements database operations
• Separation of DDL and DML is the major distinction between the application systems developed by database languages and developed by programming languages.
Junping Sun Database Systems 4-3
SQL - Structural Query Language
SQL:
• It is the most accepted and implemented interface language for relational database systems(intergalactic dataspeak).
History of Relational Database Languages: • SEQUEL (1974 -- 1975)
• It was the Application Programing Interface (API) to System R.
• It was revised to SEQUEL/2 after several years, and later SEQUEL/2 was changed to SQL.
• SQL/DS (1981) • DB2 (1983)
• SQL (ANSI-86) the first standardized version of SQL, called SQL1 • SQL (ANSI-89)
• SQL (ANSI-92), called SQL2
• SQL3, support recursive operation and object-oriented paradigm
• SQL-99 Standard
Data Definition
Schema Definition at Three Level of Databases: View data schema (table) definition:
A view table can be defined on the top of one or more base table Base data table schema definition:
A base table is corresponding to one physical data file in the storage system. Physical
• Each base table can be stored in different type of storage schema or data organization structure such as
sequential file, hash index, ISAM, VSAM
B-Tree, B+-Tree, B*-Tree, K-D Tree, KDB Tree, R-Tree, R+-Tree, R*-Tree • Integrity constraints on schema
• Authorization, and security mechanism on user defined database operations such as query, update, and insert/delete operations.
Junping Sun Database Systems 4-5
Data Definition
Create Statements:
• create table statement (to define a base table)
• create index statement (to define an index at internal level) • create view statement (to define an view at user level)
• create schema statement (to treat a database as whole unit in SQL89 &SQL2) Drop Statements:
• drop table statement (to delete the definition and all instances of the table) • drop index statement (to remove an existing index)
• drop view statement (to delete the view) • drop schema statement (to delete schema)
Schema and Catalog in ANSI-SQL Standard
SQL Schema:
• It is identified by a schema name , and includes an authorization identifier to indicate the user or account who owns the schema.
Example:
CREATE SCHEMA COMPANY AUTHORIZATION JSMITH;
• It creates a schema called COMPANY, owned by the user with authorization identifier JSMITH.
Syntax:
schema ::= CREATE SCHEMA schema-name AUTHORIZATION user
Junping Sun Database Systems 4-7
CREATE TABLE EMPLOYEE Statement
CREATE TABLE EMPLOYEE
(NAME VARCHAR2(19) NOT NULL,
SSN CHAR(9), BDATE DATE, ADDRESS VARCHAR(30), SEX CHAR, SALARY NUMBER(10,2), SUPERSSN CHAR(9),
DNO VARCHAR(8) NOT NULL,
CONSTRAINT EMPPK PRIMARY KEY(SSN),
CONSTRAINT EMPSUPERFRK
FOREIGN KEY (SUPERSSN) REFERENCES EMPLOYEE (SSN) DISABLE,
CONSTRAINT EMPDUMFRK
FOREIGN KEY (DNO) REFERENCES DEPARTMENT (DNUMBER) DISABLE);
• The constraint can be enabled by using the ALTER TABLE statement after the data is loaded into the table.
ALTER TABLE EMPLOYEE ENABLE CONSTRAINT EMPSUPERFRK;
Specifying Referential Triggered Actions
CREATE TABLE EMPLOYEE(NAME VARCHAR2(19) NOT NULL,
SSN CHAR(9),
BDATE DATE,
ADDRESS VARCHAR(30),
SEX CHAR,
SALARY NUMBER (10,2)
CHECK SALARY BETWEEN 10000 AND 99000,
DNO VARCHAR(9) NOT NULL DEFAULT “1”,
CONSTRAINT EMPPK PRIMARY KEY (SSN),
CONSTRAINT EMPSUPERFK
FOREIGN KEY (SUPERSSN) REFERENCES EMPLOYEE (SSN) ON DELETE CASCADE DISABLE);
Junping Sun Database Systems 4-9
Specifying Referential Triggered Actions
CREATE TABLE DEPARTMENT
(DNAME VARCHAR2(15) NOT NULL,
DNUMBER VARCHAR(8),
MGRSSN CHAR(9) NOT NULL DEFAULT “888665555”,
CONSTRAINT DEPTPK
PRIMARY KEY (DNUMBER),
CONSTRAINT DEPTSK UNIQUE (DNAME),
CONSTRAINT DEPTMGRFRK
FOREIGN KEY (MGRSSN) REFERENCES EMPLOYEE(SSN)
ON DELETE CASCADE DISABLE);
ALTER TABLE EMPLOYEE ADD (CONSTRAINT EMPDNOFRK
FOREIGN KEY (DNO) REFERENCES DEPARTMENT(DNUMBER) );
Data Types
SQL Data Types: (ANSI-SQL) SQL Data Types: (ORACLE)
CHARACTER(n) CHAR(n)
CHARACTER VARYING(n) VARCHAR(n)
VARCHAR2(2) NUMERIC(p,s) NUMBER(p,s) DECIMAL(p,s) INTEGER NUMBER(38) INT SMALLINT FLOAT(b) NUMBER DOUBLE PRECISION REAL DATE DATE RAW LONG LONG RAW ROWID
Junping Sun Database Systems 4-11
Data Manipulation in SQL
Data Manipulation at Base Table Level: • Query the database via select statement
• Modify data (tuples) in a table of the database via update statement • Remove data (tuples) from a table of the database via delete statement. • Append data (tuples) into a table in the database via insert statement. Data Manipulation at View (virtual table) Level:
• Query the partial database via select statement from view • Update or modify the partial data defined at the view level
mapping view update to the underlying base table • single table update
• multiple table update still has unsolved problem.
Query Database In SQL
• Querying database in SQL is done via select statement. General format of select statement:
select <attribute list> from <table list> where <condition>
• <attribute list> is a list of attribute names whose values are to be retrieved by the query.
• <table list> is a list of the relation names required to process the query. multiple tables listed in the <table list> implies join operation involved. • <condition> is a conditional (Boolean) expression that identifies the tuples to
be retrieved by the query.
<condition> specifies the selection and join operations.
<condition> can include another select statement as a subquery of nested query.
Junping Sun Database Systems 4-13
SELECT-PROJECT QUERY
Q0: Retrieve the birth date and address of the employee whose name is ‘John B. Smith’.
SQL Script for Q0:
Q0: select bdate, address from employee
where fname =‘John’ and minit = ‘B’ and lname = ‘Smith’; Relation Algebra Expression for Q0:
S
<bdate, address> (V
fname = ‘John’ and minit = ‘b’ and lname =‘smith’(employee) Target Attribute: bdate, addressConstraint: fname =‘John’ and minit = ‘B’ and lname = ‘Smith’ Target Relation: employee
SELECT-PROJECT-JOIN QUERY
Q1. Retrieve the first and last names and addresses of all employees who work for the 'Research ' department.
select fname, lname, address from employee, department
where dname = 'Research' and dnumber = dno;
Target Attributes: fname, lname, address Constraint:
Select Condition: dname = 'Research' Join Condition: dnumber = dno Target Relations: employee, department
• This query involves one selection on department relation and a join on relations employee and department.
Junping Sun Database Systems 4-15
Q2. For every project located in 'Stafford’, list the project number, the controlling department number, and the department manager's last name, address, and birthdate.
select pnumber, dnum, lname, address, bdate from project, department, employee
where plocation = 'Stafford' and dnum = dnumber and mgrssn = ssn;
Target Attributes: pnumber, dnum, lname, address, bdate Constraints:
Select Condition: plocation='Stafford'
Join Condition: dnum=dnumber, mgrssn = ssn Target Relations: project, department, employee
• selection operation on project relation to select project tuples located in 'Stafford'.
• join with project and department relation to find the controlling department • join with department and employee relation to find manager’s information in
employee relation.
• two join operations implement two relationships in ER schema of the database, MANAGES and Controls.
Dealing with Ambiguous Attribute Names and Aliasing
Q1A: select fname, lname, address from employee, department
where department.dname = 'Research' and department.dnumber = employee.dnumber ;
• if the attribute names for department number are the same in both employee and department tables, then qualifier will be necessary in specifying a query to avoid ambiguity.
Q8. For each employee, retrieve the employee's first and last name and the first and last name of his or her immediate supervisor.
select e.fname, e.lname, s.fname, s.lname from employee e, employee s where e.superssn = s.ssn;
Junping Sun Database Systems 4-17
Discussion on Aliasing
• ambiguity will arise in the case of queries that refer to the same relation name twice.
• the above query statement declares alternative relation names of employee relation e and s.
• e and s can be imagined as two different copies of the employee relation. e represents employees in the role of supervisees
s represents employees in the role of supervisors • join and selection operations are involved. • join attributes are superssn and ssn.
the join condition e.superssn = s.ssn links the employee’s supervisor’s corresponding information such as fname and lname.
• the join condition implements the recursive relationship supervision in original ER schema.
• this is an example of one level recursion.
• a general recursive query, with unknown number of levels, can be not specified.
Query Examples
Query with PROJECT:
Q9: List all employees’ social security number. select ssn
from employee; Query with SELECT:
Q1C: Retrieve all employees’ tuples from department 5. select *
from employee where dno = 5;
Junping Sun Database Systems 4-19
Query Examples
Query with CARTESIAN PRODUCT:
Q10: List all combinations of EMPLOYEE SSN and DEPARTMENT DNAME select ssn, dname
from employee, department; Query with Retrieving Distinct Attribute Values: Q11: Retrieve the salary of every employee
select ALL salary from employee;
Q11A: Retrieve all distinct salary values select DISTINCT salary from employee;
Query Involving with Union
Q4. Make a list of all project numbers for projects that involve an employee whose last name is ’Smith’ as a worker or as a manager of the
department that controls the project. (select distinct pnumber
from project, employee, department
where lname = ’Smith’ and dnum = dnumber and mgrssn = ssn) union
(select distinct pnumber
from project, employee, works_on
where lname = ’Smith’ and pnumber = pno and essn = ssn); • the first select query retrieves the projects that involve a 'Smith' as a
manager of department that controls the project.
• the second select query retrieves the projects that involve a 'Smith' as a worker on the project.
• if several employees have the last name 'Smith', the project names involving any of them would be retrieved.
Junping Sun Database Systems 4-21
Discussion
The first part of union:
Target Attributes: pnumber Constraints:
Select Condition: lname = ‘Smith’
Join Condition: dnum = dnumber (implement relationship control) mgrssn = ssn (implement relationship manager) Target Relations: project, employee, department
The second part of union:
Target Attributes: pnumber Constraints:
Select Condition: lname = ‘Smith’
Join Condition: pnumber = pno and essn = ssn (implement M:N relationship works_on) Target Relations: project, employee, works_on
Predicate IN
• The IN predicates selects those rows for which a specified value appears in a list of constant values enclosed in parentheses or the results from a
subquery.
Q13: Retrieve the social security numbers of all employees who work on any one of the project with project number 1, 2, or 3.
select distinct essn from works_on where pno in (1, 2, 3); Result from the query:
essn 123456789 666884444 453453453 333445555
Junping Sun Database Systems 4-23
Workson Table
Predicate NOT IN
• The NOT IN predicate is true if the expression preceding the keyword IN does not match any value in the list.
Q13b: Retrieve the social security numbers of all employees who work on the project other than projects 1, 2, and 3.
select essn from works_on
where pno not in (1, 2, 3); Result from the query:
essn 333445555 888665555 987654321 987987987 999887777
Junping Sun Database Systems 4-25
Quantifier ANY/SOME
Predicate ANY /SOME:
• The ANY/SOME predicates select those rows for which a specified value appears in the results from a subquery.
Query: Retrieve the social security numbers of employees who works on some projects controlled by department 5.
select distinct essn from works_on
where pno = any (select pnumber from project where dnum = 5); • =any predicate is same as the IN predicate.
• ANSI-SQL supports both ANY and SOME predicates, even they are equivalent.
• ORACLE only supports ANY predicate not SOME.
• The difference between IN and = ANY(=SOME) predicates is that IN could be connected with a set of values but ANY(SOME) only subqueries.
Quantifier SOME and ANY
• Both SOME and ANY are designed to link a simple relational operator with a subquery that return a multi-row result.
• The sequence preceding the subquery has the following format:
{expression relational-operator quantifier} is called quantifier predicate Expression Comparison-operator Quantifier Subquery
quantity > ANY (select ... )
• The whole quantifier predicate will be applied to each row of subquery result in return.
Logical expression is true if and only if one or more rows in the subquery result satisfy the comparison.
It is false if and only if absolutely none of the subquery result rows satisfy the comparison.
Junping Sun Database Systems 4-27
Quantifier ALL
Quantifier ALL:
• The ALL predicates evaluates to true if and only if a comparison between a single value and the set of values retrieved by the subquery is true for all values retrieved by the subquery.
Query: List the names of employees whose salary is greater than the salary of all the employees in department 5.
select lname, fname from employee
where salary > all (select salary from employee where dno = 5);
• Predicate ANY, SOME, and ALL could be prefixed with any comparison operators such as { =, t!d z}
• z can be expressed by <> or != in the sql condition expression.
Discussions on Predicates IN and NOT IN
• The predicate
a IN (x, y, z) is equivalent to a = x OR a = y OR a = z select essn
from works_on
where pno = 1 or pno = 2 or pno = 3; • The predicate
a NOT IN (x, y, z) is equivalent to a <> x AND a <> y AND a<> z a NOT IN (x, y, z) is equivalent to a <> ALL (x, y, z)
select essn from works_on
where pno <> and pno <> 2 and pno <> 3; • The predicate
Junping Sun Database Systems 4-29
Nested Query (Type-N)
Q4A. Make a list of all project names for projects that involve an employee whose last name is ’Smith’ as a worker, or as a manager of the department that controls the project.
select distinct pname from project
where pnumber in (select pnumber
from project, department, employee where lname =’Smith’ and
dnum = dnumber and mgrssn =ssn) or
pnumber in (select pno
from works_on, employee
where lname = ’Smith’ and essn = ssn);
• The comparison operator IN compares a value V (here V is pnumber) with a set of (or multiset) of values V and evaluates to TRUE if V is one of the elements in V.
Decomposition of Nested Query
Subquery 1:
temp1: select pnumber
from project, department, employee
where dnum = dnumber and mgrssn =ssn and lname ='Smith' Subquery 2:
temp2: select pno
from workson, employee
where essn = ssn and lname = 'Smith'
Subquery 3:
select distinct pnumber from project
Junping Sun Database Systems 4-31
Comparison Nested and Flatten Queries
Query: Retrieve the social security numbers of employees who work on some projects controlled by department 5.
select distinct essn from works_on
where pno = (select pnumber from project where dnum = 5); Equivalent Query:
select essn
from works_on, project
where dnum = 5 and pno = pnumber ;
• The first implementation by using subquery can avoid join operation. • The second implementation has to use join operation where
pno = pnumber is the join condition or join path.
Correlated Nested Query (Type-J)
Q12. Retrieve the name of each employee who has a dependent with the same first name and same sex as the employee.
select e.fname, e.lname from employee e
where e.ssn in (select essn from dependent where essn = e.ssn and
sex = e.sex and
e.fname = dependent_name); • The where clause of inner query block contains join predicates that
references the table of an outer query block (and the table is not included in the from clause of the inner query block).
• essn = e.ssn correlates the current dependent tuple with the corresponding employee the dependent belongs to.
• sex = e.sex and e.fname = dependent_name checks the equivalence of sex and fname values between employee and dependent tuples.
Junping Sun Database Systems 4-33
Rule for Subqueries and Nested Queries
1. The subquery should be enclosed within parentheses.
2. Subqueries may contain nested subqueries. When subqueries are nested, SQL evaluates them from the inside out.
a. The innermost query is processed first
b. Then the result of query is passed to the next outer query.
3. In general, we might have several levels of nested queries, the ambiguity among attribute names will be possible if attributes of the same name exist, one in a relation in the from-clause of the outer query, and the other in a relation in the from-clause of the nested query (inner query).
The rule is that a reference to an unqualified attribute refers to the relation declared in the innermost nested query.
4. Column name in a subquery are implicitly qualified by the table name in the FROM clause of the subquery (that is the FROM clause at the same level). 5. A subquery may refer only to column names from tables which are named in
outer queries or in subquery’s own FROM clause.
A subquery may not access tables which are used only by a child query. 6. When a subquery is one of the two operands involved in a comparison, the
subquery must be written as the second operand.
Query with Exists Function
Q12B: Retrieve the name of employee who has a dependent with the same first name and same sex as the employee.
select e.fname, e.lname from employee e
where exists (select *
from dependent where essn = e.ssn and
sex = e.sex and
Junping Sun Database Systems 4-35
The Exists Function in SQL
• exists and not exists in SQL is used to check whether the result of a
correlated query is empty.
• exists and not exists in SQL are usually used in conjunction with a
correlated nested query.
• In the example 12, the nest query within the exists function references the
ssn, fname, and sex attributes of employee relation from the outer query.
• For each employee tuple, evaluate the nested query, which retrieves all
dependent tuples with the same social security number ssn, sex and name as the employee tuple.
if at least one tuple exists in the results of the nested query, then select that employee tuple.
In general,
exists(Q) returns TRUE if there is at least one tuple in the result of query Q and returns FALSE otherwise.
not exists(Q) returns TRUE if there are no tuples in the result of query Q and returns FALSE otherwise.
Query with Not Exists Function
Q6: Retrieve the names of employees who have no dependents. select fname, lname
from employee
where not exists (select *
from dependent where ssn = essn);
• The correlated nested query retrieves all dependent tuples related to an
employee tuple, if none exist, the employee tuple is selected.
• For each employee tuple, the nested query selects all dependent tuples
whose essn value matches the employee ssn.
• If the result of the nested query is empty then no dependents are related to
the employee, so that employee tuple is selected and its fname and lname are retrieved.
Junping Sun Database Systems 4-37
Nested Query with Two Exists Function
Q7. List the names of managers who have at least one dependent. select fname, lname
from employee
where exists (select *
from dependent where ssn = essn) and exists (select * from department where ssn = mgrssn);
• the first nested query selects all dependent tuple related to an employee • the second nested query selects all department tuples managed by the
employee tuple.
• if at least one of the fist one and at least one of the second exist with the
same ssn, the employee tuple is selected and the fname and lname are retrieved.
• this is the implementation of intersection operation.
Query with Division (use contains)
Q3. Retrieve the name of each employee who works on all the projects controlled by department 5.
select fname, lname from employee where ((select pno
from works_on where ssn = essn) contains (select pnumber from project where dnum = 5));
• the second nested query which is not correlated to the outer query retrieves
the project numbers of all projects controlled by department 5.
• for each employee tuple, the first nested query, which is correlated, retrieves
the project numbers on which the employee works; if these contain all projects controlled by department 5, the employee tuples is selected and the name of that tuple is retrieved.
Junping Sun Database Systems 4-39
Query with Division
Q3: Retrieve the name of each employee who works on all the projects controlled by department 5.
select fname, lname from employee e where not exists
( (select pnumber from project where dnum = 5) minus (select pno from workson w
where e.ssn = w.essn) )
Query with Division
Q3: Retrieve the name of each employee who works on all the projects controlled by department 5.
select fname, lname from employee where not exists (select *
from workson b
where (b.pno in (select pnumber from project where dnum = 5)) and
not exists (select *
from workson c where c.essn = ssn and
Junping Sun Database Systems 4-41
Discussion
• The outer nested query selects any works_on (b) tuples whose pno is of a
project controlled by department 5 and there is not a works_on (c) with the same pno and the same ssn as that of the employee tuple under
consideration in the outer query.
if no such tuple exists, we select the employee tuple, and retrieve the fname and lname of that employee tuple.
the equivalent interpretation of the query script is as follows:
there does not exist a project controlled by department 5 that the employee does not work on.
equivalently,
select each employee who works on all the projects controlled by department 5.
Renaming Attributes and Join Tables
Q8a: Retrieve the last name of each employee and his or her supervisor, while renaming the resulting attribute names as employee_name and supervisor_name.
select e.lname as employee_name, s.lname as supervisor_name from employee as e, employee as s
where e.superssn = s.ssn;
Q1a: Retrieve the names of the employees who work for ‘Research’ department.
select fname, lname, address
from (employee join department on dno = dnumber) where dname = ‘Research’;
Junping Sun Database Systems 4-43
Natural Join, Outer Join, and Nested Join
Q1b: select fname, lname, address from (employee natural join
(department as dept(dname, dno, mssn, msdate) where dname = ‘Research’;
Q8b: Retrieve the last names of all employees and his or her supervisor if these employees have a supervisor.
select e.lname as employee_name, s.lname as supervisor_name from (employee e left outer join employee s
on e.superssn = s.ssn);
Q2A: select pnumber, dnum, lname, address, bdate
from ((project join department on dnum = dnumber) join employee on mgrssn = ssn)
where plocation = ‘Stafford’;
Outer Join in ORACLE
Q8b: Retrieve the last names of all employees and his or her supervisor if these employees have a supervisor.
select e.lname as employee_name, s.lname as supervisor_name from employee e, employee s
where e.superssn = s.ssn (+);
• This is equivalent to that the employee table as the role of employee left outer joins the employee table as the role of supervisor.
Q8c: Retrieve the last names of all employees and his or her supervisees if these employees have a supervisee.
select s.lname as employee_name, e.lname as supervisor_name from employee s, employee e
where s.ssn = e.superssn (+);
• This is equivalent to that the employee as the role of supervisor left outer joins the employee table as the role of supervisee.
Junping Sun Database Systems 4-45
Aggregation Functions
Aggregate Functions:
• It takes an entire column as an argument and compute a single value based on the contents of the column.
• The function result is an “aggregate” of the individual data values in the rows of the column.
Q15’: Find the total number of employees in the company, the sum of the salaries of all employees, the maximum, the minimum, and the average salary.
select count(*), sum(salary), max(salary), min(salary), avg(salary) from employee;
• count(*) is applied to count the total number of tuple from employee tuple. • sum(), max(), min(), and avg() functions is applied to salary column value of
the tuples in employee table.
Q16’: Find the total number of employees of the ‘Research’ department, as well as the summation of the salaries, the maximum salary, the minimum salary, and the average salary in this department. select count(*), sum(salary), max(salary), min(salary), avg(salary) from employee
where dno = dnumber and dname = ‘Research’;
• all the aggregation functions, count(), sum(), max(), min(), and avg() are applied to these employee tuples from ‘Research’ department.
• the constraints dno = dnumber and dname = ‘Research’ in where clause are evaluated first before aggregate functions are evaluated.
Q19: Count the number of distinct salary values in the database. select count (distinct salary)
Junping Sun Database Systems 4-47
Q5: Retrieve the names of all employees who have two or more dependents Incorrect one: select lname, fname
from employee where (select count(*)
from dependent where ssn = essn ) >= 2;
• when a subquery is one of the two operands involved in a comparison, the subquery must be written as the second operand.
Correct one:
select lname, fname from employee
where 2 <= (select count(*)
from dependent where ssn = essn );
Group By Clause
• In many cases, we want to apply aggregate functions to subgroups of tuples
in a relation based on some attribute values. Example:
Find the average salary of employees in each department find the number of employees who work on each project.
• In these cases, we want to group the tuples have the same value of some
attribute(s), called the grouping attribute(s), and apply the function to each such group independently.
• SQL has a group by clause for this purpose.
• The group by clause specifies the grouping attributes, which must also
appear in the select clause, so that the value of applying each function on the group of tuples appears along with the value of the grouping attribute(s).
Junping Sun Database Systems 4-49
Group by Clause
Q20: For each department, retrieve the department number, the number of employees in the department, and their average salary.
select dno, count(*), avg(salary) from employee
group by dno;
Q21: For each project, retrieve the project number, the project name, and number of employees who work on that project.
select pnumber, pname, count(*) from project, works_on
where pnumber = pno group by pnumber, pname;
• the grouping and aggregate functions are applied after the joining of the two
relations.
Having Clause
Q22. For each project on which more than two employees work, retrieve the project number, project name, and number of employees work on that project.
select pnumber, pname, count(*) from project, workson
where pnumber = pno group by pnumber, pname having count(*) > 2;
• SQL provides a having clause, which can appear only in conjunction with
group by clause
• having provides a condition on the group of tuples associated with each
value of the grouping attributes, and only the groups that satisfy the condition are retrieved in the result of the query.
• selection condition in the where clause limits the tuples to which group
function are applied.
Junping Sun Database Systems 4-51
Q23. For each project, retrieve the project number, project name, and number of employee from department 5 who works on that project
select pnumber, pname, count(*) from project, workson, employee
where pnumber = pno and ssn = essn and dno = 5 group by pnumber, pname;
Q5. Retrieve the name s of all employees who have two or more dependents.
select lname, fname from employee
where ssn in (select essn from dependent where ssn = essn group by essn
having count (essn) >= 2);
Where Condition before Having
Q24. Count the total number of employees with salaries greater than $40,000 who work in each department, but only these department with more than five employees.
select dname, count(*) from department, employee
where dnumber = dno and salary > 40000 group by dname
having count(*) > 5;
• this is not the correct query statement.
• selection condition (salary > 40000) has eliminated these employee tuples
whose salary <= 40000 before the group by and having clauses.
• it will select only departments that have more than five employees who each
earns more than $40,000.
• the rule is that the where clause is executed first to select individual tuples;
the having clause is applied later to select individual groups of tuples.
• the tuples are already restricted to employees earning more than $40,000
Junping Sun Database Systems 4-53
The correct one:
select dname, count(*) from department, employee
where dnumber = dno and salary > 40000 and dno in (select dno
from employee group by dno
having count(*) > 5) group by dname;
• the constraints dnumber = dno and salary > 40000 in where clause join the department tuples with employee tuples whose salary is greater than 40000. • the subquery which includesfive employees work.
Having Clause
• HAVING clause is designed for use in conjunction with GROUP BY when it is desired to restrict the groups which appears in the final result.
• HAVING conditions often involve aggregation functions, permitting the filtering of groups based on summary calculations.
• Aggregation functions may not be used within a WHERE clause.
• WHERE clause filters individual rows going to the final result or intermediate result.
• HAVING filters groups going into the final result.
• WHERE and HAVING may be used together cooperatively:
WHERE is applied first to filter single rows, then group are formed from the rows which remain, then finally the HAVING clause is applied to filter the groups.
Junping Sun Database Systems 4-55
Summary of GROUP BY/HAVING Clauses
1. Attribute names or column names not listed in the GROUP BY clause may not appear in the HAVING condition in ANSI-1989 and ANSI-1992 SQL.
2. Aggregation functions may always be used in the HAVING clause, even if they do not appear in the SELECT attribute list.
3. The HAVING condition can involve compound conditions formed by
combining simple logical expressions with the logical operators AND, OR, and NOT.
4. HAVING and WHERE can work together.
• HAVING condition is always applied to GROUP BY Clause.
• WHERE condition is always applied to attributes involved in selection or join. 5. Non-aggregation expression may be used in the HAVING clause, providing the
expressions involve only columns which are named in the GROUP BY clause.
Syntax Structure of SELECT Statements
SELECT <attribute list> FROM <table list> [WHERE <condition>]
[GROUP BY <grouping attribute(s)>] [HAVING <grouping condition>] [ORDER BY <attribute list>]
• SELECT clause lists the attributes or functions to be retrieved.
• FROM clause specifies all relations needed in the query but not those in nested query.
• WHERE clause specifies the conditions for selection of tuples from these relations.
• GROUP BY specifies grouping attribute(s), whereas HAVING clause specifies a condition on the groups being selected rather than on the individual tuples. • The built in aggregation functions COUNT, SUM, MIN, MAX, and AVG are
used in conjunction with grouping. • ORDER specifies an order
Junping Sun Database Systems 4-57
Sequence
1. FROM: The FROM clause is processed first. It specifies the table(s) or views which serve as the source of all data for the final result. If multiple tables are involved, the join operation is necessary.
2. WHERE: The WHERE clause is processed second. It eliminates those rows defined in FROM clause which do not satisfy the search condition.
3. GROUP BY: The GROUP BY clause groups the remaining rows on the basis of shared values in the GROUP BY column(s). The partial result now has the form of a set of groups.
4. HAVING: The HAVING clause is now applied to eliminate those groups which do not satisfy the HAVING condition.
5. SELECT: The SELECT list is used to remove unwanted columns or attributes from the partial result. Only elements which appear in the SELECT list remain.
6. ORDER BY: The final result in the order based on ORDER BY list.
Insert Statement in SQL
Insert Statement:
Insert a new tuple into employee table: insert into employee
values (’Richard’, ’K’, ’Marini’, ’653298653’, ’30-DEC-52’, ’98 Oak Forest, Katy, ‘TX', 'M', 37000, '987654321', 4);
insert into employee(fname, lname, ssn) values (‘Richard’, ‘Marimi’, ‘653298653’);
• Attributes that are not specified in the insert statement are set to their DEFAULT or to NULL if the attributes are defined with DEFAULT or NULL. • The insert operation will be rejected if NOT NULL has been specified for
Junping Sun Database Systems 4-59
Insert a set of tuples into a table:
• create a relation and load it with result of a query.
create table depts_info (deptname vchar(15), noofemps integer, totalsal integer);
insert into depts_info (deptname, noofemps, totalsal) select dname, count(*), sum(salary)
from department, employee where dnumber = dno group by dname;
Delete Statement in SQL
Delete a tuple:
to delete the employee tuple with lname ‘Brown’ delete from employee
where lname = ‘Brown’; Delete a set of tuples:
to delete the employee tuples from ‘Research’ department delete from employee
where dno in (select dnumber from department
where dname = ‘Research’); To delete all the tuples in employee table:
Junping Sun Database Systems 4-61
Update Statement in SQL
Update a single tuple:
to change the location and controlling department number of project number 10 to ‘Bellaire’ and 5.
update project
set plocation = ‘Bellaire’, dnum = 5 where pnumber = 10;
Update a set of tuples in a table:
to raise the salary of employees from ‘Research’ department by 10%. update employee
set salary = salary * 1.1
where dno in (select dnumber from department
where dname = ‘Research’);
Views in SQL
View:
• It is a single table is derived from other tables, these other tables can be base tables or previously defined views.
• A view does not necessarily exist in physical form, it is considered as a virtual table in contrast to base tables whose tuples are actually stored in the database.
Advantages and Disadvantages of View:
• The advantage is that a frequent query involving with join operations can be represented. Queries involving join operations do not have to do join operations every time by querying the view.
• The disadvantage is that the possible update operations applied to views are limited.
Junping Sun Database Systems 4-63
Specification of Views in SQL
Create a view on fname, lname, pname, hours V1: create view works_on1
as select fname, lname, pname, hours from employee, project, works_on where ssn = essn and pno = pnumber; works_on1:
V2: create view dept_info (dept_name, no_of_emps, total_sal) as select dname, count(*), sum(salary)
from department, employee where dnumber = dno group by dname; dept_info
dept_name no_of_emps total_sal fname lname pname hours
Querying on View
QV1: To retrieve the last name, first name of all employees who work on ‘ProjectX’
select pname, fname, lname from works_on1
where pname = ‘ProductX’;
• A view is always up to date, if we modify the tuples in the base tables which
the view is defined, the view automatically reflects these changes.
• The view is not realized at the time of view definition but rather at the time we
specify a query on the view.
• It is the responsibility of the DBMS and not the user to make sure that the
view is up to date.
• If the view is no longer useful, then view can be disposed by drop command.
V1d: drop view works_on1;
Junping Sun Database Systems 4-65
Updating in Views
Single Table View Update:
An update on a view defined on a single table can be mapped to an update on the underlying base table.
Multi Table View Update:
An view involving joins, an update operation may be mapped to update operations on the underlying base relations in multiple ways.
Suppose there is a view update the PNAME attribute of ’John Smith’ from ’ProductX’ to ’ProductY’.
UV1: update works_on1
set pname = ’ProductY’
where lname = ’smith’ and fname = ’john’ and pname =’ProductX’
this query can be mapped into several updates on the base relations to give the desired update on the view.
• There are two possible update (a) and (b) on the base relations corresponding to UV1.
(a). update works_on
set pno = (select pnumber from project
where pname ='ProdcutY') where essn = (select ssn
from employee
where lname = 'Smith' and fname ='John') and pno = (select pnumber
from project
where pname ='ProductX') (b). update project
Junping Sun Database Systems 4-67
Discussion
• Update (a) relates "John Smith’ to the ’Product Y’ project tuple in place of the
’Product X’, and is the most likely to desired updated.
• Original update changes the project name pname in works_on1 view, it is
unlikely that the update wants to change the PNAME itself, the semantics here is to update the project that ’John Smith’ works on.
• So the update (a) will update the correspondent project number where
PNAME = ’Product Y’ in works_on base table.
• Update (b) would also give the desired updated effect on the view, but it
accomplishes this by changing the name of of the ’Product X’ tuple in the project relation to ’Product Y’.
It is quite unlikely that the user who specified the view update UV1 wants to update to be interpreted as in update (b).
Observation
• A view with a single defining table is updatable if the view attributes contain
the primary key or some other candidate key of the base relation, because this maps each (virtual) view tuple to a single base tuple.
• Views defined on multiple tables using joins are generally not updatable. • Views defined using grouping and aggregate function are not updatable.
Example:
UV2: modify dept_info
set total_sal = 100000 where dname = ’Research’;
• A view update is feasible when only one possible update on the base
relations can accomplish the desired update effect on the view.
• Whenever an update on the view can be mapped to more than one update on
the underlying base relations, we must have a certain procedure to choose the desired update.
• some researchers have developed methods for choosing the most likely update.
• while other researchers prefer to have the user choose the desired update mapping view definition.
Junping Sun Database Systems 4-69
Specifying Additional Constraints as
Assertions
• To specify the constraint “The salary of an employee must not be greater than the salary of the manager of the department that employee works for. create assertion salary_constraint
check ( not exists ( select *
from employee e, employee m, department d where e.salary > m.salary
and e.dno = d.dnumber and d.mgrssn = m.ssn) );
• if tuples in the database cause the condition of an Assertion statement to evaluate to be FALSE, the constraint is violated.
Specifying Index in SQL
Specifying index on single attribute: I1: create index lname_index
on employee (lname ); Specifying index on multiple attributes: I2: create index names_index
on employee (lname asc, fname desc, minit); Specifying index on the attribute with unique value:
I3: create unique index ssn_index on employee(ssn); Specifying cluster index:
I4: create index dno_index on employee (dno) cluster;
Junping Sun Database Systems 4-71
Cluster in ORACLE
create cluster deptandemp (deptemp varchar(9) ); create table department
( dname varchar(19), dnumber varchar(9), ...
)
cluster deptandemp (dnumber) ;
create table employee
( name varchar(19), ...
dno varchar(9), )
cluster deptandemp (dno) ;
Discussion on Index
• The reseason and motivation for index is to support efficient search and maintenance.
Advantages:
Indices support binary search
Indices support dynamic maintenance Disadvantages:
It costs extra memory space.
Algorithms to support indices are more complex.
• Key work unique can be used to enforce the key constraint.
The reason behind linking the definition of a key constraint with specifying an index is that it is much more efficient to enforce uniqueness of key values on a file if an index is defined on the key attribute, since the search on index is much more efficient .
• A clustering and unique index is similar to primary index. • A clustering and non-unique index is similar to cluster index. • A nonclustering index is similar to secondary index.