• No results found

SQL - Join CS 4750 Database Systems

N/A
N/A
Protected

Academic year: 2021

Share "SQL - Join CS 4750 Database Systems"

Copied!
25
0
0

Loading.... (view fulltext now)

Full text

(1)

SQL - Join

CS 4750

Database Systems

(2)

Recap 1: Aggregation

ID name dept_name salary

10101 Srinivasan Comp. Sci. 65000.00

12121 Wu Finance 90000.00

15151 Mozart Music 40000.00

22222 Einstein Physics 95000.00

32343 El Said History 60000.00

33456 Gold Physics 87000.00

45565 Kate Comp. Sci. 75000.00

58583 Katz History 62000.00

76543 Singh Finance 80000.00

76766 Brandt Biology 72000.00

83821 Kate Comp. Sci. 92000.00

98345 Kim Elec. Eng. 80000.00

Instructor (refer to alldbs.sql)

Find average salary for each department

SELECT dept_name, AVG(salary) FROM instructor

GROUP BY dept_name;

(3)

Recap 2: Aggregation

ID name dept_name salary

10101 Srinivasan Comp. Sci. 65000.00

12121 Wu Finance 90000.00

15151 Mozart Music 40000.00

22222 Einstein Physics 95000.00

32343 El Said History 60000.00

33456 Gold Physics 87000.00

45565 Kate Comp. Sci. 75000.00

58583 Katz History 62000.00

76543 Singh Finance 80000.00

76766 Brandt Biology 72000.00

83821 Kate Comp. Sci. 92000.00

98345 Kim Elec. Eng. 80000.00

Find average salary for each department that is greater than 70000

SELECT dept_name, AVG(salary) FROM instructor

GROUP BY dept_name

Instructor (refer to alldbs.sql)

(4)

Recap: Primary Keys, Foreign Keys

Hiring (TA_id, Name, Year, Two_week_hours)

TA_id Name Year Two_week_hours

1234 Minnie 4 20

2345 Humpty 3 24

3456 Dumpty 4 30

3333 Minnie 3 12

5678 Mickey 2 16

Key = one or more attributes that uniquely

identify a row TA_Assignment (TA_id, Course)

TA_id Course

1234 Database Systems 2345 Database Systems 3456 Cloud Computing 1234 Software Analysis

5678 Web Programming Lang.

Foreign key = one or more attributes that uniquely identify a row in another

table references

Foreign key describes a relationship between tables

(5)

JOIN

Combine related tables or sets of data on key values

All tables (or relations) must have a unique identifier (primary key)

Any related tables must contain a copy of the unique identifier (foreign key) from the related table

left table

right table FULL JOIN left

table

right table LEFT JOIN

left table

right table RIGHT JOIN

left table

right table (INNER) JOIN left

table

right table NATURAL JOIN (most common)

(6)

NATURAL JOIN

left table

right table NATURAL JOIN

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

Not include the repeated column

one(p1, colA, colB) two(p2, p1, colX, colY)

SELECT *

FROM one NATURAL JOIN two;

for each row1 in one:

for each row2 in two:

if (row1.p1 = row2.p1):

output (row1.p1, row1.colA, row1.colB, row2.p2, row2.colX, row2.colY)

Join on common named

attribute(s).

If no match found, move on.

(7)

p1 colA colB p2 colX colY

1 A aa 9001 55 ABC

2 B bb 9006 42 STU

3 C cc 9002 77 PQR

3 C cc 9003 95 DEF

5 E ee 9005 50 KLM

7 G gg 9004 62 GHI

7 G gg 9007 83 XYZ

NATURAL JOIN

left table

right table NATURAL JOIN

Not include the repeated column

SELECT * FROM one

NATURAL JOIN two

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

one(p1, colA, colB) two(p2, p1, colX, colY)

(8)

(INNER) JOIN

left table

right table (INNER) JOIN

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

Include the repeated column

one(p1, colA, colB) two(p2, p1, colX, colY)

SELECT *

FROM one JOIN two ON one.p1 = two.p1;

Join on specified attribute(s).

If no match found, move on.

for each row1 in one:

for each row2 in two:

if (row1.p1 = row2.p1):

output (row1.p1, row1.colA, row1.colB,

row2.p2, row2.p1, row2.colX, row2.colY)

(9)

p1 colA colB p2 p1 colX colY

1 A aa 9001 1 55 ABC

2 B bb 9006 2 42 STU

3 C cc 9002 3 77 PQR

3 C cc 9003 3 95 DEF

5 E ee 9005 5 50 KLM

7 G gg 9004 7 62 GHI

7 G gg 9007 7 83 XYZ

(INNER) JOIN

left table

right table (INNER) JOIN

SELECT *

FROM one JOIN two ON one.p1 = two.p1

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

one(p1, colA, colB) two(p2, p1, colX, colY)

SELECT *

FROM one INNER JOIN two ON one.p1 = two.p1

Include the repeated column

(10)

Cartesian (Cross) Product

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

one(p1, colA, colB) two(p2, p1, colX, colY)

SELECT *

FROM one JOIN two;

for each row1 in one:

for each row2 in two:

output (row1.p1, row1.colA, row1.colB,

row2.p2, row2.p1, row2.colX, row2.colY) SELECT *

FROM one, two;

(11)

p1 colA colB p2 p1 colX colY

1 A aa 9001 1 55 ABC

1 A aa 9005 5 50 KLM

1 A aa 9004 7 62 GHI

1 A aa 9003 3 95 DEF

1 A aa 9007 7 83 XYZ

1 A aa 9002 3 77 PQR

1 A aa 9006 2 42 STU

2 B bb 9002 3 77 PQR

2 B bb 9006 2 42 STU

2 B bb 9001 1 55 ABC

2 B bb 9005 5 50 KLM

2 B bb 9004 7 62 GHI

2 B bb 9003 3 95 DEF

2 B bb 9007 7 83 XYZ

3 C cc 9003 3 95 DEF

3 C cc 9007 7 83 XYZ

... ...

Cartesian (Cross) Product

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

one(p1, colA, colB)

two(p2, p1, colX, colY)

one × two

(12)

NATURAL LEFT OUTER JOIN

Augment a join by the dangling tuples

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

one(p1, colA, colB) two(p2, p1, colX, colY)

SELECT *

FROM one NATURAL LEFT OUTER JOIN two;

Join on common named

attribute(s).

If no match found, pad with

NULL values

left table

right table LEFT JOIN

Preserve all rows in left table

(13)

p1 colA colB p2 colX colY

1 A aa 9001 55 ABC

2 B bb 9006 42 STU

3 C cc 9002 77 PQR

3 C cc 9003 95 DEF

4 D dd NULL NULL NULL

5 E ee 9005 50 KLM

7 G gg 9004 62 GHI

7 G gg 9007 83 XYZ

8 H hh NULL NULL NULL

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

one(p1, colA, colB) two(p2, p1, colX, colY)

NATURAL LEFT OUTER JOIN

left table

right table LEFT JOIN

Augment a join by the dangling tuples

(14)

LEFT OUTER JOIN

Augment a join by the dangling tuples

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

one(p1, colA, colB) two(p2, p1, colX, colY)

SELECT *

FROM one LEFT OUTER JOIN two ON one.p1 = two.p1;

Join on specified attribute(s).

If no match found, pad with

NULL values

left table

right table LEFT JOIN

Preserve all rows in left table

(15)

p1 colA colB p2 p1 colX colY

1 A aa 9001 1 55 ABC

2 B bb 9006 2 42 STU

3 C cc 9002 3 77 PQR

3 C cc 9003 3 95 DEF

4 D dd NULL NULL NULL NULL

5 E ee 9005 5 50 KLM

7 G gg 9004 7 62 GHI

7 G gg 9007 7 83 XYZ

8 H hh NULL NULL NULL NULL

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

one(p1, colA, colB) two(p2, p1, colX, colY)

LEFT OUTER JOIN

left table

right table LEFT JOIN

Augment a join by the dangling tuples

(16)

NATURAL RIGHT OUTER JOIN

Augment a join by the dangling tuples

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

one(p1, colA, colB) two(p2, p1, colX, colY)

SELECT *

FROM one NATURAL RIGHT OUTER JOIN two;

Join on common named

attribute(s).

If no match found, pad with

NULL values

left table

right table RIGHT JOIN

Preserve all rows in right table

(17)

p1 colA colB p2 colX colY

1 A aa 9001 55 ABC

2 B bb 9006 42 STU

3 C cc 9002 77 PQR

3 C cc 9003 95 DEF

5 E ee 9005 50 KLM

7 G gg 9004 62 GHI

7 G gg 9007 83 XYZ

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

one(p1, colA, colB) two(p2, p1, colX, colY)

NATURAL RIGHT OUTER JOIN

Augment a join by the dangling tuples

Since two.p1 is a foreign key referencing one.p1,

matches are always founded

left table

right table RIGHT JOIN

(18)

RIGHT OUTER JOIN

Augment a join by the dangling tuples

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

one(p1, colA, colB) two(p2, p1, colX, colY)

Join on specified attribute(s).

If no match found, pad with

NULL values

left table

right table RIGHT JOIN

SELECT *

FROM one RIGHT OUTER JOIN two ON one.p1 = two.p1;

Preserve all rows in right table

(19)

p1 colA colB

1 A aa

2 B bb

3 C cc

4 D dd

5 E ee

7 G gg

8 H hh

p2 p1 colX colY

9001 1 55 ABC

9002 3 77 PQR

9003 3 95 DEF

9004 7 62 GHI

9005 5 50 KLM

9006 2 42 STU

9007 7 83 XYZ

one(p1, colA, colB) two(p2, p1, colX, colY)

RIGHT OUTER JOIN

Augment a join by the dangling tuples

Since two.p1 is a foreign key referencing one.p1,

matches are always founded

left table

right table RIGHT JOIN

p1 colA colB p2 p1 colX colY

1 A aa 9001 1 55 ABC

2 B bb 9006 2 42 STU

3 C cc 9002 3 77 PQR

3 C cc 9003 3 95 DEF

5 E ee 9005 5 50 KLM

7 G gg 9004 7 62 GHI

7 G gg 9007 7 83 XYZ

(20)

Self Joins

Join with itself

Useful when the table has a FOREIGN KEY which references its own PRIMARY KEY

Can be viewed as joining two copies of the same table

To do self join, use aliases

(21)

(Challenging) Example: Self Joins

Find names of all TAs who are assigned to Database Systems and Software Analysis

TA_id Name Year Two_week_hours

1234 Minnie 4 20

2345 Humpty 3 24

3456 Dumpty 4 30

3333 Minnie 3 12

5678 Mickey 2 16

TA_id Course

1234 Database Systems 2345 Database Systems 3456 Cloud Computing 1234 Software Analysis

5678 Web Programming Lang.

2345 Software Analysis

Hiring(TA_id, Name, Year, Two_week_hours) TA_Assignment(TA_id, Course)

How should we write SQL?

(22)

(Challenging) Example: Self Joins

Find names of all TAs who are assigned to Database Systems and Software Analysis

TA_id Name Year Two_week_hours

1234 Minnie 4 20

2345 Humpty 3 24

3456 Dumpty 4 30

3333 Minnie 3 12

5678 Mickey 2 16

TA_id Course

1234 Database Systems 2345 Database Systems 3456 Cloud Computing 1234 Software Analysis

5678 Web Programming Lang.

2345 Software Analysis

Hiring(TA_id, Name, Year, Two_week_hours) TA_Assignment(TA_id, Course)

SELECT H.Name

From Hiring AS H, TA_Assignment AS T WHERE H.TA_id = T.TA_id AND

T.Course = "Database Systems" AND T.Course = "Software Analysis";

Will this work?

No.

Returns empty set

(23)

(Challenging) Example: Self Joins

Find names of all TAs who are assigned to Database Systems and Software Analysis

TA_id Name Year Two_week_hours

1234 Minnie 4 20

2345 Humpty 3 24

3456 Dumpty 4 30

3333 Minnie 3 12

5678 Mickey 2 16

TA_id Course

1234 Database Systems 2345 Database Systems 3456 Cloud Computing 1234 Software Analysis

5678 Web Programming Lang.

2345 Software Analysis

Hiring(TA_id, Name, Year, Two_week_hours) TA_Assignment(TA_id, Course)

SELECT H.Name

From Hiring AS H, TA_Assignment AS T1, TA_Assignment AS T2

WHERE H.TA_id = T1.TA_id AND H.TA_id = T2.TA_id AND

T1.Course = "Database Systems" AND T2.Course = "Software Analysis";

Will this work?

Name Minnie Humpty

(24)

(Challenging) Example: Self Joins

Find names of all TAs who are assigned to Database Systems and Software Analysis

TA_id Name Year Two_week_hours

1234 Minnie 4 20

2345 Humpty 3 24

3456 Dumpty 4 30

3333 Minnie 3 12

5678 Mickey 2 16

TA_id Course

1234 Database Systems 2345 Database Systems 3456 Cloud Computing 1234 Software Analysis

5678 Web Programming Lang.

2345 Software Analysis

Hiring(TA_id, Name, Year, Two_week_hours) TA_Assignment(TA_id, Course)

SELECT H.Name

From Hiring AS H, TA_Assignment AS T1, TA_Assignment AS T2

WHERE H.TA_id = T1.TA_id AND H.TA_id = T2.TA_id AND

T1.Course = "Database Systems" AND T2.Course = "Software Analysis";

Let’s consider each part of the query

All pairs of courses a TA is assigned For each pair, check

for courses

(25)

Wrap-Up

Join combines data across tables

Nested-loop

Natural join (most common)

Inner join (filter Cartesian product)

Outer joins (preserve non-matching tuples)

Self join pattern

Different joining techniques can be used to achieve particular goals

What’s next?

SQL – subqueries

left table

right table LEFT JOIN

left table

right table RIGHT JOIN left

table

right table NATURAL JOIN

left table

right table (INNER) JOIN

References

Related documents

left empty, this function will join together all the tables of your dm object that can be reached from the start table, in the direction of the foreign key relations (pointing from

A database design that eliminates redundant data in any single table, places tightly related datasets into individual tables, keeps unrelated datasets separated in their own

These rules show the relationships among the tables in the database. For example, employee table and project table are related in the Company database.

How to remove a database object from the tables from sys objects follow the server drop tables database in all sql server the table statement has happened while using the table

left empty, this function will join together all the tables of your dm object that can be reached from the start table, in the direction of the foreign key relations (pointing from

An automatic join (or “autojoin”) data set contains the table names and column links that are required to join tables automatically in an SQL Query Window session. When tables that

ALTER TABLE Persons DROP CONSTRAINT chk_PersonId. Limit the value range for

 Table: primary keys of the two participating entity sets and any attributes on the relationship itself (may have dups here).  Primary key: the primary keys added from