join in SQL
join in SQL
SQL Join is used to fetch data from two or more tables, which is joined to appear as single set of SQL Join is used to fetch data from two or more tables, which is joined to appear as single set of data. SQL Join is used for combining column from two or more tables b
data. SQL Join is used for combining column from two or more tables b y using values commony using values common to both tables.
to both tables.JoinJoinKeyword is used in SQL queries for joining two or more tables. inimumKeyword is used in SQL queries for joining two or more tables. inimum required condition for joining table, is
required condition for joining table, is(n-1)(n-1) where wherenn, is number of , is number of tables. ! tables. ! table can also table can also join tojoin to itself "nown as,
itself "nown as, Self JoinSelf Join..
Types of Join
Types of Join
#he following are the types of J$%& that we can use in SQL. #he following are the types of J$%& that we can use in SQL.
• • %nner %nner • • $uter $uter • • LeftLeft • • 'ight'ight
Cross JOIN or Cartesian Product
Cross JOIN or Cartesian Product
#his type of J$%& returns the cartesian product of rows of from the
#his type of J$%& returns the cartesian product of rows of from the tables in Join. %t will return atables in Join. %t will return a table which consists of records which combines each
table which consists of records which combines each row from the first table with each row ofrow from the first table with each row of the second table.
the second table. (ross J$%& Synta) is, (ross J$%& Synta) is,
SELECT column-name-list SELECT column-name-list from
from table-name1table-name1
CROSS JOIN CROSS JOIN table-name2
table-name2;;
Example of Cross JOIN
Example of Cross JOIN
#he
IIDD NNAAMMEE * * aabbhhii + + aaddaamm aallee)) #he
#he class_infoclass_info table, table,
IID D AAddddrreessss * * --LL//%% + + 0011!!%% 2
2 ((//&&&&!!%%
Cross
Cross J$%& query will be, J$%& query will be,
SELECT * SELECT * from class, from class,
cross JOIN class_info; cross JOIN class_info;
#he result table will loo" li"e, #he result table will loo" li"e,
IID D NNAAMME E IID D AAddddrreessss
* * aabbhhii ** --LL//%% + + aaddaamm ** --LL//%% aallee)) ** --LL//%%
IIDD NNAAMMEE * * aabbhhii + + aaddaamm aallee)) #he
#he class_infoclass_info table, table,
IID D AAddddrreessss * * --LL//%% + + 0011!!%% 2
2 ((//&&&&!!%%
Cross
Cross J$%& query will be, J$%& query will be,
SELECT * SELECT * from class, from class,
cross JOIN class_info; cross JOIN class_info;
#he result table will loo" li"e, #he result table will loo" li"e,
IID D NNAAMME E IID D AAddddrreessss
* * aabbhhii ** --LL//%% + + aaddaamm ** --LL//%% aallee)) ** --LL//%%
* * aabbhhii ++ 0011!!%% + + aaddaamm ++ 0011!!%% aallee)) ++ 0011!!%% *
* aabbhhii 22 ((//&&&&!!%%
+
+ aaddaamm 22 ((//&&&&!!%%
INNER Join or EQI Join
INNER Join or EQI Join
#his is a simple J$%& in which the result is based on
#his is a simple J$%& in which the result is based on matched data as per the matched data as per the equality conditionequality condition specified in the que
specified in the query.ry. %nner Join Synta) is, %nner Join Synta) is,
SELECT column-name-list SELECT column-name-list from
from table-name1table-name1
INNER JOIN INNER JOIN table-name2
table-name2
WHERE tale-name!"column-name # tale-name$"column-name; WHERE tale-name!"column-name # tale-name$"column-name;
Example of Inner JOIN
Example of Inner JOIN
#he
#he classclass table, table,
IIDD NNAAMMEE * * aabbhhii + + aaddaamm 2 2 aallee)) aannuu #he
#he class_infoclass_info table, table,
IID D AAddddrreessss
*
* --LL//%%
+
2 (/&&!%
Inner J$%& query will be,
SELECT * from class, class_info %&ere class"i' # class_info"i';
#he result table will loo" li"e,
ID NAME ID Address
* abhi * -L/%
+ adam + 01!%
2 ale) 2 (/&&!%
Natural JOIN
&atural Join is a type of %nner join which is based on column having same name and same datatype present in both the tables to be joined.
&atural Join Synta) is,
SELECT *
from table-name1 NATURAL JOIN table-name2;
Example of Natural JOIN
#he class table,
ID NAME
* abhi
+ adam
2 ale)
anu
#he class_info table,
ID Address
* -L/%
+ 01!%
2 (/&&!%
Natural join query ill !e"
#he result table will loo" li"e,
ID NAME Address
* abhi -L/%
+ adam 01!%
2 ale) (/&&!%
%n the above e)ample, both the tables being joined have %- column3same name and same
datatype4, hence the records for which value of %- matches in both the tables will be the result of &atural Join of these two tables.
Outer JOIN
$uter Join is based on both matched and unmatched data. $uter Joins subdivide further into,
• Left $uter Join • 'ight $uter Join • 5ull $uter Join
Left Outer Join
#he left outer join returns a result table with the #atc$ed data of two tables then remaining rows of the lefttable and null for the ri%$t table6s column.
Left $uter Join synta) is,
SELECT column-name-list from table-name1
LEFT OUTER JOIN table-name2
on tale-name!"column-name # tale-name$"column-name;
Left outer Join Synta) for &racle is,
select column-name-list from table-name1,
table-name2
on tale-name!"column-name # tale-name$"column-name++;
Example of Left Outer Join
#he class table,
ID NAME * abhi + adam 2 ale) anu 7 ashish
#he class_info table,
ID Address
* -L/%
2 (/&&!%
8 &$%-!
9 :!&%:!#
'eft &uter Join query will be,
SELECT * RO class LET O)TER JOIN class_info ON class"i'#class_info"i'+;
#he result table will loo" li"e,
ID NAME ID Address
* abhi * -L/%
+ adam + 01!%
2 ale) 2 (/&&!%
anu null null
7 ashish null null
Ri!"t Outer Join
#he right outer join returns a result table with the #atc$ed data of two tables then remaining rows of the ri%$t ta!le and null for the left table6s columns.
'ight $uter Join Synta) is,
select column-name-list from table-name1
table-name2
on tale-name!"column-name # tale-name$"column-name;
'ight outer Join Synta) for &racle is,
select column-name-list from table-name1,
table-name2
on tale-name!"column-name++ # tale-name$"column-name;
Example of Ri!"t Outer Join
#he class table,
ID NAME * abhi + adam 2 ale) anu 7 ashish
#he class_info table,
ID Address
* -L/%
2 (/&&!%
8 &$%-!
9 :!&%:!#
i%$t &uter Join query will be,
SELECT * RO class RI.HT O)TER JOIN class_info on class"i'#class_info"i'+;
#he result table will loo" li"e,
ID NAME ID Address
* abhi * -L/%
+ adam + 01!%
2 ale) 2 (/&&!%
null null 8 &$%-!
null null 9 :!&%:!#
#ull Outer Join
#he full outer join returns a result table with the #atc$ed data of two table then remaining rows of both lefttable and then theri%$t table.
5ull $uter Join Synta) is,
select column-name-list from table-name1
table-name2
on tale-name!"column-name # tale-name$"column-name; Example of #ull outer join is$
#he class table,
ID NAME * abhi + adam 2 ale) anu 7 ashish
#he class_info table,
ID Address * -L/% + 01!% 2 (/&&!% 8 &$%-! 9 :!&%:!#
SELECT * RO class )LL O)TER JOIN class_info on class"i'#class_info"i'+;
#he result table will loo" li"e,
ID NAME ID Address
* abhi * -L/%
+ adam + 01!%
2 ale) 2 (/&&!%
anu null null
7 ashish null null
null null 8 &$%-!
null null 9 :!&%:!#
Nirav Prabtani, $3 Jan $3!4 C5OL !67"!8 6!
4"9! $6 2otes+
Rate t&is:
T/0es of oin in S1L Ser2er for fetc&in< recor's from multi0le tales"
Intro'uction
In t&is ti0, I am <oin< to e=0lain aout t/0es of oin"
W&at is oin>>
(n S1L JOIN clause is use' to comine ro%s from t%o or more tales, ase' on a common ?el' et%een t&em"
T&ere are man/ t/0es of oin"
• Inner Join
!" E@ui-oin
$" Natural Join
• Outer Join
!" Left outer Join $" Ri<&t outer oin 6" ull outer oin
• Cross Join
• Self Join
)sin< t&e Co'e
Join is 2er/ useful to fetc&in< recor's from multi0le tales %it& reference to common column et%een t&em"
To un'erstan' oin %it& e=am0le, %e &a2e to create t%o tales in S1L Ser2er 'ataase" !" Em0lo/ee
$" create tale Em0lo/ee 6"
4" i' int i'entit/!,!+ 0rimar/ Ae/, 7" )sername 2arc&ar73+, B" irstName 2arc&ar73+, " LastName 2arc&ar73+, 9" De0artID int " + Vote!
!3" De0artments create tale De0artments i' int i'entit/!,!+ 0rimar/ Ae/, De0artmentName 2arc&ar73+ +
No% ?ll Em0lo/ee tale %it& 'emo recor's liAe t&at"
ill De0artment tale also liAe t&is""""
!+ Inner Join
T&e oin t&at 'is0la/s onl/ t&e ro%s t&at &a2e a matc& in ot& t&e oine' tales is Ano%n as inner oin"
select e!")sername,e!"irstName,e!"LastName,e$"De0artmentName _ from Em0lo/ee e! inner oin De0artments e$ on e!"De0artID#e$"i'
It <i2es matc&e' ro%s from ot& tales %it& reference to De0artID of ?rst tale an' i' of secon' tale liAe t&is"
E@ui-Join
E@ui oin is a s0ecial t/0e of oin in %&ic& %e use onl/ e@ualit/ o0erator" Hence, %&en /ou maAe a @uer/ for oin usin< e@ualit/ o0erator, t&en t&at oin @uer/ comes un'er E@ui oin"
E@ui oin &as onl/ #+ o0erator in oin con'ition"
E@ui oin can e inner oin, left outer oin, ri<&t outer oin" C&ecA t&e @uer/ for e@ui-oin:
SELECT * RO Em0lo/ee e! JOIN De0artments e$ ON e!"De0artID # e$"i'
$+ Outer Join
Outer oin returns all t&e ro%s of ot& tales %&et&er it &as matc&e' or not"
We &a2e t&ree t/0es of outer oin:
!" Left outer oin $" Ri<&t outer oin 6" ull outer oin
a+ Left Outer oin
Left oin 'is0la/s all t&e ro%s from ?rst tale an' matc&e' ro%s from secon' tale liAe t&at""
SELECT * RO Em0lo/ee e! LET O)TER JOIN De0artments e$ ON e!"De0artID # e$"i'
+ Ri<&t outer oin
Ri<&t outer oin 'is0la/s all t&e ro%s of secon' tale an' matc&e' ro%s from ?rst tale liAe t&at"
SELECT * RO Em0lo/ee e! RI.HT O)TER JOIN De0artments e$ ON e!"De0artID # e$"i'
Result:
6+ ull outer oin
ull outer oin returns all t&e ro%s from ot& tales %&et&er it &as een matc&e' or not"
SELECT * RO Em0lo/ee e! )LL O)TER JOIN De0artments e$ ON e!"De0artID # e$"i'
6+ Cross Join
( cross oin t&at 0ro'uces Cartesian 0ro'uct of t&e tales t&at are in2ol2e' in t&e oin" T&e siFe of a Cartesian 0ro'uct is t&e numer of t&e ro%s in t&e ?rst tale multi0lie' /
t&e numer of ro%s in t&e secon' tale liAe t&is"
SELECT * RO Em0lo/ee cross oin De0artments e$ Gou can %rite a @uer/ liAe t&is also:
SELECT * RO Em0lo/ee , De0artments e$
4+ Self Join
Joinin< t&e tale itself calle' self oin" Self oin is use' to retrie2e t&e recor's &a2in< some relation or similarit/ %it& ot&er recor's in t&e same tale" Here, %e nee' to use aliases for t&e same tale to set a self oin et%een sin<le tale an' retrie2e recor's satisf/in< t&e con'ition in %&ere clause"
SELECT e!")sername,e!"irstName,e!"LastName from Em0lo/ee e! _ inner oin Em0lo/ee e$ on e!"i'#e$"De0artID
Here, I &a2e retrie2e' 'ata in %&ic& i' an' De0artID of em0lo/ee tale &as een matc&e':
5oints of Interest
Here, I &a2e taAen one e=am0le of self oin in t&is scenario %&ere mana<er name can e retrie2e' / mana<eri'%it& reference of em0lo/ee i' from one tale"
Here, I &a2e create' one tale em0lo/ees liAe t&at:
If I &a2e to retrie2e mana<er name from mana<er i', t&en it can e 0ossile / Self oin:
select e!"em0Name as ana<erName,e$"em0Name as Em0Name _ from em0lo/ees e! inner oin em0lo/ees e$ on e!"i'#e$"mana<eri' Result:
!! im0ortant 'ataase
'esi<nin< rules %&ic& I follo%
Shivprasad koiraa
, $7 e $3!4 C5OL
77"98
!!
4" 93 2otes+
Rate t&is: 2ote ! 2ote $ 2ote 6 2ote 4 2ote 7
T&is article %ill 'iscuss aout !! im0ortant 'ataase
'esi<nin< rules"
Tale of Contents
•
Intro'uction
•
Rule !: W&at is t&e nature of t&e a00lication OLT5 or
OL(5+>
•
Rule $: reaA /our 'ata in to lo<ical 0ieces, maAe life
sim0ler
•
Rule 6: Do not <et o2er'ose' %it& rule $
•
Rule 4: Treat 'u0licate non-uniform 'ata as /our i<<est
enem/
•
Rule B: Watc& for 0artial 'e0en'encies
•
Rule : C&oose 'eri2e' columns 0reciousl/
•
Rule 9: Do not e &ar' on a2oi'in< re'un'anc/, if
0erformance is t&e Ae/
•
Rule : ulti'imensional 'ata is a 'ierent east
alto<et&er
•
Rule !3: CentraliFe name 2alue tale 'esi<n
•
Rule !!: or unlimite' &ierarc&ical 'ata self-reference 58
an' 8
Courtesy: Image from Motion pictures
Intro'uction
efore /ou start rea'in< t&is article let me con?rm to /ou I
am not a <uru in 'ataase 'esi<nin<" T&e elo% !! 0oints are
%&at I &a2e learnt 2ia 0roects, m/ o%n e=0eriences, an' m/
o%n rea'in<" I 0ersonall/ t&inA it &as &el0e' me a lot %&en it
comes to D 'esi<nin<" (n/ criticism is %elcome"
T&e reason I am %ritin< a full lo%n article is, %&en
'e2elo0ers 'esi<n a 'ataase t&e/ ten' to follo% t&e t&ree
normal forms liAe a sil2er ullet" T&e/ ten' to t&inA
normaliFation is t&e onl/ %a/ of 'esi<nin<" Due t&is min' set
t&e/ sometimes &it roa' locAs as t&e 0roect mo2es a&ea'"
If /ou are ne% to normaliFation, t&en clicA an' see 6 normal
forms in action %&ic& e=0lains all t&e t&ree normal forms ste0
/ ste0"
Sai' an' 'one normaliFation rules are im0ortant <ui'elines
ut taAin< t&em as a marA on stone is callin< for troule"
elo% are m/ o%n !! rules %&ic& I rememer on t&e to0 of
m/ &ea' %&ile 'oin< D 'esi<n"
Rule !: W&at is t&e nature of
t&e a00lication OLT5 or
OL(5+>
W&en /ou start /our 'ataase 'esi<n t&e ?rst t&in< to
anal/Fe is t&e nature of t&e a00lication /ou are 'esi<nin< for,
is it Transactional or (nal/tical" Gou %ill ?n' man/ 'e2elo0ers
/ 'efault a00l/in< normaliFation rules %it&out t&inAin<
aout t&e nature of t&e a00lication an' t&en later <ettin< into
0erformance an' customiFation issues" (s sai', t&ere are t%o
Ain's of a00lications: transaction ase' an' anal/tical ase',
lets un'erstan' %&at t&ese t/0es are"
Transa!tiona
: In t&is Ain' of a00lication, /our en' user is
more intereste' in CR)D, i"e", creatin<, rea'in<, u0'atin<,
an' 'eletin< recor's" T&e oKcial name for suc& a Ain' of
'ataase is OLT5"
Ana"ti!a
: In t&ese Ain's of a00lications /our en' user is
more intereste' in anal/sis, re0ortin<, forecastin<, etc" T&ese
Ain's of 'ataases &a2e a less numer of inserts an'
u0'ates" T&e main intention &ere is to fetc& an' anal/Fe 'ata
as fast as 0ossile" T&e oKcial name for suc& a Ain' of
'ataase is OL(5"
In ot&er %or's if /ou t&inA inserts, u0'ates, an' 'eletes are
more 0rominent t&en <o for a normaliFe' tale 'esi<n, else
create a at 'enormaliFe' 'ataase structure"
elo% is a sim0le 'ia<ram %&ic& s&o%s &o% t&e names an'
a''ress in t&e left &an' si'e are a sim0le normaliFe' tale
an' / a00l/in< a 'enormaliFe' structure &o% %e &a2e
create' a at tale structure"
Rule $: reaA /our 'ata into
lo<ical 0ieces, maAe life
sim0ler
T&is rule is actuall/ t&e ?rst rule from !
stnormal form" One of
t&e si<ns of 2iolation of t&is rule is if /our @ueries are usin<
too man/ strin< 0arsin< functions liAe sustrin<, c&arin'e=,
etc", t&en 0roal/ t&is rule nee's to e a00lie'"
or instance /ou can see t&e elo% tale %&ic& &as stu'ent
names; if /ou e2er %ant to @uer/ stu'ent names &a2in<
M8oirala an' not MHarisin<&, /ou can ima<ine %&at Ain' of a
@uer/ /ou %ill en' u0 %it&"
So t&e etter a00roac& %oul' e to reaA t&is ?el' into
furt&er lo<ical 0ieces so t&at %e can %rite clean an' o0timal
@ueries"
Rule 6: Do not <et
o2er'ose' %it& rule $
De2elo0ers are cute creatures" If /ou tell t&em t&is is t&e
%a/, t&e/ Aee0 'oin< it; %ell, t&e/ o2er'o it lea'in< to
un%ante' conse@uences" T&is also a00lies to rule $ %&ic& %e
ust talAe' ao2e" W&en /ou t&inA aout 'ecom0osin<, <i2e a
0ause an' asA /ourself, is it nee'e'> (s sai', t&e
'ecom0osition s&oul' e lo<ical"
or instance, /ou can see t&e 0&one numer ?el'; its rare
t&at /ou %ill o0erate on ISD co'es of 0&one numers
se0aratel/ until /our a00lication 'eman's it+" So it %oul' e
a %ise 'ecision to ust lea2e it as it can lea' to more
Rule 4: Treat 'u0licate
non-uniform 'ata as /our
i<<est enem/
ocus an' refactor 'u0licate 'ata" / 0ersonal %orr/ aout
'u0licate 'ata is not t&at it taAes &ar' 'isA s0ace, ut t&e
confusion it creates"
or instance, in t&e elo% 'ia<ram, /ou can see M7t&
Stan'ar' an' Mift& stan'ar' means t&e same" No% /ou
can sa/ t&e 'ata &as come into /our s/stem 'ue to a' 'ata
entr/ or 0oor 2ali'ation" If /ou e2er %ant to 'eri2e a re0ort,
t&e/ %oul' s&o% t&em as 'ierent entities, %&ic& is 2er/
confusin< from t&e en' user 0oint of 2ie%"
One of t&e solutions %oul' e to mo2e t&e 'ata into a
'ierent master tale alto<et&er an' refer t&em 2ia forei<n
Ae/s" Gou can see in t&e elo% ?<ure &o% %e &a2e create' a
ne% master tale calle' MStan'ar's an' linAe' t&e same
usin< a sim0le forei<n Ae/"
Rule 7: Watc& for 'ata
se0arate' / se0arators
T&e secon' rule of !
stnormal form sa/s a2oi' re0eatin<
<rou0s" One of t&e e=am0les of re0eatin< <rou0s is e=0laine'
in t&e elo% 'ia<ram" If /ou see t&e s/llaus ?el' closel/, in
one ?el' %e &a2e too muc& 'ata stue'" T&ese Ain's of ?el's
are terme' as MRe0eatin< <rou0s" If %e &a2e to mani0ulate
t&is 'ata, t&e @uer/ %oul' e com0le= an' also I 'out aout
t&e 0erformance of t&e @ueries"
T&ese Ain's of columns %&ic& &a2e 'ata stue' %it&
se0arators nee' s0ecial attention an' a etter a00roac&
%oul' e to mo2e t&ose ?el's to a 'ierent tale an' linA
t&em %it& Ae/s for etter mana<ement"
So no% lets a00l/ t&e secon' rule of !
stnormal form: M(2oi'
re0eatin< <rou0s" Gou can see in t&e ao2e ?<ure I &a2e
create' a se0arate s/llaus tale an' t&en ma'e a
man/-to-man/ relations&i0 %it& t&e suect tale"
Wit& t&is a00roac& t&e s/llaus ?el' in t&e main tale is no
more re0eatin< an' &as 'ata se0arators"
Rule B: Watc& for 0artial
'e0en'encies
Watc& for ?el's %&ic& 'e0en' 0artiall/ on 0rimar/ Ae/s" or
instance in t&e ao2e tale %e can see t&e 0rimar/ Ae/ is
create' on roll numer an' stan'ar'" No% %atc& t&e s/llaus
?el' closel/" T&e s/llaus ?el' is associate' %it& a stan'ar'
an' not %it& a stu'ent 'irectl/ roll numer+"
T&e s/llaus is associate' %it& t&e stan'ar' in %&ic& t&e
stu'ent is stu'/in< an' not 'irectl/ %it& t&e stu'ent" So if
tomorro% %e %ant to u0'ate t&e s/llaus %e &a2e to u0'ate
it for eac& stu'ent, %&ic& is 0ainstaAin< an' not lo<ical" It
maAes more sense to mo2e t&ese ?el's out an' associate
t&em %it& t&e Stan'ar' tale"
Gou can see &o% %e &a2e mo2e' t&e s/llaus ?el' an'
attac&e' it to t&e Stan'ar's tale"
T&is rule is not&in< ut t&e $
n'normal form: M(ll Ae/s s&oul'
'e0en' on t&e full 0rimar/ Ae/ an' not 0artiall/"
Rule : C&oose 'eri2e'
columns 0reciousl/
If /ou are %orAin< on OLT5 a00lications, <ettin< ri' of 'eri2e'
columns %oul' e a <oo' t&ou<&t, unless t&ere is some
0ressin< reason for 0erformance" In case of OL(5 %&ere %e
'o a lot of summations, calculations, t&ese Ain's of ?el's are
necessar/ to <ain 0erformance"
In t&e ao2e ?<ure /ou can see &o% t&e a2era<e ?el' is
'e0en'ent on t&e marAs an' suect" T&is is also one form of
re'un'anc/" So for suc& Ain's of ?el's %&ic& are 'eri2e' from
ot&er ?el's, <i2e a t&ou<&t: are t&e/ reall/ necessar/>
T&is rule is also terme' as t&e 6
r'normal form: MNo column
s&oul' 'e0en' on ot&er non-0rimar/ Ae/ columns" /
0ersonal t&ou<&t is 'o not a00l/ t&is rule lin'l/, see t&e
situation; its not t&at re'un'ant 'ata is al%a/s a'" If t&e
re'un'ant 'ata is calculati2e 'ata, see t&e situation an' t&en
'eci'e if /ou %ant to im0lement t&e 6
r'normal form"
Rule 9: Do not e &ar' on
a2oi'in< re'un'anc/, if
0erformance is t&e Ae/
Do not maAe it a strict rule t&at /ou %ill al%a/s a2oi'
re'un'anc/" If t&ere is a 0ressin< nee' for 0erformance t&inA
aout 'e-normaliFation" In normaliFation, /ou nee' to maAe
oins %it& man/ tales an' in 'enormaliFation, t&e oins
Rule : ulti'imensional
'ata is a 'ierent east
alto<et&er
OL(5 0roects mostl/ 'eal %it& multi'imensional 'ata" or
instance /ou can see t&e elo% ?<ure, /ou %oul' liAe to <et
sales 0er countr/, customer, an' 'ate" In sim0le %or's /ou
are looAin< at sales ?<ures %&ic& &a2e t&ree intersections of
'imension 'ata"
or suc& Ain's of situations a 'imension an' fact 'esi<n is a
etter a00roac&" In sim0le %or's /ou can create a sim0le
central sales fact tale %&ic& &as t&e sales amount ?el' an'
it maAes a connection %it& all 'imension tales usin< a
forei<n Ae/ relations&i0"
Rule !3: CentraliFe name
2alue tale 'esi<n
an/ times I &a2e come across name 2alue tales" Name an'
2alue tales means it &as Ae/ an' some 'ata associate' %it&
t&e Ae/" or instance in t&e elo% ?<ure /ou can see %e &a2e
a currenc/ tale an' a countr/ tale" If /ou %atc& t&e 'ata
closel/ t&e/ actuall/ onl/ &a2e a Ae/ an' 2alue"
or suc& Ain's of tales, creatin< a central tale an'
'ierentiatin< t&e 'ata / usin< a t/0e ?el' maAes more
sense"
Rule !!: or unlimite'
&ierarc&ical 'ata
self-reference 58 an' 8
an/ times %e come across 'ata %it& unlimite' 0arent c&il'
&ierarc&/" or instance consi'er a multi-le2el marAetin<
scenario %&ere a sales 0erson can &a2e multi0le sales 0eo0le
elo% t&em" or suc& scenarios, usin< a self-referencin<
0rimar/ Ae/ an' forei<n Ae/ %ill &el0 to ac&ie2e t&e same"
T&is article is not meant to sa/ t&at 'o not follo% normal
forms, instea' 'o not follo% t&em lin'l/, looA at /our
0roects nature an' t&e t/0e of 'ata /ou are 'ealin< %it&
?rst"
elo% is a 2i'eo %&ic& e=0lains t&e t&ree normal forms ste0
/ ste0 usin< a sim0le sc&ool tale"
Gou can also 2isit m/ %esite for ste0 / ste0 2i'eos
on Desi<n 5atterns, )L, S&are5oint $3!3, "NET
License
T&is article, alon< %it& an/ associate' source co'e an' ?les,
is license' un'er T&e Co'e 5roect O0en License C5OL+
S&are
• E#AIL • T$ITTER • • • •(out t&e (ut&or
Shivprasad koiraa
(rc&itect &tt0:QQ%%%"@uest0on'"com
In'ia
Introduction to database design
This article/tutorial will teach the basis of relational database design and explains how to make a good database design. It is a rather long text, but we advise to read all of it. Designing a database is in fact fairl eas, but there are a few rules to stick to. It is important to know what these rules are, but more importantl is to know wh these rules exist, otherwise ou will tend to make mistakes!
tandardi"ation makes our data model flexible and that makes working with our data much easier. #lease, take the time to learn these rules and appl them! The database used in this article is designed with our database design and modeling tool De$ign for Databases.
% good database design starts with a list of the data that ou want to include in our database and what ou want to be able to do with the
database later on. This can all be written in our own language, without an &'. In this stage ou must tr not to think in tables or columns, but (ust think) *+hat do I need to know* Don-t take this too lightl, because if ou find out later that ou forgot something, usuall ou need to start all over. %dding things to our database is mostl a lot of work.
Identifing ntities
The tpes of information that are saved in the database are called -entities-. These entities exist in four kinds) people, things, events, and locations. verthing ou could want to put in a database fits into one of these categories. If the information ou want to include doesn-t fit into these categories, than it is probabl not an entit but a propert of an entit, an attribute.
To clarif the information given in this article we-ll use an example. Imagine that ou are creating a website for a shop, what kind of information do ou have to deal with In a shop ou sell our products to customers. The
*hop* is a location *ale* is an event *#roducts* are things and
*0ustomers* are people. These are all entities that need to be included in our database.
1ut what other things are happening when selling a product % customer comes into the shop, approaches the vendor, asks a 2uestion and gets an answer. *Vendors* also participate, and because vendors are people, we need a vendors entit.
Figure 1: Entities: types of information.
Identifing 3elationships
The next step is to determine the relationships between the entities and to determine the cardinalit of each relationship. The relationship is the
connection between the entities, (ust like in the real world) what does one entit do with the other, how do the relate to each other 4or example, customers bu products, products are sold to customers, a sale comprises products, a sale happens in a shop.
The cardinalit shows how much of one side of the relationship belongs to how much of the other side of the relationship. 4irst, ou need to state for each relationship, how much of one side belongs to exactl 5 of the other side. 4or example) 6ow man customers belong to 5 sale 6ow man sales belong to 5 customer 6ow man sales take place in 5 shop 7ou-ll get a list like this) 8please note that -product- represents a tpe of product, not an occurance of a product9
• 0ustomers ::; ales 5 customer can bu something several times
• ales ::; 0ustomers 5 sale is alwas made b 5 customer at the time
• 0ustomers ::; #roducts 5 customer can bu multiple products
• #roducts ::; 0ustomers 5 product can be purchased b multiple customers
• 0ustomers ::; hops 5 customer can purchase in multiple shops
• hops ::; 0ustomers, 5 shop can receive multiple customers
• hops ::; #roducts in 5 shop there are multiple products
• #roducts ::; hops 5 product 8tpe9 can be sold in multiple shops
• hops ::; ales in 5 shop multiple sales can me made
• ales ::; hops 5 sale can onl be made in 5 shop at the time
• #roducts ::; ales 5 product 8tpe9 can be purchased in multiple sales
• ales ::; #roducts 5 sale can exist out of multiple products
Did we mention all relationships There are four entities and each entit has a relationship with ever other entit, so each entit must have three relationships, and also appear on the left end of the relationship three times. %bove, 5< relationships were mentioned, which is =>?, so we can conclude that all relationships were mentioned.
@ow we-ll put the data together to find the cardinalit of the whole
relationship. In order to do this, we-ll draft the cardinalities per relationship. To make this eas to do, we-ll ad(ust the notation a bit, b noting the
-backward-:relationship the other wa around)
• 0ustomers ::; ales 5 customer can bu something several times
• ales ::; 0ustomers 5 sale is alwas made b 5 customer at the time
The second relationship we will turn around so it has the same entit order as the first. #lease notice the arrow that is now faced the other wa!
• 0ustomers A:: ales 5 sale is alwas made b 5 customer at the time
0ardinalit exists in four tpes) one:to:one, one:to:man, man:to:one, and man:to:man. In a database design this is indicated as) 5)5, 5)@, B)5, and B)@. To find the right indication (ust leave the -5-. If there is a -man- on the left side, this will be indicated with -B-, if there is a -man- on the right side it is indicated with -@-.
• 0ustomers ::; ales 5 customer can bu something several times 5)@.
• 0ustomers A:: ales 5 sale is alwas made b 5 customer at the time 5)5.
The true cardinalit can be calculated through assigning the biggest values for left and right, for which -@- or -B- are greater than -5-. In thisexample, in
both cases there is a -5- on the left side. Cn the right side, there is a -@- and a -5-, the -@- is the biggest value. The total cardinalit is therefore -5)@-. % customer can make multiple -sales-, but each -sale- has (ust one customer. If we do this for the other relationships too, we-ll get)
• 0ustomers ::; ales ::; 5)@ • 0ustomers ::; #roducts ::; B)@ • 0ustomers ::; hops ::; B)@ • ales ::; #roducts ::; B)@ • hops ::; ales ::; 5)@ • hops ::; #roducts ::; B)@
o, we have two -5:to:man- relationships, and four -man:to:man-relationships.
1etween the entities there ma be a mutual dependenc. This means that the one item cannot exist if the other item does not exist. 4or example, there cannot be a sale if there are no customers, and there cannot be a sale if there are no products.
The relationships ales ::; 0ustomers, and ales ::; #roducts are
mandator, but the other wa around this is not the case. % customer can exist without sale, and also a product can exist without sale. This is of importance for the next step.
3ecursive 3elationships
ometimes an entit refers back to itself. 4or example, think of a work hierarch) an emploee has a boss and the bosschef is an emploee too. The attribute -boss- of the entit -emploees- refers back to the entit
-emploees-.
In an 3D 8see next chapter9 this tpe of relationship is a line that goes out of the entit and returns with a nice loop to the same entit.
3edundant 3elationships
ometimes in our model ou will get a -redundant relationship-. These are relationships that are alread indicated b other relationships, although not directl.
In the case of our example there is a direct relationships between
customers and products. 1ut there are also relationships from customers to sales and from sales to products, so indirectl there alread is a
relationship between customers and products through sales. The
relationship -0ustomers A::::; #roducts- is made twice, and one of them is therefore redundant. In this case, products are onl purchased through a
sale, so the relationships -0ustomers A::::; #roducts- can be deleted. The model will then look like this)
Figure 3: Relationships between the entities.
olving Ban:to:Ban 3elationships
Ban:to:man relationships 8B)@9 are not directl possible in a database. +hat a B)@ relationship sas is that a number of records from one table belongs to a number of records from another table. omewhere ou need to save which records these are and the solution is to split the relationship up in two one:to:man relationships.
This can be done b creating a new entit that is in between the related entities. In our example, there is a man:to:man relationship between sales and products. This can be solved b creating a new entit) sales: products. This entit has a man:to:one relationship with ales, and a man:to:one relationship with #roducts. In logical models this is called an associative entit and in phsical database terms this is called a link table or (unction table.
Figure 4: Many to many relationship implementation via assoiative entity.
In the example there are two man:to:man relationships that need to be solved) -#roducts A::::; ales-, and -#roducts A::::; hops-. 4or both situations there needs to be created a new entit, but what is that entit 4or the #roducts A::::; ales relationship, ever sale includes more
products. The relationship shows the content of the sale. In other words, it gives details about the sale. o the entit is called -ales details-. 7ou could also name it -sold products-.
The #roducts A::::; hops relationship shows which products are available in which the shops, also known as -stock-. Cur model would now look like this)
Figure !: Mo"el with lin# tables $to# an" $ales%"etails.
Identifing %ttributes
The data elements that ou want to save for each entit are called -attributes-.
%bout the products that ou sell, ou want to know, for example, what the price is, what the name of the manufacturer is, and what the tpe number is. %bout the customers ou know their customer number, their name, and address. %bout the shops ou know the location code, the name, the
address. Cf the sales ou know when the happened, in which shop, what products were sold, and the sum total of the sale. Cf the vendor ou know his staff number, name, and address. +hat will be included precisel is not of importance et it is still onl about what ou want to save.
Figure &: Entities with attributes.
Derived Data
Derived data is data that is derived from the other data that ou have alread saved. In this case the -sum total- is a classical case of derived data. 7ou know exactl what has been sold and what each product costs, so ou can alwas calculate how much the sum total of the sales is. o reall it is not necessar to save the sum total.
o wh is it saved here +ell, because it is a sale, and the price of the product can var over time. % product can be priced at 5 euros toda and at E euros next month, and for our administration ou need to know what it cost at the time of the sale, and the easiest wa to do this is to save it here. There are a lot of more elegant was, but the are too profound for this article.
#resenting ntities and 3elationships) ntit
3elationship Diagram 83D9
The ntit 3elationship Diagram 83D9 gives a graphical overview of the database. There are several stles and tpes of 3 Diagrams. % much: used notation is the -crowfeet- notation, where entities are represented as rectangles and the relationships between the entities are represented as lines between the entities. The signs at the end of the lines indicate the
tpe of relationship. The side of the relationship that is mandator for the other to exist will be indicated through a dash on the line. @ot mandator entities are indicated through a circle. *Ban* is indicated through a
-crowfeet- de relationship:line splits up in three lines.
In this article we make use of De$ign for Databases to design and present our database.
% 5)5 mandator relationship is represented as follows)
Figure ': Man"atory one to one relationship.
% 5)@ mandator relationship)
Figure (: Man"atory one to many relationship.
% B)@ relationship is)
Figure ): Man"atory many to many relationship.
Figure 1*: Mo"el with relationships.
%ssigning Fes
#rimar Fes
% primar ke 8#F9 is one or more data attributes that uni2uel identif an entit. % ke that consists of two or more attributes is called a composite ke. %ll attributes part of a primar ke must have a value in ever record 8which cannot be left empt9 and the combination of the values within these attributes must be uni2ue in the table.
In the example there are a few obvious candidates for the primar ke.
0ustomers all have a customer number, products all have a uni2ue product number and the sales have a sales number. ach of these data is uni2ue and each record will contain a value, so these attributes can be a primar ke. Cften an integer column is used for the primar ke so a record can be easil found through its number.
'ink:entities usuall refer to the primar ke attributes of the entities that the link. The primar ke of a link:entit is usuall a collection of these reference:attributes. 4or example in the alesGdetails entit we could use
the combination of the #F-s of the sales and products entities as the #F of alesGdetails. In this wa we enforce that the same product 8tpe9 can onl be used once in the same sale. Bultiple items of the same product tpe in a sale must be indicated b the 2uantit.
In the 3D the primar ke attributes are indicated b the text -#F- behind the name of the attribute. In the example onl the entit -shop- does not have an obvious candidate for the #F, so we will introduce a new attribute for that entit) shopnr.
4oreign Fes
The 4oreign Fe 84F9 in an entit is the reference to the primar ke of
another entit. In the 3D that attribute will be indicated with -4F- behind its name. The foreign ke of an entit can also be part of the primar ke, in that case the attribute will be indicated with -#4- behind its name. This is usuall the case with the link:entities, because ou usuall link two
instances onl once together 8with 5 sale onl 5 product tpe is sold 5 time9.
If we put all link:entities, #F-s and 4F-s into the 3D, we get the model as shown below. #lease note that the attribute -products- is no longer
necessar in -ales-, because -sold products- is now included in the link: table. In the link:table another field was added, -2uantit-, that indicates how man products were sold. The 2uantit field was also added in the stock: table, to indicate how man products are still in store.
Figure 11: +rimary #eys an" foreign #eys.
Defining the %ttribute-s Data Tpe
@ow it is time to figure out which data tpes need to be used for the
attributes. There are a lot of different data tpes. % few are standardi"ed, but man databases have their own data tpes that all have their own advantages. ome databases offerthe possibilit to define our own data tpes, in case the standard tpes cannot do the things ou need.
The standard data tpes that ever database knows, and are most:used, are) 06%3, V%306%3, THT, 4'C%T, DC1', and I@T.
Text)
• 06%38length9 : includes text 8characters, numbers, punctuations...9. 06%3 has as characteristic that it alwas saves a fixed amount of
positions. If ou define a 06%3859 ou can save up to ten positions maximum, but if ou onl use two positions the database will still save 5 positions. The remaining eight positions will be filled b spaces. • V%306%38length9 : includes text 8characters, numbers,
punctuation...9. V%306%3 is the same as 06%3, the difference is that V%306%3 onl takes as much space as necessar.
• THT : can contain large amounts of text. Depending on the tpe of database this can add up to gigabtes.
@umbers)
• I@T : contains a positive or negative whole number. % lot of
databases have variations of the I@T, such as TI@7I@T, B%''I@T, BDIBI@T, 1IJI@T, I@T<, I@T=, I@TE. These variations differ from the I@T onl in the si"e of the figure that fits into it. % regular I@T is = btes 8I@T=9 and fits figures from :<5=K=E?L=K to M<5=K=E?L=L, or if ou define it as @IJ@D from to =<N=NLK<NL. The I@TE, or
1IJI@T, can get even bigger in si"e, from to
5E==LK==K?KNOO5L5L, but takes up to E btes of diskspace, even if there is (ust a small number in it.
• 4'C%T, DC1' : The same idea as I@T, but can also store floating point numbers. . Do note that this does not alwas work perfectl. 4or instance in B&' calculating with these floating point numbers is not perfect, 85/?9>? will result with B&'-s floats in .NNNNNNN, not 5. Cther tpes)
• 1'C1 : for binar data such as files.I@T : for I# addresses. %lso useable for netmasks.
4or our example the data tpes are as follows)
Figure 12: ,ata mo"el "isplaying "ata types.
@ormali"ation
@ormali"ation makes our data model flexible and reliable. It does generate some overhead because ou usuall get more tables, but it enables ou to do man things with our data model without having to ad(ust it.
@ormali"ation, the 4irst 4orm
The first form of normali"ation states that there ma be no repeating groups of columns in an entit. +e could have created an entit -sales- with
attributes for each of the products that were bought. This would look like this)
Figure 13: ot in 1st normal form.
+hat is wrong about this is that now onl ? products can be sold. If ou would have to sell = products, than ou would have to start a second sale or ad(ust our data model b adding -product=- attributes. 1oth solutions are unwanted. In these cases ou should alwas create a new entit that ou link to the old one via a one:to:man relationship.
Figure 14: n aor"ane with 1st normal form.
@ormali"ation, the econd 4orm
The second form of normali"ation states that all attributes of an entit
should be full dependent on the whole primar ke. This means that each attribute of an entit can onl be identified through the whole primar ke. uppose we had the date in the alesGdetails entit)
Figure 1!: ot in 2n" normal form.
This entit is not according the second normali"ation form, because in
order to be able to look up the date of a sale, I do not have to know what is sold 8productnr9, the onl thing I need to know is the sales number. This was solved b splitting up the tables into the sales and the alesGdetails table)
Figure 1&: n aor"ane with 2n" normal form.
@ow each attribute of the entities is dependent on the whole #F of the entit. The date is dependent on the sales number, and the 2uantit is dependent on the sales number and the sold product.
@ormali"ation, the Third 4orm
The third form of normali"ation states that all attributes need to be directl dependent on the primar ke, and not on other attributes. This seems to be what the second form of normali"ation states, but in the second form is actuall stated the opposite. In the second form of normali"ation ou point out attributes through the #F, in the third form of normali"ation ever
Figure 1': ot in 3r" normal form.
In this case the price of a loose product is dependent on the ordering
number, and the ordering number is dependent on the product number and the sales number. This is not according to the third form of normali"ation. %gain, splitting up the tables solves this.
Figure 1(: n aor"ane with 3r" normal form.
@ormali"ation, Bore 4orms
There are more normali"ation forms than the three forms mentioned above, but those are not of great interest for the average user. These other forms are highl speciali"ed for certain applications. If ou stick to the design rules and the normali"ation mentioned in this article, ou will create a design that works great for most applications.
@ormali"ed Data Bodel
If ou appl the normali"ation rules, ou will find that the -manufacturer- in de product table should also be a separate table)
Figure 1): ,ata mo"el in aor"ane with 1st/ 2n" an" 3" normal form.
Jlossar
%ttributes : detailed data about an entit, such as price, length, name
0ardinalit : the relationship between two entities, in figures. 4or example, a person can place multiple orders.
ntities : abstract data that ou save in a database. 4or example) customers, products.
4oreign ke 84F9 : a referral to the #rimar Fe of another table. 4oreign Fe:columns can onl contain values that exist in the #rimar Fe column that the refer to.
Fe : a ke is used to point out records. The most well:known ke is the #rimar Fe 8see #rimar Fe9.