ICT 252: Theory of Databases Lecture Notes
ICT 252: Theory of Databases Lecture Notes
stu$ent
stu$entRi$
stu$entRfirstname
stu$entRfathersname
pro-rammeRco$e
pro-ramme
pro-rammeRco$e
pro-Rname
pro-R$escription
course
courseRco$e
courseRname
courseR$escription
cre$itRhours
stu$entCourse
stu$entRi$
courseRco$e
1
1
1
FBE Computer Science Department Lecture Notes Theory of Databases
12) So*omon 9ebe$e CSD#,
pro-ramme
"ro&ra!!e;code "ro&;na!e "ro&;descri"tion
CSDE< Computer Science De-ree ) Hear De-ree in Computer Science
CSD#, Computer Science Dip*oma Dip*oma in Computer Science
course
course;code course;na!e course;descri"tion credit;hours
#CT"(" Theory of
Databases
#ntro$uction to $atabases= DB'S= $atabase
mo$e*sK focus on re*ationa* mo$e*K usin- E+
1 mo$e**in- to $esi-n $atabases.
!
#CT")1 #nternet 6 %eb
,a-e Deve*opment
Basic s/i**s reAuire$ for eb $eve*opment=
inc*u$in- >T'L an$ scriptin-.
)
stu$entCourse
student;ID course;code
122 #CT"("
121 #CT"("
122 #CT")1
12" #CT")1
12) #CT"("
12) #CT")1
There are connections beteen the $ata in the $ifferent re*ations. %e ca** these
relationshi"s because the ros in a tab*e can be re*ate$ to ros in another tab*e.
They are re*ate$ by va*ues that match in the $ifferent tab*es.
/uestionG Can you i$entify re*ationships beteen tab*es on this $ia-ram&
5 stu$ent is re-istere$ for one pro-ramme Cmust be 1D
5 pro-ramme has many stu$ents re-istere$ for it C2 or moreD
SoN1(!any re*ationship beteen ,ro-ramme an$ Stu$ent re*ations.
5 stu$ent can enro* for many courses C1 or moreD
5 course can have many stu$ents re-istere$ for it C2 or moreD.
SoN.1+many beteen Stu$ent an$ Stu$entCourse= 1+many beteen Course an$
Stu$entCourse. 5n$ !any(!any beteen Stu$ent an$ Course but e cannot sho
that $irect*y beteen the tab*es e use the Stu$entCourse re*ation to mo$e* the
association beteen the " re*ations.
NoNa$$ the re*ationships to your $ia-ram put a *ine beteen the re*ate$
attributes= ith a 1 for the one si$e an$ a CinfinityD symbo* for the many si$e.
Some $esi-ners use a $ifferent notation a *ine ith an arro hea$ pointin- to the
many si$e of the re*ationship.
Can a*so have a 1+1 re*ationship e.-. if e intro$uce a Teacher re*ation an$ assume
that one course is tau-ht by one teacher on*y.
1e*ationships beteen re*ations are important because often e nee$ to e;tract $ata
from " or more tab*es for the $ata to be meanin-fu*. For e;amp*e if the re-istrar
ants a *ist shoin- each stu$ent an$ hat courses he?she has re-istere$ for= the $ata
,a-e ") of 87
FBE Computer Science Department Lecture Notes Theory of Databases
must come from the stu$ent an$ stu$entCourse re*ations. %e can -et this $ata by
matchin- the stu$entRi$ fie*$ in both tab*es.
This ay of combinin- tab*es to -et $ata from them is ca**e$ a 3oin e can use SFL
to 3oin tab*es.
#t is important to un$erstan$ the re*ationships beteen your $b tab*es for e;tractin-
meanin-fu* 6 usefu* $ata from them.
%e i** *earn more about re*ationships hen e *oo/ at E+1 mo$e**in-.
4.6 <eys
#n a re*ation= each tup*e CroD represents an instance of the rea*+or*$ entity that the
re*ation mo$e*s.
%e nee$ some ay to $istin-uish the instances from each other the va*ues of the
attributes of an instance must uniAue*y i$entify that instance.
,ut another ayG no to instances Ctup*es?rosD can have e;act*y the same va*ues for
a** the attributes.
5 /ey is an attribute or set of attributes in a re*ation that uniAue*y i$entifies each tup*e
in the re*ation.
4.6.1 .u"er5eys
Loo/ at the Stu$ent+Schema re*ation schema if e ta/e the combination of a**
attributes= it ou*$ uniAue*y i$entify each ro. Or e cou*$ ta/e the combination of
stu$entRi$= stu$entRfirstname= stu$entRfathersname. But e cou*$ not use
stu$entRfirstname= stu$entRfathersname as $ifferent stu$ents cou*$ have the same
name.
Each of these combinations is ca**e$ a su"er5ey.
# su"er5ey is an attribute or co!bination of attributes containin& uni=ue 3alues
for each tu"le in the relation.
4.6.2 Candidate <eys
#f e ta/e aay stu$entRfathersname from the stu$entRi$= stu$entRfirstname=
stu$entRfathersname combinationNe sti** have a super/ey.
#f e then ta/e aay stu$entRfirstname= e have 3ust stu$entRi$ *eft an$ this sti**
uniAue*y i$entifies each tup*e.
There is no sub+set of this set of attributes that is itse*f a super/ey e have re$uce$
the super/ey as much as e can.
%hat e are *eft ith is ca**e$ a candidate 5ey.
# candidate 5ey is a su"er5ey for 4hich no subset is itself a su"er5ey.
%e can a*so that a can$i$ate /ey is a !ini!al su"er5ey it is minima* if removin-
any attributes ma/es it no *on-er uniAue.
Loo/ at the ,ro-ramme+Schema. Startin- ith a** the attributes as a super/ey= can you
i$entify one or more can$i$ate /eys& 5ssume that no to ,ro-rammes have the same
,ro-ramme name.
5G " can$i$ate /eys ,ro-rammeRCo$e an$ ,ro-RName because each can uniAue*y
i$entify a pro-ramme. The combination of both is a super/ey but not a can$i$ate /ey
because it has sub+sets C,ro-rammeRCo$e 6 ,ro-RnameD that are super/eys.
,a-e "! of 87
FBE Computer Science Department Lecture Notes Theory of Databases
4.6. +ri!ary 5ey
No e have i$entifie$ " can$i$ate /eys for the ,ro-rammeRSchema.
%hen $esi-nin- a re*ationa* $atabase= you Cas the $esi-nerD must choose one of these
to be the ,rimary 9ey for the re*ation.
/uestionG %hich ou*$ you choose for this re*ation&
See hat the stu$ents chooseN.fin$ someone ho chooses ,ro-rammeRCo$e an$ as/
them hy they choose it.
The reason is that the co$e is *ess *i/e*y to chan-e over time hereas the university
mi-ht chan-e course names. 5 -oo$ can$i$ate for primary /ey is an attribute hose
va*ues are *east *i/e*y to have to be chan-e$ over time.
# "ri!ary 5ey is a candidate 5ey chosen to be the !ain 4ay to uni=uely identify
tu"les in the relation. #t represents a constraint in the rea*+or*$ that the $atabase
mo$e*s. For e;amp*e= in choosin- ,ro-rammeRCo$e as the primary /ey= e are
ref*ectin- the fact that the Co$e must be uniAue for every ,ro-ramme. Li/eise for
Stu$entR#D.
The $ecision is up to the $esi-ner= but you shou*$ put some thou-ht into it. Sometimes
it is obvious hat the primary /ey shou*$ be= e.-. in the Stu$ent+Schema. But
sometimes it isnJt e.-. in Stu$ent+Course+Schema.
5fter you have some e;perience of $esi-nin- $atabases= you i** not often thin/
about super/eys an$ can$i$ate /eys as your e;perience i** -ui$e you an$ you i**
be ab*e to te** Auite Auic/*y hat shou*$ be the primary /eyI
5 primary /ey that consists of more than one attribute is a co!"osite "ri!ary 5ey.
NoNon your $ia-ram of the re*ation schema= un$er*ine the primary /eys in each
re*ation.
4.* Constraints
# mentione$ ear*ier that a primary /ey ref*ects a constraint in the rea* or*$ of the
$atabase.
5 constraint is a ru*e that restricts the possib*e va*ues that can -o into a re*ation Ctab*eD
in a re*ationa* $atabase.
Besi$es /eys= there are some other types of constraint in the re*ationa* mo$e*.
Some constraints are va*ue+base$= some are va*ue+neutra*.
@a*ue+base$G comparison of an attribute va*ue to some constant va*ue e.-.
Cre$it>ours PO 2 to ref*ect that a courseJs cre$it hours must be -reater than or eAua*
to 2.
@a*ue+neutra*G comparison of attribute va*ues to other attribute va*ues. For e;amp*e= if
e ha$ start $ate an$ finish $ate attributes for a Stu$ent the finish $ate shou*$ be
*ater than the start $ate.
/G 9eys are a form of constraint. Do you thin/ they are va*ue+base$ or va*ue+neutra*&
,a-e "( of 87
FBE Computer Science Department Lecture Notes Theory of Databases
#G va*ue+neutra* because they compare attribute va*ues in a -iven co*umn to other
va*ues in the same co*umn.
%e are -oin- to *oo/ at these types of constraintG
Functiona* $epen$ency
Entity inte-rity
1eferentia* inte-rity
Tri--ers
4.*.1 9unctional De"endencies
Lcover as part of norma*isationM
4.*.2 'ntity Inte&rity
Entity inte-rity means that each re*ation must have an attribute or combination of
attributes hose va*ues uniAue*y i$entify each tup*e in the re*ation.
#n other or$s= no to tup*es in the re*ation can have the same va*ue for that attribute
or combination of attributes.
This is to ensure that entities from the rea* or*$ are uniAue*y i$entifie$ in the
$atabase e.-. stu$ents= courses= pro-rammes.
Entity inte-rity is enforce$ ith primary /eys no to tup*es in a re*ation can contain
the same va*ues for the primary /ey attributeCsD.
5*so= the primary /ey of a re*ation cannot have a nu** va*ue in any tup*e so primary
/ey attributes must have an a$$itiona* constraint that $oes not a**o nu**s.
#f nu**s ere a**oe$= then to or more ros cou*$ have the nu** va*ue hich
vio*ates the entity inte-rity constraint.
4.*. $eferential Inte&rity
Loo/ bac/ at the re*ate$ tab*es stu$ent an$ stu$entCourse.
Stu$ent#D is the primary /ey in the stu$ent re*ation.
%e /no that if e *oo/ at stu$entCourse= the va*ues for Stu$ent#D match va*ues in
Stu$ent.Stu$ent#D Cpoint out this $ot notation tab*e.co*umn?re*ation.attribute to
the stu$entsD.
#n fact= e reAuire that the va*ues in stu$entCourse.stu$ent#D match va*ues in
stu$ent.Stu$ent#D. This is referential inte&rity here the va*ues in co*umns of one
tab*e must match va*ues in co*umns of other tab*es.
1eferentia* inte-rity is enforce$ usin- another type of /ey a forei-n /ey.
Stu$ent#D is a forei&n 5ey in the stu$entCourse re*ation.
5 forma* $efinition for a forei-n /eyG
# relation r
1
can ha3e an attribute that is the "ri!ary 5ey of another relation> r
2
.
This a forei&n 5ey fro! r
1
> referencin& r
2
.
r
1
is the referencin& relation. r
2
is the referenced relation.
,a-e ". of 87
FBE Computer Science Department Lecture Notes Theory of Databases
#n a $atabase instance= -iven any tup*e= say t
a
= from the r
1
re*ation= there must be some
tup*e= t
b
= in the r
"
re*ation here the va*ue of the forei-n /ey attribute of t
a
is the same
as the va*ue of the primary /ey attribute in r
"
.
The va*ue of the forei-n /ey attribute of t
a
can be nu** a*so.
>oever= the $b $esi-ner can $eci$e hether or not to a**o nu**s in the forei-n /ey
attribute. #t $epen$s on the usa-e in the rea* or*$.
/G For e;amp*e= in the stu$entCourse re*ation $oes it ma/e sense to a**o stu$ent#D
or course#D to be nu**&
#G no because both are part of the primary /ey the primary /ey cannot be nu**.
/G %hat about in the stu$ent re*ation cou*$ the pro-rammeRco$e be nu**&
#G $epen$s on usa-e can a stu$ent be re-istere$ but not have se*ecte$ a pro-ramme&
# thin/ no= as the stu$ent ou*$ have to choose a pro-ramme. They can a*ays chan-e
it *ater. #f you a**o nu**s here= you cou*$ en$ up ith $ata that shos stu$ents that
are not re-istere$ for a pro-ramme but that are re-istere$ for courses.
4.*.4 Tri&&ers
'any DB'Ss inc*u$e a capabi*ity to $efine ru*es that are processe$ Cor tri--ere$D
hen certain events occur.
For e;amp*e= if the ba*ance on a current account becomes ne-ative C*ess than 2D= the
account shou*$ be mar/e$ as bein- over$ran. This cou*$ be mo$e**e$ ith an
attribute name$ Over$ran in the 5ccount re*ation= hich can have the va*ues true or
fa*se Cor yes?noD.
%e ant the $atabase to automatica**y up$ate the Over$ran co*umn hen the
ba*ance chan-es from bein- positive to ne-ative or vice+verse.
This is ca**e$ a tri&&er in the $atabase.
This is not strict*y a constraint in the re*ationa* mo$e*= but it is a feature that has been
imp*emente$ in many 1DB'S pac/a-e.
The tri--er is a ru*e $efine$ in the $atabase itse*f. The ru*e for this e;amp*e ou*$ be
somethin- *i/e thisG
5fter a ro in the 5ccount re*ation has been up$ate$
FO1 E5C> up$ate$ ro
#F neRba*ance S 2 5ND o*$Rba*ance P 2 T>EN
set Over$ran O true
ELSE #F
o*$Rba*ance S 2 5ND neRba*ance P 2 T>EN
set Over$ran O fa*se
END #F
En$ FO1
5 tri--er has ) parts to itG
Event C5ccount is up$ate$ can be any event e.-. $e*ete or insertD
Con$ition Cba*ance chan-es from bein- positive to ne-ative or vice+verse
can *oo/ at the va*ues in the ro before an$ after the event occurre$D
,a-e "0 of 87
FBE Computer Science Department Lecture Notes Theory of Databases
5ctions Cset the Over$ran f*a- can carry out any SFL action e.-. insert
to another tab*eD
5 tri--er can be create$ usin- SFL an$ it is then part of the $atabase schema= *i/e
tab*es= co*umns an$ other ob3ects in the $atabase.
5s a $atabase $esi-ner or pro-rammer= you shou*$ be carefu* in your use of tri--ers=
as they can s*o $on the operation of the $atabase as the tri--er i** run every
time a ro or -roup of ros in the tab*e is up$ate$.
Sometimes= you can fin$ another ay of carryin- out the action so thin/ about it
first an$ use a tri--er on*y if you cannot fin$ another ay of $oin- it.
5 $elational #l&ebra
1b2ecti3eG to *earn the basic an$ a$$itiona* operations of re*ationa* a*-ebra= as these
form the basis for SFL. This is $one by *oo/in- at each operation= $oin- some
e;amp*es an$ -ivin- the c*ass e;ercises to $o the operations themse*ves.
+re"arationG $up*icate >an$out )= hich is a short or/sheet on re*ationa* a*-ebra.
<ive to stu$ents after the basic operations have been covere$. They shou*$ $o the
e;ercises outsi$e of c*ass= an$ instructor can brief*y run throu-h the so*utions at
be-innin- of ne;t c*ass.
%e have *oo/e$ at the re*ationa* mo$e*G
1e*ations?attributes?tup*es Ctab*es?co*umns?recor$sD
Domains for attributes
Database schemas
9eys
Constraints
5s a foun$ation for *earnin- SFL= e i** ta/e some time to *oo/ at some of the
operators of re*ationa* a*-ebra as SFL is base$ on it.
1e*ationa* a*-ebra has operators that operate on re*ations.
#t is proce$ura* in nature the operators -enera**y ta/e 1 or " re*ations as input an$
pro$uce a ne re*ation as the output.
SFL is base$ on re*ationa* a*-ebra= but is itse*f most*y a $ec*arative *an-ua-e.
The basic operations areG
Se*ect
,ro3ect
4nion
Set $ifference
Cartesian pro$uct
1ename
Some others= hich are themse*ves $efine$ in terms of the basic operationsG
Set intersection
Natura* 3oin
Division
,a-e "7 of 87
FBE Computer Science Department Lecture Notes Theory of Databases
assi-nment
5.1 8asic 1"erations
4nary operatorsG operate on one re*ation se*ect= pro3ect= rename
Binary operatorsG operate on pairs of re*ations union= set $ifference= Cartesian
pro$uct
The resu*t of an operation is a ne re*ation.
5.1.1 .elect
The se*ect operator se*ects tup*es from a re*ation= that satisfy a -iven predicate
Ccon$itionD.
%e use the -ree/ *etter si-ma CD for the operator.
LStu$ents shou*$ have to han$ the han$out shoin- the stu$ent= course= pro-ramme
re*ations an$ samp*e $ataM.
To se*ect tup*es from the stu$ent re*ation= here the stu$ents are on the CSDE<
pro-rammeG
pro-rammeRco$eOTCSDE<T
Cstu$entD
stu$ent is the argument relation
pro-rammeRco$eOTCSDE<T is the predicate
The resu*t of this operation is a ne re*ation containin- ) tup*es.
The pre$icate has a comparison operator CP= S= O= Cnot eAua*D= = D. Hou can
compare va*ues in $ifferent attributes or compare an attribute to a constant va*ue.
#t can a*so inc*u$e *o-ica* 5ND= O1 an$ NOT operatorsG = = .
',ercisesG
%rite operations for the fo**oin-G
se*ect a** courses that have more than ) cre$it hours.
#G
cre$itRhoursP)
CcourseD
Se*ect stu$ents that are in the CSDE< pro-ramme an$ hose name is Te/*e.
#G
pro-rammeRco$eOTCSDE<T stu$entRfirstnameOTTe/*eT
Cstu$entD
5.1.2 +ro2ect
The pro3ect operator can be use$ to -et a sub+set of the attributes from a re*ation.
The resu*t is a ne re*ation= containin- a** the tup*es in the operan$ re*ation= an$ the
specifie$ attributes.
%e use the -ree/ *etter ,i for the operan$ CD.
To -et on*y the first name an$ fatherJs name from the stu$ent re*ationG
stu$entRfirstname= stu$entRfathersname
Cstu$entD
Because the resu*t of an operation is a re*ation= e can use the resu*t as a re*ation
input to another operation.
,a-e "8 of 87
FBE Computer Science Department Lecture Notes Theory of Databases
So= if you ant to -et the first name an$ fatherJs name for a** stu$ents on the CSDE<
pro-ramme= you can combine this operation ith the se*ect operationG
stu$entRfirstname= stu$entRfathersname
C
pro-rammeRco$eOTCSDE<T
Cstu$entDD
The resu*t of this is a ne re*ation ith attributes stu$entRfirstname an$
stu$entRfathersname an$ containin- on*y those tup*es that have CSDE< in the
pro-rammeRco$e attribute of the stu$ent re*ation
Cput another ay names of a** stu$ents on the CSDE< pro-rammeD
This is a re*ationa* a*-ebra e;pression as it combines $ifferent operations.
',erciseG
%rite an e;pression to -et the course co$e an$ names of courses that have ! or more
cre$it hours.
5G
courseRco$e= courseRname
C
cre$itRhours !
CcourseDD
#sideG at this point= you shou*$ start to see ho these operations are use$ in SFL.
Thin/ of the pro3ect an$ se*ect operators e have 3ust *oo/e$ at.
No thin/ of $oin- a Auery in a $atabase ith these re*ations in it G
#n 5ccess Fuery Desi-ner you ou*$ a$$ the stu$ent tab*e= then choose the co*umns
stu$Ri$= stu$Rfirstname= stu$Rfathersname from the stu$ent tab*e= then a$$ the criteria
that pro-Rco$e must be eAua* to CSDE<D
The Auery ou*$ *oo/ *i/e this if you rote it in SFL Cin the Fuery Desi-ner= ri-ht+
c*ic/ in the tab*es area= choose SFL @ie an$ you can see the SFL for the Auery.
1i-ht+c*ic/ in the tit*e bar to -o bac/ to Fuery Desi-n.DG
se*ect stu$Ri$= stu$Rfirstname= stu$Rfathersname
from stu$ent
here pro-Rco$e O QCSDE<J
%hat part is eAuiva*ent to a re*ationa* a*-ebra se*ect operation can you see an
ar-ument re*ation an$ a pre$icate&
#G the from c*ause an$ the here c*ause are the ar-ument re*ation an$ the pre$icate.
%hat part is eAuiva*ent to the pro3ect operation&
#G the *ine QSe*ect stu$Ri$= stu$Rfirstname= stu$RfathersnameJ
So= itJs a bit confusin- that SFL uses the or$ Qse*ectJ for hat is actua* the pro3ect
operation. For this reason= some te;tboo/s use a $ifferent name for the se*ect
operation restrict because it is restrictin- the number of tup*es.
Se*ect a subset of tup*es
,ro3ect a subset of attributes
The basic SFL Auery $oes se*ection an$ pro3ection. Hou can picture it *i/e thisG
,a-e )2 of 87
FBE Computer Science Department Lecture Notes Theory of Databases
student;ID student;firstna!e student;fathersna!e "ro&ra!!e;code
122 Sara Ne-ash CSDE<
121 Te/*e >aimanot CSDE<
12" Terhas <irma CSDE<
12) So*omon 9ebe$e CSD#,
1esu*tin- re*ation *oo/s *i/e thisG
student;ID student;firstna!e student;fathersna!e
122 Sara Ne-ash
121 Te/*e >aimanot
12" Terhas <irma
5.1. :nion
%e use the union operator to -et tup*es from " $ifferent re*ations.
Let us a$$ a teacher re*ation base$ on this schemaG
Teacher+Schema Cteacheri$= teacherRfirstname= teacherRfathersname= teacherRemai*D
5n$ *et us a*so a$$ an emai* a$$ress attribute for stu$entsG
Stu$ent+schema O Cstu$entRi$= stu$entRfirstname= stu$entRfathersname=
pro-rammeRco$e= stu$entRemai*D
No *et us say e ant to -et a *ist of a** stu$ent an$ teacher names= a*on- ith their
emai* a$$resses an$ put a** the $ata into a sin-*e re*ation. #f e se*ect from each one=
e have " re*ationsG
stu$entRi$
C
courseRco$eOT#CT"("T
Cstu$entCourseDD
". -et a re*ation shoin- stu$ents ta/in- #CT")1 shou*$ be compatib*e ith the
first re*ationG
stu$entRi$
C
courseRco$eOT#CT")1T
Cstu$entCourseDD
No e can -et the $ifference beteen these toG
stu$entRi$
C
courseRco$eOT#CT"("T
Cstu$entCourseDD
stu$entRi$
C
courseRco$eOT#CT")1T
Cstu$entCourseDD
The resu*t is= a-ain= a ne re*ation= ith one attribute an$ a number of tup*es.
5.1.5 Cartesian +roduct
%e a*rea$y ta*/e$ about the cartesian pro$uct of $omains remember that a re*ation
is a subset of the cartesian pro$uct of a set of $omains.
#n the same ay= e can combine " re*ations ith a Cartesian pro$uct operator the
resu*t is a re*ation that has a** the attributes from both re*ations an$ hose tup*es are
a** the possib*e combinations of the tup*es from each of the " re*ations.
The operator is ; e.-.
r
1
; r
"
,a-e )" of 87
FBE Computer Science Department Lecture Notes Theory of Databases
%e cou*$ have the same attribute name in the " re*ations so e have to ma/e sure
e can $istin-uish beteen the attributes in the resu*tin- re*ation.
r O stu$ent ; pro-ramme
The attributes in r areG
Cstu$ent.stu$Ri$= stu$ent.stu$Rfirstname= stu$ent.stu$Rfathersname=
stu$ent.pro-Rco$e= pro-ramme.pro-Rco$e= pro-ramme.pro-Rname=
pro-ramme.pro-R$escD
%e use the re*ation name as a prefi; to in$icate hich schema each attribute comes
from. But if the name occurs in one of the re*ations on*y= e can omit the re*ation
name. >ere= on*y pro-Rco$e occurs in both= so e have to put the re*ation names in
front of those " attributes.
This a*so means that the ar-ument re*ations must have $ifferent names.
r contains tup*es for every possib*e pair of stu$ent 6 pro-ramme tup*es e.-. Cbase$ on
$ata in han$out "D so if e have n tup*es in stu$ent an$ m tup*es in course= r
contains Cn ; mD tup*es.
stu$ent ; pro-ramme
stud;id stud;firstn
a!e
stud;fathersn
a!e
"ro&;cod
e
"ro&;cod
e
"ro&;na!e "ro&;desc
122 Sara Ne-ash CSDE< CSDE< Computer Science
De-ree
) Hear De-ree in
Computer Science
122 Sara Ne-ash CSDE< CSD#, Computer Science
Dip*oma
Dip*oma in
Computer Science
121 Te/*e >aimanot CSDE< CSDE< Computer Science
De-ree
) Hear De-ree in
Computer Science
121 Te/*e >aimanot CSDE< CSD#, Computer Science
Dip*oma
Dip*oma in
Computer Science
12" Terhas <irma CSDE< CSDE< Computer Science
De-ree
) Hear De-ree in
Computer Science
12" Terhas <irma CSDE< CSD#, Computer Science
Dip*oma
Dip*oma in
Computer Science
12) So*omon 9ebe$e CSD#, CSDE< Computer Science
De-ree
) Hear De-ree in
Computer Science
12) So*omon 9ebe$e CSD#, CSD#, Computer Science
Dip*oma
Dip*oma in
Computer Science
O1N-ive a simp*e e;amp*eG
r1 C11+SchemaD
11+Schema O C5= B= CD
5 B C
1
"
r" C1"+SchemaD
1"+Schema O CD= ED
%e can en$ up ith a *ot of tup*es in the resu*t $epen$in- on ho many tup*es in the
ar-ument re*ations.
,a-e )) of 87
FBE Computer Science Department Lecture Notes Theory of Databases
#n some tup*es= the stu$ent.pro-Rco$e O pro-ramme.pro-Rco$e= but in others they are
not eAua*G
t Lstu$ent.pro-Rco$eM t Lpro-ramme.pro-Rco$eM
#f e $o r
1
; r
"
= the schema for the resu*tin- re*ation r is a concatenation of the
schema 1
1
an$ 1
"
.
%hat use is this operation&
N.e can use it to anser Auestions *i/e Q-et a *ist of a** stu$ents on the Computer
Science De-ree pro-rammeJ if e $o not /no the pro-Rco$e= e can use the select
operator to -et on*y those tup*es here the pro-Rname is QComputer Science De-reeJ=
from the Cartesian pro$uct re*ation.
stu$ent.pro-Rco$eOpro-ramme.pro-Rco$e
C
stu$ent.pro-Rco$eOpro-ramme.pro-Rco$e
C
stu$ent.stu$Ri$Ostu$entCourse.stu$Ri$
C
courseRco$eOT#CT"("T
Cstu$ent ; stu$entCourseD
DD
,a-e )! of 87
FBE Computer Science Department Lecture Notes Theory of Databases
5.1.6 $ena!e
5s e rite more comp*e; e;pressions= it ou*$ be nice not to have to rite out *on-
re*ation names *i/e Qstu$entCourseJ.
%e can rename a resu*t to a shorter name= usin- the rename operator .
This is a*so usefu* if you ant to $o a Cartesian pro$uct of a re*ation ith itse*f as
the to operan$s must have $ifferent names.
To rename the re*ation -iven by an e;pression E to the name ;G
;
CED
For e;amp*eG
cs$e-Rstu$ents
C
pro-rammeRco$eOTCSDE<T
Cstu$entDD
The operation returns the re*ation -iven by the e;pression= an$ the re*ation is name$
cs$e-Rstu$ents. Can no use that name in further operations.
To fin$ any stu$ents name$ So*omon ho are on the CSDE< pro-rammeG
stu$RfirstnameOTSo*omonT
C
cs$e-Rstu$ents
C
pro-rammeRco$eOTCSDE<T
Cstu$entDDD
Or= can simp*y rename a re*ation= as a re*ation is itse*f a trivia* re*ationa* a*-ebra
e;pressionG
stu$ent"
Cstu$entD
%e can a*so rename the attributes in the re*ation= usin- synta; *i/e thisG
; C51=5"=N.5nD
CED
#f the e;pression E has arity n Cn attributesDK 51 is a name for the first attribute= 5n is
a name for the n
th
attribute.
',a!"leG fin$ the hi-hest cre$it hours Cthis is a trivia* e;amp*e as e can easi*y see
from the $ata= but an eAuiva*ent Auestion mi-ht be Qfin$ the account ith the hi-hest
ba*anceJ hich ou*$ be more $ifficu*t to seeD.
1. ma/e a re*ation containin- a** courses that do not have the highest credit hours
i.e. here the cre$it hours va*ue is *ess than some the cre$it hours va*ue in
some other tup*e in the re*ation.
". $o a set difference beteen a** courses an$ those that are in the re*ation ma$e
in step 1.
This means e are $oin- an operation that reAuires " operan$s but here the "
operan$s are base$ on the same re*ation.
.te" 1G
Luse $ifferent co*ours to sho each ne bit you a$$ this e;pressionM
course.cre$itRhours
S
,a-e )( of 87
FBE Computer Science Department Lecture Notes Theory of Databases
e have to have a $ifferent re*ation containin- the same attributes so e ma/e a
ne re*ation that is 3ust the Course re*ation rename$G
c
CcourseD
No e can $o a cartesian pro$uct of course ith the rename$ re*ationG
course.cre$itRhours S c.cre$itRhours
Ccourse ;
c
CcourseDD
No *etJs pro3ect to -et 3ust the cre$itRhours attribute from the course re*ation
because e have se*ecte$ those tup*es here the course.cre$itRhours is *ess than
c.cre$itRhoursG
course.cre$itRhours
C
course.cre$itRhours S c.cre$itRhours
Ccourse ;
c
CcourseDDD
.te" 2G
No e have a re*ation that has one attribute cre$itRhours. Each tup*e represents
some course that has cre$itRhours *ess than some other course i.e. a** courses that
do not have the highest credit(hours.
So if e no $o a set $ifference of this re*ation from a** cre$itRhours hatever is
remainin- must be the hi-hest cre$itRhours. 1emember that for set $ifference= the
ar-ument re*ations must be union+compatib*e.
course.cre$itRhours
CcourseD
course.cre$itRhours
C
course.cre$itRhours S c.cre$itRhours
Ccourse ;
c
CcourseDDD
5.2 #dditional 1"erations
The basic re*ationa* a*-ebra operators that e have 3ust *oo/e$ at are sufficient to
e;press any re*ationa* a*-ebra Auery.
But even ith these= some types of common Auery that e ma/e on re*ations become
comp*e; an$ *en-thy to e;press.
To ma/e some of these easier= there some a$$itiona* operations that simp*ify some
common types of Auery.
These areG
Set intersection
Natura* 3oin
Division
5ssi-nment
Each one of these can be e;presse$ in terms of the basic operations a*so.
5.2.1 .et Intersection
Suppose e ant to fin$ out hat stu$ents are ta/in- the course #CT"(" an$ the
course #CT")1.
This can be viee$ as the intersection of " sets or re*ationsG
stu$Ri$
C
courseRco$eOT#CT"("T
Cstu$entCourseDD
stu$Ri$
C
courseRco$eOT#CT")1T
Cstu$entCourseDD
is the set intersection operator.
,a-e ). of 87
FBE Computer Science Department Lecture Notes Theory of Databases
r s can be ritten usin- the basic operations *i/e thisG
r Cr sD
Cr sD -ives tup*es that are in r but not in s.
#f e ta/e those tup*es aay from r= e are *eft ith on*y tup*es that are in r and in s.
#t is easier 3ust to use .
5.2.2 Natural Aoin
Thin/ bac/ to the Cartesian pro$uct operation usua**y= to -et somethin- meanin-fu*
from the resu*ts= e $o a se*ect operation ith some pre$icate on the C, resu*t
because e are *oo/in- for matchin- va*ues in the tup*es.
For e;amp*e= e $i$ this operation to -et a *ist of stu$ents ho are ta/in- the
Computer Science De-ree pro-ramme Cassumin- e $o not /no the pro-Rco$e for
itDG
stu$Rfirstname=stu$Rfathersname
C
stu$ent.pro-Rco$eOpro-ramme.pro-Rco$e
C
stu$Rfirstname=stu$Rfathersname
C
pro-Rco$eOTCSDE<T
Cstu$ent V;V stu$entCourse V;V courseD
D
%e cou*$ rite Cstu$ent V;V stu$entCourseD V;V course or stu$ent V;V Cstu$entCourse V;V
courseD an$ -et the same resu*t because the or$er in hich the operations are
e;ecute$ $oes not matter e say that the natura* 3oin is associative as an operation.
". 4sin- a 3oin instea$ of set intersection
1emember our e;amp*e for set intersection fin$ the #Ds of a** stu$ents ho are
ta/in- both #CT"(" an$ #CT")1 coursesG
stu$Ri$
C
courseRco$eOT#CT"("T
Cstu$entCourseDD
stu$Ri$
C
courseRco$eOT#CT")1T
Cstu$entCourseDD
,a-e )7 of 87
FBE Computer Science Department Lecture Notes Theory of Databases
This can a*so be e;presse$ as a natura* 3oinG
stu$Ri$
C
courseRco$eOT#CT"("T
Cstu$entCourseD V;V C
courseRco$eOT#CT")1T
Cstu$entCourseDD
). :oinin- re*ations that $o not have a common attribute
#f e 3oin " re*ations that $o not have a common attribute= the resu*t of the natura* 3oin
is the same as that of the Cartesian pro$uctG
r V;V s O r ; s
1 S O Cempty setD
!. Combine a se*ection an$ a Cartesian pro$uct into a sin-*e operation. Ear*ier= e
$i$ thisG
stu$Rfirstname=stu$Rfathersname
C
Cr ; sD
here CthetaD is a pre$icate on attributes in the schema 1 S.
So e can $oG
stu$Rfirstname=stu$Rfathersname
Cstu$ent VV
pro-RnameOTComputer Science De-reeT
pro-rammeD
The V;V
operator is ca**e$ a theta +oin.
5.2. Di3ision
The division operation is use$ to anser Aueries *i/e Qfin$ courses that are bein- ta/en
by a** stu$ents on the CSDE< pro-rammeJ or Qfin$ stu$ents ho are ta/in- a**
coursesJ.
#n contrast= a natura* 3oin can fin$ courses that are bein- ta/en by any stu$ent.
',a!"leG to fin$ courses that are bein- ta/en by all stu$ents on the CSDE<
pro-ramme
stu$entCourse Y C
stu$Ri$
C
pro-Rco$eOTCSDE<T
Cstu$entDDD
1efer to your >an$out " the stu$entCourse samp*e $ata. But first= a$$ a ne tup*e to
itG C12"= #CT"("D.
stu$entCourse
stud;id course;code
122 #CT"("
121 #CT"("
122 #CT")1
12" #CT")1
12) #CT"("
12) #CT")1
12" #CT"("
%hat tup*es are in
stu$Ri$
C
pro-Rco$eOTCSDE<T
Cstu$entDD&
,a-e )8 of 87
FBE Computer Science Department Lecture Notes Theory of Databases
5nserG
stud;id
122
121
12"
No if e Q$ivi$eJ stu$entCourse by this= e i** -et a resu*t that has on*y the
attribute courseRco$e. #s there any courseRco$e that has a tup*e for every one of W122=
121= 12"X&
5G #CT"(".
%hen $ivi$in-= e are *oo/in- for va*ues in the stu$entCourse.courseRco$e co*umn
that are associated with every value in the stud(id column of the re*ation resu*tin-
from se*ectin- ith the pre$icate pro-Rco$eOTCSDE<T on stu$ent.
To ma/e it more c*ear= e shou*$ a$$ a pro3ection to the re*ation on the *eft= to sho
hat attributes it hasG
stu$Ri$=courseRco$e
Cstu$entCourseD Y C
stu$Ri$
C
pro-Rco$eOTCSDE<T
Cstu$entDDD
9or!al definitionG
rC1D an$ sCSD are re*ations ith schemas 1 an$ S.
S 1 that is= every attribute in schema S is a*so in schema 1.
The re*ation r Y s is a re*ation on the schema 1 S Ca** attributes in schema 1 that are
not in schema SD.
For this e;amp*eG the resu*t has on*y the attribute courseRco$e .
5 tup*e t is in the resu*tin- re*ation if these " con$itions ho*$G
1. t is in
1+S
CrD
". for every tup*e t
s
in s= there is a tup*e t
r
in r that satisfies both of the fo**oin-G
a. t
r
LSM O t
s
LSM
b. t
r
L1 + SM O t
For this e;amp*eG
Condition 1G
1+S
CrD -ives a** the courseRco$e va*ues in stu$entCourseG W#CT"("=
#CT"("= #CT")1= #CT")1= #CT"("= #CT")1= #CT"("X but $up*icate tup*es are
e*iminate$ by a pro3ection= so e have W#CT"("= #CT")1X.
Condition 2G
%e have W122=121= 12"X in our s re*ation.
+art aG For every tup*e t
s
in s= there is a tup*e t
r
in r that satisfiesG
t
r
LSM O t
s
LSM
#n other or$s= e ant to see is there a matchin- ro for every stu$Ri$ in
stu$entCourse Cfor every courseRco$e va*ue in the set W#CT"("= #CT")1XD.
For every tup*e t
s
in sG this is the set of tup*es W122=121= 12"X.
122G t
r
LSM is WS122= #CT"("P= S122= #CT")1PX
121G t
r
LSM is WS121= #CT"("PX
12"G t
r
LSM is WS12"= #CT")1P=S12"= #CT"("PX
,a-e !2 of 87
FBE Computer Science Department Lecture Notes Theory of Databases
To fulfil "art ?b@= e have to ta/e the above tup*es an$ chec/ that t
r
L1 + SM O t= here
t is in W#CT"("= #CT")1X.
For 122G
S122= #CT"("P S#CT"("P O S#CT"("P
there is a matchin- tup*e t
r
L1 + SM for every t
For 121G
there is NOT a matchin- tup*e t
r
L1 + SM for every t #CT")1 $oes not have a match.
So e must e*iminate #CT")1 from the set of tup*es t in
1+S
CrD.
For 12"G
there is a matchin- tup*e t
r
L1 + SM for every t
',"ress r B s usin& the basic o"erations
r Y s O
1+S
CrD +
1+S
CC
1+S
CrD ; sD +
1+S=S
CrDD
LetJs brea/ this $onG
1+S
CrD +
1+S
C C
1+S
CrD ; sD +
1+S=S
CrD D
The steps areG
1. <et a** the courseRco$e va*ues
". E*iminate courseRco$e va*ues that $o not have a** possib*e stu$ent+course
combinations in stu$entCourse. %e $o it *i/e thisN.
1+S
CrD ; s
-ives us every possib*e pairin- of courseRco$e an$ stu$Ri$
course;code stud;id
#CT"(" 122
#CT"(" 121
#CT"(" 12"
#CT")1 122
#CT")1 121
#CT")1 12"
1+S=S
CrD
-ives us the tup*es in r= but ith the courseRco$e attribute first= then stu$Ri$ Cto $o set
$ifference= must be union compatib*e same $omain for the i
th
attribute in each
re*ationD.
course;code stud;id
#CT"(" 122
#CT"(" 121
#CT")1 122
#CT")1 12"
#CT"(" 12)
#CT")1 12)
#CT"(" 12"
Set $ifference of the " above *eaves the tup*e C#CT")1= 121D the on*y tup*e in the top
re*ation that is not in the bottom re*ation.
,a-e !1 of 87
FBE Computer Science Department Lecture Notes Theory of Databases
This is -ivin- us the possib*e combinations of tup*es Cfrom the C, of courseRco$e an$
stu$Ri$D that $o not actua**y appear in stu$entCourse.
Do pro3ection
1+S
on this *eaves the courseRco$e on*y C#CT")1D.
No $o set $ifference beteen
1+S
CrD an$ C#CT")1D to find the course(code
values that have all possible combinations existing in stu$entCourse.
The tup*es for
1+S
CrD are C#CT"("= #CT")1D.
The set $ifference is #CT"(".
5.2.4 #ssi&n!ent
#t ou*$ be nice if e cou*$ assi-n the resu*t of an operation to a variab*e an$ then
use that variab*e in a subseAuent e;pressions= *i/e in pro-rammin- *an-ua-es.
%e**Ne can assi-n an e;pression to an temporary re*ation variab*e= *i/e thisG
temp1e*1 E
This assi-ns the resu*t of the e;pression E to the re*ation variab*e temp1e*1. The
variab*e can then be use$ in subseAuent e;pressions.
LetJs ta/e one of the *on-er e;pressions e have use$ as an e;amp*eG
stu$Rfirstname=stu$Rfathersname
C
stu$ent.pro-Rco$eOpro-ramme.pro-Rco$e
C
tit*e=type=a$vance=notes=auRi$=auRfname=auR*name
C
tit*e.typeOJbusinessJ
CTit*eD nat.3oin Tit*e5uthor nat.3oin C
author.fname L#9E Jm[J
C5uthorDDD
Note that the se*ect operations are $one before the natura* 3oins this e*iminates some
ros before $oin- the 3oin= ma/in- the 3oins more efficient.
.te" G 5ccess p*an eva*uation fi*e operations. 5ccess p*an *oo/s *i/e a tree
structure the *eaf no$es are the startin- points the tab*es in the Auery. #t shos
hat in$ices i** be use$. This is one possib*e access p*an.
,a-e 70 of 87
'er-e 3oin C-et
matchin- rosD
4se B+Tree in$e; on Tit*eR#D
to -et ros in seAuence=
fi*terin- to -et on*y those
ith typeOJbusinessJ
4se B+Tree in$e; on Tit*eR#D
to -et ros in seAuence
5uthors
4se B+Tree in$e; on 5uRi$ to -et
ros in seAuence= fi*terin- to -et on*y
those here auRfname be-ins ith m.
Sort CauRi$D
'er-e 3oin C-et matchin- rosD
FBE Computer Science Department Lecture Notes Theory of Databases
5nother possib*e p*an for this Auery mi-ht be to first 3oin 5uthors 6 Tit*e5uthor.
The DB'S i** eva*uate the $ifferent p*ans each operation type Ce.-. rea$ a B Tree
in$e;= sort= mer-e 3oinD has an associate$ cost. The DB'S can estimate costs by
estimatin- the number of ros that i** resu*t from each operation.
Hou can see a representation of the p*an create$ by SFL Server in the Fuery 5na*yBer.
Before runnin- the Auery= use Fuery menu= Sho E;ecution ,*an. %hen you run the
Auery= an e;tra tab appears in the resu*ts pane E;ecution ,*an. This shos a
$ia-ram= simi*ar to the one above. #f you point the mouse at a no$e in the $ia-ram=
you see more information about hat is happenin- e.-. e;act*y hat in$e; is bein-
use$= hat operation is bein- carrie$ out= the C,4 cost etc.
#n this case= you i** see that the in$ices on Tit*es an$ 5uthors are c*ustere$ so
accessin- those in or$er is fast. The in$e; on Tit*e5uthor.Tit*eR#D is non+c*ustere$= so
it may not be so fast but as the number of Tit*es has a*rea$y been re$uce$
CtypeOJbusinessJD= the number of matchin- ros is sma**. On*y those b*oc/s pointe$
to by in$e; entries for the remainin- Tit*eR#D va*ues nee$ to be accesse$.
.te" 4G e;ecution of access p*an interpret the p*an an$ e;ecute it.
%.15. :se of .tatistics
The Auery optimisation part of the DB'S nee$s statistics about the $ata in the
$atabase to ma/e -oo$ $ecisions.
# mentione$ that in eva*uatin- the access p*an= the DB'S estimates the cost of each
operation= by estimatin- the number of ros that i** be returne$.
#t $oes this by referencin- tab*e profi*es an$ statistics such as number of ros in
each tab*e an$ $istribution of va*ues in a co*umn Ce.-. of CustomerName va*uesD.
5 DB'S *i/e SFL Server /eeps these statistics in its system $atabases. #t a*so up$ates
the statistics re-u*ar*y= automatica**y.
The DB'S i** a*so ana*yse the avai*ab*e in$ices on a tab*e an$ try to use the one
that -ives the *oest cost for a particu*ar operation.
Sometimes= you may fin$ that it $oes not use the in$e; you e;pect it to= or ant it to.
SFL Server a**os you to force it to use a particu*ar in$e; in a Auery usin-
somethin- ca**e$ a tab*e hint Cuse Boo/s On*ine if you ant to fin$ out moreD.
%.16 Creatin& Inde,es 4ith ./L
Hou can create in$ices usin- SFL Aueries or ith the Enterprise 'ana-er.
#n SFLG
C1E5TE L4N#F4EM LCL4STE1ED V NONCL4STE1EDM #NDE` in$e;Rname
,a-e 77 of 87
Tit*es Tit*e5uthor
FBE Computer Science Department Lecture Notes Theory of Databases
ON tab*eRname Cco*umnRname L5SC V DESCM= N..D
4niAueG if uniAue= no " ros can have the same va*ue for the in$e; co*umnCsD
C*ustere$?nonc*ustere$G c*usterin- or not
5SC?DESCG ascen$in- or$er or $escen$in- or$er of search /eys in the in$e;. For a
c*ustere$ in$e;= this i** affect the or$er of the ros in the $ata fi*e.
Hou can a*so or/ ith in$ices usin- the Enterprise 'ana-er see >an$out 7=
section 0= un$er the course Database ith SFL Server for $etai*s.
#n 'S 5ccess= in$e;in- is one of the properties of a co*umn you can specify if the
co*umn is to be in$e;e$ an$ if yes= if $up*icates are a**oe$ or not an$ a*so if it is
ascen$in- or $escen$in-. Hou can a*so use the #n$e;es button on the too*bar usefu*
if you ant to create an in$e; on to or more co*umns to-ether.
LDurin- this unit= arran-e ith the technica* assistant to $o a *ab session on in$e;in-.
<ive him?her an$ the stu$ents a copy of >an$out 12 CTab*es 6 #n$ices %or/sheetD to
use in the *ab. #$ea**y near the en$ of the unit in the same ee/ as the *ast fe
sections above are covere$ in c*ass.M
%.1* /uiF
>an$out 11 is a AuiB for stu$ents to he*p chec/ their on *earnin- of this unit.
Either -ive them each a copy or ca** out the Auestions in c*ass.
<ive them about "2 minutes to anser the Auestions= or/in- on their on.
Then -et them to sap ith the person besi$e them.
Then you -o throu-h the Auestions an$ their ansers stu$ents can mar/ each otherJs
papers. 4se this as an opportunity to $iscuss the ansers if stu$ents have $ifferent
ansers to the su--este$ ansers= chec/ them an$ e;p*ain hy they are not Auite
ri-ht or accept if they are -oo$ ansers.
The so*utions to the AuiB are in the $oc >an$out 11 #n$e;in- FuiB So*utions.$oc.
- ./L
Cmost*y to be covere$ in the *abs by the course technica* assistantD
1) Introduction to Transactions 0 Concurrency
1b2ecti3esG stu$ents to /no hat $atabase transactions are= the 5C#D properties of a
transaction= hat concurrency means an$ to un$erstan$ the issues invo*ve$ in
concurrency contro*.
$eadin&G 'annino= chapter 1)K Si*berschatB et a* chapters 1( 6 1..
1).1 Introduction
5 transaction usua**y means an interaction amon- " or more parties in the con$uct of
business e.-. a customer *o$-in- money into a ban/ account= a customer ith$rain-
money from the ban/ account= the purchase of a car or boo/in- a f*i-ht.
,a-e 78 of 87
FBE Computer Science Department Lecture Notes Theory of Databases
#n a DB'S= this type of transaction may reAuire severa* $atabase operations to ta/e
p*ace. For e;amp*e to *o$-e money into a ban/ account= the operations that must
ta/e p*ace cou*$ beG
1. #nsert a ne recor$ to the 5ccount>istory tab*e
". 4p$ate the account ba*ance in the 5ccounts tab*e
To ith$ra money= the operations mi-ht beG
1. Chec/ the account ba*ance to ma/e sure the ith$raa* amount is
avai*ab*e
". #nsert a ne recor$ to the 5ccount>istory tab*e
). 4p$ate the account ba*ance in the 5ccounts tab*e
L#f you nee$ another e;amp*e transfer money beteen " accounts nee$ to $ebit
one account= cre$it the other an$ up$ate both ba*ances. %i** refer bac/ to these
e;amp*es= so /eep on the boar$ if possib*e.M
So= in a $atabase system= one transaction can invo*ve any number of rea$s from an$
rites to the $atabase. 5ny one of the operations e.-. chec/ the account ba*ance $oes
not ma/e sense to the en$ user if ta/en on its on. The en$ user may as/ the system
to perform a *o$-ement or a ith$raa*= an$ $oes not nee$ to /no about the
operations that ta/e p*ace.
#n other or$s= a co**ection of operations to-ether is a sin-*e unit from the point of
vie of the $atabase user.
#n $atabase termino*o-y= a transaction is: a se=uence of o"erations "erfor!ed as a
sin&le> lo&ical unit of 4or5.
The operations performe$ as part of a transaction ma/e chan-es to the $ata i.e. they
are $ata mo$ification operations.
5t any point in time= there may be many transactions happenin-. For e;amp*e= in a
ban/Js system= there are customers comin- into $ifferent branches an$ ma/in-
*o$-ements an$ ith$raa*s at a** times $urin- the $ay. So the system has to be ab*e
to han$*e concurrent transactions Ctransactions happenin- at the same time= an$ usin-
the same $atabase tab*esD.
The 5ey feature of a transaction is that all the o"erations !ust succeed. #f any one
of the operations fai*s= then the transaction must fai* it must un$o any of the
operations that $i$ succee$.
For e;amp*e= if the system removes money from the savin-s account but fai*s to a$$
the money to the current account there i** be a prob*em ith the customer]s
account.
So= $atabase transactions must have ays of chec/in- that a** operations have
comp*ete$.
1).2 #CID +ro"erties
For a *o-ica* unit of or/ to be consi$ere$ a transaction= it shou*$ have certain
properties. These properties he*p to ensure the inte-rity of $ata in a $atabase.
,a-e 82 of 87
FBE Computer Science Department Lecture Notes Theory of Databases
They areG
#to!icity
Consistency
Isolation
Durability
These are /no as the 5C#D properties of a transaction Cfrom the first *ettersD.
Lrite on boar$ but *eave room to put e;tra info for each one a$$ notes as you -o
throu-h the fo**oin- section that $escribes each propertyM
#to!icity
5** $ata chan-es ma$e by the operations are ref*ecte$ in the $atabase or none of them
are Ca** $ata mo$ifications performe$ or noneD.
For e;amp*e to ith$ra money from an account reAuires ) stepsG
1. Chec/ the account ba*ance to ma/e sure the ith$raa* amount is
avai*ab*e
". #nsert a ne recor$ to the 5ccount>istory tab*e
). 4p$ate the account ba*ance in the 5ccounts tab*e
#f a** ) steps succee$= the transaction is comp*ete e say it is co!!itted.
Let us suppose the first step succee$s= then the secon$ step fai*s for some reason= a
recor$ cannot be inserte$ to the 5ccount>istory tab*e.
#n this case= the entire transaction must fai* because the account cannot be $ebite$.
#f the secon$ step $oes succee$ an$ then the thir$ step fai*s the secon$ step must be
un$one= because the account cannot be $ebite$ ithout up$atin- the account ba*ance.
#f a transaction fai*s= the process of un$oin- the previous steps an$ -oin- bac/ to the
initia* state is ca**e$ rollin& bac5 the transaction.
Thus= 5tomicity means that a DB'S must be capab*e of recoverin- transactions if
somethin- -oes ron- hi*e the operations are bein- e;ecute$. #f a transaction is
partia**y comp*ete$= it must un$o the comp*ete$ operations.
There are $ifferent situations that can cause this. ,ossib*e causes areG
" sets of operations CtransactionsD -et into a $ea$*oc/ because both are
aitin- to use the same $ata eventua**y one of them i** be terminate$ this
is ca**e$ transaction recovery.
a crash of the system if the DB'S has a system fai*ure hi*e transactions
are ta/in- p*ace= after the $b is recovere$= there may be partia**y comp*ete$
transactions Chere some of the operations too/ p*ace but others $i$nJtD this
is ca**e$ crash recovery.
Consistency
%hen comp*ete$= a transaction must preserve the consistency of a $atabase. This
means that after comp*etion= a** $ata must be in a correct state an$ comp*y ith a**
$ata constraints an$ va*i$ation ru*es.
e.-. for a ban/ transaction that transfers money beteen " customer accounts + this
ou*$ mean that the tota* amount of money recor$e$ in the customer]s accounts
shou*$ sti** be the same.
,a-e 81 of 87
FBE Computer Science Department Lecture Notes Theory of Databases
Consistency a*so means that data integrity in the $atabase must be maintaine$. This
means avoi$in- the situation here a transaction can rea$ Qdirty dataJ. Dirty $ata is
$ata that has been chan-e$ but not yet committe$ to the $atabase.
To he*p ensure consistency= the DB'S must be ab*e to $ea* ith concurrent
transactions. %e i** ta*/ more about this *ater.
Isolation
#n summary= this means that any transaction must be unaare of other transactions
e;ecutin- in the system concurrent*y Cmeanin- at the same timeD.
No other transactions or e*ements of the $atabase can see the chan-es resu*tin- from a
transaction unti* the transaction comp*etes. Other transactions shou*$ see the $ata in
the state it as in before the transaction or after it comp*etes not in beteen.
Or= in other or$s= a transaction must see a consistent $atabase a transaction cannot
rea$ or rite $ata that is bein- mo$ifie$ by another transaction.
,ossib*e conseAuences of transactions that are not iso*ate$G
Lost u"dates an up$ate by one user overrites an up$ate by another user
:nco!!itted de"endency Ca*so /non as a $irty rea$D hen one
transaction rea$s $ata ritten by another transaction before the first
transaction commits
+hanto! $o4s hen a transaction 5 rea$s $ata ros= then another
transaction B $oes somethin- to up$ate the ros= then transaction 5 carries out
the same rea$ a-ain but -ets $ifferent recor$s from the first time.
%e i** ta*/ more about each of these hen e cover Concurrency Contro*.
Durability
5fter a transaction comp*etes successfu**y= the chan-es it has ma$e in the $atabase
i** persist CremainD even if there is a system fai*ure Ce.-. a $is/ crashD. #n other
or$s the chan-es must be permanent an$ cannot be erase$ from the $atabase Cafter
they are committe$D.
e.-. for ban/in- if the system crashes= the customer i** sti** see that she move$
B122 to the current account. Or if customer ` transferre$ B1222 to the account of
customer H the money must sti** sho as havin- been transferre$.
1). D8M. .er3ices for Transactions
5 DB'S shou*$ provi$e services to he*p meet the 5C#D properties of transactions.
SFL a*so has some features that -ive the $b $esi-ner contro* over transactions.
1)..1 #to!icity
SFL has statements to be-in?commit?ro**bac/ transactions.
The DB'S automatica**y carries out transaction 6 crash recovery.
1)..1.1 ./L state!ents
#f there is a seAuence of operations that form a transaction= the pro-rammer shou*$
be-in a transaction before the first one an$ commit the transaction after the *ast one.
Hou a*so have to put *o-ic in the co$e to ma/e a** the operations fai* if any one of
them fai*s SFL has a statement to roll back a transaction a*so.
,a-e 8" of 87
FBE Computer Science Department Lecture Notes Theory of Databases
8e&in transaction
Operation 1
#f fai*ure
$ollbac5 transaction
E;it
En$ if
Operation "
#f fai*ure
$ollbac5 transaction
E;it
En$ if
Co!!it transaction
1)..2 Consistency
Data inte-rity constraints are chec/e$ hen up$ates are ma$e.
The re*ationa* mo$e* has bui*t+in inte-rity ru*es *i/e entity inte-rity Cprimary /ey
constraint= uniAueness constraintD= referentia* inte-rity Cforei-n /ey constraintD= chec/
constraints Cva*i$ation ru*esD= nu** chec/s.
%hen a $atabase operation is carrie$ out= the DB'S chec/s that a** the chan-e$ or
a$$e$ $ata sti** meets a** the constraints on it. #f it $oes not= an error occurs.
#n SFL Server= certain SFL statements e.-. C1E5TE T5BLE= #NSE1T are
automatica**y ro**e$ bac/ if an error occurs.
For other statements or combinations of statements= the pro-rammer can chec/ for
such errors an$ ro** bac/ the transaction if an error occurs.
1).. Isolation
>an$*in- of concurrent transactions concurrency contro*. 5 mechanism ca**e$
*oc/in- is use$ to he*p contro* concurrency.
To prevent other transactions accessin- a tab*e it is usin-= a transaction can loc5 the
tab*e. 5 *oc/ te**s other transactions that the tab*e cannot be accesse$.
But there are $ifferent *eve*s of *oc/in- possib*e for e;amp*e= if one step in a
transaction is to simp*e rea$ $ata from a particu*ar tab*e= there is no nee$ to stop other
transactions from rea$in- the tab*e.
%e i** ta*/ more about this *ater.
1)..4 Durability
5 $atabase can be bac/e$ up re-u*ar*y. Then if there is a system fai*ure= the $atabase
bac/up can be restore$ to recover the $atabase.
ButN.if the bac/ up as ta/en ) hours before the fai*ure= the *ast ) hours of $ata
chan-es i** not be in itN.so the transactions ma$e in that time are not $urab*e.
To a$$ress this= the DB'S can /eep a lo& of transactions. 5 *o- fi*e can be a$$e$ to
a** the time an$ $oes not ta/e as much space as a comp*ete bac/up.
The *o- is *i/e a *ist of a** the chan-es that have ta/en p*ace.
#f the bac/up is restore$= then the *o- fi*e from the time of bac/up can be app*ie$ to
the $atabase this i** brin- it ri-ht up to $ate= up to the point in time here it
crashe$.
,a-e 8) of 87
FBE Computer Science Department Lecture Notes Theory of Databases
This is /non as a transaction *o-.
L5s/ have you notice$ that hen you create a $atabase on SFL Server= there are to
fi*e *ocations specifie$ one is for the $ata itse*f an$ the other is for the *o- fi*e. They
can be put in $ifferent $rives= so if the $ata $rive crashes= you may sti** have the *o-
fi*e.M
1).4 Concurrency Control
Concurrent transactions are transactions that are runnin- at the same time. 5 DB'S
shou*$ be ab*e to han$*e this an$ sti** maintain the iso*ation property of transactions.
%hat iso*ation actua**y meansG
mo$ifications ma$e by concurrent transactions must be iso*ate$ from mo$ifications
ma$e by any other concurrent transactions.
Let us suppose there are to transactions= 5 an$ B= occurrin- concurrent*y an$ usin-
the same tab*es= then transaction 5 shou*$ see the $ata in the state it as in before
transaction B is carrie$ out or after transaction B as comp*ete$. Transaction 5
shou*$ not see the $ata in any interme$iate state Ce.-. after one up$ate but before
another oneD.
This is because a transaction can ma/e chan-es to $ata an$ it can consist of severa*
$ifferent operations to ma/e those chan-es. Suppose Trans 5 is ma/in- chan-es. #f
Trans B sees the $ata hi*e 5 is e;ecutin-= it may rea$ $ata before 5 chan-es it. Then
5 comp*etes= but B has rea$ some $ata that is no incorrect.
e.-. for ban/in- cannot have another transaction tryin- to ta/e money out of the
savin-s account hi*e money is bein- move$ to the current account hat if there
i** be no money *eft in the account after the transaction comp*etes&
LetJs *oo/ at a simp*e e;amp*e to sho hat can happen hen transactions run
concurrent*y. Dra a time *ine CT1+T.D an$ a series to sho " transactions runnin- at
the same time= operatin- on the same $ata.
The te;t in parentheses is information you Cthe teacherD can ta*/ about?a$$ after
$rain- the $ia-ram.
L,ossib*e c*ass activity to $o for this to invo*ve the stu$ents an$ to he*p your visua*
an$ /inaesthetic *earners
1
. Hou can have stu$ents act out the process of the $ifferent
transactions an$ hat they $o.
'a/e some car$s one sayin- Transaction 5= one sayin- Transaction B. <et a
vo*unteer to ho*$ up each one= or stic/ on the a** or boar$. Then have a stu$ent for
each transaction. Ta/e the steps ritten in the tab*e for each transaction at each time
CT1= T" etcD rite them on pieces of car$?paper as instructions. >ave somethin- to
be the $atabase e.-. ritten on the boar$ or another stu$ent ho*$in- the va*ue of ` in
the $b. Then han$ each QtransactionJ stu$ent a piece of paper ith an instruction on it
-ive the T1 instructions first= *et them $o the action an$ then continue ith T" an$
so on. 1emember that each transaction rea$s $ata into its on buffer so the va*ue it
1
One *earnin- theory says that *earners are visua*= au$itory or /inesthetic. #n short= this means some
peop*e *earn best by seein- $ia-rams= pictures etcK some by *istenin- an$ some by $oin- thin-s. 'ost
peop*e are visua* or au$itory but some stu$ents i** be /inesthetic. TheyJ** be bore$ by p*ain o*$ cha*/
6 ta*/ teachin-I
,a-e 8! of 87
FBE Computer Science Department Lecture Notes Theory of Databases
has in its buffer $oesnJt chan-e if the va*ue in the $b chan-es i.e. if Trans 5 rites ;=
Trans B $oes not -et the ne va*ue unti* it $oes a ne rea$ of `.
>opefu**y this i** he*p stu$ents to un$erstan$ ho concurrent transactions operate
an$ to see ho a *ost up$ate can occur.M
Ti!e Transaction # Transaction 8
T1 rea$ ` 3result= -4
T" ;GO ` E (2 3result= x<"-4 rea$ ` 3result= - 1 because rans & has not yet
written the new value of x4
T) rite ` 3> in db now has value
"-4
`GO`E"2 3result= #-4
T! rite ; 3> in db now has value #-4
T( Commit transaction
T. Commit transaction
FG hat is the va*ue of ` after T.&
5G "1.
FG %hat shou*$ it be&
5G 1 E "2 E (2 O 01.
#f this as your ban/ account= ou*$ you be happy&I
This is ca**e$ a lost u"date the up$ate ma$e by Transaction 5 as *ost.
:nco!!itted de"endency Cor $irty rea$D e;amp*eG
Ti!e Transaction # Transaction 8
T1 rea$ ` 3result= -4
T" ;GO ` + 1 3result= x<.4
T) rite ` 3> in db now has value .4
T! rea$ ` 3result= . 1 because rans & has
now written the new value of x4
T( ro**bac/ transaction
5fter T(= Transaction B has an incorrect va*ue for ` if it performs an operation e.-.
to a$$ to `= the resu*t i** be incorrect because Transaction 5 ro**e$ bac/ itJs
operations puttin- ` bac/ to a va*ue of 1.
Incorrect su!!ary Cphantom rosD e;amp*eG
Occurs hen Transaction 5 is up$atin- $ata hi*e Transaction B is rea$in- the $ata to
ca*cu*ate a summary.
Ti!e Transaction # Transaction 8
T1 rea$ ` 3result= -4
T" ;GO ` + 1 3result= x<.4
T) rite ` 3> in db now has value .4
T! rea$ ` 3result= . 1 because rans & has
now written the new value of x4
T( sum O sum E `
T. rea$ H 3result= !4
,a-e 8( of 87
FBE Computer Science Department Lecture Notes Theory of Databases
T0 Sum O sum E H 3result= .?! < !4
T7 1ea$ H 3result= !4
T8 H O y 1 3result= #4
T12 %rite y 3result y<# in the db4
>ere= Transaction B has rea$ ` after Transaction 5 up$ate$ it= but rea$ H before
Transaction 5 up$ate$ it.
So Transaction B has inconsistent $ata remember that iso*ation means that
Transaction B shou*$ see the $ata in the state it as in before ransaction & modified
it, or after it modified it not in beteen the to states.
1).5 Loc5s
5** of the above prob*ems can be avoi$e$ by usin- a mechanism ca**e$ *oc/s.
%hen a transaction is ma/in- chan-es to $ata in a tab*e= the transaction can -et a *oc/
on the tab*e.
%hi*e the tab*e is *oc/e$ by the transaction= other transactions ishin- to access the
$ata cannot $o so they must ait unti* the *oc/ is re*ease$.
LHou can sho ho this ou*$ prevent any of the previous ) e;amp*es in each case=
Transaction B ou*$ have to ait for Transaction 5 to re*ease its *oc/ before
accessin- the $ata.M
This seems *i/e a -oo$ so*utionN.no= transactions 3oin a Aueue Ca *ine in 4S
En-*ishD hen they ant to access $ata in a particu*ar tab*e.
%hen the *oc/ is avai*ab*e= it is -rante$ to the ne;t transaction in the Aueue.
But consi$er hat this means if there are many transactions concurrent*y accessin-
the same $ata they have to ait in the Aueue. This can cause $e*ays in the
app*ications comp*etin- their transactions.
So this can re$uce the $b performance s*oin- thin-s $on. 5n$ ith this=
transactions are not rea**y concurrent any more instea$ they ait an$ a transaction is
the on*y one up$atin- the $ata at a -iven point in time.
This is ca**e$ seria*iBation here transactions access $ata in seAuence= formin- a
Aueue.
This is one e;treme of isolation le3el. The other e;treme is to a**o transactions to
rea$ $ata before it is committe$ by other transactions i.e. to accept the possibi*ity of
uncommitted dependency happenin-.
#n SFL Server= this iso*ation *eve* is ca**e$ read unco!!itted.
#t means the DB5 C$b a$ministratorD is a**oin- more transactions to access the same
$ata at the same time= /noin- that sometimes this i** resu*t in an uncommitted
dependency an$ thus some inconsistent $ata.
But this may be acceptab*e if it $oes not happen often.
#t a*so $epen$s on the $ata e.-. sensitive $ata *i/e ban/ account transactions vs
customer a$$ress $ata. 5 chan-e in the ban/ account ba*ance shou*$ not be rea$ unti*
it is committe$K a chan-e to the customerJs a$$ress $oes not happen often an$ cou*$
be rea$ before committin-. #f the chan-e is committe$= the effect of havin- the $ata
before it chan-e$ is minima*.
,a-e 8. of 87
FBE Computer Science Department Lecture Notes Theory of Databases
5 DB'S i** have $ifferent *eve*s of iso*ation. SFL Server has ! *eve*s.
The choice of iso*ation *eve* is a tra$e+off beteen concurrency an$ $ata consistency.
The hi-hest *eve* of iso*ation= Seria*iBab*e= -ives *o concurrency but hi-h
consistency.
The *oest *eve*= read unco!!itted= -ives hi-h concurrency but *o consistency.
Dra a $ia-ram *i/e this C$onJt nee$ the -ri$ *inesDG
Concurrency Consistency
Seria*iBab*e Chi-h iso*ationD
*o
hi-h
1epeatab*e 1ea$
1ea$ Committe$
1ea$ 4ncommitte$ C*o
iso*ationD
hi-h *o
1).6 Loc5 Ty"es
1).* I!"licit Transactions
%e have not been usin- be-in?commit transaction hen e rite SFLNbut SFL
Server is smart it $oes somethin- ca**e$ #mp*icit Transactions. #t $oes these for
certain statementsG
5*ter= create= $rop= $e*ete= insert= up$ate= se*ect.
5n imp*icit transaction occurs automatica**y. So if you $o an insert= if it creates an
error Ce.-. you try to put a character va*ue into an int co*umnD= SFL Server
automatica**y ro**s bac/ the transaction.
This $efau*t transaction mo$e is ca**e$ 5utocommit Transactions because it
automatica**y commits if the statement e;ecution is successfu* an$ it automatica**y
ro**s bac/ if the statement fai*s.
%hen you connect to the SFL Server throu-h the Fuery 5na*yBer= you can choose a
$ifferent Transaction 'o$e. Others are #mp*ict Transactions an$ E;p*icit
Transactions.
L-et e;p*anations from >an$out 0 pa-es (?.M
#mp*ict TransactionsG
E;p*icit Transactions
1).% De!o in Lab
To $emonstrateG
On one c*ient= chan-e the #mp*icitRTransactions settin- to ON
Start an imp*icit transaction to up$ate the name of a tit*eG
up$ate tit*es set tit*e O ]The Busy E;ecutive]]s Database <ui$e up$ate$ by ;;;]
here tit*eRi$ O ]B412)"]
Chere ;;; is the stu$entJs user nameD.
#f other c*ients no try to se*ect from the tit*es tab*e= they shou*$ fin$ that the Auery is
ta/in- some time because the transaction has a *oc/ on the tab*e.
,a-e 80 of 87
FBE Computer Science Department Lecture Notes Theory of Databases
Commit the transaction on the first c*ientG
Commit tran
No the se*ect on the other c*ients shou*$ comp*ete.
11 Introduction to .ecurity
Types of Database $i$nJt cover ear*ier in -reat $etai*
Cou*$ $oG
5na*ytic COL5,D
Operationa* COLT,D
,a-e 87 of 87