0% found this document useful (0 votes)
41 views101 pages

Normalization

Uploaded by

Prit Patel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
41 views101 pages

Normalization

Uploaded by

Prit Patel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 101

Normalization

Unit-IV
Topics to be covered
• Functional Dependency
• Definition and types of FD
• Closure of FD set
• Closure of attribute set
• Irreducible set of FD
• Normalization and normal forms
• 1NF
• 2NF
• 3NF
• BCNF
• 4NF
• 5NF

2
What is Functional Dependency?
 Let R be a relation schema having n attributes A1, A2, A3,…, An.
Student
RollNo Name SPI BL
101 Raj 8 0
102 Meet 7 1

 Let attributes X and Y are two subsets of attributes of relation R.


 If the values of the X component of a tuple uniquely (or functionally) determine the
values of the Y component, then there is a functional dependency from X to Y.
 This is denoted by X → Y (i.e RollNo → Name, SPI, BL).
 It is referred as: Y is functionally dependent on the X or X determines Y.

3
Diagrammatic representation
X→Y {X1, X2} → Y X → {Y1, Y2}

X Y X1 X2 Y X Y1 Y2

 Example
 Consider the relation Account(account_no, balance, branch).
 account_no can determine balance and branch.
 So, there is a functional dependency from account_no to balance and branch.
 This can be denoted by account_no → {balance, branch}.

account_no balance branch

4
FD is a constraint between two sets of attributes in a relation from a database
Rollno Name
FD is the generalization of the concept of key
101 John
XY {X determine Y or X decides Y}
102 Mike
Single value of LHS we will get single value
at RHS. 103 John
RollnoName Yes
NameRollno No

A B C D
AB N
A1 B1 C1 D1 AC N ABC So with the help of this FD can we determine
AD Y ABD every other attribute uniquely ??
A1 B2 C2 D1 BCA
BA N
A2 B2 C1 D2 BC N ……..

A3 B3 C2 D2 BD N
CA N
A4 B4 C4 D4 CB N
A5 B3 C3 D3 CD N
5
Application of FD
 We can determine additional FD
 We can identify key(pk,sk,ck…)
 Equivalence of FD
 Minimal FD set(we can represent the same information with less no of FD)

Examle:1 Given the following relation instance.


X Y Z
-------
1 4 2
1 5 3
1 6 3
3 2 2

Which of the following functional dependencies are satisfied by the instance?


(a) XY -> Z and Z -> Y
(b) YZ -> X and Y -> Z
(c) YZ -> X and X -> Z
(d) XZ -> Y and Y -> X
6
Types of Functional Dependencies
1. Trivial Functional Dependency
• X → Y is trivial FD if Y is a subset of X
• Eg. {Roll_No, Department_Name} → Roll_No
2. Nontrivial Functional Dependency
• X → Y is nontrivial FD if Y is not a subset of X
• Eg. {Roll_No, Department_Name} → Student_Name
• {Thus, if there exists at least one attribute in the RHS of a functional dependency that is not a part of LHS, then it is called as a
non-trivial functional dependency}

7
Armstrong's axioms (inference rules)
 Armstrong's axioms are a set of rules used to infer (derive) all the functional
dependencies on a relational database.

8
Armstrong's axioms (inference rules)
1. Reflexivity 5. Self-determination
– If B is a subset of A – A→A
then A → B always hold
6. Decomposition
2. Augmentation – If A → BC
– If A → B
then A → B and A → C
then AC → BC always hold
3. Transitivity 7. Union
– If A → B and B → C – If A → B and A → C
then A → C always hold then A → BC
4. Pseudo Transitivity 8. Composition
– If A → B and BD → C – If A → B and C → D
then AD → C then AC → BD

9
Rules for Functional Dependency-
Rule-01: A functional dependency X → Y will always hold if all the values of X are unique (different)
irrespective of the values of Y.

The following functional dependencies will always hold since all


the values of attribute ‘A’ are unique-
• A→B
• A → BC
• A → CD
• A → BCD
• A → DE
• A → BCDE

A → Any combination of attributes A, B, C, D, E

10
Rules for Functional Dependency-
Rule-02: A functional dependency X → Y will always hold if all the values of Y are same irrespective of the
values of X.
The following functional dependencies will always hold since all
the values of attribute ‘C’ are same-
• A→C
• AB → C
• ABDE → C
• DE → C
• AE → C

In general, we can say following functional dependency will


always hold true-

 Any combination of attributes A, B, C, D, E → C


 In general, a functional dependency α → β always holds-
 If either all values of α are unique or if all values of β are same or both.

11
What is closure of a set of FDs?
 The Closure Of Functional Dependency means the complete set of all possible attributes
that can be functionally derived from given functional dependency using the inference
rules known as Armstrong’s Rules.
 If “F” is a functional dependency then closure of functional dependency can be denoted
using {F}+ or F+.

12
Example of closure of a set of FDs
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set
of functional dependencies are:
• F = (A → B, A → C, CG → H, CG → I, B → H)
 The functional dependency A → H is logical implied.

We have
A→B
Transitivity rule A→H
B→H

13
Example of closure of a set of FDs
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set
of functional dependencies are:
• F = (A → B, A → C, CG → H, CG → I, B → H)
 The functional dependency CG → HI is logical implied.

We have
CG → H
Union rule CG → HI
CG → I

14
Example of closure of a set of FDs
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set
of functional dependencies are:
• F = (A → B, A → C, CG → H, CG → I, B → H)
 The functional dependency AG → I is logical implied.

We have
A→C
Pseudo-transitivity rule AG → I
CG → I

15
Example of closure of a set of FDs
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set
of functional dependencies are:
• F = (A → B, A → C, CG → H, CG → I, B → H)
 The functional dependency AG → I is logical implied.

We have
A→C Augmentation rule AG → CG

AG → CG
Transitivity rule AG → I
CG → I
16
What is a closure of attribute sets?
 Given a set of attributes X, the closure of X under F is the set of attributes that are
functionally determined by α under F.
 It is denoted by X+.

17
Algorithm to find closure of attribute sets
Input : Let F be a set of FDs for relation R.
Steps:
1. X+= X //initialize X+ to X
2. For each FD : Y -> Z in F Do
If Y ⊆ X+ Then //If Y is contained in X+
X+ = X+ ∪ Z //add Z to X+
End If
End For
3. Return X+ //Return closure of X

Output : Closure X+ of X under F

18
Examples
Example-1 : Consider the table student_details having (Roll_No, Name,Marks, Location) as the attributes and
having two functional dependencies.
FD1 : Roll_No Name, Marks
FD2 : Name Marks, Location
Now, We will calculate the closure of all the attributes present in the relation using the three steps mentioned
below.
{Roll_no}+ = {Roll_No}
{Roll_no}+ = {Roll_No, Name, Marks}
Therefore, complete closure of Roll_No will be :
{Roll_no}+ = {Roll_No, Marks, Name, Location}
Similarly, we can calculate closure for other attributes too i.e “Name”.
{Name}+ = {Name}
{Name}+ = {Name, Marks, Location}
{Name}+ = {Name, Marks, Location}
{Marks}+ = {Marks}
and
{Location}+ = { Location}

19
Example-2 : Consider a relation R(A,B,C,D,E) having below
mentioned functional dependencies.
FD1 : A  BC
FD2 : C  B
FD3 : D  E
FD4 : E  D

{A}+ = {A, B, C}
{B}+ = {B}
{C}+ = {B, C}
{D}+ = {D, E}
{E}+ = {E}

20
Closure Of Functional Dependency : Calculating Candidate Key
 A Candidate Key of a relation is an attribute or set of attributes that can determine the
whole relation or contains all the attributes in its closure."
Example-1 : Consider the relation R(A,B,C) with given functional dependencies :
FD1 : A B
FD2 : B C

{A}+ = {A, B, C}
{B}+ = {B, C}
{C}+ = {C}

Clearly, “A” is the candidate key as, its closure contains all the attributes present in the
relation “R”.

Example-2 : Consider another relation R(A, B, C, D, E) having the Functional dependencies :


FD1 : A BC
FD2 : C B
FD3 : D E
FD4 : E D
21
{A}+ = {A, B, C}
{B}+ = {B}
{C}+ = {C, B}
{D}+ = {E, D}
{E}+ = {E, D}

In this case, a single attribute is unable to determine all the attribute on its own like in previous
example. Here, we need to combine two or more attributes to determine the candidate keys.

{A, D}+ = {A, B, C, D, E}


{A, E}+ = {A, B, C, D, E}
Hence, "AD" and "AE" are the two possible keys of the given relation “R”. Any other combination
other than these two would have acted as extraneous attributes.

NOTE : Any relation “R” can have either single or multiple candidate keys.

22
Exercise
1. Consider the relation scheme R = {E, F, G, H, I, J, K, L, M, N} and the set of functional dependencies {{E, F}  {G}, {F}
 {I, J}, {E, H}  {K, L}, K  {M}, L  {N} on R. What is the key for R?

A. {E, F}
B. {E, F, H}
C. {E, F, H, K, L}
D. {E}

2. In a schema with attributes A, B, C, D and E following set of functional dependencies are given
{A  B, A  C, CD  E, B  D, E  A}
Which of the following functional dependencies is NOT implied by the above set?

A. CD  AC
B. BD  CD
C. BC  CD
D. AC  BC

23
What is extraneous attributes?
 Let us consider a relation R with schema R = (A, B, C) and set of functional dependencies F
= { AB → C, A → C }.
 In AB → C, B is extraneous attribute. The reason is, there is another FD A → C, which
means when A alone can determine C, the use of B is unnecessary (extra).
 An attribute of a functional dependency is said to be extraneous if we can remove it
without changing the closure of the set of functional dependencies.

24
What is canonical cover?
 A canonical cover of F is a minimal set of functional dependencies equivalent to F, having
no redundant dependencies or redundant parts of dependencies.
 It is denoted by Fc
 A canonical cover for F is a set of dependencies Fc such that
1. F logically implies all dependencies in Fc and
2. Fc logically implies all dependencies in F and
3. Fc is free from all the extraneous functional dependencies
4. Fc is not unique and may be more than one for a given set of functional dependencies.

Need-
 Working with the set containing extraneous functional dependencies increases the computation time.
 Therefore, the given set is reduced by eliminating the useless functional dependencies.
 This reduces the computation time and working with the irreducible set becomes easier

25
Steps to Find Canonical cover
Step-01:
Write the given set of functional dependencies in such a way that each functional dependency contains
exactly one attribute on its right side.

Example-

The functional dependency X → YZ will be written as-


X→Y
X→Z

Step-02:

• Consider each functional dependency one by one from the set obtained in Step-01.
• Determine whether it is essential or non-essential.

To determine whether a functional dependency is essential or not, compute the closure of its left side-
• Once by considering that the particular functional dependency is present in the set
• Once by considering that the particular functional dependency is not present in the set

26
Then following two cases are possible-

Case-01: Results Come Out to be Same-


If results come out to be same,
• It means that the presence or absence of that functional dependency does not create any difference.
• Thus, it is non-essential.
• Eliminate that functional dependency from the set.

NOTE-

• Eliminate the non-essential functional dependency from the set as soon as it is discovered.
• Do not consider it while checking the essentiality of other functional dependencies.

Case-02: Results Come Out to be Different-


If results come out to be different,
• It means that the presence or absence of that functional dependency creates a difference.
• Thus, it is essential.
• Do not eliminate that functional dependency from the set.
• Mark that functional dependency as essential.

27
Step-03:
• Consider the newly obtained set of functional dependencies after performing Step-02.
• Check if there is any functional dependency that contains more than one attribute on its left side.

Then following two cases are possible-

Case-01: No-

• There exists no functional dependency containing more than one attribute on its left side.
• In this case, the set obtained in Step-02 is the canonical cover.

Case-02: Yes-

• There exists at least one functional dependency containing more than one attribute on its left side.
• In this case, consider all such functional dependencies one by one.
• Check if their left side can be reduced.

Use the following steps to perform a check-


• Consider a functional dependency.
• Compute the closure of all the possible subsets of the left side of that functional dependency.
• If any of the subsets produce the same closure result as produced by the entire left side, then replace the left side with that
subset.
After this step is complete, the set obtained is the canonical cover.
28
Exercise
The following functional dependencies hold true for the relational scheme R ( W , X , Y , Z ) –
X→W
WZ → XY
Y → WXZ
Write the irreducible equivalent for this set of functional dependencies.

Step-01:

Write all the functional dependencies such that each contains exactly one attribute on its right side-
X→W
WZ → X
WZ → Y
Y→W
Y→X
Y→Z

29
Step-02:
Eliminating WZ → X, our set of functional dependencies
Check the essentiality of each functional dependency one by one. reduces to-
X→W
For X → W: WZ → Y
Y→W
• Considering X → W, (X)+ = { X , W } Y→X
• Ignoring X → W, (X)+ = { X } Y→Z
Now, we will consider this reduced set in further checks.
Now,
• Clearly, the two results are different. For WZ → Y:
• Thus, we conclude that X → W is essential and can not be
eliminated. • Considering WZ → Y, (WZ)+ = { W , X , Y , Z }
• Ignoring WZ → Y, (WZ)+ = { W , Z }
For WZ → X:
Now,
• Considering WZ → X, (WZ)+ = { W , X , Y , Z } • Clearly, the two results are different.
• Ignoring WZ → X, (WZ)+ = { W , X , Y , Z } • Thus, we conclude that WZ → Y is essential and can not be
eliminated.
Now,
• Clearly, the two results are same.
• Thus, we conclude that WZ → X is non-essential and can be
eliminated.

30
For Y → W:

• Considering Y → W, (Y)+ = { W , X , Y , Z } For Y → Z:


• Ignoring Y → W, (Y)+ = { W , X , Y , Z }
•Considering Y → Z, (Y)+ = { W , X , Y , Z }
Now, •Ignoring Y → Z, (Y)+ = { W , X , Y }
• Clearly, the two results are same.
• Thus, we conclude that Y → W is non-essential and can be Now,
eliminated. •Clearly, the two results are different.
•Thus, we conclude that Y → Z is essential and can not be
Eliminating Y → W, our set of functional dependencies reduces to- eliminated.
X→W
WZ → Y From here, our essential functional dependencies are-
Y→X X→W
Y→Z WZ → Y
For Y → X: Y→X
Y→Z
• Considering Y → X, (Y)+ = { W , X , Y , Z }
• Ignoring Y → X, (Y)+ = { Y , Z }

Now,
• Clearly, the two results are different.
• Thus, we conclude that Y → X is essential and can not be eliminated
31
Step-03:

• Consider the functional dependencies having more than one attribute on their left side.
• Check if their left side can be reduced.

In our set,
• Only WZ → Y contains more than one attribute on its left side.
• Considering WZ → Y, (WZ)+ = { W , X , Y , Z }

Now,
• Consider all the possible subsets of WZ.
• Check if the closure result of any subset matches to the closure result of WZ.
(W)+ = { W }
(Z)+ = { Z }
Clearly,
• None of the subsets have the same closure result same as that of the entire left side.
• Thus, we conclude that we can not write WZ → Y as W → Y or Z → Y.
• Thus, set of functional dependencies obtained in step-02 is the canonical cover.
Finally, the canonical cover is-
X→W
WZ → Y
Y→X
Y→Z
32
Exercise
1. Consider the following set F of functional dependencies: minimal cover..?
F= {
A  BC Canonical Cover = {
B C A B
A  B B C
AB  C }
}

2. Consider another set F of functional dependencies:


F={
A BC
CD E F={
B D already in minimal cover
EA }
}

33
3. Find out canonical cover and minimal cover of following FDs:
AT LAST MINIMAL FDS are :
A->C
{
A->D
A -->C
E->A
AC -->D
E->H
E--> AD
E -->H
hence the CANONICAL FORM
}
IS...(canonical form means LHS should not
be repeated....)

A->CD;
3.5 Find out canonical cover and minimal cover of following FDs: E->AH;
Minimal cover:
R(VWXYZ)
V -> W
V -> W
V -> X
VW -> X
Y -> V
Y -> VXZ
Y -> Z

34
4. Consider a relation scheme R = (A, B, C, D, E, H) on which the following functional dependencies
hold: {A–>B, BC–>D, E–>C, D–>A}. What are the candidate keys of R?
(a) AE, BE
(b) AE, BE, DE
(c) AEH, BEH, BCH
(d) AEH, BEH, DEH

5. In a schema with attributes A, B, C, D and E, following set of functional dependencies are given:
A->B
A->C
CD->E
B->D
E->A
Which of the following functional dependencies is NOT implied by the above set?
(a) CD->AC (b) BD->CD (c) BC->CD (d) AC->BC

35
6. The following functional dependencies are given:
AB->CD, AF->D, DE->F, C->G , F->E, G->A
Which one of the following options is false?
(a)CF+ = {ACDEFG} (b)BG+ = {ABCDG}
(c)AF+ = {ACDEFG} (d)AB+ = {ABCDFG}

7. Relation R has eight attributes ABCDEFGH. Fields of R contain only atomic values.
F={CH->G,
A->BC,
B->CFH,
E->A,
F->EG}
is a set of functional dependencies (FDs) so that F + is exactly the set of FDs that hold for R.

How many candidate keys does the relation R have?


(a) 3 (b) 4 (c) 5 (d) 6

36
What is decomposition?
 Decomposition is the process of breaking down given relation into two or more
relations.
 Relation R is replaced by two or more relations in such a way that:
1. Each new relation contains a subset of the attributes of R
2. Together, they all include all tuples and attributes of R
 Types of decomposition
1. Lossy decomposition
2. Lossless decomposition (non-loss decomposition)

37
What is an anomaly in database design?
 Anomalies are problems that can occur in poorly planned, un-normalized database
where all the data are stored in one table.
 There are three types of anomalies that can arise in the database because of redundancy
are
1. Insert anomaly
2. Delete anomaly
3. Update / Modification anomaly

38
Insert anomaly
 Consider a relation
• emp_dept (E#, Ename, Address, D#, Dname, Dmgr#) E# as a primary key
E# Ename Address D# Dname Dmgr#
Want to insert new 1 Raj Rajkot 1 CE 1
department detail (IT) 2 Meet Surat 1 CE 1

 Suppose a new department (IT) has been started by the organization


but initially there is no employee appointed for that department.
 We want to insert that department detail in emp_dept table.
 But the tuple for this department cannot be inserted into this table as
the E# will have NULL value, which is not allowed because E# is primary
key.
 This kind of problem in the relation where some tuple cannot be
inserted is known as insert anomaly.

39
What is Insert anomaly?
 An insert anomaly occurs when certain attributes cannot be
inserted into the database without the presence of another
attribute.

40
Delete anomaly
 Consider a relation
• emp_dept (E#, Ename, Address, D#, Dname, Dmgr#) E# as a primary key
E# Ename Address D# Dname Dmgr#
Want to delete Meet 1 Raj Rajkot 1 CE 1
employee's detail 2 Meet Surat 1 IT 2

 Now consider there is only one employee in some department (IT) and
that employee leaves the organization.
 So we need to delete tuple of that employee (Meet).
 But in addition to that information about the department also deleted.
 This kind of problem in the relation where deletion of some tuples can
lead to loss of some other data not intended to be removed is known as
delete anomaly.

41
What is Delete anomaly?
 A delete anomaly exists when certain attributes are lost because
of the deletion of another attribute.

42
Update anomaly
 Consider a relation
• emp_dept (E#, Ename, Address, D#, Dname, Dmgr#) E# as a primary key
E# Ename Address D# Dname Dmgr#
Want to update CE 1 Raj Rajkot 1 CE M1
department’s manager 2 Meet Surat 2 IT M2
3 Jay Rajkot 2 CE M2

 Suppose the manager of a (CE) department has changed, this requires


that the Dmgr# in all the tuples corresponding to that department
must be changed to reflect the new status.
 If we fail to update all the tuples of given department, then two
different records of employee working in the same department might
show different Dmgr# lead to inconsistency in the database.

43
What is Update anomaly?
 An update anomaly exists when one or more records (instance) of
duplicated data is updated, but not all.

44
Anomaly (Summary)
EmpID EmpName Address DeptID DeptName DeptMngr
E1 Raj Rajkot D1 C.E. Patel
E2 Samir Rajkot D2 Civil Shah
E3 Meet Baroda D1 Computer Patel
E4 Deepak Surat D1 C.E Patel
E5 Suresh Surat D3 Electrical Joshi
null null null D4 Chemical null

Delete Anomaly Insert Anomaly


If we delete Employee having ID “E2” then Do not allow to insert new
Civil department will also delete because Department “Chemical” until an
there is only one record of Civil dept. employee is assign to it.

Update Anomaly
An update anomaly exists when one or more records of duplicated data is updated, but
not all.

45
How to deal with insert anomaly
EmpID EmpName Address DeptID DeptName DeptMngr
E1 Raj Rajkot D1 Computer
C.E. Patel
E2 Samir Rajkot D2 Civil Shah
E3 Meet Baroda D1 Computer Patel
E4 Deepak Surat D1 C.E
Computer Patel
E5 Suresh Surat D3 Electrical Joshi
null null null D4 Chemical null
Do not allow to insert new department “Chemical” until an employee is assign to it.
EmpID EmpName Address DeptID DeptID DeptName DeptMngr
E1 Raj Rajkot D1 D1 Computer Patel
E2 Samir Rajkot D2 D2 Civil Shah
E3 Meet Baroda D1 D3 Electrical Joshi
E4 Deepak Surat D1 D4 Chemical null
E5 Suresh Surat D3

46
How to deal with delete anomaly
EmpID EmpName Address DeptID DeptName DeptMngr
E1 Raj Rajkot D1 Computer
C.E. Patel
E2 Samir Rajkot D2 Civil Shah
E3 Meet Baroda D1 Computer Patel
E4 Deepak Surat D1 C.E
Computer Patel
E5 Suresh Surat D3 Electrical Joshi

If we delete Employee having ID “E2” then Civil department will also delete because
there is only one record of Civil dept.

EmpID EmpName Address DeptID DeptID DeptName DeptMngr


E1 Raj Rajkot D1 D1 Computer Patel
E2 Samir Rajkot D2 D2 Civil Shah
E3 Meet Baroda D1 D3 Electrical Joshi
E4 Deepak Surat D1
E5 Suresh Surat D3

47
How to deal with update anomaly
EmpID EmpName Address DeptID DeptName DeptMngr
E1 Raj Rajkot D1 Computer
C.E. Patel
E2 Samir Rajkot D2 Civil Shah
E3 Meet Baroda D1 Computer Patel
E4 Deepak Surat D1 C.E
Computer Patel
E5 Suresh Surat D3 Electrical Joshi
Changing the name of department D1 from “Computer” to “IT” may update one or
more records, but not all.

EmpID EmpName Address DeptID DeptID DeptName DeptMngr


E1 Raj Rajkot D1 D1 Computer Patel
E2 Samir Rajkot D2 D2 Civil Shah
E3 Meet Baroda D1 D3 Electrical Joshi
E4 Deepak Surat D1
E5 Suresh Surat D3

48
Summary S-ID
1
Name
A
Age
18
Br_code
101
Br_name
CSE
HOD
AAA
2 B 19 101 CSE AAA
3 C 18 101 CSE AAA
Idea: In this table we have to stored the entire college data. 4 D 20 102 EC BBB
Result: Entire branch data is repeated for every student of same branch. 5 E 18 102 EC BBB
Redundancy: When same data is stored multiple time. 6 F 19 103 ME CCC

Disadvantage: (i) Insertion, deletion and modification anomalies.


(ii) Inconsistency in data.
(iii) Increases the database size and increase the access time.

Insertion anomalies : When certain data(attribute) can not be inserted in to database


without the presence of other data.(civil branch information depends students enrolment)

Deletion anomalies : if we delete some data(unwanted), it cause deletion of some other


data(wanted).(mechanical students passed out from the college the information related to mechanical )

Updation anomalies : when we want to update a single piece of data, it must be updated
at all the places.(HOD changed, missed somewhere)

49
How anomalies in database design can be solved?

 Such type of anomalies in database design can be solved by using normalization.

50
What is normalization?
 Normalization is the process of removing redundant data from
tables to improve data integrity, scalability and storage
efficiency.
1. data integrity (completeness, accuracy and consistency of data)
2. scalability (ability of a system to continue to function well in a
growing amount of work)
3. storage efficiency (ability to store and manage data that consumes
the least amount of space)

51
What we do in normalization?
 Normalization generally involves splitting an existing table into
multiple (more than one) tables, which can be re-joined or linked
each time a query is issued (executed).

52
How many normal forms are there?
 Normal forms:
1. 1NF (First normal form)
2. 2NF (Second normal form)
3. 3NF (Third normal form)
4. BCNF (Boyce–Codd normal form)
5. 4NF (Forth normal form)
6. 5NF (Fifth normal form)

As we move from 1NF to 5NF number of tables and


complexity increases but redundancy decreases.

53
1NF (First Normal Form)
 Conditions for 1NF

Each cells of a table should contain a single value.

 A relation R is in first normal form (1NF) if and only if each cell of the table contains only an atomic value.
OR
 A relation R is in first normal form (1NF) if the attribute of every tuple is either single valued or a null value.

#By default, every table is in 1NF.


54
1NF (First Normal Form) [Composite attribute]
Customer
CustomerID Name Address
C01 Raj Jamnagar Road, Rajkot relation is not in 1NF
C02 Meet Nehru Road, Jamnagar

 Problem: It is difficult to retrieve the list of customers living in ‘Jamnagar’ from above
table.
 The reason is that address attribute is composite attribute which contains road name as
well as city name in single cell.
 It is possible that city name word is also there in road name.
 In our example, ‘Jamnagar’ word occurs in both records, in first record it is a part of road
name and in second one it is the name of city.

55
1NF (First Normal Form) [Composite attribute]
Customer
CustomerID Name Address
C01 Raj Jamnagar Road, Rajkot
C02 Meet Nehru Road, Jamnagar

 Solution: Divide composite attributes into number of sub-attributes and insert value in
proper sub-attribute.

Customer
CustomerID Name Road City
C01 Raj Jamnagar Road Rajkot
C02 Meet Nehru Road Jamnagar

56
1NF (First Normal Form) [Multivalued attribute]
Student
RollNo Name FailedinSubjects
101 Raj DS, DBMS
102 Meet DBMS, DS relation is not in 1NF

103 Jeet DS, DBMS, DE


104 Harsh DBMS, DE, DS
105 Nayan DE, DBMS, DS

 Problem: It is difficult to retrieve the list of students failed in ‘DBMS’ as well as ‘DS’ but
not in other subjects from above table.
 The reason is that FailedinSubjects attribute is multi-valued attribute so it contains more
than one value.

57
1NF (First Normal Form) [Multivalued attribute]
Student Student Result
RollNo Name FailedinSubjects RollNo Name RID RollNo Subject
101 Raj DS, DBMS 101 Raj 1 101 DS
102 Meet DBMS, DS 102 Meet 2 101 DBMS
103 Jeet DS, DBMS, DE 103 Jeet 3 102 DBMS
104 Harsh DBMS, DE, DS 104 Harsh 4 102 DS
105 Nayan DE, DBMS, DS 105 Nayan 5 103 DS
… … …

 Solution: Split the table into two tables in such as way that
• the first table contains all attributes except multi-valued attribute with
same primary key and
• other table contains multi-valued attribute and place a primary key in it.
• insert the primary key of first table in the second table as a foreign key.
58
2NF (Second Normal Form)
 Conditions for 2NF

It is in 1NF and no partial dependency exist in the relation.

 A relation R is in second normal form (2NF)


• if and only if it is in 1NF and
• every non-key attribute is fully dependent on the primary key

Partial Dependency: When a prime attribute determines non-prime attribute(s).


Prime attribute: parts of candidate key/primary key
In other words,
A → B is called a partial dependency if and only if-
A is a subset of some candidate key
B is a non-prime attribute.
However when a relation does not exist a composite candidate key, then no PD

59
Consider a relation- R ( V , W , X , Y , Z ) with functional dependencies-
VW → XY
Y→V
WX → YZ

The possible candidate keys for this relation are-


VW , WX , WY

From here,
• Prime attributes = { V , W , X , Y }
• Non-prime attributes = { Z }

Now, if we observe the given dependencies-


• There is no partial dependency.
• This is because there exists no dependency where incomplete candidate key
determines any non-prime attribute.

Thus, we conclude that the given relation is in 2NF.


60
2NF (Second Normal Form)
CustomerID AccountNO AccesssDate Balance BranchName
C01 A01 01-01-2017 50000 Rajkot
C02 A01 01-03-2017 50000 Rajkot
C01 A02 01-05-2017 25000 Surat
C03 A02 01-07-2017 25000 Surat

CustomerID AccountNO AccesssDate Balance BranchName

 FD1: {CustomerID, AccountNO} → {AccesssDate, Balance, BranchName}


 FD2: AccountNO → {Balance, BranchName}
 Balance and BranchName are partial dependent on primary key. So above relation is not in 2NF.
 Problem: For example, in case of a joint account multiple customers have common (one) accounts.
 If an account ‘A01’ is operated jointly by two customers says ‘C01’ and ‘C02’ then data values for attributes
Balance and BranchName will be duplicated in two different tuples of customers ‘C01’ and ‘C02’.
61
2NF (Second Normal Form)
Solution: Decompose relation in such a CstID ActNO AccessDate Balance Branch
way that resultant relations do not have
any partial FD. C01 A01 01-01-2017 50000 Rajkot
C02 A01 01-03-2017 50000 Rajkot
C01 A02 01-05-2017 25000 Surat
C03 A02 01-07-2017 25000 Surat

Remove partial dependent attributes


ActNO Balance Branch
from the relation that violets 2NF.
Place them in separate relation along Table 1 A01 50000 Rajkot
with the prime attribute on which they A02 25000 Surat
are fully dependent. CstID ActNO AccessDate
The primary key of new relation will be C01 A01 01-01-2017
the attribute on which it is fully Table 2 C02 A01 01-03-2017
dependent.
C01 A02 01-05-2017
Keep other attributes same as in that
C03 A02 01-07-2017
table with the same primary key.
62
3NF (Third Normal Form)
 Conditions for 3NF
It is in 2NF and there is no transitive dependency.

(Transitive dependency???) A → B & B → C then A → C.(A,C  non Prime)


 A relation R is in third normal form (3NF)
• if and only if it is in 2NF and
• every non-key attribute is non-transitively dependent on the primary key
OR
 A relation R is in third normal form (3NF)
• if and only if it is in 2NF and
• Any one condition holds for each non-trivial functional dependency A → B
• A is a super key
• B is a prime attribute

63
Consider a relation- R ( A , B , C , D , E ) with functional dependencies-
A → BC
CD → E
B→D
E→A
The possible candidate keys for this relation are-
A , E , CD , BC

From here,
• Prime attributes = { A , B , C , D , E }
• There are no non-prime attributes

Now,
• It is clear that there are no non-prime attributes in the relation.
• In other words, all the attributes of relation are prime attributes.
• Thus, all the attributes on RHS of each functional dependency are prime
attributes.

Thus, we conclude that the given relation is in 3NF.

64
3NF (Third Normal Form)
AccountNO Balance BranchName BranchAddress
A01 50000 Rajkot Kalawad Road
A02 40000 Rajkot Kalawad Road
A03 35000 Rajkot Kalawad Road
A04 25000 Rajkot Kalawad Road

AccountNO Balance BranchName BranchAddress

 FD1: AccountNO → {Balance, BranchName, BranchAddress} and


 FD2: BranchName → BranchAddress
 So AccountNO → BranchAddress (Using Transitivity rule)
 Problem: In this relation, branch address will be stored repeatedly for
each account of the same branch which occupies more space.

65
3NF (Third Normal Form)
Solution: Decompose relation in such a
ANO Balance BName BAddress
way that resultant relations do not have
any transitive FD. A01 50000 Rajkot Kalawad Road
A02 40000 Rajkot Kalawad Road
A03 35000 Rajkot Kalawad Road
Remove transitive dependent attributes A04 25000 Rajkot Kalawad Road
from the relation that violets 3NF.
Place them in a new relation along with BName BAddress
the non-prime attributes due to which
Table 1 Rajkot Kalawad Road
transitive dependency occurred.
The primary key of the new relation will
be non-prime attributes due to which ANO Balance BName
transitive dependency occurred. A01 50000 Rajkot
Table 2
Keep other attributes same as in the A02 40000 Rajkot
table with same primary key and add A03 35000 Rajkot
prime attributes of other relation into it
as a foreign key. A04 25000 Rajkot

66
BCNF (Boyce-Codd Normal Form)
 Conditions for BCNF
if and only if-
1. Relation already exists in 3NF.
2. For each non-trivial functional dependency A → B, A must be a super key of the relation.

Consider a relation- R ( A , B , C ) with the functional dependencies-


A→B
B→C
C→A

The possible candidate keys for this relation are-


A,B,C

Now, we can observe that LHS of each given functional dependency is a candidate key.
Thus, we conclude that the given relation is in BCNF.

67
BCNF (Boyce-Codd Normal Form)
 FD1: Student, Language→ Faculty Student Language Faculty
Mita JAVA Patel
 FD2: Faculty → Language
Nita VB Shah
 So {Student, Language} → Language Sita JAVA Jadeja
(Using Transitivity rule) Gita VB Dave
Rita VB Shah
Here, one faculty teaches only one subject, but a
subject may be taught by more than one faculty. Nita JAVA Patel
Mita VB Dave
In FD2, determinant is Faculty which is not a Rita JAVA Jadeja
primary key. So table is not in BCNF.

 Problem: In this relation one student has more than one project
with different guide (faculty) then records will be stored
repeatedly for each student, language and guide (faculty)
combination which occupies more space.
68
BCNF (Boyce-Codd Normal Form)
Solution: Decompose relation in such a
way that resultant relations do not have Student Language Faculty
any transitive FD.

Remove transitive dependent prime


attribute from relation that violets Table 1 Table 2
BCNF.
Faculty Language Student Faculty
Place them in separate new relation along Patel JAVA Mita Patel
with the non-prime attribute due to
Shah VB Nita Shah
which transitive dependency occurred.
Jadeja JAVA Sita Jadeja
The primary key of new relation will be
Dave VB Gita Dave
this non-prime attribute due to which
transitive dependency occurred. Rita Shah

Keep other attributes same as in that Nita Patel


table with same primary key and add a Mita Dave
prime attribute of other relation into it Rita Jadeja
as a foreign key.
69
Multivalued dependency (MVD)
 For a dependency X → Y, if for a single value of X, multiple values of Y exists, then the
table may have multi-valued dependency.
X Y

studentID subject activity Multivalued dependency


101 DS Cricket (MVD) is denoted by →→
101 DBMS Cricket Multivalued dependency
101 DS Football (MVD) is represented as
101 DBMS Football X →→ Y

70
4NF (Fourth Normal Form)
 A relation R is in fourth normal form (4NF)
• if and only if it is in BCNF and
• has no multivalued dependencies
• The following table is not in 4NF, stu_id --->> course, stu_id --->> hobby

STU_ID COURSE HOBBY


21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey

71
4NF (Fourth Normal Form)
studentID subject activity
101 DS Cricket
101 DBMS Cricket
101 DS Football
101 DBMS Football

Decompose

studentID subject studentID activity


101 DS 101 Cricket
101 DBMS 101 Football

72
Functional dependency & Multivalued dependency
 A table can have both functional dependency as well as multi-valued dependency to
gather.
• studentID → address
• studentID →→ subject
• studentID →→ activity

studentID address subject activity


101 C. G. Road DS Cricket
101 C. G. Road DBMS Cricket
101 C. G. Road DS Football
101 C. G. Road DBMS Football

73
Functional dependency & Multivalued dependency

studentID address subject activity


101 C. G. Road DS Cricket
101 C. G. Road DBMS Cricket
101 C. G. Road DS Football
101 C. G. Road DBMS Football

Decompose

studentID subject studentID address studentID activity


101 DS 101 C. G. Road 101 Cricket
101 DBMS 101 Football

74
5NF (Fifth Normal Form)
 A relation R is in fifth normal form (5NF)
• if and only if it is in 4NF and
• If we can decompose table further to eliminate redundancy and anomaly, and when we re-join the
decomposed tables by means of candidate keys, we should not be losing the original data or any new
record set should not arise. In simple words, joining two or more decomposed table should not lose
records nor create new records.
• R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R
• Or it cannot be decomposed into any number of smaller tables without loss of data.

75
Example
Consider an example of different Subjects taught by different lecturers and the lecturers taking classes for different
semesters.
Note: Please consider that Semester 1 has Mathematics, Physics and Chemistry and Semester 2 has only Mathematics in its
academic year!!

• In above table, Rose takes both Mathematics and Physics class for Semester 1, but she does not take Physics class for
Semester 2.
• In this case, combination of all these 3 fields is required to identify a valid data.
• Imagine we want to add a new class – Semester3 but do not know which Subject and who will be taking that subject.
• We would be simply inserting a new entry with Class as Semester3 and leaving Lecturer and subject as NULL. As we
discussed above, it’s not a good to have such entries. Moreover, all the three columns together act as a primary key,
we cannot leave other two columns blank!

76
• Hence we have to decompose the table in such a way that it satisfies all the rules till 4NF
• Ans when join them by using keys, it should yield correct record. Here, we can represent each lecturer’s Subject
area and their classes in a better way. We can divide above table into three – (SUBJECT, LECTURER), (LECTURER,
CLASS), (SUBJECT, CLASS)

Now, each of combinations is in three different tables. If we need to identify who is teaching which subject to
which semester, we need join the keys of each table and get the result.

77
Points to remember
Point-1
Remember the following diagram which implies-
• A relation in BCNF will surely be in all other normal forms.
• A relation in 3NF will surely be in 2NF and 1NF.
• A relation in 2NF will surely be in 1NF.

Point-2
The above diagram also implies-
• BCNF is stricter than 3NF.
• 3NF is stricter than 2NF.
• 2NF is stricter than 1NF.

78
Point-03:

While determining the normal form of any given relation,


• Start checking from BCNF.
• This is because if it is found to be in BCNF, then it will surely be in all other normal forms.
• If the relation is not in BCNF, then start moving towards the outer circles and check for other normal forms in the
order they appear.

Point-04:

• In a relational database, a relation is always in First Normal Form (1NF) at least.

Point-05:

• Singleton keys are those that consist of only a single attribute.


• If all the candidate keys of a relation are singleton candidate keys, then it will always be in 2NF at least.
• This is because there will be no chances of existing any partial dependency.
• The candidate keys will either fully appear or fully disappear from the dependencies.
• Thus, an incomplete candidate key will never determine a non-prime attribute.

79
Point-06

• Third Normal Form (3NF) is considered adequate for normal relational database design.
• Every binary relation (a relation with only two attributes) is always in BCNF.
• BCNF is free from redundancies arising out of functional dependencies (zero redundancy).
A relation with only trivial functional dependencies is always in BCNF.
In other words, a relation with no non-trivial functional dependencies is always in BCNF.

80
Summary
Normal Description
Form
1NF A relation is in 1NF if it contains an atomic value.

2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional dependent on the primary
key. {No PA→ NPA}.
3NF A relation will be in 3NF if it is in 2NF and no transition dependency exists.

BCNF If it is 3NF and For each non-trivial functional dependency A → B, A must be a super key.

4NF A relation will be in 4NF if it is in BCNF and has no multi-valued dependency.

5NF A relation is in 5NF if it is in 4NF and not contains any join dependency and joining should be lossless.{i.e
relation can not be decomposed further and have losseless join}

81
Step by step: Candidate key
Consider another relation R(A, B, C, D, E) having the Functional dependencies :
FD1 : A BC
FD2 : C B
FD3 : D E
FD4 : E D
R(A B C D E)

• Now we will find the element with no incoming edge, which A here , which conclude that A is not at
the right hand side and it is not determine by any other element so it’s a must attribute of candidate
key, because no combination of element’s closer contains A.

• Now we will try A alone and find closer {A}+={ A,B,C }


• Now we will try combination with A with left hand side attr, {AC}+={A,B,C}
• Now we will try {AD}+={A,B,C,D,E}
• Now we will try {AE}+={A,B,C,D,E}

• So we have consider all the cases, so final candidate key for the relation is AD and AE

82
Decomposition of relation
2NF
Step 1: Create a separate relation for each partial dependency

Step 2: Remove the right hand side attribute of the partial dependency from the relation that is
being decomposed. R
Step 3: we always create one table for CK if it is not a A B C
Part of a relation. a 1 X
b 2 Y
Example: R(A,B,C) a 3 Z
BC c 3 Z
Solution: d 3 Z
key: AB R2
A B e 3 Z
step 1: R1(B,C)
step 2: R(A,B) or R2(A,B) a 1
R1
b 2 B C
Why ? a 3
1 X
AB is the key c 3
B may be null 2 Y
d 3
So BC ? 3 Z
e 3
nullC 83
Examples
Assume a relation R (A, B, C, D, E) with the following set of functional dependencies;
F = {AB → C, B → D, E → D}
The key for this relation is ABE. Then, all three given FDs are partial dependencies, viz., AB → C, B
→ D, and E → D.
Step 1: separate tables for partial dependencies; hence, R1 (ABC), R2 (BD) and R3 (ED).

Step 2: remove RHS of these three partial FDs from R; hence, R4(A, B, E).
Step 3: is candidate key is a part of any relation?

Thus, we have four tables R1 (ABC), R2 (BD), R3 (ED) and R4 (ABE).

1. R(A B C D E) 2. R(A B C D E) 3. R(A B C D E F G H I J)


ABC AB ABC
DE BE ADGH
CD BDEF
AI
CK={ABD} CK={AC} HJ
R1(ABC) R1(ABE)
R2(DE) R2(CD) CK={ABD} R1(ABC) R2(ADGHJ)
R3(ABD) R3(AC) R3(BDEF) R4(AI)R5(ABD)
84
3NF
Step 1: Create a separate relation for each transitive dependency

Step 2: Remove the right hand side attribute of the partial dependency from the relation that is
being decomposed.
Step 3: we always create one table for CK if it is not a A B C
Part of a relation. a 1 X
b 1 X
c 1 X
Example: R(A,B,C)
d 2 Y
AB
e 2 Y
BC
Solution: A B f 3 Z
key: A a 1 g 3 Z
step 1: R1(B,C) b 1
step 2: R(A,B) or R2(A,B) c 1
B C
d 2
1 X
e 2
2 Y
f 3
3 Z
g 3

85
Example 1: R(A,B,C,D,E) R(A B C D E F G H I J)
AB ABC
BE A DE
CD B F
Solution: F GH
key: AC D IJ
step 1: R1(A,B,E)(still not in 3 NF)
--- R11(A,B) Solution:
--- R12(B,E) key: AB
step 2: R2(C,D) step 1: R1(A,D,E,I,J)(still not in 3 NF)
step 3: R3(A,C) --- R11(A,D,E)
--- R12(D,I,J)
step 2: R2(B,F,G,H)
R(A,B,C,D,E) ---R21(B,F)
ABC R(ABCDEFGHIJ) ---R22(G,H)
BD ABC step 3: R3(A,B,C) AB keeps c with itself
DE AD GH Key: ABD
Solution: Key AB BD EF R1(ABCI)= R11(ABC) R12(AI)
Step 1:R1(BDE) AI R2(ADGHJ)=R21(ADGH) R22(HJ)
---R11(BD) H J R3(BDEF)
---R12(DE) R4(ABD)
Step 2:R2(ABC) 86
BCNF
Step 1: Create a separate relation for each violated dependency

Step 2: Remove the right hand side attribute of the dependency from the relation that is being
decomposed.
Step 3: we always create one table for CK if it is not a A B C
Part of a relation. a 1 X
b 2 Y
Example: R(A,B,C) c 2 Z
ABC c 2 Z
CB {violation}
A C d 3 W
Solution:
a X e 3 W
key: AB,AC
g 3 W
step 1: R1(C,B) b Y
step 2: R(A,C) or R2(A,C)
c Z
C B
c Z X 1
d W Y 2
We can decompose the relation in the
reference of only one key, selected: AC e W Z 2
g W W 3
87
Find the highest Normal Form of the relation
R(ABCDEFGH) R(ABCDE) R(ABCDEF)
AB C CE  D AB  C
A  DE DB DC  AE
BF
CA EF
F  GH

Key: AB Key: CE Key: ABD, BCD

1NF 1NF 1NF

R(ABCDEFGHI) R(ABCDE) R(ABCDEFGH)


AB  C AB  CD BC ADE
BD  EF DA D B
AD  GH BC  DE
AI Key: BCFGH, CDFGH
Key: AB, BD, BC
Key: ABD
3NF 2NF
1NF
88
R(VWXYZ) R(ABCDEF)
X  YV ABC  D
YZ ABD  E
ZY CD  F
VW  X CDF  B
BF  D
Key: VW,XW
Key: ABC, ACD

R(ABC)
AB
B C
CA

Key: A,B,C

89
4 NF
Step 1: Create a separate relation for each violated multi valued dependency

Step 2: Remove the right hand side attribute of the dependency from the relation that is
being decomposed.
Step 3: we always create one table for CK if it is not a
Part of a relation.

Example: R(Person, Mobile, Hobby)


Person ---» Mobile “Person Multi-deteremines mobile”
Person ---» Hobby

Solution:
key: Person
step 1: R1(Person, Mobile)
step 2: R(Person, Hobby)

90
Decomposition of a Relation- The process of breaking up or dividing a single relation into two or
more sub relations is called as decomposition of a relation.

Goal : elimination of redundancy by decomposing a relation in to several relation


Properties of Decomposition-

The following two properties must be followed when decomposing a given relation-
1. Lossless decomposition-

Lossless decomposition ensures-


• No information is lost from the original relation during decomposition.
• When the sub relations are joined back, the same relation is obtained that was decomposed.
• If after joining we get more tuple? (spurious tuple)
**Every decomposition must always be lossless.

2. Dependency Preservation-

Dependency preservation ensures-


• None of the functional dependencies that holds on the original relation are lost.
• The sub relations still hold or satisfy the functional dependencies of the original relation.
91
Types of Decomposition- Decomposition of a relation can be completed in the following two ways-

1. Lossless Join Decomposition-


• Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.
• This decomposition is called lossless join decomposition when the join of the sub relations results in the
same relation R that was decomposed.
• For lossless join decomposition, we always have-
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R

where ⋈ is a natural join operator


No extraneous tuples should appear after joining of the sub-relations.

92
Consider the following relation R( A , B , C )-

Now, let us check whether this decomposition is


The two sub relations are-
lossless or not.
For lossless decomposition, we must have-
R1 ⋈ R2 = R

• This relation is same as the original relation R.


• Thus, we conclude that the above decomposition is lossless
join decomposition.
93
2. Lossy Join Decomposition-

• Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.


• This decomposition is called lossy join decomposition when the join of the sub relations does not result in
the same relation R that was decomposed.
• The natural join of the sub relations is always found to have some extraneous tuples.
• For lossy join decomposition, we always have-
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R
where ⋈ is a natural join operator

Consider this relation is decomposed into two sub relations as R1( A , C ) and R2(
B , C )-

94
Now, let us check whether this decomposition is lossy or not.
For lossy decomposition, we must have-
R1 ⋈ R2 ⊃ R

Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and


R2 we get-
This relation is not same as the original relation R and contains
some extraneous tuples.
Clearly, R1 ⋈ R2 ⊃ R.
Thus, we conclude that the above decomposition is lossy join
decomposition.
NOTE-

• Lossy join decomposition is also known as careless decomposition.


• This is because extraneous tuples get introduced in the natural join of the sub-relations.
• Extraneous tuples make the identification of the original tuples difficult.
95
Lossy decomposition
Ano Balance Bname
A01 5000 Rajkot
A02 5000 Surat

Ano Balance Balance Bname


A01 5000 Not Same 5000 Rajkot
A02 5000 5000 Surat

Ano Balance Bname


A01 5000 Rajkot
A01 5000 Surat
Extra records
A02 5000 Rajkot
A02 5000 Surat

96
Lossless decomposition
Ano Balance Bname
A01 5000 Rajkot
A02 5000 Surat

Ano Balance Ano Bname


A01 5000 Same A01 Rajkot
A02 5000 A02 Surat

Ano Balance Bname


A01 5000 Rajkot Same as original table
A02 5000 Surat No extra records

97
Dependency Preserving Decomposition
 A Decomposition D = { R1, R2, R3….Rn } of R is dependency preserving wrt a set F of Functional
dependency if

(F1 ∪ F2 ∪ … ∪ Fm)+ = F+. Consider a relation R


R ---> F{...with some functional dependency(FD)....}
R is decomposed or divided into R1 with FD { f1 } and R2 with { f2 }, then
there can be three cases:

f1 U f2 = F :Decomposition is dependency preserving.


f1 U f2 ⊂ F :Not Dependency preserving.
f1 U f2 ⊃ F :This case is not possible.

 In the dependency preservation, at least one decomposed table must satisfy every dependency.
 If a relation R is decomposed into relation R1 and R2, then the dependencies of R either must be a
part of R1 or R2 or must be derivable from the combination of functional dependencies of R1 and
R2.

98
Exercise
Problem: Let a relation R (A, B, C, D ) and F+= {A –> B, B –> C, C –> D,D –>B}. Relation R is
decomposed into R1( A, B), R2(B,C) and R3(B, D). Check whether decomposition is
dependency preserving or not.
Solution:
R1(A,B) R2(B,C) R3(B,D)

A–>B B–>C B–>D


B–>A C–>B D–>B

F1={A–>B} F2={B–>C, C–>B} F3={B–>D, D–>B}

F1 ∪ F2 ∪ F3 = F+

Yes Dependency preserving

99
 Problem: Let a relation R (A, B, C, D ) and F+= {AB –> CD, D –> A}. Relation R is
decomposed into R1( A, D) and R2(B,C,D). Check whether decomposition is dependency
preserving or not.
R1( A, D) R2(B,C,D)

A–>D B–>CD
D–>A C–>BD
D–>BC
BC–>D
BD–>C
CD–>B

AB->CD is not preserved


No, Dependency preserving

100
Points to remember:
• BCNF decomposition is always lossless but not always dependency preserving.
• Sometimes, going for BCNF may not preserve functional dependencies.
• So, go for BCNF only if the lost functional dependencies are not required else normalize till 3NF only.
• There exist many more normal forms even after BCNF like 4NF and more.
• But in the real world database systems, it is generally not required to go beyond BCNF.
• Lossy decomposition is not allowed in 2NF, 3NF and BCNF.
• So, if the decomposition of a relation has been done in such a way that it is lossy, then the decomposition
will never be in 2NF, 3NF and BCNF.
• Unlike BCNF, Lossless and dependency preserving decomposition into 3NF and 2NF is always possible.
• A prime attribute can be transitively dependent on a key in a 3NF relation.
• A prime attribute can not be transitively dependent on a key in a BCNF relation.
• If a relation consists of only singleton candidate keys and it is in 3NF, then it must also be in BCNF.
• If a relation consists of only one candidate key and it is in 3NF, then the relation must also be in BCNF.

101

You might also like