DE-Module 4
DE-Module 4
MODULE DISCUSSION:
Functional Dependency
▪ Definition and Types
▪ Armstrong's axioms / Inference Rules
▪ Closure
• Closure of Attributes set
• Candidate Key
• Minimal Cover / Canonical Cover / Irreducible Set
Normalization
▪ Definition
▪ Anomalies of un-Normalized Relation
▪ Need of Normalization
▪ Benefits of Normalization
▪ Types of Normalization
1. 1NF
2. 2NF
3. 3NF
4. BCNF
5. 4NF
▪ Normal form
• Identifying highest Normal Form
• Dependency Preserving
• Lossless Join Decomposition
• Conversion of Normal Forms
1
28-03-2025
FD: α → β Dependent
α β α β
2 3 2 3
1 4 1 4
2 6 2 3
2
28-03-2025
Trivial FD:-
“A FD: α → β is said to be TFD if and only if β is a subset of α (or β ⊆ α ).” Trivial
Examples:-
Roll No → Roll No
Roll No, Name → Roll No
Non-Trivial
TFD is always valid no need to check their FD further. Functional
Dependency
Advantages of FD:
✔Functional Dependency avoids data redundancy.
✔It helps you to find the facts regarding the database design
3
28-03-2025
x→ y if y ⊆ x (i.e. TFD)
Rule-2 Transitivity
if x→ y & y→ z
then x→ z
Rule-3 Augmentation
if x→ y
then xz→ yz (i.e. z is a common attribute(s))
Rule-4 Union
if x→ y & x→ z
Secondary Rules
then x→ yz
Rule- 5 Decomposition / Splitting
if x→ yz
then x→ y & x→ z
Rule-6 Pseudo Transitivity
if x→ y & yz → a
then xz→ a
Rule-7 Composition
if x→ y & w→ v
then xw→ yv
Important terms
Super Key:- “Set of attributes whose closure contains attributes of given relation”.
Candidate Key:- “Super key or minimal super key or it is a super key whose proper subset is not a super
key”.
Prime Attributes:- “A Prime attribute must be a member of some candidate key”
Non-Prime Attributes:- “A Nonprime attribute is not a prime attribute—that is, it is not a member of any
candidate key”
4
28-03-2025
A + ={A,B,C,D,E}
B + ={B,C,D,E}
C + ={C,D,E}
D + ={D,E}
E + ={E}
AD + ={A,D,B,C,E}
Here,
AB,AD,AC,AE, ABC, ADE, AEC, ……. All 2 4=16 combinations are super key and only
one ‘A’ is the candidate key.
Q.2 R(ABCDE), FD:-{A → B, D → E} find all the super and candidate keys using attribute
closure set?
Is there any
other
candidate
key(s) is
possible ?
5
28-03-2025
Step 2: If no prime attributes i.e. A, C, and D is not available in the right hand side
of FD then we can say that now we will not find any candidate key.
Steps:-
1. Splitting rule so that in every RHS has single value
2. Remove extraneous attributes from LHS
3. Remove redundant FD.
FD:-{AB → C, C → AB, B → C, ABC → AC, A → C, AC → B}
Step 1:-{AB→C, C →A, C →B, B →C, ABC →A, ABC →C, A →C, AC →B} #Apply
Splitting Rules
Step 1:-{AB→C, C →A, C →B, B →C, ABC →A, ABC →C, A →C, AC →B} # Remove
extraneous or redundant attributes
Step 2:-{B→C, C →A, C →B, B →C, A →C, C →B} # Remove redundant FD
FD:-{C →A, B →C, A →C, C →B}
6
28-03-2025
NORMaLIZatION
▪ Definition
▪ Anomalies of un-Normalized Relation
▪ Need of Normalization
▪ Benefits of Normalization
▪ Types of Normalization
1. 1NF
2. 2NF
3. 3NF
4. BCNF
5. 4NF
▪ Normal form
• Identifying highest Normal Form
• Dependency Preserving
• Lossless Join Decomposition
• Conversion of Normal Forms
NORMaLIZatION
Student
Student Student Credits Department Building Name Room No
ID Name Name
1 Rahul 5 CSE B1 201
2 Jiya 8 CSE B1 201
3 Rohan 7 EE B2 204
4 Ipsita 4 ECE B10 206
5 Swati 5 IT B7 207
6 Juli 6 ME B7 289
7 Jagat 7 CE B9 278
8 Chandra 9 IT B7 207
Normalization is the process of reduce this data redundancy problem.
Data Due to redundancy there will be three main problem in larger schema:-
Redundancy 1. Insertion Anomaly
2. Update Anomaly
3. Deletion Anomaly
7
28-03-2025
NORMaLIZatION
Data Anomaly:- “When a data is having multiple copies at one place we update the data but we
forget to update the same data at another place. Now we can’t say which data is correct because we
have different value in different place for same data. This is called as data anomaly.”
1. Insertion Anomaly:-
Suppose we want to insert a new department information i.e. department name, building
name, and room no.
2. Update Anomaly:-
Suppose CSE department has been shifted from B1 to C1 and Room No from 201 to 203.
3. Deletion Anomaly:-
Suppose a student is pass-out from CSE department and I want to delete that student record.
NORMaLIZatION
Normalization is a database design technique that reduces data redundancy and eliminates
undesirable characteristics like Insertion, Update and Deletion Anomalies.
Normalization is the process of making the relation free from insertion, update, and deletion
anomaly and save space by reducing the redundant or duplicate data.
How to solve Insertion, Update and Deletion Anomalies ?
Normalization rules divides larger tables into smaller tables and links them using relationships.
Student Student Student Credits Department Dept Department Building Room No
ID Name Name Name Name
1 Rahul 5 CSE CSE B1 201
PK
2 Jiya 8 CSE EE B2 204
PK
3 Rohan 7 EE ECE B10 206
FK
4 Ipsita 4 ECE ME B7 289
5 Swati 5 IT CE B9 278
6 Juli 6 ME IT B7 207
7 Jagat 7 CE
8 Chandra 9 IT
8
28-03-2025
NORMaLIZatION
Benefits of Normalization
▪ Reduce space
▪ Reduce data redundancy
▪ Solve insertion, update, and deletion anomalies
▪ It simplify the queries
▪ It is important for OLTP (Online Transaction Processing) system where insert, update, delete
and queries exists very frequently by the end user
▪ It simplify the database structure
Decomposition of a large schema is not easy so, important point is how to decompose the table so
that the data will not get loss . Therefore, we need lossless decomposition.
NORMaLIZatION
First Normal Form(1st NF)
City
State Phno
Cid
Sadd
Hno takes
Student Course
Cname
Sid Sname Composite
Attribute Multivalued
Attribute
9
28-03-2025
NORMaLIZatION
Definition:- “A relation is in first normal form if and only if all the attributes are atomic domains
or each column or each attributes should contain atomic values.”
Atomic values:- “It can not be further decomposed into smaller pieces by the DBMS or we can say
individually unique.”
Decompose the
composite
Convert to 1NF:- (Student) Attributes Make different tuple
First Method for each Multivalued
Attribute
NORMaLIZatION
Rules:-
1. Each column should have unique name
2. No ordering to rows and columns
3. No duplicate rows/ tuples/ records
Second Method
Student
10
28-03-2025
NORMaLIZatION
Rules:-
1. Each column should have unique name
2. No ordering to rows and columns
3. No duplicate rows/ tuples/ records
NORMaLIZatION
Second Normal Form (2 nd NF)
Rules:-
1. If it is in 1NF
2. There will be no partial dependency in the relation that is proper subset of any candidate key will
determine non prime attributes.
Example:- R(ABCDEF), FD:- {A → B, B → C, C → D, D →E}
ABCDEF + ={A,B,C,D,E,F}
AF + ={A,B,C,D,E,F}
Candidate
Key Prime attributes:- A, F
Non-Prime attributes:- B,C,D,E NPA
A→B
PA Here partial dependency exists so it is not in 2NF
11
28-03-2025
NORMaLIZatION
Q.6 R(A,B,C,D), FD:-{AB →CD, C →A, D →B} Check whether the given relational schema is in
2NF or not ?
Q.7 R(A,B,C,D), FD:-{A →B,B →C, C →D} Check whether the given relational schema is in 2NF
or not ?
Q.8 R(A,B,C,D), FD:-{A →B,B →D} Check whether the given relational schema is in 2NF or not ?
Note:- “If CK is having only single attribute then definitely that relation would be in 2NF.”
Drawbacks
✔ Data redundancy
✔ Update Anomaly
NORMaLIZatION
Third Normal Form (3 rd NF):-
Student
Sid Sname DOB State Country Pincode Credit
Data
1 Rahul 1998 Odisha IN 1224 C1 Redundancy
Student FK Country PK
12
28-03-2025
NORMaLIZatION
Rules:-
1. It is in 2NF
2. It does not contain any transitive dependency for non-prime attributes.
Transitive Dependency
A →B&B→C
A→C
Non-prime Non-prime
attributes attributes
A table is in 3NF if and only if for each of its non-trivial FD at least one of the following conditions
holds
1. LHS is Super Key
2. RHS is prime attributes
NORMaLIZatION
Q.7 R(A,B,C,D), FD:- {A → B, B → C, C → D}
Q.8 R(A,B,C,D,E,F), FD:{AB → CDEF, BD → F}
Q.9 R(A,B,C,D,E), FD:-{A →B, B →C, C →D, D →A}
Note:- “Combination of prime and non-prime attributes always tends to non-prime attributes.”
13
28-03-2025
NORMaLIZatION
Boyce-Codd Normal form (BCNF)
3NF can not solve the data redundancy problem when we have multiple overlapping candidate
key like
AB
BC
CD
3NF can not handle this type of problem of cases so we need BCNF.
Rules:-
1. It is in 3NF
2. For each non-trivial FD i.e. X →Y, X must be super key or candidate key.
NORMaLIZatION
Q.10 R(A,B,C,D), FD:- {A → B, B → C, C → A}
Q.11 R(A,B,C,D,E), FD:-{A →BCDE, BC →ACE, D →E}
Q.12 R(A,B,C,D,E), FD:- {AB →DCE, D →A} find highest normal form.
Q.13 R(A,B,C,D,E,F,G,H), FD:-{ABC → DE, E →GH, H →G, G →H, ABCD →EF} find
highest normal form.
1NF
2NF
3NF
BCNF
14
28-03-2025
NORMaLIZatION
Q.10 R(A,B,C,D), FD:- {AB → CD, AC → BD, BC → D}
Q.11 R(A,B,C,D,E), FD:-{AB → CDE, D →BE}, CK:- AB,AD
Q.12 R(A,B,C,D,E), FD:- {AE →BC, AC →D, CD →BE, D →E}, CK:- AD, AC, AE
Q.13 R(A,B,C,D), FD:- {AB →C, ABD →C, ABC →D, AC →D}, CK:- AB
Q.14 R(A,B,C,D), FD:- {A →BCD, BC →AD, D →B}, CK:- A, BC, CD
Q.15 R(A,B,C), FD{A →B, B →AC}, CK:- A,B
Important Points
❖If all CK are simple or single attribute then it would be in 2NF.
❖If all attributes of a relation are prime attributes, then it would be in 3NF.
❖If relation is in 3NF and all CKs are simple then it is in BCNF
NORMaLIZatION
Fourth Normal Form (4 th NF):-
Rules:-
1. It should be BCNF
2. It should not have multivalued dependency
15
28-03-2025
NORMaLIZatION
No
Enrolment Dependency
NORMaLIZatION
Decomposition
Decomposition means divide relation into multiple sub relation and it should follow
1. Dependency Preserving
2. Lossless
1. Dependency Preserving
R(A,B,C) R1(A,B) R2(B,C)
1 1 1 1 1 1 1
2 1 2 2 1 1 2
3 2 1 3 2 2 1
4 2 2 4 2 2 2
G FD:- F={A →B, A →C,A →BC,BC →A} FD:-F1={A →B} FD:-F2{ }
If F1 ∪ F2 =F
then F ≡ G OR G ≡ F
The dependency set of original relations are been preserving after decomposition of this relation.
16
28-03-2025
NORMaLIZatION
Example:- R(A,B,C,D,E)
FD:-{A →B, B →C,C →D,D →A}
R1(A,B,C) R2(C,D,E)
A + ={A,B,C,D} C + ={A,B,C,D}
A + ={A,B,C,D} C + ={A,B,C,D}
+
A ={B,C} C + ={D}
+
B ={B,C,D,A} D + ={A,B,D,C}
B + ={C,A} D + ={C}
+
C ={C,D,A,B} E + ={E}
+
C ={A,B} CE + ={E,C,D,A,B}
AB + ={A,B,C,D} (It is duplicate because A → C and AB → C)
BC + ={B,C,D,A}
AC + ={B,C,D,A}
NORMaLIZatION
FD:-F={A →B,
A + ={A,B,C,D} B →C , C →D,
B + ={A,B,C,D} D →A}
C + ={A,B,C,D}
D + ={D,C,A,B}
Note:- “Hence G covers F and F covers G. G is a superset of F. Each FD of F is a member of G”.
Sometimes BCNF decomposition will not dependency preserving but upto 3NF it is always
possible to dependency preserving decomposition.
17
28-03-2025
NORMaLIZatION
Q.16 R(A,B,C,D,E), FD:- {A →BCD, B →AE, BC →AED, D →E, C →DE}
R1(A,B), R2(B,C), and R3(C,D,E).
Q.17 R(A,B,C,D), FD:- {A →B, C → D}
R1(A,C) and R2(B,D)
Q.19 (A,B,C,D,E), FD:- {A →BCDE, BC →AED, D →E}
R1(A,B), R2(B,C), and R3(C,D,E)
Q.20 R(A,B,C,D), FD:-{A →B, B →C, C →D, D →A}
R1(A,B), R2(B,C), and R3(C,D)
NORMaLIZatION
2. Lossless Join Decomposition
R(A,B,C) R1(A,B) R2(B,C)
R
1 1 1 1 1 1 1 Decomposition
2 1 2 2 1 1 2 R1 R2
3 2 1 3 2 2 1 R
4 3 2 4 3 3 2 Join
18
28-03-2025
NORMaLIZatION
Note:- “After join we must have same record no loss in record or no extra records. ”
Property 1:- “The union of attributes of decomposed relation must be equal with the attributes of
actual relation. ”
Attribute(R1) ∪ Attribute(R2)= Attribute(R)
Property 2:- “Result of natural join should be same as the result of Cartesian product. ”
Attribute(R1) ∩ Attribute(R2) ≠ Φ
Property 3:- “If the common attributes is the super key of either R1 and R2, then definitely that
would be lossless join decomposition.”
Attribute(R1) ∩ Attribute(R2) = Attribute(R1)
OR
Attribute(R1) ∩ Attribute(R2) = Attribute(R2)
NORMaLIZatION
Example:-
R(A,B,C) R1(A,B) R2(A,C) Natural Join R1 ⋈ R2
1 1 1 1 1 1 1 1 1 1
2 1 2 2 1 2 2 2 1 2
3 2 1
3 2 1 3 2 3 1 4 3 2
4 3 2 4 3 4 2 R1 ⋈ R2=R1
19
28-03-2025
NORMaLIZatION
Example:- R(A,B,C,D,E,F)
FD:-{AB → C,C → D,D → EF, F → A,D → B}
D:-{ABC, CDE, EF}
NORMaLIZatION
Converting of Normal Forms ((Form 1NF to 2NF)
R(A,B,C,D), FD:-{A →B, B →C}
ABCD + ={A,B,C,D} As A →B, B →C
AD + ={A,B,C,D} So, Prime attributes:- A and D
CK A →B (PD is present so, this relation is not in 2NF)
A+ ={A,B,C} So R1(A,B,C) and R2(A,D)
A+ ={A,B,C} So, A →BC A+ ={A,B,C}
B+ ={B,C} So, B →C D+ ={D}
C+ ={C} Therefore, Check whether it is in
Therefore, F1:-{ } decomposition preserving or
F1:-{A →BC, B →C} not …… F1 ∪ F2 =F
Note:- If PA is a single It was already in BCNF {A →BC, B →C}
attribute then there will be
no PD so it is in 2NF It is decomposition preserving
as well as lossless joint
Now we can say the relation is in 2NF (As it is in lowest form)
decomposition.
20
28-03-2025
NORMaLIZatION
Q.21 R(A,B,C,D), FD:-{A →B, C → D}
Q.22 R(A,B,C,D), FD:-{A →B, B → C, C → D}
Q.23 R(A,B,C,D,E,F,G,H), FD:-{A →BD,B →C,E →FG,AE →H}
thaNk yOU….
21