Normalization in
DBMS
Submit by – Anu
Submitted To – Dr. Dinesh
Gupta
Definition & Purpose
Understanding Normalization
Normalization is the process of organizing data to
minimize redundancy and improve data integrity. It aids in
efficient data management and helps prevent unwanted
anomalies in data.
Photos provided by Unsplash
Normalization Forms
First Normal Form (1NF)
1NF requires that all attributes must
contain only atomic values, ensuring
that each field only holds a single
value.
Example of
first normal Table_Product
form :- Product_id colour Price
This table is not in first 1 Black , red Rs. 210
normal form because the
“colour” column contains 2 Green Rs. 150
multiple values.
3 Red Rs.110
4 Green, Red Rs.260
5 Black Rs.100
After decomposing it into first
normal form it looks like:-
Product_id Price Product_id Colour
1 Rs.210
1 Black
2 Rs.150 1 Red
2 Green
3 Rs.110
3 Red
4 Rs.260
4 Green
5 Rs.100 4 Blue
5 black
Second Normal Form
Rules of 2NF Significance of 2NF
Every non-key attribute must be 2NF helps to further reduce
fully functionally dependent on redundancy by organizing data in
the primary key. This eliminates such a way that it can be
partial dependency and ensures uniquely identified through a
consistency. single key, streamlining database
management.
Photos provided by Unsplash
SECOND
NORMAL Table purchase detail
FORM :- Customer_id Store_id Location
This table has a composite
primary key i.e. customer id, 1 1 Patna
store id. The non key
attribute is location. In this 1 3 Noida
case location depends on
store id, which is part of the 2 1 Patna
primary key.
3 2 Delhi
4 3 Noida
After decomposing it into
second normal form it looks
like:-
Table Purchase Table Store
Customer_id Store_id Store_id Location
1 1
1 Patna
1 3
2 Delhi
2 1
3 Noida
3 2
4 3
Third Normal Form
Principles of 3NF Importance of 3NF
A table is in 3NF if it is in 2NF By enforcing 3NF, databases
and all transitive dependencies ensure that data is stored only
are removed. This means that no once, which minimizes
non-key attribute depends on anomalies during data
another non-key attribute. manipulation and enhances
overall database efficiency.
Photos provided by Unsplash
THIRD
NORMAL Table Book Details
FORM :- Book_id Genre_id Genre_type Price
In the table, book_id
determines genre_id and 1 1 Fiction 100
genre_id determines genre
types. Therefore book_idd 2 2 Sports 110
determines genre type via
genre_id and we have 3 1 Fiction 120
transitive functional
dependency.
4 3 Travel 130
5 2 Sports 140
After decomposing it into third
normal form it looks like:-
Table Book Table Genre
Book_id Genre_id Price Genre_id Genre type
1 1 100
1 Fiction
2 2 110
2 Sports
3 1 120
3 Travel
4 3 130
5 2 140
Boyce-codd normal
form(BCNF) :-
• It is advance version of 3NF that’s why it is also referred as 3.5NF.
BCNF is stricter than 3NF. A table complies with BCNF if it is in 3NF
and for every functional dependency X->Y, X should be the super key
of the table.
Student Course Teacher
BCNF :- Aman DBMS AYUSH
KEY: {Student , Aditya DBMS RAJ
Course}
Functinal Abhinav E-COMM RAHUL
dependency :
Aman E-COMM RAHUL
{student, course} ->
Teacher Abhinav DBMS RAJ
Teacher-> Course
Problem: teacher is
not superkey but
determines course.
After decomposing it into
boyce-codd normal form it looks
like :-
Student Course Course Teacher
Aman DBMS DBMS AYUSH
Aditya DBMS DBMS RAJ
Abhinav E-COMM E-COMM RAHUL
Aman E-COMM
Abhinav DBMS
Fourth normal form(4NF):-
• Fourth normal form is a level of database normalization where there
are non-trivial multivalued dependencies other than a candidate key.
• It builds on the first three normal forms (1NF, 2NF,3NF) and the
Boyce-codd Normal Form(BCNF).
Student Major Hobby
Fourth Normal
Form Aman Management Football
Aman Management Cricket
KEY : {students, majors, Raj Management Football
hobby}
MVD : -> -> Major, hobby Raj Medical Football
Ram Management Cricket
Aditya Btech Football
Abhinav Btech Cricket
After decomposing it into fourth
normal form it look like:-
Student Major Student Hobby
Aman Management Aman Football
Raj Management Aman Cricket
Raj Medical Raj Football
Ram Management Ram Cricket
Aditya Btech Aditya Football
Abhinav Btech Abhinav Cricket
FIFTH NORMAL FORM(5NF)
• A database is said to be in 5NF if and only if,
• It’s in 4NF
• If we decomposed table further to eliminate redundancy and anomaly
when we re-join the decomposed tables by means of candidates keys,
we should not be losing the original data or any new record set
should not arise.
• In simple words, joining two or more decomposed table should not
lose records nor create new records.
FIFTH
NORMAL Seller Company Product
FORM Aman Coca cola company Thumps Up
Aditya Unilever Ponds
KEY : {Seller, company,
Aditya Unilever Axe
product}
MVD : Seller -> -> Aditya Unilever Lakme
Company, product
Product is related to Abhinav P&G Vicks
company
Abhinav Pepsico Pepsi
After decomposing it into fifth
normal form it looks like:-
Seller Product Seller Comapny
Aman Thumps Up
Aman Coca cola company
Aditya Ponds
Aditya Axe Aditya Unilever
Aditya Lakme
Abhinav P&G
Abhinav Vicks
Abhinav Pepsico
Abhinav Pepsi
Company Product
Coca cola company Thumps Up
Unilever Ponds
Unilever Axe
Unilever Lakme
Pepsico Pepsi
P&G Vicks
Normalization Process
Step Description Example
1NF Ensure atomicity Breaking down data
2NF Remove partial keys Separate tables
3NF Eliminate transitive dependencies Final adjustments
Benefits of Normalization
Data Integrity Efficiency in Queries Reduced Redundancy
Normalization helps maintain Normalized databases generally By organizing data efficiently,
accuracy and consistency of allow for faster retrieval of data normalization minimizes the
data, preventing data anomalies as they eliminate unnecessary amount of duplicate data stored
during data manipulation. data duplication. in the database.
Summary of Benefits
40% 30% 20% 10%
Importance of Minimizing
Data Integrity Duplication Faster Queries Simplified Updates
Normalization significantly It prevents duplicate data Normalization allows for A normalized database
improves overall data storage which in turn faster data retrieval and structure is easier to
integrity by minimizing optimizes storage and more efficient query maintain and manage,
redundancy and potential reduces update anomalies. execution due to reducing the potential for
inconsistencies. streamlined data errors.
structures.
Thank you