Department of Computer Engineering
Faculty of Diploma Studies
UNIT 4
Database Management System
(09CE1302)
- Surabhi Palkar (Assistant Professor) 1
Normalization of Database Tables
A large database defined as a single relation may result in data
duplication. This repetition of data may result in:
▪ Making relations very large.
▪ It isn't easy to maintain and update data as it would involve
searching many records in relation.
▪ Wastage and poor utilization of disk space and resources.
▪ The likelihood of errors and inconsistencies increases.
So to handle these problems, we should analyze and decompose the
relations with redundant data into smaller, simpler, and well-structured
relations that are satisfy desirable properties.
Normalization is a process of decomposing the relations into relations
with fewer attributes.
2
Database Management System (09CE1302) Prof. Surabhi Palkar
Normalization of Database Tables
Database Management System (09CE1302) Prof. Surabhi Palkar
What is Normalization?
▪ Normalization is the process of organizing the data in the
database.
▪ Normalization is used to minimize the redundancy from a
relation or set of relations. It is also used to eliminate
undesirable characteristics like Insertion, Update, and Deletion
Anomalies.
▪ Normalization divides the larger table into smaller and links
them using relationships.
▪ The normal form is used to reduce redundancy from the
database table.
4
Database Management System (09CE1302) Prof. Surabhi Palkar
Why do we need Normalization?
▪ The main reason for normalizing the relations is removing
anomalies.
▪ Failure to eliminate anomalies leads to data redundancy and can
cause data integrity and other problems as the database grows.
▪ Data modification anomalies can be categorized into three types:
▪ Insertion Anomaly: Insertion Anomaly refers to when one cannot
insert a new tuple into a relationship due to lack of data.
▪ Deletion Anomaly: The delete anomaly refers to the situation
where the deletion of data results in the unintended loss of
some other important data.
▪ Updation Anomaly: The update anomaly is when an update of a
single data value requires multiple rows of data to be updated.
5
Database Management System (09CE1302) Prof. Surabhi Palkar
Normal Forms
• In database management systems (DBMS), normal forms are a
series of guidelines that help to ensure that the design of a
database is efficient, organized, and free from data
anomalies.
• There are several levels of normalization, each with its own
set of guidelines, known as normal forms.
Database Management System (09CE1302) Prof. Surabhi Palkar
First Normal Form (1NF)
1NF (First Normal Form) Rules:
• Each table cell should contain a single value.
• Each record needs to be unique.
Database Management System (09CE1302) Prof. Surabhi Palkar
Second Normal Form (2NF)
2NF (Second Normal Form) Rules
• Rule 1- Be in 1NF
• Rule 2- Single Column Primary Key that does not
functionally dependent on any subset of candidate key
relation
• 2NF eliminates redundant data by requiring that each
non-key attribute be dependent on the primary key.
• This means that each column should be directly related
to the primary key, and not to other columns.
8
Database Management System (09CE1302) Prof. Surabhi Palkar
Second Normal Form (2NF)
• This table has a composite primary key [Customer ID, Store ID]. The
non-key attribute is [Purchase Location].
• In this case, [Purchase Location] only depends on [Store ID], which
is only part of the primary key.
• Therefore, this table does not satisfy second normal form.
Database Management System (09CE1302) Prof. Surabhi Palkar
Second Normal Form (2NF)
• To bring this table to second normal form, we break the table into two
tables, and now we have two tables as above.
• What we have done is to remove the partial functional dependency that
we initially had. Now, in the table [TABLE_STORE], the column [Purchase
Location] is fully dependent on the primary key of that table, which is
[Store ID].
10
Database Management System (09CE1302) Prof. Surabhi Palkar
Third Normal Form (3NF)
3NF (Third Normal Form) Rules
• Rule 1- Be in 2NF
• Rule 2- Has no transitive functional dependencies
A relation is in third normal form if it holds at least one of
the following conditions for every non-trivial function
dependency X → Y.
1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part
of some candidate key.
11
Database Management System (09CE1302) Prof. Surabhi Palkar
Third Normal Form (3NF)
• Super key in above table:
{EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}..
..so on
• Candidate key: {EMP_ID}
Non-prime attributes: In the given table, all attributes except
EMP_ID are non-prime. 12
Database Management System (09CE1302) Prof. Surabhi Palkar
Third Normal Form (3NF)
• Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP
dependent on EMP_ID.
• The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on
super key(EMP_ID). It violates the rule of third normal form.
• That's why we need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
13
Database Management System (09CE1302) Prof. Surabhi Palkar