NORMALIZATION IN DATABASE
MANAGEMENT SYSTEM
Database normalization
• Normalization is the process of reorganizing data in a database so that it
meets two basic requirements:
i. There is no redundancy of data.
ii. Data dependencies are logical.
• Normalization usually involves dividing a database into two or
more tables and defining relationships between the tables.
Purpose of normalisation
• Minimise redundancy in data
• Remove insert, delete and update anomalies during database activities
• Reduce the need to reorganise data when it is modified or enhanced.
• Normalisation reduces a complex user view into a number of subgroups
Levels of normalization based on the amount
of redundancy in the database.
• Various levels of normalization are:
• First Normal Form (1NF)
• Second Normal Form (2NF)
• Third Normal Form (3NF)
• Boyce-Codd Normal Form (BCNF)
• Fourth Normal Form (4NF)
• Fifth Normal Form (5NF)
1st Normal Form (1NF)
• First Normal Form defines that all the attributes in a relation must have
atomic domains. The values in an atomic domain are indivisible units.
• 1 NF Decomposition
a. Place all items that appear in the repeating group in a new table
b. Designate a primary key for each new table produced.
c. Duplicate in the new table the primary key of the table from which the
repeating group was extracted or vice versa.
Example of a table not in 1NF :
Studio Director Movies
• This table contains Attribute values
which are not single. This is not in
Normalised form.
• To make it into 1NF we have to
Marvel Kevin Feige The Avengers decompose table into atomic
elements.
Captain America
DCEU Zack Snyder Batman Vs
SuperMan
Suicide Squad
Table in 1NF after eliminating:
Studio Director Movies
Marvel Kevin Feige The Avengers
Marvel Kevin Feige Captain America
DCEU Zack Snyder Batman Vs Superman
DCEU Zack Snyder Suicide Squad
Now it is in 1NF.
Second Normal Form – 2NF
• Prime attribute − an attribute, which is a part of the prime-key, is known as
a prime attribute.
• Non-prime attribute − an attribute, which is not a part of the prime-key, is
said to be a non-prime attribute.
If we follow second normal form, then every non-prime attribute should be
fully functionally dependent on prime key attribute and there should not be
any partial dependency.
Example of a table not in 2NF:
Studio Movie Budget city • Here Primary key is (studio, movie) and city
depends only on the studio and not on the
Marvel Avengers 100 New York
whole key.
• So, this is not in 2NF form.
Marvel Captain 120 New York
America
DCEU Batman Vs 150 Gotham
Superman
DCEU Suicide 75 Gotham
Squad
Solution of 2 NF
Old Scheme {Studio, Movie, Budget,
City} New Scheme {Movie, Studio,
Budget} New Scheme {Studio, City}
Movie Studio Budget Studio City
The Avengers Marvel 100
Captain America Marvel 120 Marvel New York
Batman Vs DCEU 150
Superman DCEU Gotham
Suicide Squad DCEU 75
Now the 2 tables are in 2NF form.
Third normal form 3 NF
• This form dictates that all non-key attributes of a table must be
functionally dependent on a candidate key i.e. there can be no
interdependencies among non-key attributes.
• For a table to be in 3NF, there are two requirements
• The table should be second normal form
• No attribute is transitively dependent on the primary key
Example of a table not in 3nf
Studio StudioTemp City
Here Studio is the primary key and both
Marvel 96 New York studio temp and city depends entirely
on the Studio.
1. Primary Key {Studio}
DCEU 99 Gotham 2. {Studio} {StudioCity}
3. {StudioCity} {CityTemp}
4. {Studio} {CityTemp}
5. CityTemp transitively depends on Studio
Fox 96 New York hence violates 3NF
It is called transitive dependency.
Paramount 95 Hollywood
Solution of 3NF
Old Scheme {Studio, StudioCity, CityTemp}
New Scheme {Studio, StudioCity}
New Scheme {StudioCity, CityTemp}
Studio Studio City Studio CityTemp
City
Marvel New York
New York 96
DCEU Gotham
Gotham 95
FOx New York
Hollywood 99
Paramount Hollywood
Boyce Codd Normal Form (BCNF) – 3.5NF
• BCNF does not allow dependencies between attributes that belong to
candidate keys.
• BCNF is a refinement of the third normal form in which it drops the
restriction
of a non-key attribute from the 3rd normal form.
• Third normal form and BCNF are not same if the following conditions
are true:
• The table has two or more candidate keys
• At least two of the candidate keys are composed of more than one attribute
• The keys are not disjoint i.e. The composite candidate keys share some
attributes
Example of BCNF
Scheme {MovieTitle, MovieID, PersonName, Role, Payment }
Key1 {MovieTitle, PersonName}
Key2 {MovieID, PersonName}
MovieTitle MovieID PersonName Role Payment
The Avengers M101 Robert Downet Jr. Tony Stark 200m
The Avengers M101 Chris Evans Chris Rogers 120m
Batman Vs Superman D101 Ben Afflek Bruce Wayne 180m
Batman Vs Superman D101 Henry Cavill Clarke Cent 125m
A walk to remember P101 Mandy Moore Jamie Sullivan 50m
Dependency between MovieID & MovieTitle Violates
BCNF
Solution of BCNF
Place the two candidate primary keys in separate entities
Place each of the remaining data items in one of the resulting entities according to its
dependency on the primary key.
New Scheme {MovieID, PersonName, Role, Payment}
New Scheme {MovieID, MovieTitle}
MovieID PersonName Role Payment MovieID MovieTitle
M101 Robert Downey Tony Stark 200m M101 The Avengers
Jr.
M101 Chris Evans Chris Rogers 125m Batman VS
D101
D101 Ben Afflek Bruce Wayne 175m Superman
D101 Henry Cavill Clarke Cent 120m
P101 A walk to remember
P101 Mandy Moore Jamie 50m
Sullivan
4nf
• Fourth normal form (4NF) is a level of database normalization where there
are no non-trivial multivalued dependencies other than a candidate key.
• It builds on the first three normal forms (1NF, 2NF and 3NF) and the Boyce-
Codd Normal Form (BCNF). It states that, in addition to a database meeting
the requirements of BCNF, it must not contain more than one multivalued
dependency.
Example of 4NF
Scheme {MovieName, ScreeningCity, Genre)
Movie ScreeningCity Genre • Many Movies can have the same Genre and
The Avengers Los Angles Sci-Fi Many Cities can have the same movie.
The Avengers New York Sci-Fi
Batman vs Santa Cruz Drama
• So this table violates 4NF .
Superman
Batman vs Durham Drama
Superman
A Walk to New York Romance
remember
Soultuin of 4NF
Move the two multi-valued relations to separate tables
Identify a primary key for each of the new entity.
New Scheme {MovieName, ScreeningCity}
New Scheme {MovieName, Genre}
MovieName ScreeningCity MovieName Genre
Batman vs Superman Santa Cruz
Batman vs Drama
The Avengers Los Angeles
Superman
A Walk to remember New york The Avengers Sci-Fi
Batman vs Superman Durham
A Walk to remember Romance
The Avengers New york
We split the table into two tables with one multivalued
value in each.
Fifth normal form
• Fifth normal form (5NF), also known as project-join normal form (PJ/NF) is
a level of database normalization designed to reduce redundancy in
relational databases recording multi-valued facts by
isolating semantically related multiple relationships. A table is said to be in
the 5NF if and only if every non-trivial join dependency in it is implied by
the candidate keys.
Example of 5NF
Theatre Company Movie • Here Product is related to each company and
T1 Paramount A walk to remember MVD: Theatre Company, Movie
T2 Marvel The Avengers
T2 Marvel Age of Ultron
T2 Marvel Dr. Strange
T3 DCEU Batman Vs
Superman
T4 Sony Spiderman
Homecoming
TITLE: Solution of 5NF
Theatre Company Theatre Movie
T2 The Avengers
T1 Paramount
T1 A walk to remember
T2 Marvel T2 Age of Ultron
T2 Dr. Strange
T3 DCEU T3 Batman vs Superman
T4 Sony T4 Spiderman Homecoming
Thank
You