Normalization of Database
Normalization of Database
Normalization Rule
Normalization rule are divided into following normal form.
1. First Normal Form
2. Second Normal Form
3. Third Normal Form
Student Table:
Student Age Subject
Adam 15 Biology, Maths
Alex 14 Maths
Stuart 17 Maths
In First Normal Form, any row must not have a column in which more than one value is saved,
like separated with commas. Rather than that, we must separate such data into multiple rows.
Before we learn about the second normal form, we need to understand the following
Prime attribute − An attribute, which is a part of the prime-key, is known as a prime
attribute.
Non-prime attribute − An attribute, which is not a part of the prime-key, is said to be a
non-prime attribute.
If we follow second normal form, then every non-prime attribute should be fully
functionally dependent on prime key attribute. That is, if X → A holds, then there should not
be any proper subset Y of X, for which Y → A also holds true.
We see here in Student_Project relation that the prime key attributes are Stu_ID and Proj_ID.
According to the rule, non-key attributes, i.e. Stu_Name and Proj_Name must be dependent
upon both and not on any of the prime key attribute individually. But we find that Stu_Name
can be identified by Stu_ID and Proj_Name can be identified by Proj_ID independently. This is
called partial dependency, which is not allowed in Second Normal Form.
For a relation to be in Third Normal Form, it must be in Second Normal form and the following
must satisfy −
No non-prime attribute is transitively dependent on prime key attribute.
For any non-trivial functional dependency, X → A, then either −
o X is a superkey or,
o A is prime attribute.
We find that in the above Student_detail relation, Stu_ID is the key and only prime key
attribute. We find that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a
superkey nor is City a prime attribute. Additionally, Stu_ID → Zip → City, so there
exists transitive dependency.
Student_Detail Table :
Student_id Student_name DOB Street city State Zip
Address Table :
Zip Street city state
Example: Suppose a company wants to store the complete address of each employee, they create
a table named employee_details that looks like this:
Here, emp_state, emp_city & emp_district dependent on emp_zip. And, emp_zip is dependent on
emp_id that makes non-prime attributes (emp_state, emp_city & emp_district) transitively
dependent on super key (emp_id). This violates the rule of 3NF.
To make this table complies with 3NF we have to break the table into two tables to remove the
transitive dependency:
employee_zip table:
Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict terms.
BCNF states that −
and
Zip → City
Example: Suppose there is a company wherein employees work in more than one department.
They store the data like this:
To make the table comply with BCNF we can break the table in three tables like this:
emp_nationality table:
emp_id emp_nationality
1001 Austrian
1002 American
emp_dept table:
emp_dept_mapping table:
emp_id emp_dept
1001 Production and planning
Functional dependencies:
This is now in BCNF as in both the functional dependencies left side part is a key.
Third Normal Form Comparison of BCNF and 3NF
BCNF or 3NF?
o Relations in BCNF and 3NF
Relations in BCNF: no repetition of information
Relations in 3NF: problem of repetition of information
o Decomposition in BCNF and in 3NF
It is always possible to decompose a relation into relations in 3NF and
the decomposition is lossless
dependencies are preserved
It is always possible to decompose a relation into relations in BCNF and
the decomposition is lossless
the information is not repeated