0% found this document useful (0 votes)
1 views

Normalization of Database

Database normalization is a technique for organizing data to eliminate redundancy and anomalies through a systematic multi-step process. It includes various normal forms: First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), and Boyce-Codd Normal Form (BCNF), each with specific rules to ensure data integrity and logical storage. The document explains these normal forms with examples and emphasizes the importance of removing data duplication and ensuring proper data dependencies.

Uploaded by

Rasel
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Normalization of Database

Database normalization is a technique for organizing data to eliminate redundancy and anomalies through a systematic multi-step process. It includes various normal forms: First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), and Boyce-Codd Normal Form (BCNF), each with specific rules to ensure data integrity and logical storage. The document explains these normal forms with examples and emphasizes the importance of removing data duplication and ensuring proper data dependencies.

Uploaded by

Rasel
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Normalization of Database

Database Normalization is a technique of organizing the data in the database.


 Normalization is a systematic approach of decomposing tables to eliminate data
redundancy and undesirable characteristics like Insertion, Update and Deletion
Anomalies.
 It is a multi-step process that puts data into tabular form by removing duplicated data
from the relation tables.

Normalization is used for mainly two purpose,


 Eliminating redundant (useless) data
 Ensuring data dependencies make sense i.e data is logically stored

Normalization Rule
Normalization rule are divided into following normal form.
1. First Normal Form
2. Second Normal Form
3. Third Normal Form

Email: meraj09034@gmail.com Page. 1


4. BCNF
First Normal Form (1NF)
As per First Normal Form, no two Rows of data must contain repeating group of information
i.e each set of column must have a unique value, such that multiple columns cannot be used to
fetch the same row. Each table should be organized into rows, and each row should have a
primary key that distinguishes it as unique.
The Primary key is usually a single column, but sometimes more than one column can be
combined to create a single primary key. For example consider a table which is not in First
normal form.

Student Table:
Student Age Subject
Adam 15 Biology, Maths
Alex 14 Maths
Stuart 17 Maths

In First Normal Form, any row must not have a column in which more than one value is saved,
like separated with commas. Rather than that, we must separate such data into multiple rows.

Email: meraj09034@gmail.com Page. 2


Student Table following 1NF will be:
Student Age Subject
Adam 15 Biology
Adam 15 Maths
Alex 14 Maths
Stuart 17 Maths
Using the First Normal Form, data redundancy increases, as there will be many columns with
same data in multiple rows but each row as a whole will be unique.

Second Normal Form (2NF)

Before we learn about the second normal form, we need to understand the following
 Prime attribute − An attribute, which is a part of the prime-key, is known as a prime
attribute.
 Non-prime attribute − An attribute, which is not a part of the prime-key, is said to be a
non-prime attribute.
If we follow second normal form, then every non-prime attribute should be fully
functionally dependent on prime key attribute. That is, if X → A holds, then there should not
be any proper subset Y of X, for which Y → A also holds true.

Email: meraj09034@gmail.com Page. 3


Example:

We see here in Student_Project relation that the prime key attributes are Stu_ID and Proj_ID.
According to the rule, non-key attributes, i.e. Stu_Name and Proj_Name must be dependent
upon both and not on any of the prime key attribute individually. But we find that Stu_Name
can be identified by Stu_ID and Proj_Name can be identified by Proj_ID independently. This is
called partial dependency, which is not allowed in Second Normal Form.

Email: meraj09034@gmail.com Page. 4


We broke the relation in two as depicted in the above picture. So there exists no partial
dependency.

Third Normal Form (3NF)

For a relation to be in Third Normal Form, it must be in Second Normal form and the following
must satisfy −
 No non-prime attribute is transitively dependent on prime key attribute.
 For any non-trivial functional dependency, X → A, then either −
o X is a superkey or,
o A is prime attribute.

We find that in the above Student_detail relation, Stu_ID is the key and only prime key
attribute. We find that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a
superkey nor is City a prime attribute. Additionally, Stu_ID → Zip → City, so there
exists transitive dependency.

Email: meraj09034@gmail.com Page. 5


To bring this relation into third normal form, we break the relation into two relations as follows

For example, consider a table with following fields.

Student_Detail Table :
Student_id Student_name DOB Street city State Zip

Email: meraj09034@gmail.com Page. 6


In this table Student_id is Primary key, but street, city and state depends upon Zip. The
dependency between zip and other fields is called transitive dependency. Hence to apply 3NF,
we need to move the street, city and state to new table, with Zip as primary key.
New Student_Detail Table :
Student_id Student_name DOB Zip

Address Table :
Zip Street city state

The advantage of removing transitive dependency is,


 Amount of data duplication is reduced.
 Data integrity achieved.

Example: Suppose a company wants to store the complete address of each employee, they create
a table named employee_details that looks like this:

emp_id emp_name emp_zip emp_state emp_city emp_district


1001 John 282005 UP Agra Dayal Bagh
1002 Ajeet 222008 TN Chennai M-City
1006 Lora 282007 TN Chennai Urrapakkam
1101 Lilly 292008 UK Pauri Bhagwan
1201 Steve 222999 MP Gwalior Ratan

Super keys: {emp_id}, {emp_id, emp_name}, {emp_id, emp_name, emp_zip}…so on


Candidate Keys: {emp_id}
Non-prime attributes: all attributes except emp_id are non-prime as they are not part of any
candidate keys.

Here, emp_state, emp_city & emp_district dependent on emp_zip. And, emp_zip is dependent on
emp_id that makes non-prime attributes (emp_state, emp_city & emp_district) transitively
dependent on super key (emp_id). This violates the rule of 3NF.

To make this table complies with 3NF we have to break the table into two tables to remove the
transitive dependency:

Email: meraj09034@gmail.com Page. 7


employee table:

emp_id emp_name emp_zip


1001 John 282005
1002 Ajeet 222008
1006 Lora 282007
1101 Lilly 292008
1201 Steve 222999

employee_zip table:

emp_zip emp_state emp_city emp_district

282005 UP Agra Dayal Bagh

222008 TN Chennai M-City

282007 TN Chennai Urrapakkam

292008 UK Pauri Bhagwan

222999 MP Gwalior Ratan

Boyce and Codd Normal Form (BCNF)

Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict terms.
BCNF states that −

 For any non-trivial functional dependency, X → A, X must be a super-key.


In the above image, Stu_ID is the super-key in the relation Student_Detail and Zip is the super-
key in the relation ZipCodes. So,

Stu_ID → Stu_Name, Zip

and

Zip → City

Email: meraj09034@gmail.com Page. 8


Which confirms that both the relations are in BCNF.

Example: Suppose there is a company wherein employees work in more than one department.
They store the data like this:

emp_id emp_nationality emp_dept dept_type dept_no_of_emp


1001 Austrian Production and planning D001 200
1001 Austrian stores D001 250
1002 American design and technical support D134 100
1002 American Purchasing department D134 600

Functional dependencies in the table above:

emp_id -> emp_nationality


emp_dept -> {dept_type, dept_no_of_emp}

Candidate key: {emp_id, emp_dept}


The table is not in BCNF as neither emp_id nor emp_dept alone are keys.

To make the table comply with BCNF we can break the table in three tables like this:
emp_nationality table:

emp_id emp_nationality
1001 Austrian
1002 American
emp_dept table:

emp_dept dept_type dept_no_of_emp


Production and planning D001 200
stores D001 250
design and technical support D134 100
Purchasing department D134 600

emp_dept_mapping table:

emp_id emp_dept
1001 Production and planning

Email: meraj09034@gmail.com Page. 9


1001 stores
1002 design and technical support
1002 Purchasing department

Functional dependencies:

emp_id -> emp_nationality


emp_dept -> {dept_type, dept_no_of_emp}
Candidate keys:
For first table: emp_id
For second table: emp_dept
For third table: {emp_id, emp_dept}

This is now in BCNF as in both the functional dependencies left side part is a key.
Third Normal Form Comparison of BCNF and 3NF
 BCNF or 3NF?
o Relations in BCNF and 3NF
 Relations in BCNF: no repetition of information
 Relations in 3NF: problem of repetition of information
o Decomposition in BCNF and in 3NF
 It is always possible to decompose a relation into relations in 3NF and
 the decomposition is lossless
 dependencies are preserved
 It is always possible to decompose a relation into relations in BCNF and
 the decomposition is lossless
 the information is not repeated

Email: meraj09034@gmail.com Page.


10

You might also like