Normalization
Definition: Normalization is the process of organizing the attributes and
tables of a database to minimize redundancy and dependency.
Goal: To reduce data duplication and improve data integrity.
First Normal Form (1NF)
Definition: A table is in 1NF if it contains only atomic values (no
repeating groups or arrays) and each column contains unique values.
Example:
Student_ID Name Subjects
1 Alice Math, English
2 Bob Science, History
1NF Conversion: Break down repeating groups into separate rows.
Student_I Name Subject
D
1 Alice Math
1 Alice English
2 Bob Science
2 Bob History
Second Normal Form (2NF)
Definition: A table is in 2NF if it is in 1NF and all non-key attributes are
fully dependent on the primary key (no partial dependency).
Example (for 2NF conversion):
Student_I Course Instructor Instructor_Contact
D
1 Math Dr. Smith 555-1234
1 English Dr. Johnson 555-5678
2NF Conversion: Remove partial dependency. Table 1: Students
Student_ID Course
1 Math
1 English
Table 2: Instructors
Instructor Contact
Dr. Smith 555-1234
Dr. Johnson 555-5678
Third Normal Form (3NF)
Definition: A table is in 3NF if it is in 2NF and there is no transitive
dependency (i.e., non-key attributes are dependent only on the primary
key).
Example:
Student_ID Course Instructor Instructor_Dept
1 Math Dr. Smith Math Dept
1 English Dr. Johnson English Dept
3NF Conversion: Remove transitive dependencies. Table 1: Students
Student_ID Course
1 Math
1 English
Table 2: Instructors
Instructor Department
Dr. Smith Math Dept
Dr. Johnson English Dept
Table 3: Courses
Course Instructor
Math Dr. Smith
English Dr. Johnson
Key Takeaways:
o Normalization reduces redundancy and improves database
integrity.
o It involves 3 levels: 1NF, 2NF, and 3NF.
o Each step builds on the previous one, with increasing complexity
and fewer data anomalies.