Normalization
Overview
What is normalization?
What are the normal forms?
How to normalize relations?
2
Introduction
Question:
How many, and what, relations (tables) should be
used to store my data?
Is the current relation free of problems?
Stud_ID Stud_Name Course_ID Course_Name Instructor Office Room Credit
224 Waters CIS20 Intro CIS Greene CBA001 205G 5
224 Waters CIS40 Database Mgt Hong CBA908 311S 5
224 Waters CIS50 Sys.Analysis Purao CBA700 139S 5
351 Byron CIS30 COBOL Hong CBA908 629G 3
351 Byron CIS50 Sys.Analysis Purao CBA700 139S 5
421 Smith CIS20 Intro CIS Greene CBA001 205G 5
421 Smith CIS30 COBOL Hong CBA908 629G 3
421 Smith CIS50 Sys.Analysis Purao CBA700 139S 5
3
Normalization
Normalization is a process of producing a set
of related relations(tables) with desirable
attributes, given the data requirements of a
domain
The goal is to remove redundancy and other data
modification (insertion, update and deletion)
problems
Usually dividing a table into 2 or more tables
Using Normal Forms as a guide
4
Normal Forms
Normal forms are guidelines
(steps) for the normalization
process
DK/NF
...
5th Normal Form
ie s
om n
…
tio
all
bu
4th Normal Form (4NF)
r a liza
tm
a
Boyce Codd Normal Form (BC/NF)
bn
Fe orm
ore
3rd Normal Form (3NF)
N
tab
we
2nd Normal Form (2NF)
l es
1st Normal Form (1NF)
Unnormalized Form (UNF)
5
Normalization – 1NF
A table is in 1NF if
it satisfies the definition of a relation
Review: what are the characteristics of a
relation?
No “repeating groups” (columns)
6
Repeating Groups
Customer First Telephone
Surname
ID Name Number
123 Robert Ingram 555-861-2025
555-403-1659
456 Jane Wright
555-776-4100
789 Maria Fernandez 555-808-9633
Customer First Tel. No.
Surname Tel. No. 1 Tel. No. 2
ID Name 3
123 Robert Ingram 555-861-2025
456 Jane Wright 555-403-1659 555-776-4100
789 Maria Fernandez 555-808-9633
Transforming to 1NF
Transforming to rows, rather than columns
Customer ID First Name Surname Telephone Number
123 Robert Ingram 555-861-2025
456 Jane Wright 555-403-1659
456 Jane Wright 555-776-4100
789 Maria Fernandez 555-808-9633
8
Transforming to 1NF: Example
Another example
UNF 1NF
9
Problems in 1NF
Basically has the same problem as the
spreadsheet tables
Redundancy
Data may be inconsistent after modification
Higher Normal Forms
Normal forms higher than 1NF deal with
functional dependency
Identifying the normal form level by
analyzing the functional dependency
between attributes (fields)
11
Functional Dependency
If each value of attribute A is associated with only
one value of attribute B, we say
A determines B
Or, B is dependent on A
Denoted as: A B
Functional dependence describes relationships
between attributes (not relations)
Note: A (or B) could be a set of fields
12
Functional Dependency Examples
Dependency example
For each SSN, there is only one corresponding
first name (or last name), so:
SSN determines FirstName
SSN FirstName
Non-dependency example
Each instructor teaches multiple courses, so:
InstructorId does not determines CourseNumber
13
Functional Dependency and Keys
By definition, a primary key (candidate key)
functionally determines all other attributes
Primary key
Surrogate key
Composite primary key
Dependency diagram
ISBN Title PubDate ListPrice
14
Functional Dependency Exercise
CustomerNum CustomerName?
{Street, City, State} Zip?
CustomerName Balance?
State (?) Zip
RepNum ( ? ) CustomerName
15
Normalization – 2NF
A relation is in 2NF, if
It is in 1st normal formal, and
All nonkey attributes must be functionally dependent on the
whole primary key (Full dependency)
No partial dependency
It implies that a relation is in 2NF if there is a single-
attribute primary key (candidate key)
Partial dependency
A non-key attribute is dependent on part of a primary key
A B C D
16
A Relation in 1NF but Not in 2NF
Partial dependency
Transforming to 2NF
Identify primary key (PK)
If PK consists of only one field, then it is in 2NF
If PK is a composite key, then analyze functional dependency
between part of primary key and other non-key attributes
Move partial dependency involved attribute to another relation
A B C D
A B C
B D
18
Transforming to 2NF: Example
Order
Order_ID Order_Date Cust_ID Cust_Name Cust_Address
Order Value
1006 10/24/2004 2 Furniture Plano, TX
Furniture
1007 10/25/2004 6 Gallery Boulder, CO Ord_ID Prod_ID Ord_Quan
1006 7 2
Product Prod_ID Prod_Desc Prod_Finish Unit_Price OrderItem 1006 5 2
4 Entertainment Natural 650.00
Center Maple 1006 4 1
5 Writer's Desk Cherry 325.00 1007 11 4
7 Dining Table Natural Ash 800.00 1007 4 3
11 4-Dr Dresser Oak 500.00 19
Problems in 2NF
Again, redundancy and potential
inconsistency
Order_ID Order_Date Cust_ID Cust_Name Cust_Address
Value
1006 10/24/2004 2 Furniture Plano, TX
Furniture
1007 10/25/2004 6 Gallery Boulder, CO
Value
1008 11/1/2004 2 Furniture Plano, TX
Normalization – 3NF
A relation is in 3NF, if
It is in 2nd normal formal, and
All attributes must, and only, be functionally
dependent on candidate keys
No transitive dependency
Transitive dependency
A B and B C, then A C
A B C D
21
A Relation in 2NF but Not in 3NF
Identify primary key (PK) and Look for
transitive dependence
Order_ID Order_Date Cust_ID Cust_Name Cust_Address
Value
1006 10/24/2004 2 Furniture Plano, TX
Furniture
1007 10/25/2004 6 Gallery Boulder, CO
Value
1008 11/1/2004 2 Furniture Plano, TX
Transitive dependency
22
Transforming to 3NF
Move the attributes involved in transitive
dependency to another relation
Order_ID Order_Date Cust_ID Cust_Name Cust_Address
Value
1006 10/24/2004 2 Furniture Plano, TX
Furniture
1007 10/25/2004 6 Gallery Boulder, CO
Value
1008 11/1/2004 2 Furniture Plano, TX
Order Customer
Order_ID Order_Date Cust_ID Cust_ID Cust_Name Cust_Address
1006 10/24/2004 2 Value
1007 10/25/2004 6 2 Furniture Plano, TX
1008 11/1/2004 2 Furniture
6 Gallery Boulder, CO
23
Some Practical Tips
If there are attributes of two different entities in one table,
there are usually problems
To identify the normalization level, determine the primary key
first; then look for partial dependence and transitive
dependence
Generally relations in the 3rd normalization forms are considered
to be well formed; going higher may introduce unnecessary
structural complexity which is inefficient for data queries
Very often tables can go for lower normal forms (de-
normalization) depending on design requirements
24
Normalization Exercise 1-1
Which normal form is the above table in?
A. 1NF
B. 2NF
C. 3NF
D. UNF
25
Normalization Exercise 1-2
Which normal form is the above table (2nd one) in?
A. 1NF
B. 2NF
C. 3NF
D. UNF
26
Normalization Exercise 1-3
Which normal form is the above table (first one) in?
1NF
2NF
3NF
UNF
27
Normalization Exercise 1-4
Final database design with 3 tables
28
Summary
3NF: If the tables are in 2NF, and every non-key attribute is
dependent on the key, the whole key, and nothing but the key
Eliminate transitive
dependencies
2NF: If the tables are in 1NF, and every non-key attribute is
dependent on the key, the whole key
Eliminate partial
dependencies
1NF: If the tables are relations
Split repeating groups
in separate rows
UNF
29
Summary
Key concepts
Normalization
Normal forms
Functional dependency