Database Normalization
Database Normalization
What is Database Normalisation? Why need Database Normalisation? How to perform Database Normalisation?
What is Normalisation
Normalisation uses a set of restrictions to exclude the undesirable properties from database design.
Data analysis
Data design
To determine the logical database structure, represented by a set of relations/tables, their attributes and their keys.
How to do Normalisation
Normalisation is to make a more natural representation by:
First Normal Form (1NF) Second Normal Form (2NF) and Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF);
Fourth and Fifth Normal Form (4NF and 5NF)
01
01 02 02 03 04
W York
W York Middl Middl Middl Middl
Leeds
Leeds Oxford Oxford Oxford Oxford
C1
C1 C2 C2 C2 C9
Nippers
Nippers Tots-Gear Tots-Gear Tots-Gear Kid-Naps
P1
P2 P1 P5 P3 P3
Pantaloons
Pantaloons Pantaloons Pinafore Socks Socks
100
50 100 200 50 50
An employee is identified by the Employee_No Name and Home Address represent single-valued fact about employee.
R Bloggs
J Smith J Smith
500 512
Home_Address 25 High Street, Leeds 5 Low Street, Leeds 25 High Street, Leeds
R Bloggs
J Smith J Smith
Name
500 512
Home_Add
Name, Home_Add
X Y: X determines Y.
??? Name Employee_No, NameHome_Address
02
02 03 04
Middl
Middl Middl Middl
Oxford
Oxford Oxford Oxford
C2
C2 C2 C9
Tots-Gear
Tots-Gear Tots-Gear Kid-Naps
P1
P5 P3 P3
Pantaloons
Pinafore Socks Socks
100
200 50 50
2NF (Method)
The Second Normalisation of relation (2NF):
Step 1: using the partial FD (XY) on a key forms a new
relation (X,Y), where X is a key.
2NF (Example)
Example: attributes (Order_No, Area, S_Off, C_No, C_Name, P_No, P_Name, QTY)
Partial FD on a key: Order_No C_No, C_Name, Area, S_Off; P_N0 P_Name
Step 1: (X,Y) 2NF_Product (P_N0, P_Name); 2NF_CUSTOMER_ORDER(Order_No, C_No, C_Name, Area, S_Off); Step 2: (original key, others)
2NF_ORDER_LINE(Order_No, P_No, QTY)
2NF (Example)
Results:
1NF_CUSTOMER_ORDER
Order_No Area S_Off C_No C_Name P_NO P_Name QTY
2NF_Product
P_NO P_Name
2NF_CUSTOMER_ORDER
Order_No C_No Name Area S_Off
2NF_ORDER_LINE
Order_No P_No QTY
3NF (Example)
Example:
2NF_CUSTOMER_ORDER(Order_No, C_No, C_Name, Area, S_Off) ORDER_NO C_No, C_Name, Area, S_Off
This relation can be split into two relations. One is identified by the key (ORDER_No) and another is identified by the non-key (C_No).
3NF (Method)
The third Normalisation Method:
Step 1: For X Y Z, a new relation is formed by (Y,Z)
2NF_CUSTOMER
C_No C_Name Area S_Off
P2
P5 P5 P5
WH3
WH4 WH4 WH4
B2
B9 B10 B11
3000
50 50 50
(P_No, Bin_No)
If each type of product is required to store only at one site, i.e. P_NO Ware_house, then we have: P_No Ware_house; (P_No, Bin_No) QTY
P_No P1 P1 P2 P5
Is P_No a key???
P5 P5
Summary: Normalisation
Unnecessary repetition of data cause the inability to represent information, update anomalies, and excessive database size. Normal forms are retractions to exclude these unnecessary data repetition. Normal forms (NFs) are defined in terms of functional dependence:
Reading: 4.10-4.14
Exercises: 4.8-4.17