Fundamental of Database
System
Chapter Five
Functional dependency
and
Normalization
Overview
Normalization is the process of efficiently
organizing data in a database.
There are two goals of the normalization
process
eliminating redundant data(storing the
same data in more than one table)
ensuring data dependencies make sense
(only storing related data in a table)
Thus, reduce the amount of space a
database consumes and ensure that data
is logically stored.
Cont…
If database reduce data redundacy and improve data integrity.
It optimum structure
It ensure atomic data
It eliminates data inconsistences and anomalies
insertion
deletion
Update
Functional
Dependency
Functional dependency: Describes the
relationship between attributes in a
relation.
If A and B are attributes of a relation R, B
is functionally dependent on A (den. A →
B), if
each value of A in R is associated with
exactly one value of B in R.
B is functionally dependent on A
A is Determinant: attribute or set of
attributes on the left hand side of the arrow.
Identify the candidate key for a relation:
recognize the attribute (group of attributes)
that uniquely identifies each row in a
relation. All of the attributes that are not
part of the primary key (non-primary key
attributes) should be functionally
dependent on the key.
Functional Dependency
The essence of this idea is that
if the existence of something, call it A,
implies that B must exist and have a
certain value,
then we say that "B is functionally
dependent on A.“
We say “A functionally determines B”,
Or "B is a function of A,“
Also expressed by the statement.
"If A, then B."
In mathematics, it is common to say things
like
"y is a function of x" or “y = f(x)“.
The determining value, x, is called the argument;
The determined value, y or f(x), is called the
result.
In database, functional dependency among
attributes
a set of attributes X functionally determines a
set of attributes Y.
if the value of X determines a unique value for Y.
Functional Dependency …
X Y holds
whenever two tuples have the same value
for X,
they must have the same value for Y.
FDs are derived from the real-world
constraints on the attributes.
they are properties on the database
intension not extension.
Functional Dependency…
Example
Constraint: the type of Wine served depends on the type of Dinner,
Dinner Type of Wine
Meat Red
Fish white
cheese rose
we say Wine is functionally dependent on Dinner.
A → B is partially dependent if there is
some attribute that can be removed from
A and the dependency still holds.
Full functional dependency: If A and B are
attributes of a relation, B is fully
functionally dependent on A if B is
functionally dependent on A, but not any
proper subset of A.
Ex.
Staff_No, SName → Branch_No partial
Staff_No → Branch_No full
Partial Dependency
If an attribute which is not a member of the
primary key is dependent on some part of
the primary key (if we have composite
primary key) then that attribute is partially
functionally dependent on the primary key.
Let {A,B} is the Primary Key and C is no key
attribute.
Then if {A,B} C and B C
Then C is partially functionally
dependent on {A,B}
Full Functional Dependency
If an attribute which is not a member of the
primary key is not dependent on some part of
the primary key but the whole key (if we have
composite Primary key) then that attribute is
fully functionally dependent on the primary key.
Let {A,B} be the Primary Key and C is a non-
key attribute
Then if {A,B} C and B C and A C does not
hold
Then C Fully functionally dependent on
{A,B}
Transitive Dependency
In mathematics and logic, a transitive
relationship is :
"If A implies B, and if also B implies C, then A
implies C."
Example:
If Mr X is a Human, and if every Human is an
Animal,
then Mr X must be an Animal.
In Generalized
If A functionally governs B, AND
If B functionally governs C
THEN A functionally governs C
Example
the value of an employee's social security
number uniquely determines the employee
name.
SSN ENAME
ENAME functionally dependent on SSN.
If a combination of SSN and PNUMBER values
uniquely determines the no of hours the
employee currently works on the project.
{SSN, PNUMBER} HOURS
HOURS is Fully functional dependent on {SSN,
PNUMBER}
Normalization
Is step by step process.
Normal forms : are output of each
steps of normalization.
Steps of Normalization:
UnNormalized Form(UNF):
Identify all data elements
First Normal Form(1NF):
Findthe key with which you can find all
data.
remove any repeating group
Normalization…
Second Normal Form(2NF):
Remove part-key dependencies (partial
dependency).
Make all data dependent on the whole key.
Third Normal Form(3NF)
Remove non-key dependencies (transitive
dependencies). Make all data dependent on
nothing but the key.
All these normal forms are based on the
functional dependencies among the
attributes of a relation.
Normalization…
Normalization of data helps us to
minimize the Modification anomalies
PRODUC
SidT Pid Status City Quality
01 p1 1 Dire Dawa 100
01 p2 1 Dire Dawa 200
02 p1 1 Dire Dawa 50
P.k=Sid+Pi
d
Modification anomalies
Insertion : a failure or difficulty to insert
some data
we can not insert Product information until it
have a supplier or vice versa.
Update : a failure or difficulty to update some
data
change of Sid “01” to “03” will cause a change on
all records of records on product
Delete : a failure or difficulty to delete some
data
to delete one supplier we must also delete all
products which are supplied by that supplier.
Normal forms
First Normal Form (1NF)
Disallows
composite attributes, multivalued attributes
nested relations; attributes whose values for an
individual tuple are non-atomic (sets or list of value)
TWO ways
1. Putting each repeating group into a separate table and
connecting them with a primary key-foreign key
relationship
2. Moving these repeating groups to a new row by
repeating the non-repeating attributes known as
“flattening” the table. then Find the key with which
you can find all data.
First Normal Form (1NF)
Stud_i Course_i Stude_nam Cours_nam Grade Teache Phone no.
d d e e r
1 10 Abdi Db A F 0910234567
2 10 Chala Db B F 0910234567
1 20 Abdi Java A X 0911234567
2 20 Chala Java A X 0911234567
3 10 Tsion Db C F 0910234567
4 10 Hana Db C F 0910234567
4 20 Hana java C X 0911234567
Cours_id Course_na Teache Phone no.
Stud_id Stud_name
me r
1 Abdi
10 db F 0910234567
2 Chala
3 Tsion 20 java X 0911234567
4 Hana
Stud_id Cours_id Grade
1 10 A
2 10 B
1 20 A
2 20 A
3 10 C
4 10 C
4 20 C
Teacher Phone no
F 0910234567
X 0911234567
First Normal Form (1NF)…
There are three main techniques to achieve first
normal form for such a relation:
1. Remove the attribute DLOCATIONS that
violates 1NF and place it in a separate relation
DEPT_LOCATIONS along with the primary key
DNUMBER of DEPARTMENT. The primary key of
this relation is the combination {DNUMBER,
DLOCATION},
A distinct tuple in DEPT_LOCATIONS exists for
each location of a department. This decomposes
the non-1NF relation into two 1NF relations.
First Normal Form (1NF) …
2. Expand the key so that there will be a separate tuple in
the original DEPARTMENT relation for each location of a
DEPARTMENT, as shown in the Figure (c) .
In this case,the primary key becomes the combination
{DNUMBER, DLOCATION}.
This solution has the disadvantage of introducing
redundancy in the relation.
3. If a maximum number of values is known for the
attribute-for example, if it is known that at most three
locations can exist for a department-replace the
DLOCATIONS attribute by three atomic attributes:
DLOCATIONl, DLOCATION2, and DLOCATION3.
This solution has the disadvantage of introducing null values
if most departments have fewer than three locations.
First Normal Form (1NF)…
The first is generally considered best
does not suffer from redundancy
general, no limit placed on a maximum
number of values.
Definition: a table (relation) is in 1NF
If:
There are no duplicated rows in the table.
Unique identifier
Each cell is single-valued
(i.e., there are no repeating groups).
Entries in a column (attribute, field) are of
the same kind.
Second Normal form (2NF)
No partial dependency of a non key
attribute on part of the primary key. This
will result in a set of relations with a
level of Second Normal Form.
Any table that is in 1NF and has a single-
attribute (i.e., a non-composite) key is
automatically also in 2NF.
Other Levels of Normalization
Boyce-Codd Normal Form (BCNF):
table is in BCNF if it is in 3NF and if every
determinant is a candidate key.
Violation of the BCNF is very rare.
So most tables in 3NF are also in BCNF.
Forth Normal form (4NF)
A table is in 4NF if it is in BCNF and if it has
no multi-valued dependencies.
Other Levels of
Normalization…
Fifth Normal Form (5NF)
also called "Projection-Join Normal Form" (PJNF),
A table is in 5NF,
if it is in4NF and
if every join dependency in the table is a consequence
of the candidate keys of the table.
Domain-Key Normal Form (DKNF)
if every constraint on the table is a logical
consequence of the definition of keys and
domains
It is from all modification anomalies.
Problems
Problems associated with
normalization
Requires data to see the problems
May reduce performance of the system
Is time consuming,
Difficult to design and apply and
Prone to human error
In short
1st normal form
No repeating fields in the table
All attributes depend on the key.
2nd normal form
All attributes depend on the whole key.
3rd normal form
All attributes depend on nothing but the
key.