0% found this document useful (0 votes)
29 views30 pages

Database Normalization and Functional Dependencies

The document outlines the topics covered in Week 8 of a Database Management Systems course, focusing on functional dependencies, normalization, and SQL operations. It explains the importance of normalization in reducing data anomalies and inconsistencies through various normal forms (1NF, 2NF, 3NF, BCNF). Additionally, it provides examples of functional dependencies and normalization processes using tables related to orders and course schedules.

Uploaded by

clafedersin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views30 pages

Database Normalization and Functional Dependencies

The document outlines the topics covered in Week 8 of a Database Management Systems course, focusing on functional dependencies, normalization, and SQL operations. It explains the importance of normalization in reducing data anomalies and inconsistencies through various normal forms (1NF, 2NF, 3NF, BCNF). Additionally, it provides examples of functional dependencies and normalization processes using tables related to orders and course schedules.

Uploaded by

clafedersin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

CENG 3005

Database Management Systems


Week 8

• Functional Dependencies
• Normal Forms
Database Management Systems
Week 8
OUTLINE
• SQL
• Ordering
• Set Operations
• Inner Join/Left Outer Join/Right Outer Join/Full Join
• Aggregate Functions
• Modifying the database (Delete/Insert/Update)
• Views
• Stored procedures
• Functional Dependencies
• Normal Forms
Today and next week we will learn:
 What is normalization ? Why do we need it?

 What does functional dependency mean? Why do we


need it?

 What role does it play in the database design process?

 What do the normal forms 1NF, 2NF, 3NF, BCNF mean?

 How normal forms can be transformed from lower


normal forms to higher normal forms?

 How some situations require denormalization to


generate information efficiently?
Watch [Link]
M
1NF
Watch [Link]
2NF
3NF
Functional Dependencies (FDs), and
how to find Keys using them
[Link]
i4sk4h2lhtU
Functional Dependencies
(formal definition)
 Definition: A functional dependency (FD) on a relation
schema R is a constraint X -> Y, where X and Y are
subsets of attributes/columns of R.
 An FD X -> Y is satisfied in a row r of R if for every
pair of tuples, t and s: if t and s agree on all attributes in
X then they must agree on all attributes in Y
 In other words, for a dependency to hold at a table R, if all the values on the
left side attribute set (X) are same, they should be the same on the right side
attribute set (Y)
 If you know left side, you can assume you know right side!
• SSN -> SSN, Name, Address
Functional Dependencies
(Examples)
• Address -> ZipCode
– Stony Brook’s ZIP is 11733
• ArtistName -> BirthYear
– Picasso was born in 1881
• Autobrand -> Manufacturer, Engine type
– Pontiac is built by General
Motors(Manufacturer) with gasoline engine
(Engine Type)
• Author, Title -> PublDate
– Shakespeare’s Hamlet published in 1600
Entailment, Closure, Equivalence
 Definition: If F is a set of FDs on schema R and f is
another FD on R, then F entails f if every instance(row) r
of R that satisfies every FD in F also satisfies f
 Ex: F = {A -> B, B-> C} and f is A -> C
• If Streetaddr -> Town and Town -> Zip then Streetaddr ->
Zip

 Definition: The closure of F, denoted F+, is the set of all


FDs entailed by F

 Definition: F and G are equivalent if F-> G and G-> F


Properties
of
Functional Dependencies
• Reflexivity: If Y -> X then X -> Y (trivial FD)
– Name, Address -> Name

• Augmentation: If X -> Y then X Z-> YZ


– If Town -> Zip then Town, Name -> Zip, Name

• Transitivity: If X -> Y and Y -> Z then X -> Z


Derived inference rules

 Union: if X-> Y and X-> Z, then X-> YZ.

 Decomposition: if X-> YZ, then X-> Y and X-> Z.

 Pseudotransitivity: if X-> Y and WY-> Z, then WX-> Z.


Generating F+

F (functional dependency list for your table)

AB-> C
union AB-> BCD decomp
aug
A-> D AB-> BD trans AB-> BCDE AB-> CDE
aug
D-> E BCD -> BCDE

Thus, F+
consists of AB-> BD, AB -> BCD, AB -> BCDE,
and AB -> CDE
If you understood Functional
Dependency so far,
Normalization is so easy!
Why do we need to normalize our
databases?
We need a criterion for determining a
table's degree of vulnerability to
1) logical inconsistencies and
2) anomalies.
1
Normalization of DB Tables
 Normalization
– Process for evaluating and correcting table structures
• determines the optimal assignments of attributes to entities
– Normalization provides micro view of entities
• focuses on characteristics of specific entities
• may yield additional entities
– Works through a series of stages called normal forms
• 1NF -> 2NF -> 3NF -> 4NF (optional)
– Higher the normal form, slower the database response
• more joins are required to answer end-user queries

 So if the response is slow, why do the database people


normalize?
1. Reduce uncontrolled data redundancies
• Help eliminate data anomalies
2. Produce controlled redundancies to link tables

1
Normalization Forms
(simple simple simple)

“Data depends on the key


[1NF]
the whole key
[2NF]
and nothing but the key
[3NF]”

“If all the arrows in FDs are out of a candidate


key” [BCNF]
1
Normalization
 is the process for evaluating and correcting
table structures to minimize data redundancies
– reduces data anomalies

 works through a series of stages called normal


forms
– First normal form (1NF)
– Second normal form (2NF)
– Third normal form (3NF)
 Edgar Codd, inventor of the Relational Model,
introduced the normal forms in 1970s
2
Normalization Example
 We have a table with  Columns
name R1; containing – Order
orders in an online – Product
store – Customer
 Each entry in the table – Address
represents an item on – Quantity
a particular order – UnitPrice

 Primary key is {Order,


Product}

2
2
2
Functional Dependencies
All the Functional Dependencies for this table R1
{Order, Product, Customer, Address, Quantity, UnitPrice}

1. Each order is for a single {Order}-> {Customer}


customer
2. Each customer has a single {Customer}-> ->
address {Address}

3. Each product has a single {Product} -> ->


price: {UnitPrice}

4. From FDs 1 and 2 and


transitivity:Contains partial dependency
{Order} -> {Address}
Address depends only part of the primary 2
Normalization to 3NF
 R{Order, Product, Customer, Address, Quantity,
UnitPrice} has now been split into 3 relations

R1={Order, Customer, Address}
R3={Product, UnitPrice}
R4={Order, Product, Quantity}
 R3 and R4 are in 3NF
 BUT!!! R1 has a transitive FD on its key!!

 To remove this transitive FD from R1


{Order} -> -> {Customer}-> -> {Address}

 We decompose R1 over
– {Order, Customer}
– {Customer, Address}

2
Normalization
 1NF:
– {Order, Product, Customer, Address, Quantity,
 2NFUnitPrice}
(no partial dependency on key)
 {Product, UnitPrice}
 {Order, Product, Quantity}
 {Order, Customer, Address}

 3NF (no transitive dependence on a key)


 {Product, UnitPrice}
 {Order, Product, Quantity}
 {Order, Customer}
 {Customer, Address}

2
Another example: Course
Schedules
 Consider a relation, Schedule, which stores
information about times for various schedules of
courses

 For example: labs for first years


 Each course has several schedules
 Only one schedule (of any course at
all) takes place at any given time
 Each student taking a course is
assigned to a single schedule for it

2
FDs in the Course Schedule
Relation
 Candidate keys: {Student, Course} and {Student, Time}

 Schedule has the following non-trivial FDs


– {Student, Course} -> {Time}
– {Time} -> {Course}

 Since there is no partial dependency(2NF) and


no transitive dependency(3NF), Schedule table
is in 3NF

2
The Schedule Relation
Student Course Time
John Databases
12:00
Mary Databases
12:00
Richard Databases
15:00
Richard Programming
Can you find the candidate keys?
10:00
Candidate keys: {Student, Course} and {Student, Time}
2
Anomalies in Schedule
 INSERT anomalies
– What if there is a new student with no class? You can’t add an
empty schedule item (what if there is a new student with no
class)
 UPDATE anomalies
– Moving the 12:00 class to 9:00 means changing two rows
 DELETE anomalies
– Deleting Rebecca removes one class(time/date) from schedule

Student Course Time


John Databases 12:00
Mary Databases 12:00
Richard Databases 15:00
Richard Programming 10:00
Mary Programming 10:00
Rebecca Programming 13:00 3

You might also like