Database Notes
Database Notes
Lecturer:
Asharul Islam
2
Major Topics
Databases
Data concepts
Normalization
Guidelines for master file/database relation design
Making use of database
Data warehouses
Learning Objectives
Upon the successful completion of this chapter the
student will be able to:
Understand database concepts.
Use normalization to efficiently store data in a database.
Use databases for presenting data.
Understand the concept of data warehouses.
3
Databases
A database is a central source of data meant to be shared by
many users for a variety of applications.
The heart of a database is the DBMS (DataBase
Management System), which allows
The creation,
Modification,
Updating of database,
The retrieval of data and
The generation of reports.
The person who ensures that the database meets its
objectives is called the database administrator.
4
Objectives of Effective Databases
The effectiveness objectives of the database include:
1) Ensuring that data can be shared among users for a
variety of applications.
2) Maintaining accurate and consistent data.
3) Ensuring all data required for current and future
applications will be readily available.
4) Allowing the database to evolve and the needs of the users
to grow.
5) Allowing users to construct their personal view of the
data without concern for the way the data are
physically stored.
5
Database Concepts
The three important concepts of database include:
1) Reality
The real world will be referred to as reality .
2) Data
Collection of facts and figures about people, events,
objects or places.
Data collected will eventually be stored in a file or
database.
3) Metadata
Data about data.
The information that describes data is referred to as
Metadata.
6
Cont…
7
Relationship between Reality, Data
and Metadata
8
Entities (Fundamental)
Entity is a distinct collection of data for one person, place,
thing, or event about which the organization wish to
maintain data.
Entities become files of database tables.
Its represented by a rectangle.
Examples:
Employee
Student
Semester
Appointment
Employee
Software Package
Patient etc.
9
Student
Entity Subtype Internship
Many One
19 Many NoneO
Examples of 1:1 Relationship
12
Example of 1:M Relationship
13
Example of M:N Relationship
14
Self-Join
An entity having a
relationship connecting to
itself is called self-join.
A self-join is when a record
has a relationship with
another record on the same
file.
Also called recursive
relationship. Employee reports to Employee:
15
ER Symbol and their meaning
16
Attributes, Records, and Keys
Attributes:
Attribute is a characteristic of an entity, sometimes
called a “field” or “data items”.
They are represented in database in form of data items.
Data items can have alphabetic, numeric, or alpha-
numeric values.
Records:
Records are a collection of data items that have
something in common with the entity described.
Keys:
Key is one of the data items in a record used to identify a
record.
17
Cont…
A record has a
primary key and
may have many
attributes.
18
Key Types
There are many types of keys which are:
1) Primary key
1) That uniquely identifies the record.
2) Secondary key
That can not uniquely identifies the record.
3) Concatenated key
That is a combination of two or more data items for the key to
uniquely identify the record.
4) Foreign key
That is a data item in one record that is the primary key of another
record.
19
Metadata
Data about the data in
the file or database.
It describe the name
given and the length
assigned each data
item.
Also it describe the
length and composition
of each of the records.
Metadata includes a
description of what the
value of each data item
looks like.
20
File and it Types
A file contains groups of records used to provide information
for operations, planning, management, and decision making.
Files can be used for storing data for an indefinite period of
time, or they can be used to store data temporarily for a
specific purpose.
A file have many types which are:
1) Master file
2) Table file
3) Transaction file
4) Report file
21
1): Master & Table Files
Master files:
Contain records for a group of entities.
Contain all information about a data entity.
Each record of a Master file generally contains a primary key
and several secondary keys.
Examples:
Patient records
Customer records
Personnel file
Table files:
Table file contains data used to calculate more data or
performance measures.
They usually read-only by a program.
For example: Tax table
22
Transaction and Report Files
Transaction files:
A transaction file is used to enter changes that update the master
file and produce reports.
Suppose a newspaper subscriber master file needs to be updated;
the transaction file would contain
the subscriber number, and a transaction code such as E for
extending the subscription, C for canceling the subscription, or A for
address change.
Report files:
A report file is used when it is necessary to print a report when
no printer is available. e.g., when the printer is busy.
Sending the output to a file rather than a printer is called
spooling. Later, when the device is ready, the document can be
printed.
It is useful because users can take files to other computer
systems and output to specialty devices.
23
Relational Database Structures
A Relational database structures consists of one or more two–
dimensional tables that are referred to as Relations or tables.
The rows of the tables represents the Records also called
Tuple.
The columns contains Attributes and the attribute value set is
called Domain.
Maintaining the table in Relational database structures is
comparatively easy.
Its advantage is that queries are handled more efficiently here.
24
Normalization
Normalization is the transformation of complex
user views and data to a set of smaller, stable, and
easily maintainable data structures.
Normalization creates data that are stored only once
on a file i.e. eliminates redundancy.
The exception is key fields.
It provides ideal data storage for database systems.
25
Three Steps of Normalization
The three steps of data
normalization are
FIRST: Remove all repeating groups
and identify the primary key.
SECOND: Remove all Partial
dependencies by ensuring that all
nonkey attributes are fully
dependent on the primary key.
THIRD: Remove any transitive
dependencies, attributes which
are dependent on other nonkey
attributes.
26
Data Model Diagrams
Also called Bubble Diagram.
It shows data associations of data
elements
Although it is possible to draw these
relationships with an E-R diagram,
it is sometimes easier to use the
simpler bubble diagram to model the
data.
Each entity is enclosed in an ellipse
Arrows are used to show the
relationships
A single arrow line represents one.
A double arrow line represents
27
many.
First Normal Form (1NF)
Remove any repeating groups.
All repeating groups are moved into a new table.
Foreign keys are used to link the tables.
When a relation contains no repeating groups, is in the first
normal form.
Keys must be included to link the relations.
Second Normal Form (2NF)
Remove any partial dependencies.
A partial dependency is when the data are only dependent
on a part of a key field.
A relation is created for the data that are only dependent on
part of the key and another for data that are dependent on
both parts.
28
Third Normal Form (3NF)
Remove any transitive dependencies.
A transitive dependency is when a relation contains data
that are not part of the entity.
The problem with transitive dependencies is updating
the data.
A single data item may be present on many records.
29
Entity-Relationship Diagram and
Record Keys
The entity-relationship diagram may be used to determine
record keys:
When the relationship is one-to-many, the primary key of the
file at the one end of the relationship should be contained as a
foreign key on the file at the many end of the relationship.
A many-to-many relationship should be divided into two one-
to-many relationships with an associative entity in the middle.
30
Guidelines For Master
File/Database Relation Design
Guidelines for creating database relations are:
1) Each separate entity should have it's own master file or
database relation. Do not combine two distinct entities on
one file.
2) A specific, nonkey data field should exist on only one
master file or relation.
3) Each master file or database relation should have
programs toC reate, Read,Update, andDelete
(CRUD) the records.
31
Integrity Constraints
There are three integrity constraint that help to ensure
that the database contains accurate data:
1) Entity integrity constraints, which govern the
composition of primary keys.
2) Referential integrity, which governs the denature of
records in a one-to-many relationship.
3) Domain integrity, that is used to validate the data.
32
(1): Entity Integrity Constraints
Entity integrity constraints are rules for primary keys.
These rules state that:
The primary key cannot have a null value.
If the primary key is a composite key, none of the fields in the
key can contain a null value.
Some databases allow you to define a unique constraint or a
unique key.
This unique key identifies only one record.
The difference between a unique key and a primary key is
that a unique key may contain a null value.
33
(2): Referential Integrity Constraints
Referential integrity governs the nature of records in a one-
to-many relationship.
Referential integrity means that all foreign keys in the child
table must have a matching record in the parent table. (Note
that the table connected to the “one” end of the entity is called the parent and the table
connected to the “many” end of the entity is called the child.)
34
Implementation of Referential
Integrity constraints
Two implementation approaches are there:
35
(3): Domain Integrity Constraints
Domain integrity constraints define rules that ensure that
only valid data are stored on database records.
Domain integrity has two forms:
1) Check constraints,
Which are defined at the table level.
Example: DATE OF PURCHASE is always less than or equal to
the current date.
2) Rules,
Which are defined as separate objects at database level.
May be used within a number of fields.
36
Making Use Of The Database
The below given steps should be followed in sequence to
assure accurate retrieval and presentation of data.
1) Choose a relation from the database.
2) Join two relations together (Join).
3) Project columns from the relation (Projection).
4) Select rows from the relation (Selection).
5) Derive new attributes (Derivation).
6) Index or sort rows.
7) Calculate totals and performance measures.
8) Present data.
The first and last steps must be done, but the six steps in
between are optional, depending on how data are to be used.
37
38