Overview of Database Management Systems
Overview of Database Management Systems
Module - I
Introduction
Characteristics of Database approach
Actors on the screen
Workers behind the scene
Advantages of using DBMS approach
Data models, schemas and instances
Three-schema architecture and data independence
ER Model: Using High-Level Conceptual Data Models for Database Design
Entity types, Entity sets, Attributes and Keys
Relationships, Relationship types
Roles and Structural Constraints
Weak Entity Types.
Introduction
Data are the known facts that can be recorded and have an implicit meaning. A database is a
collection of logically related data.
Properties of a database:
o It represents some aspect of real world (miniworld).
o It is a logically coherent collection of data with some inherent meaning.
o It is designed, built and populated with data for a specific purpose.
A DBMS is a general purpose software system facilitating each of the following (with respect to
a database):
Defining a database
o Specifying data types, structures, and constraints of the data to be stored in the
database.
Constructing the database
o the process of storing the data on some storage medium (e.g., magnetic disk)
that is controlled by the DBMS
Manipulating the database
o querying the database to retrieve specific data, updating the database to reflect
changes in the miniworld, and generating reports
Sharing a database
o allowing multiple users and programs to access the database "simultaneously"
Maintaining the database
o allowing the system to evolve as requirements change over time
In DBMS
A database system has self-describing nature.
A complete definition and description of the database structure and constraints is
stored in the DBMS catalog (known as metadata),
Application software need not be changed to change description of databases.
A general purpose DBMS software package is not written for a specific database
application.
Application must refer to the catalog to know the structure of the files in a specific
database.
In DBMS
DBMS provides an abstract view of the data that hides the details.
A DBMS provides users with a conceptual representation of data that does not include
the details of how data is stored and how the operations are implemented.
o Program-data independence
o Program-operation independence
Program-data independence
The definition and description of the database is stored in the catalog (known as
metadata); to change description of databases, application software are not changed.
Program-operation independence
In object-oriented relational system, users can define operations on data as part of the
database definitions
An operation is specified in two parts:
o The interface of an operation name and data types of its arguments.
o The implementation of the operation is specified separately and can be changed
without affecting the interface.
User application programs can operate on the data by invoking on these operations
through their names and arguments, regardless of how the operations are
implemented.
A database has many users, different users may have different requirements i.e. each
user requires a different view of the database.
A view may be a subset of the database or it may contain virtual data that is derived
from the database files but is not explicitly stored.
Database Designers
They are responsible for identifying the data to be stored and for choosing an
appropriate structure to represent and store this data.
They also define views for different categories of users. The final design must be able to
support the requirements of all the user sub-groups.
End-users
These are persons who access the database for querying, updating, and report
generation. They are main reason for database's existence!
Tool Developers
They design and implement tools – the software packages that facilitate database
modeling design.
These tools include packages for database design, performance monitoring, natural
language or graphical interfaces, prototyping, simulation, and test data generation.
Tools can be purchased separately which are developed by different vendors.
1. Controlled Redundancy
2. Restricting Unauthorized Access
3. Providing Persistent Storage for Program Objects
4. Providing Storage Structures for Efficient Query Processing
5. Providing Backup & Recovery
6. Multiple user interfaces
7. Representing Complex Relationships among data
8. Enforcing Integrity Constraints
9. Permitting Inferencing and Action Using Rules
1. Controlled Redundancy
In the file processing approach, each user defines and implements the files needed and
software applications to manipulate those files.
Various files are likely to have different formats and programs may be written in
different languages and same information may be duplicated in several files.
Data redundancy leads to
o wasted storage space,
o duplication of effort (when multiple copies of a datum need to be updated),
o a higher likelihood of the introduction of inconsistency.
Database design stores each logical data item at one place to ensure consistency and
saves storage.
But sometimes, controlled redundancy is necessary to improve the performance.
Database should have capability to control this redundancy & maintain consistency by
specifying the checks during database design.
3. Flexibility
It may be necessary to change the structure of a database as requirements change.
Modern DBMS allows certain type of changes to the structure of database without
affecting the stored data and the existing application programs.
4. Availability of Up-to date information
Availability of immediate up to date information is essential for many transaction
processing applications.
In DBMS, update applied to database by one user can immediately be seen by other
users.
5. Economies of scale
The DBMS approach permits consolidation of data and applications, thus reducing the
amount of wasteful overlap between the activities of data processing personnel, also
reduce the storage space.
Database approach provides data abstraction. Database abstraction refers to the suppression
of details of data organization and storage and the highlighting of the essential features for an
improved understanding of data. Structure of a database includes data types, relationships,
and constraints that should hold on the data. A data model is a collection of concepts that can
be used to describe the structure of a database and thus it provides the necessary means to
achieve this abstraction. Most data models also include a set of basic operations for specifying
retrievals and updates on the database. Data models also include the dynamic aspect or
behavior of a database application. Concepts to specify behavior are fundamental to object
oriented data models but are also being incorporated in more traditional relational data models
(e.g. stored procedures).
The description of a database is called the database schema which is specified during database
design and is not expected to change frequently. Schema diagram displays structure of each
record type but not the actual instances of records. Each object in the schema is known as a
schema construct e.g. student table. A schema diagram displays only some aspects of a schema
such as names of record types and data items, and some type of constraints; other aspects are
not specified.
An Example
The data in the database at a particular moment is called a database state or snapshot. This
state is also called the current set of occurrences or instances in the database. Many database
states can be constructed to correspond to a particular database. When a new database is
defined, database state is empty state with no data. When database is populated, database
enters in initial state. The DBMS is partly responsible for ensuring that every state of the
database is a valid state - that is, a state that satisfies the structure and constraints specified in
the schema. Hence, specifying a correct schema to the DBMS is extremely important. DBMS
stores the descriptions of the schema constructs and constraints in the catalog (meta-data).
Schema is called the intension and state is called as extension. Changes in application
requirements result in schema evolution.
Three-Schema Architecture
The schema in DBMS can be described at three levels:
o Internal level has an internal schema
o Conceptual level has a conceptual schema
o External level includes a number of external schemas or user views
The information about all three schemas is stored in the system catalog.
Internal Schema
The internal schema specifies complete details of storage and access paths for the database.
File organization on the disk should be decided e.g. hashing, indexing etc. The process of
arriving at a good physical database schema is called physical database design.
Conceptual Schema
The conceptual schema (or logical schema) describes the structure of the database. The
conceptual schema hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, user operations, and constraints. A
representational data model is used to describe the conceptual schema when a database is
implemented. The process of arriving at a good conceptual database schema is called
conceptual database design.
External Schema
The external schemas allow data access to be customized at the level of individual users and
hide rest of the details from the users. Any database has exactly one conceptual schema and
one physical schema but may have many external schemas in view to support different users.
Each external schema consists of a collection of one or more number of views or tables. The
external schema design is guided by end user requirements e.g. courseinfo (cid, fname, sid).
Important points
Three schemas are only descriptions of data; the stored data that actually exists is at the
physical level.
DBMS must transform a request specified on an external schema into a request against
the conceptual schema, and then into a request on the internal schema for processing
over the stored database.
The process of transforming requests and results between levels are called mappings
Data Independence
One of the most important benefits of using three schema architecture is its support for data
independence. Applications are insulated from how data are structured and stored. Data
independence is the capacity to change the schema at one level of a database without changing
the schema at the next higher level.
o Logical data independence
o Physical data independence
Important points
Physical data independence exists in most databases.
But logical data independence is hard to achieve.
Data independence occurs because when the schema is changed at some level, the
schema at higher level remains unchanged; only the mapping between the two levels is
changed.
Two levels of mappings create an overhead during compilation or execution of a query
or program, leading to inefficiencies in the DBMS.
Entity-Relationship Model
ER Model
A database can be modeled as:
o a collection of entities having attributes
o relationship among entities
Entity, Entity type, Entity set
Attributes, Type of attributes
Keys
Relationships, Relationship type
Roles
Constraints
Entities
An entity is an object that has existence and is distinguishable from other objects.
Physical existence
o Person, car, employee etc.
Conceptual existence
o Company, job, university course
An entity lies within the scope of the business world being modeled. Each entity has attributes
– the particular properties that describe it.
Attributes
Attributes are properties used to describe an entity. For example an EMPLOYEE entity may have
a Name, SSN, Address, Sex, BirthDate. A specific entity will have a value for each of its
attributes. For example a specific employee entity may have Name=‘Ram', SSN='123456789',
Address ='731, RR Nagar, Bangalore, Karnataka', Sex='M', BirthDate='09-JAN-65’. Each attribute
has a data type associated with it e.g. integer, string, date etc. Each attribute must have a
unique name across the entity.
Types of Attributes
Simple
Each entity has a single atomic value for the attribute. For example, SSN or Sex.
Composite
The attribute may be composed of several components. The value of a composite attribute is
the concatenation of the values of its simple attributes. Composition may form a hierarchy
where some components are themselves composite. If components of the composite
attributes have to be referred, it is necessary to store the components separately. If composite
attribute is referred only as a whole, there is no need to subdivide it into component attributes
e.g. if address has to be referred as a whole only, there is no need to divide it.
For Example:
Address (Apt#, House#, Street, City, State, ZipCode, Country)
Name (FirstName, MiddleName, LastName).
Single-valued
Only single value for a particular entity e.g. DOB
Multi-valued
An entity may have multiple values for that attribute.
Multiple value may have lower and upper bounds on the number of values.
For example, Color of a CAR: {Color}
phone_numbers: {phone number}
Stored attributes
Stored in the database
Derived attributes
Can be computed from other related attributes
E.g. Birthdate ---stored
Age ----derived
Joining date ------stored
Year of Experience ---derived
Null values
A particular entity may not have the value for a particular attribute
o Value is not applicable
o Value is unknown
A special value NULL is written
e.g. Every employee may not have Fax no.
Complex attributes
Composite attributes can be nested arbitrarily.
Components of a composite attribute can be shown in () and multivalued attribute can
be shown in {}.
For example:
{Address_phone (
{phone (Area_code, phone_number)},
Address (Street_Address
(Number, Street, Apartment_No), City, State, Zip)
) }
Entity Type
An entity type defines a collection of entities that have the same attributes
Each entity is defined in database by its name and attributes.
Each entity type must have a name that is unique across the entire model and has a
consistent meaning across the modelling team and the end users.
For example,
o EMPLOYEE is an entity type
o PROJECT is an entity type.
Entity set
The collection of all entities of a particular entity type in the database at any point of
time is called an entity set.
The entity set is usually referred to using the same name as the entity type.
For example,
EMPLOYEE refers to both
o ‘type of entity’
o ‘set of entity’
Notation
Entity type by rectangular box enclosing the entity type name
Attributes by ovals attached to entity type by straight line
Example
Key attributes
An attribute of an entity type for which each entity must have a unique value is called a
key attribute of the entity type.
For example, SSN of EMPLOYEE
(Key attribute defines the each entity of an entity type uniquely)
Uniqueness of key attribute must hold for every extension of the entity type.
It is not the property of a particular extension of the entity type, it is a constraint on all
extension of entity type.
A key attribute may be composite.
For example,
Registration is a composite key of the CAR entity type with components (Number,
State).
Some entity types have more than one key attributes, those attributes can behave as keys
on their own separately; they are called candidate keys. All key attributes should be
underlined in ER diagram. Selected key will work as Primary key. Other potential keys will
be Alternate keys. An entity type may have no key; it is called a weak entity type.
A : E P(V)
A is Attribute of E Entity Type having value set V, is a function to the power set P(V)
For composite attribute
V = P(V1) * P( V2) * ……* P(Vn)
Where V1, V2 ….Vn are the value set of each component of composite attribute.
Relationships
A relationship is an association among two or more entities.
Whenever an attribute of one entity type refers to the attribute of another entity type,
some relationship exists.
Specifically a relationship relates two or more distinct entities with a specific meaning
o For example, manager of the DEPARTMENT refers to an employee who manages
the department.
Relationships Type
Relationships of the same type are grouped into a relationship type.
o For example, the WORKS_FOR relationship type in which EMPLOYEEs and
DEPARTMENTs entities participate,
o or The MANAGES relationship type in which EMPLOYEEs and DEPARTMENTs
entities participate.
More than one relationship type can exist with the same participating entity types.
For example, MANAGES and WORKS_FOR are distinct relationships between EMPLOYEE
and DEPARTMENT, but with different meanings and different relationship instances.
Relationships Set
A relationships set is a set of relationships of the same type.
A relationship set and a relation type are referred to by the same name.
The relationship set R is a set of relationship instances ri, where each ri associates n
individual entities (e1, e2, ….., en) and each entity ej in ri is a member of entity type Ej,
1≤ j ≤ n
E.g. each employee and department participates in the relationship of works_for.
Some instances in the WORKS_FOR relationship set, which represents a relationship type
WORKS_FOR between Employee and department.
Binary Relationship
When two entities participate in relation.
WORKS_FOR is binary relationship and participating entities are EMPLOYEE and
DEPARTMENT
Ternary Relationship
When three entities participate in relation.
SUPPLY is ternary relationship and participating entities are SUPPLIER, PROJECT and
PARTS.
Role names
Each entity type that participates in a relationship type plays a particular role in the
relationship.
Role name signifies the role that a participating entity plays in each relationship
instance.
For example,
o In the WORKS_FOR relationship type, EMPLOYEE plays the role of employee or
worker and DEPARTMENT plays a role of department or employer.
Each participating entity type name can be used as role name.
Recursive relationship
In some cases, same entity participates more than once in a relationship type and plays
different roles.
In such cases, role names become necessary for distinguishing the meaning of each
participation.
Such relationship types are called Recursive relationship.
Constraints on Relationships
Structural Constraints on Relationship Types (Also known as ratio constraints) are
determined from the mini-world situation.
o Maximum Cardinality (or cardinality ratio)
o Minimum Cardinality (also called participation constraint or existence dependency
constraints)
Total Participation
Total Participation is a constraint when every entity in the entity set participates in at least
one relationship in the relationship set.
Total Participation is also called existence dependency.
Shown by double lining the link
For example
o Every employee must work in some department
o Every employee must work on some project
Partial Participation
Partial participation is the constraint when some entities may not participate in any
relationship in the relationship set.
3. An entity type EMPLOYEE with attributes SSN, Name, Sex, Address, Salary, Birth_date,
Department and Supervisor. Name & Address are composite, SSN is key attribute. Projects and
Number_Of_Hours for which Employee is working.