Unit-I-Data Models
Unit-I-Data Models
of DBMS
By
Dr. Rifaqat Ali
Department of Mathematics and Scientific Computing
National Institute of Technology Hamirpur
Email id : rifaqatali@nith.ac.in
Schemas, Subschema and Instances:
Schema:
A schema is plan of the database that give the names of the entities
and attributes and the relationship among them.
A schema includes the definition of the database name, the record
type and the components that make up the records.
Alternatively, it is defined as a frame-work into which the values of
the data items are fitted.
The values fitted into the frame-work changes regularly but the
format of schema remains the same.
For Example: Consider the database consisting of three files ITEM,
CUSTOMER and SALES.
Continued…
A schema can be divided into two categories, i.e., (i) Logical
schema and (ii) Physical schema
(i) The logical schema is concerned with exploiting the data
structures offered by the DBMS so that the schema becomes
understandable to the computer. It is important as programs use it
to construct applications.
(ii) The physical schema is concerned with the manner in which the
conceptual database get represented in the computer as a stored
database. It is hidden behind the logical schema and can usually be
modified without affecting the application programs.
Subschema:
A subschema is a subset of the schema having the same
properties that a schema has.
It identifies a subset of areas, sets, records, and data names
defined in the database schema available to user sessions.
The subschema allows the user to view only that part of the
database that is of interest to him.
The subschema defines the portion of the database as seen by
the application programs and the application programs can have
different view of data stored in the database.
Instances: The data in the database at a particular moment of time is
called an instance or a database state.
• In a given instance, each schema constructs its own current set of
instances.
• Everytime we update (i.e., insert, delete or modify) the value of a data
item in a record, one state of the database changes into another state.
Three Level Architecture of Database Systems:
• The three level architecture was suggested by ANSI/SPARC.
• The database is divided into three levels:
i. External level,
ii. Conceptual level
iii. Internal (Physical) level
(i) Internal (Physical) Level:
This level describes the actual physical
storage of data in which the data is
actually stored in memory.
This level is not relational because
data is stored according to various
coding schemes instead of tabular
form.
This is the low level representation of
entire database.
The internal level is concerned with
the following aspects: Storage space
allocation, Access paths, Data
compression and encryption
techniques.
(ii) Conceptual Level:
Also known as logical level which
describes the overall logical structure of
whole database for a community of
users.
This level is relational because data
visible at this level will be relational
tables and operators will be relational
operators.
This level represents entire contents of
the database in an abstract form in
comparison with physical level.
The conceptual schema is defined which
hides the actual physical storage and
concentrate on relational model of
database.
(iii) External Level:
It is concerned with individual
users and describes the actual
view of data seen by individual
users.
It is defined by the DBA for every
user.
The remaining part of database is
hidden from that user. This means,
user can access only that part of
database for which s/he is
authorized by DBA.
Different Mappings in Three Level Architecture of DBMS:
Each user is able to access the same data but have a different
customized view of the data as per the requirement.
The changes to physical storage organization does not affect the
internal structure of the database. e.g., moving the database to a
new storage device.
To use the database, the user is no need to concern about the
physical data storage details.
The conceptual structure of the database can be changed by the DBA
without affecting any user.
The database storage structure can be changed by the DBA without
affecting the user’s view.
Data Independence:
It is defined as the characteristics of a database system to
change the schema at one level without affecting the change
schema at the next higher level.
It can also be defined as the immunity of the application
programs to change in the physical representation and access
techniques of the database.
The above definition says that the application programs do not
depend on any particular physical representation or access
technique of the database.
The data independence is of Two types: Physical and Logical
1. Physical Data Independence:
The physical storage structures used for storing the data could
be changed without changing the conceptual view or any of the
external views.
Only the mapping between the conceptual and internal level is
changed.
2. Logical Data Independence:
The conceptual schema can be changed without changing the
existing external schemas.
Only the mapping between the external and conceptual level is
changed and absorbed all the changes of the conceptual
schema.
Data Models
• A data model is used to describe the structure of the database
including data types, relationships and the constraints that apply
on the data.
• A data model helps in understanding the meaning of the data
and ensures in the following way:
The data requirements of each user.
The use of data across various applications.
The nature of data independent of its physical representations.
• A data model supports communication between the users and
database designers.
Characteristics of Data Models:
Diagrammatic representation of the data model.
Simplicity in designing i.e., Data and their relationships can be
expressed and distinguished easily.
Application independent, so that different applications can share
it.
Data representation must be without duplication.
Bottom-up approach must be followed.
Consistency and structure validation must be maintained.
Types of Data Models:
• In this model, the data is represented in the form of tables which is used
interchangeably with the word Relation.
• Each table consists of rows also knowns as tuples (A tuple represents a
collection of information about an item, e.g., student record) and column
also known as attributes. (An attribute represents the characteristics of an
item, e.g., Student’s Name and Phone No.).
Advantages: it is structurally independent, improved conceptual simplicity
adhoc query capability and powerful DBMS.
Disadvantages: are substantial hardware and software overhead and
facilitates poor design and implementation.
(ii) Object Based Data Models: These models are used in
describing the data at the logical and user view levels.
These models allow the users to implicity specify the constraints
in the data.
Further categorised into four types:
a) Entity Relationship Model (ER-Model)
b) Object Oriented Model
c) Semantic Data Model
d) Functional Data Model
a) Entity Relationship Model (ER-Model)
• The E-R model is a high level conceptual data model developed by Chen in
1976 to facilitate database design.
• The E-R model is the generalization of earlier available commercial model
like the hierarchical and network model.
• It also allows the representation of the various constraints as well as their
relationships.
• Advantages: it is conceptually simple, an effective communication tool and
can be integrated with the relational data model.
• Disadvantages: there are limited constraint representation, limited
relationship representation, no data manipulation language and loss of
information content.
b) Object Oriented Model
• It is a logical data model that captures the semantics of objects supported in
an object-oriented programming.
• It is based on collection of objects, attributes and relationships which together
form the static properties.
• An object is a collection of data and methods. When different objects of same
type are grouped together they form a class. This model is used basically for
multimedia applications as well as data with complex relationships.
Advantages: Capability to handle various data types, improved data access,
and improved productivity.
Disadvantages: Not suitable for all applications, No precise definition, and
Difficult to maintain.
c) Semantic Data Model
• These models are used to express greater interdependencies among
entities of interest.
• These interdependencies enable the models to represent the semantic of
the data in the database.
• The functional data model describes those aspects of a system concerned with
transformation of values-functions, mappings, constraints and functional
dependencies.
Comparison of Various Data Models
(iii) Physical Data Models: These models provide the concepts that
describes the details of how the data is stored in the computer along
with their record structures, access paths and ordering.
Further divided into two types:
a) Unifying Model: is expressed as Unified Modeling Language (UML).
Class diagrams to capture the logical structure of a domain as a set
of classes, their features (attributes), and the relationships
(associations) between them.
b) Frame Memory Model: is a virtual view of secondary storage that
can be implemented with reasonable overhead to support database
record storage and accessing requirements.
Note that: Only specialized or professional users can use these models.
Database Languages:
The generalized database language, which is common in all databases, is
Structured Query Language (SQL).
It is divided into the following four parts: DDL, DML, DCL, and TCL
1. Data Definition Language (DDL): The database language which is used to
define database objects; to drop database objects; to alter (change) database
objects, such as tables, views, users, is known as Data Definition Language
(DDL).
For example:
DDL to create a ZONE table.
Create table zone (zoneid integer, zonename char(20));
DDL to alter a ZONE table which changes the data type of a zoneid from
integer to char(1).
Alter table zone modify column zoneid char(1);
DDL to drop (delete) a ZONE table.
Drop table zone;
Continued….
2. Data Manipulation Language (DML): The database language, which is
used to insert data, manipulate data; delete data or retrieve data in tables or
views, is known as Data Manipulation Language (DML).
For example:
DML to insert data in a ZONE table.
Insert into zone values(1, ‘North’);
DML to change the value of zonename from ‘North’ to ‘East’ for
zoneid = 1.
Update zone set zonename = ‘East’ where zoneid = 1;
DML to delete a record of zoneid 1 from ZONE table.
Delete from zone where zoneid = 1;
DML to retrieve a record of zoneid 1 from ZONE table.
Select * from zone where zoneid = 1;
Continued….
Note that: Data Sub Language (DSL): The combination of Data Definition
Language and Data Manipulation Language is known as Data Sub Language
(DSL).
Difference between Data Independence and Structural Independence: