DBMS Chapter2 Concepts and Architecture
DBMS Chapter2 Concepts and Architecture
According to C.J. Date (one of the leading database experts), a data model is an
abstract, self-contained, logical definition of the objects, operators, and so forth,
that together constitute the abstract machine with which users interact. The
objects allow us to model the structure of data; the operators allow us to model its
behavior.
Types of Data Models
1. High Level- Conceptual data model.
2. Low Level – Physical data model.
3. Relational or Representational
4. Object-oriented Data Models:
5. Object-Relational Models:
1. High Level-conceptual data model: User level data model is the high level or
conceptual model. This provides concepts that are close to the way that many
users perceive data.
2 .Low level-Physical data model: provides concepts that describe the details of
how data is stored in the computer model. Low level data model is only for
Computer specialists not for end-user.
3. Representation data model: It is between High level & Low level data model
Which provides concepts that may be understood by end-user but that are not too
far removed from the way data is organized by within the computer.
1. Relational Model
The Relational Model uses a collection of tables both data and the relationship
among those data. Each table has multiple columns and each column has a unique
name.
Relational database comprising of two tables.
Advantages
1. The main advantage of this model is its ability to represent data in a simplified
format.
2. The process of manipulating record is simplified with the use of certain key
attributes used to retrieve data.
3. Representation of different types of relationship is possible with this model.
2. Network Model
The data in the network model are represented by collection of records and
relationships among data are represented by links, which can be viewed as
pointers.
Advantages:
1. Representation of relationship between entities is implemented using pointers
which allows the representation of arbitrary relationship
2. Unlike the hierarchical model it is easy.
3. Data manipulation can be done easily with this model.
3. Hierarchical Model
A hierarchical data model is a data model which the data is organized into a tree
like structure. The structure allows repeating information using parent/child
relationships: each parent can have many children but each child has one parent.
All attributes of a specific record are listed under an entity type.
Advantages:
1. The representation of records is done using an ordered tree, which is
natural method of implementation of one–to-many relationships.
2. Proper ordering of the tree results in easier and faster retrieval of records.
Allows the use of virtual records. This result in a stable database especially when
modification of the data base is made
A schema diagram can display only some aspects of a schema like the name of
record type, data type, and constraints. Other aspects can't be specified through
the schema diagram. For example, the given figure neither show the data type
of each data item nor the relationship among various files.
In the database, actual data changes quite frequently. For example, in the given
figure, the database changes whenever we add a new grade or add a student.
The data at a particular moment of time is called the instance of the database.
DBMS Architecture ( three Schema Architecture)
A commonly used view of data approach is the three-level architecture
suggested by the ANSI/SPARC (American National Standards
Institute/Standards Planning and Requirements Committee). ANSI/SPARC
proposed an architectural framework for databases.
The three levels of the architecture are three different views of the data:
External Schema - individual user view
Conceptual Schema- Logical or community user view Physical Schema -
Internal or storage view
The three level database architecture allows a clear separation of the information
meaning (conceptual view) from the external data representation and from the
physical data structure layout. A database system that is able to separate the
three different views of data is likely to be flexible and adaptable.
The External Schema is the view that the individual user of the database has. This
view is often a restricted view of the database and the same database may
provide a number of different views for different classes of users.
The Conceptual schema (sometimes called the logical schema) describes the
stored data in terms of the data model of the DBMS. In a relational DBMS, the
conceptual schema describes all relations that are stored in the database.
It hides physical storage details, concentrating upon describing entities, data
types, relationships, user operations, and constraints.
The physical schema specifies additional storage details. Essentially, the physical
schema summarizes how the relations described in the conceptual schema are
actually stored on secondary storage devices such as disks and tapes.
It tells us what data is stored in the database and how.
Mappings
Process of transforming request and results between three level it's called mapping.
1. Conceptual/Internal Mapping
2. External/Conceptual Mapping
1. Conceptual/Internal Mapping:
The conceptual/internal mapping defines the correspondence between the
conceptual view and the store database.
It specifies how conceptual record and fields are represented at the internal level.
It relates conceptual schema with internal schema.
If structure of the store database is changed.
If changed is made to the storage structure definition-then the conceptual/internal
mapping must be changed accordingly, so that the conceptual schema can remain
invariant.
There could be one mapping between conceptual and internal levels.
2. External/Conceptual Mapping:
Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one
level of the database system without altering the schema at the next higher level.
For instance, the following statement in the SQL language is used to create the
account table:
create table account
(accountnumber number(10),
balance number(8));
The storage definition language (SDL), is used to specify the internal schema. The
mappings between the two schemas may be specified in either one of these
languages.
The view definition language (VDL), to specify user views and their mappings to
the conceptual schema, but in most DBMSs the DDL is used to define both
conceptual and external schemas.
In addition, it updates a special set of tables called the data dictionary or data
directory.
A data dictionary contains metadata—that is, data about data. The schema of a
table is an example of metadata. A database system consults the data dictionary
before reading or modifying actual data.
Data Manipulation Language
Data manipulation is
• The retrieval of information stored in the database
• The insertion of new information into the database
• The deletion of information from the database
• The modification of information stored in the database
Data Manipulation Language (DML) statements are used for managing data within
schema objects. Some examples:
o SELECT - retrieve data from the a database
o INSERT - insert data into a table
o UPDATE - updates existing data within a table
o DELETE - deletes all records from a table
The query in the SQL language finds the name of the customer whose customer-id
is 192:
select customername
from customer
where customerid = 192;
The query specifies that those rows from the table customer where the customerid
is 192 must be retrieved.
DBMS Interfaces
User-friendly interfaces provided by a DBMS may include the following.
Menu Based Interfaces for Web Clients or Browsing. These interfaces present the
user with lists of options, called menus, that lead the user through the formulation
of a request. Menus do away with the need to memorize the specific commands
and syntax of a query language; rather, the query is composed step by step by
picking options from a menu that is displayed by the system. Pull-down menus are
a very popular technique in Web-based user interfaces. They are also often used in
browsing interfaces, which allow a user to look through the contents of a database
in an exploratory and unstructured manner.
Forms Based Interfaces. A forms-based interface displays a form to each user.
Users can fill out all of the form entries to insert new data, or they fill out only
certain entries, in which case the DBMS will retrieve matching data for the
remaining entries. Forms are usually designed and programmed for naive users as
interfaces to canned transactions.
Graphical User Interfaces. A graphical interface (GUI) typically displays a schema
to the user in diagrammatic form. The user can then specify a query by
manipulating the diagram. In many cases, GUIs utilize both menus and forms.
Most GUIs use a pointing device, such as a mouse, to pick certain parts of the
displayed schema diagram.
Natural Language Interfaces. These interfaces accept requests written in English
or some other language and attempt to "understand" them. A natural language
interface usually has its own "schema," which is similar to the database conceptual
schema, as well as a dictionary of important words. The natural language interface
refers to the words in its schema, as well as to the set of standard words in its
dictionary, to interpret the request.
Interfaces for Parametric Users. Parametric users, such as bank tellers, often
have a small set of operations that they must perform repeatedly. Systems
analysts and programmers design implement a special interface for naive users.
Usually, a small set of abbreviated commands is included, with the goal of
minimizing the number of keystrokes required for each request.
Interfaces for the DBA. Most database systems contain privileged commands that
can be used only by the DBA's staff. These include commands for creating
accounts, setting system parameters, granting account authorization, changing a
schema, and reorganizing the storage structures of a database.
the top part of Figure, It shows interfaces for the DBA staff, casual users who work
with interactive interfaces to formulate queries, application programmers who
create programs using some host programming languages, and parametric users
who do data entry work by supplying parameters to predefined transactions. The
DBA staff works on defining the database and tuning it by making changes to its
definition using the DDL and other privileged commands.
The DDL compiler processes schema definitions, specified in the DDL, and stores
descriptions of the schemas (meta-data) in the DBMS catalog. The catalog
includes information such as the names and sizes of files, names and data types of
data items, storage details of each file, mapping information among schemas, and
constraints.
Casual users and persons with occasional need for information from the database
interact using some form of interface, which we call the interactive query
interface. These queries are parsed and validated for correctness of the query
syntax, the names of files and a query compiler that compiles them into an
internal form. This internal query is subjected to query optimization, the query
optimizer is concerned with the rearrangement and possible reordering of
It is now common to have the client program that accesses the DBMS running on
a separate computer from the computer on which the database resides. The
former is called the client computer running a DBMS client software and the
latter is called the database server. In some cases, the client accesses a middle
computer, called the application server, which in turn accesses the database
server.
Database System Utilities
In addition to possessing the software modules just described, most DBMSs have
database utilities that help the DBA manage the database system. Common
utilities have the following types of functions:
Loading. A loading utility is used to load existing data files—such as text
files or sequential files—into the database. Usually, the current (source) format of
the data file and the desired (target) database file structure are specified
to the utility, which then automatically reformats the data and stores it
in the database.
Backup. A backup utility creates a backup copy of the database, usually by
dumping the entire database onto tape or other mass storage medium. The
backup copy can be used to restore the database in case of disk failure.
Database storage reorganization. This utility can be used to reorganize a set
of database files into different file organizations, and create new access paths
to improve performance.
Performance monitoring. Such a utility monitors database usage and provides
statistics to the DBA. The DBA uses the statistics in making decisions
such as whether or not to reorganize files or whether to add or drop indexes
to improve performance.
Other utilities may be available for sorting files, handling data compression,
monitoring access by users, interfacing with the network, and performing other
functions.
Architectures for DBMSs have followed trends similar to those for general
computer system architectures. Earlier architectures used mainframe computers to
provide the main processing for all system functions, including user application
programs and user interface programs, as well as all the DBMS functionality. The
reason was that most users accessed such systems via computer terminals that
did not have processing power and only provided display capabilities. Therefore, all
processing was performed remotely on the computer system, and only display
information and controls were sent from the computer to the display terminals,
which were connected to the central computer via various types of
communications networks.
As prices of hardware declined, most users replaced their terminals with PCs and
workstations. At first, database systems used these computers similarly to how
they had used display terminals, so that the DBMS itself was still a centralized
DBMS in which all the DBMS functionality,
application program execution, and
user interface processing were carried out
on one machine.
The Figure illustrates the physical
components in a centralized
architecture. Gradually,
DBMS systems started to exploit
the available processing power at
the user side, which led to
client/server DBMS architectures.
Two-tier Client / Server architecture is used for User Interface program and
Application Programs that runs on client side. An interface called ODBC(Open
Database Connectivity) provides an API that allow client side program to call the
dbms. Most DBMS vendors provide ODBC drivers. A client program may connect to
several DBMS's. In this architecture some variation of client is also possible for
example in some DBMS's more functionality is transferred to the client including
data dictionary, optimization etc. Such clients are called Data server.
Three-tier Client / Server Architecture
Centralized Distributed
The first is the data model on which the DBMS is based. The
main data model used in many current commercial DBMSs is
the relational data model. The object data model has been
implemented in some commercial systems but has not had
widespread use. Many legacy applications still run on
database systems based on the hierarchical and network data
models.
The third criterion is the number of sites over which the database
is distributed. A DBMS is centralized if the data is stored at a
single computer site. A centralized DBMS can support multiple
users, but the DBMS and the database reside totally at a single
computer site. A distributed DBMS (DDBMS) can have the actual
database and DBMS software distributed over many sites,
connected by a computer network. Homogeneous DDBMSs use
the same DBMS software at all the sites, whereas heterogeneous
DDBMSs can use different DBMS software at each site.
The fourth criterion is cost. It is difficult to propose a classification
of DBMSs based on cost. Today we have open source (free) DBMS
products like MySQL and PostgreSQL that are supported by third-
party vendors with additional services. The main RDBMS products
are available as free examination 30-day copy versions as well as
personal versions,
We can also classify a DBMS on the basis of the types of access
path options for storing files. One well-known family of DBMSs is
based on inverted file structures. Finally, a DBMS can be general
purpose or special purpose. When performance is a primary
consideration, a special-purpose DBMS can be designed and built for
a specific application; such a system cannot be used for other
applications without major changes. Many airline reservations and
telephone directory systems developed in the past are special-
purpose DBMSs. These fall into the category of online transaction
processing (OLTP) systems, which must support a large number of
concurrent transactions without imposing excessive delays