IT Series > A Fundamental Study of Database Management Systems
Database Development Process
The development of complete database application is ferigihy and complicated process.
Different strategies can be used to develop database applications which are as follows:
2.7.1 General Strategies
‘A database application is developed to satisfy the requirements of the user. It is very
‘important fo understand these requirements in detail. The application should be developed
according to the expectation of user. Different techniques are used to find the requirements
and needs of users such as interviews. The requirements should be defined as early as
Possible in development process. There are two strategies to develop a database application:
Top-Down Development
2. Button-tip Development
‘Top-Down Development
‘This strategy starts with general issues and moves to specific issues. First of all, itis
important to find out general goals of organization and the means by which these goals can
be achieved. The requirements are defined that must be satisfied to roach these goals. This
study gives an abstract data model of the system.
The uset moves to detailed and specific isues using this model, This process identifies
«particular database and related application to be developed. Finally, high-level data model
5S transformed into low-level models. All identified systemis,databases and applications are
eloped,
2. Bottom-Up Development
[This strategy starts with specific issues and moves to general issues. The user begins by
identifying a specific system to be developed. The requirements are found by studying the
existing system and by interviewing different users.
System Development Life Cycle
System development lifecycle is a conventional way to develop an information system.
It consists of many steps and involves different persons, The steps of SDLC are as follows:
_/2-8: Preliminary Investigation
Preliminary investigation is the first phase of SDLC. Its main objective is to identify
deficiencies and requirements in the user's current environment. An important result of the
Preliminary investigation is whether the system to be developed is feasible or not,
Feasibility is determined on the following parameters:
‘Whether current technical resources or technology. is available in the developer's
‘organization or in the market that is capable of handling the user's requirements.
+ Whether the system is cost effective economically or financially.
+ _How effectively the user will operate this software once installed.
Feasibility study report is produced at the end ofthis phase. A final acceptance of the
Proposed system is taken from the user. The next phase is started when the proposed system
is accepted,Chapter 2 => Database Environment 39
‘rida
sean J)
‘analynie
Figure: Phases of System Development Life Cycle
2.8.2 Requirement Analysis
In this phase, the current business system is studied in detail to find out how it works
and where the improvements are required. It cludes a detailed study of various operations
performed by the system and their relationship within and outside the syslem, The analyst
‘and user work closely during the complete analysis phase. A detailed document is prepared
at the end ofthis phase called requirement specifications.
2.8.3 System Design
‘The requirement analysis phase provides the requirements of the system. The next
phase is to design the new system to satisfy these requirements. The design phase states how
a system will meet the requirements identified during systems analysis phase as mentioned
in the requirements specifications.
‘Some activities performed during design phase are as follows:
+ Identification of data entry forms along with the data elements
* Identification of reports and outputs of the new system
‘+ Design the form or display as expected in the system. This may be done on paper or
on a computer display using any design tools.
Identification of data elements and tables for database creation
Procedures for deriving the output from given input
‘The document produced at the end of this activity is called design specification. The
detailed design specification is given to the programmers to start software development.
2.8.4 Software Development
In this phase, actual coding of the programs is done. Programs are tested wsing dummy
data. Programmers also prepare the documentation related to programs. The documentation,
explains how and why a certain procedure was coded in a specific way.
\28.5 System Testing
After the programs are tested individually, the system is tested as a whole, During
system testing phase, all software modules are integrated and tested to ensure that they are
running according to the specifications. Special test data is prepared as Input for processing.
‘The results are examined to ensure that they are correct.40 IT Series => A Fundamental Study of Database Management Systems
486 System Implementation
In this phase, the developed system is installed for use. ‘The following activities are
performed before the actual usage of the system:
+ User personnel are trained to operate the system:
_/ +The data files needed by the system are constructed.
«28.7 System Maintenance
The system may become less useful if any change occurs in the user environment. The
software may be modified for its effective use. The activity of system maintenance may vary
depending on the scale of modifications and enhancements.
2.9 Staged Database Design Approach
‘Another way to design an information system is known as staged database design
approach. It is a top-down approach. It begins by analyzing the general requirements of the
organization. As the process continues, these problems are analyzed in more details. The
‘steps in this approach are as follows:
1. Analyze User Environment
‘The first step in designing a database is to understand and analyze the current user
enyitonment. The designer should closely study the current system) and its outputs. He
should also interview different users to know their current and future requirements.
2. Develop Logical Data Model
‘The designer develops a logical data model for the organization alter analyzing, the
user environment. This data model consists of all entities, attributes and relationships. The
designer also determines the following things:
“Types of applications and transactions
Types of databases access
Volume of transactions
Volume of data
Frequency of databases access
Budgetary restrictions
+ Performance requirements
3. Choose a DBMS
The designer chooses a particular database management system on the basis of logical
data model. The selected DBMS should satisfy all requirements and constrains identified in
the logical data model.
4. Map Logical Model to DBMS
‘The designer maps the logical data model to available data structure of the selected
database management system.
5. Develop Physical Model
‘The designer creates the exact layout of data according the facilities of selected DBMS
and available resources of software and hardware.Chapter 2 => Database Environment a1
6. Evaluate Physical Model
‘The designer evaluates the physical model by checking the performance of applications
and’transactions. The designer may implement a particular portion of database to validate
the user views and performance effectively.
7. Perform Tuning
‘Tuning is performed to improve the performance of database. Different modifications
are made to the physical model if required.
8. Implement Physical Model
The designer implements the physical model if evaluation is satisfying, The database
becomes functional
‘Analyze User
Enviroment
Develop Logical
‘Model
‘Map Logicel| A
‘Model to DEMS
igure: Staged Database Design,
“The above figure shows that different steps can be repeated at different stages of the
development process, For example, the database designer can review the user environment
uring the development of logical model. While mapping the logical model to physical
model, he can change the selection of DBMS. Similarly, f an error occurs and he has to tune
the system, he may need to change DBMS or remap the logical model, Even at the last step,
‘he may review all steps from the very beginning,IT Series -> A Fundamental Study of Database Management Systems
ag
240 Design Tools
Dest tools are used to describe the design process in a standard way) W standard tool
fo seatgant because it provides standard notation for desingning specific systems, If there
aedeandard tool, everyone may use different design-riouitions that can be aifficaie o
Understand for others, This situation can resul*in more critical problem if both persons are
working on the same system.
2.10.1 Data Flow Diagram \~—
A data flow diagram shows the flow of data through an organization, Its wed ra design
systems graphically. DED is very simple and it hides complexities ofthe system.
2.10.1.1 Advantages of DFD.
DID provides the following advantages:
* [provides the freedom from committing to the technical implementation of system
too early.
[Lhetps in further understanding of interrelationships of systems and subsystems.
{tis helpful in communicating current system knowledge to users,
becnapanalysis of proposed system to determine if aff the data and processes have
been defined.
2.10.1.2 Limitations of DFD
DED has the following limitations:
{DED does not provide any way of expressing decision points.
+ _ DFDis only focused on the follow of information,
2.10.1.3 Symbols in DED
Data flow diagram uses the following symbols:
«Data Flow |
[The data flow symbol is used to express the flow of information from one entity to
Aaeanet entity in the system. Data flow is a pipiline through wich packets of infornate:
flow. An arrow labeled with the name of data is used for dats flow:
= 3
eee Store
‘The data store symbol is used as a repository for the storage of data, It indicates that
S28 i permanently stored in the database. Itis expressed witha teclangle that ig ‘open on the
‘ight. The right width of the rectangel is drawn with double lines.
Processer 2 = Database En wx
External Entity X*
An enlity that interacts with the system from outside the. system is called external
ity. The external entities interact with the system in two ways. They may receive the deta
from the system or may produce data fro the system. Il is expressed as rectangle:
Vie Ex
The collector symbol is used to express several data flow connection terminating at @
single location. It is used to show the convergence of data to a single point. The following
symbol is used to represent a collector:
= Cae
The separator is used to separate data from a single source to multiple sinks. The
following symbol is used to represent a separator:
Context diagrams are also used to represent and depict 4 system. A DFD is a more
detailed representation of the system. A contex\ diagram only deals with the boundary of the
system. It gives the overall picture of a system It represents the software model as a single
and large process with input and output data It is displayed by incoming and outgoing
arrows respectively. It focuses on the main dataflow in a system. It establishes the
relationships between extemal entities of a system and the system via the data (flows) that
they exchange. To enable a DFD to start from a context diagram, single bubble of the context
‘diagram is partitioned into more bubbles to reveal the details of processing in the system.
The dataflow diagrams are basically process-oriented. They are used to model the
functions performed by a system.
uirern rans tar grees!
erent sor Sree
Tal score
Figure: A data flow diagram shows how data moves through the existing system.‘Series > A Fundamental Study of Database ‘Management Systems
Sn
Ya Database Administrator (DBA)
DBA is an important person in the development of any information system. He is
Seponsble for design, operation and management of dalsbons He must be technically
negotiating agreements ete.
2.11.1 Functions of DBA
‘The main responsibilities of database administrator are as follows:
1. Preliminary Database Planning
DBA may participate in preliminary database Planning if appointed early
2. Identifying User Requirements
DBA identifies the current user environment. He: closely studies the current system and
‘Ss outputs. He also interviews diferent users to know thee, ‘current and future requirements,
>: Peveloping & Maintaining Data Dictionary
Data dictionary is a very useful collection of data about database, The DBA. stores data
ltem-naines, sources, meanings and saage in data dictionary. The daty dictionary is revised
‘"epularly to update it as the project continues
~ 4. Designing Logical Model © ~~
After analyzing fhe user environment, the DBA ‘develops a logical data model for the
organization. This data model consists of all entities, attributes and relationships.
5 Choosing a DBMS
‘The DBA chooses a particular databaso management system on the basis of logical data
model, The selected DBMS should satisfy all requirements and constrains identified in the
logical data model,
Developing Physical Model
‘The DBA creates the ‘exact layout of data according, the facilities of selected DBMS and
available resource of software and hardware,
(7 Creating & Loading Database
After developing physical model, the DBA creates the structure of database by using
DBMS. He also loads the data into the database,
8. Developing User ViewsChapter 2 => Database Envitonment 45
~~ 10. Developing & Enforcing Data Standard
JPata in database must be inserted according tothe sfafdard required by organization.
DBA ensures that the inserted data is always according to these standards. The user inirtace
Should guide the user to insert proper data. For example, the text fields may contain derek
Tear etae take it easy for the usr to judge the type of data tobe inscried. Integrity means
{nat database must always satisfy the rues that apply to the real word. For exarple any
Employees can only work in one department, database should not allow an employee to bo
registered in two departments. Consistency means thal two different pieces of dais earner
re ays fach olher. For example if date of birth of an employee is 7/8/78 atone place ond
11. Developing Operating Procedures
DBA should establish procedures for different operations. ‘The Procedures include
Security and authorization, recording hardware and software failures, taking performance
measurements, shutting down database property, restarting and recovering afteriadion
—12. Training the Users
‘The DBA should train end users, application programmers and othe? Users s0 thal they
spent cy Sl ts
‘elping Database Users. JA V0 Ae) Wt
‘The database administrator helps the database users by:
* Making sure that the data they require is available
* Assisting them on using correctly the system
~14. Defining Backup and Recovery Procedures
11 any Portion of database is damaged by human error or by hardware, it should be
serene 38 $000 a8 possible. Iti important to backup the information of the databeee an 6
backup server so that it may be recovered in case of emergency.
“AS. Monitoring Performance
Should tune it. He should add or change the indexes. In some tases the DBA may have to
‘change the physical model and reload the database.
Fr Data Administrator (DA) ‘See
a
jhe need of data administrator atises in a very large organization where many
We dency ny, ist Data administrator is responsible for the whole information resece,
functions. He also controls and manages database, establishes data standards ‘communicates
crests Prepares logical design, develops data dictionary, plans the development ot
A Fundamental Study of Database Management Systems
213 Data Dictionary
Data dictionary is a repository of information that describes the logical structure of the
database. It contains record types, data item types and data aggregates etc. Data dictionaries
in some systems store database schema and can be used to create and process database. Data
dictionary contains metadata. Metadata i the data abou the date soci the da ee,
2.13.1 Uses of Data Dictionary
Different uses of data dictionary are as follows: Ba
‘* Information about Data: It is used to collect and store information about data in a
central location. It helps the management to get control over data as a resource,
* Communication with Users: It provides great help in communication as it stores
exact meanings of data items. An exact definition of each item should be stored in
data dictionary that can be used in case of any problem.
* Record of Change in Database Structure: It keeps track of changes to the database
Structure. The changes such as creation of new data item or modification of data item
descriptions should be record in data dictionary.
* Determining the Impact of Change: Data dictionary records each item and its
relationships. DBA can see the effects of a change.
‘> Recording Access Control Information: Data dictionary stores all information about
different authorized users. It also contains the types of access for all users,
* Audit Information: It can also keep record of each access to the database. This
information can later be used for audit purposes.
2.13.2 Types of Data Dictionaries
Different types data dictionaries are as follows!
| Integrated Data Dictionary
A data dictionary that is part of DBMS is called integrated data dictionary. It performs
‘many functions throughout the life of the database not only in design phase. There are two
types of integrated data dictionary:
* Active Data Dictionary: The integrated data dictionary is called active if it is checked
by DEMS every time a database is accessed. It is always consistent with actual
database structure. I is automatically maintained by the system.
* Passive Data Dictionary: The integrated data dictionary is called passive if itis not
used in day-to-day database processing,
2. Freestanding Data Dictionary
A data dictionary that is available without a Particular DBMS is called freestanding
slata dictionary. It can be a commercial product or a simple file developed by the designer.
Many CASE packages provide a data dictionary toot: Ibis preferable in initial design stages
before choosing, any particular DEMS.
2.14 Logical Database Design Neher
‘The logical database design contains the definition of the data to be stored in database,
{also contains the rules and information about the structure and type of data. All entities,
{heir attributes and their relationships are described in logical model. It is the complete
description of data stored in database.Chapter 2 > Database Environment 47
2.14.1 Logical Database Design Process ~~
An overview of logical database design process is as follows:
Reprecent Entities
See SA eS
Represent Relationships
¥
Merge the Relations
¥
‘Normalize the Relations
Figure: Logical Design trocess
.. Represent Entities ——~
Each entity in a E-R diagram is represented as a relation in relational model. In this
process, the name of entity becomes the nime of. relation. The identifier of entity type
becomes the primary key of relation. The remuining attributes of the entity type become non-
key attributes of the relation.
Following example explains the process of converting an entity into a relation:
A
o were
a AN EMPLOYEE, -
a Sanh
Figure: EMPLOYEE entity
\ Letmptoyeetn [ame [address [oirthdate
x
wit Figure: EMPLOYEE Relation.
ey In the above example, EMPLOYEE entity is converted into relation. The attributes of
ty
the entity are fields of the relation. The data model describes Employee!D as an identifier and
is underlined. The above relation is using EmployeelD as primary key for the relation.
._& Represent Relationships _~
Each relationship in an E-R diagram must also be represented in relational model. The
representation depends on the naluze of relationship. In some cases, a relationship is
represented by making the primary key of one relation a foreign key of another relation. In
some cases, a separate relation is created to represent a relationship.
(ea al“ Tsien alge Src
Se
we Merge the Relations
In some cases, there may be redundant relations. It means thal two or more relations
‘may describe the same entity type. The redundant relations must be merged to remove the
redundancy. This process is also known as view integration. Suppose there are two relations
as follows:
EMP1 (EMPNO, NAME, ADDRESS, PHONE)
EMP? (EMPNO, ENAME, EMP-ADDR, EMP_JOB CODE, EMP_DOB)
‘The above tables EMP1 and EMP2 describe the same entity EMPLOYEE. They can be
‘merged into one relation. The result of merging the above relations is as follows:
EMP (EMPNO, NAME, ADDRESS, PHONE, EMP_JOB_CODE, EMP_DO8)
‘The new relation contains atthbutes of both relations without any repeating attributes.
\A. Normalize the Relations
‘The relations created in step 1 and step 2 may have some unnecessary redundancy.
Some certain anomalies or errors may arise while updating these relations. The process of
normalization refines these relations to avoid these problems.
2.15-Physical Database Design
Physical design is the last stage of database dedign process, The major objective of
physical dataase design is to implement the databse as a set-of records, files, indexes and
other data structures.
2.15:1 Major Inputs to Database Design
Three major inputs to physical database design are as follows:
1, Logical Database Structure
‘These are developed during logical database design such as normalized relations.
2 User Processing Requirements
I includes the size and frequency of database usage, response time, security, backup
and recovery ete.
3. Characteristics of DBMS >“
It includes the characteristics of DBMS and other ee ‘of computer operating
environment,
2.15.2 Components of Physical Database Design
Different components of physical database desing are as follows:
1, Data Volume and Usage Analysis
It is used to estimate the size or volume and usage pattems of database. The estimate
of database size is used to select the physical storage devices: It is also used to determine the
costs of storage. The estimate of usage patterns are used to select file organization and access
methods, It is also used to plan for the use of indexes and a strategy for data distribution.
2. Data Distribution Strategy
Many organizations are using distributed computer networks now a days. These
organization face a significant problem in physical database design. The problem is that they
have to decide and select nodes or sites in network at which data will be located physically,iapter 2 => Database Environment
Tiie basic data distribution strategies areas follows:
i Centralized ~~
In this strategy, all data is located at a single site-ILis simple and easy-to conduct. This
strategy has three disadvantages:
+ Data stored at remote sites is not accessible r
‘© Data communication costs may be very high.
‘+ The database system fails totally when the central system fails
ii, Partitioned
In this strategy, the database is divided into partitions or fragments. Each partition is
assigned to a particular site. The major advantage of this strategy is that data is moved closer
to local user. Data becomes more easily accessible.
iii, Replicated
In this strategy, the full copy of the database is assigned to more than one site in the
network. This strategy maximizes local access. But it creates update probelms because each
database change must be reliably processed and synchronized at all sites,
iv. Hybrid 7
In this strategy, the database is divided into critical and non-critical fragments. The
ial fragments are stored at multiple sites. Tre non-critical fragments are stored at one site
only.
3. File Organization
File organization is a technique for physically arranging the records of a file on
secondary devices. The system designer must recognize several r selecting a file
organization. These constrains inlcude the following:
* Physical characteristics of secondary storage devices
* Available operating systems and file management software
+ User requirements for storing and accessing data
Criteria to Select File Organization
‘The criteria for selecting a fle organization ate as follows:
Fast access for data retrieval
High throughput for processing transactions
Efficient use of storage space
Protection from failure or data loss
Minimizing need for data re-organization
Security from unauthorized use
Organization Methods
‘The files are organized on storage media i the following methods:
a. Sequential Files
The records in sequential file organization are stored in sequence. A sequence means
the records are stored one after the other. The records can be retrieved only in the sequence in
‘whieh they were stored. The principal storage media for sequential files is magnetic tape.
ily.50 AT Series = A Fundamental Study of Database Management Systems
The'major disadvantage of sequential acces is that itis very Slow: IF the the last record
‘sto be retrieved, all preceding record are read before reaching the last record.
b. Direct or Random Files. OC
‘The records in direct file organization aré not stored in a particular sequence. A key
value of a record is used to determine the location to store the record. Each record is accessed
directly without going through the Preceding records.
‘This file organization is suitable for storing data on disk. Direct file organization is
much faster than sequential file organization for finding a specific record.
A problem may occur in this type of files known as synonym. The problem occurs if
the same address is calculated to store two or more records,
c. Indexed Sequential Files. ~~"
In indexed sequential file organization, records are stored in ascending or
order. The order is based on a value called key. Additionally, indexed file ‘organization
maintains an index ina file.
An index consists of key values and the corresponding disk address for each record in
the file. Index refers to the place on a disk where a record is stored. The index file is updated
whenever a record is added or deleted from the file.
‘The records in indexed file organization can be accessed in Sequential access as well as
random access or direct access. The records in this file type tequire more space on storage
‘media. This method is slower than direct file organization as it requires to perform an index
search, Xx
4. Indexes
An index is a table that is used to determine the location of rows ina table. Indexes are
used to speed up the sorting and searching process. The performance of database is
improved with these indexes. The index may be created on primary key, secondary key and
foreign key etc. 2
5. Integrity Constraints
Database integrity means the correctness and consistency of data. It is another form of
database protection. Integrity is related to the quality of data. Integrity is maintained with tne
help of integrity constraints, These constraints are the rules that are designed to keep data
consistent and correct. They act like a check on the incoming data. It is very important that 2
database maintains the quality of the data stored in it. DEMS provides several mechanisms to
enforce integrity of the data.