0% found this document useful (0 votes)

250 views54 pages

Chapter - 7 Distributed Database System

Distributed databases allow data to be shared across multiple computer systems while maintaining the appearance of a centralized database. This document discusses distributed database concepts, including data fragmentation, replication, and allocation across sites. It also covers types of distributed database systems like homogeneous and heterogeneous, and architectural components like transaction processors and data processors. Key topics are distribution transparency and how it handles fragmentation, location, and mapping transparency, as well as transaction transparency and its use of protocols like two-phase commit.

Uploaded by

dawodyimer

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Download as ppt, pdf, or txt

0% found this document useful (0 votes)

250 views54 pages

Chapter - 7 Distributed Database System

Uploaded by

dawodyimer

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Download as ppt, pdf, or txt

You are on page 1/ 54

Chapter - 7

Distributed Databases system

1
Outline
1 Distributed Database Concepts

2 Data Fragmentation, Replication and Allocation

3 Types of Distributed Database Systems

4 Query Processing

5 Concurrency Control and Recovery

6 3-Tier Client-Server Architecture

Distributed Database Concepts
It can be defined as
 A distributed database (DDB) is a collection of multiple
logically related database distributed over a computer
network.
 DDBMS is a software system that manages a distributed
database while making the distribution transparent to the
user.
Advantages DDS
1. Management of distributed data with different levels of
transparency:
– Distribution transparency:
• This refers to the physical placement of data (files, relations,
etc.) is not known to the user.

Site 5
Site 1

Site 4 Communications neteork

Site 3 Site 2
Con…
− Network transparency: Users do not have to worry about operational details of
the network.
− Location transparency: refers to freedom of issuing command from any location
without affecting its working.
− Naming transparency: allows access to any named object (files, relations, etc.)
from any location.
− Replication transparency: Allows to store copies of a data at multiple sites.

• This is done to minimize access time to the required data.

− Fragmentation transparency: Allows to segment a relation horizontally

(create a subset of tuples of a relation) or vertically (create a subset of columns of a

relation).
Advantages DDS
2. Increase reliability and availability:
− Reliability refers to system live time, that is, system is running efficiently most
of the time.
− Availability is the probability that the system is continuously available (usable
or accessible) during a time interval.
− A distributed database system has multiple nodes (computers) and if one fails
then others are available to do the job.
3. Improved performance:
− A DDBMS fragments the database to keep data closer to where it is needed
most.
− This reduces data management (access and modification) time significantly.
4. Easier expansion (scalability):
− Allows new nodes (computers) to be added anytime without chaining the entire
configuration.
Disadvantages DDS
– Complexity

– Cost

– Security

– Integrity control more difficult

– Lack of standards

– Lack of experience

– Database design more complex

Types of Distributed Database Systems
Homogeneous
• All sites of the database system have identical setup, i.e., same database
system software.
• The underlying operating systems can be a mixture of Linux, Window,
Unix, etc.
• For example, all sites run Oracle or DB2, or Sybase or some other database
system.
Window
Site 5 Unix
Advantages Oracle Site 1
 Easy to use Oracle
 Easy to mange Window
Site 4 Communications
 Easy to Design neteork
Disadvantages
 Difficult for most organizations to Oracle

force a homogeneous environment Site 3 Site 2

Linux Oracle Linux Oracle
Heterogeneous
 Different data center may run different DBMS products, with possibly different underlying data models.

 Translations required to allow for:

o Different hardware.
o Different DBMS products.
o Different hardware and different DBMS products.

Object Unix Relational

Oriented Site 5 Unix
Site 1
Hierarchical
Window
Site 4 Communications
network

Network
Object DBMS
Oriented Site 3 Site 2 Relational
Linux Linux
Heterogeneous
 Advantages
 Huge data can be stored in one Global center from different data center
 Remote access is done using the global schema.
 Different DBMSs may be used at each node

 Disadvantages
 Difficult to mange
 Difficult to design.

.
Federated Database Management Systems

• A federated database system (FDBS) is a collection of cooperating

database systems that are autonomous and possibly heterogeneous.
• Differences in data models: Relational, Objected oriented,
hierarchical, network, etc.
• Differences in constraints: Each site may have their own data
accessing and processing constraints.
• Differences in query language: Some site may use SQL, some may
use SQL-89, some may use SQL-92, and so on.

Multidatabase system (MDBS): A distributed DBMS in which each site

maintains complete autonomy.
Distributed Processing and Distributed Database
DBMS Architectures
DDBMS Components
Computer workstations
 To form the network system.
Network hardware and software
 Components that reside in each workstation.
Communications media
 Carry the data from one workstation to another.
Transaction processor (TP)
 Receives and Processes the application’s data requests.
Data processor (DP)
 Stores and Retrieves data located at the site.
 Also Known as data manager (DM).
DDBMS protocol
• determines how the DDBMS will:
– Interface with the network to transport data and commands
between DPs and TPs.
– Synchronize all data received from DPs (TP side) and route
retrieved data to the appropriate TPs (DP side).
– Ensure common database functions in a distributed system --
security, concurrency control, backup, and recovery.
• Single-Site Processing, Single-Site Data (SPSD)
– All processing is done on a single CPU or host computer.

– All DBMS are stored on the host computer’s local disk.

– The DBMS is accessed by dumb terminals.

– Most mainframe and minicomputer DBMSs.

– 1st generation single-user microcomputer database.

Non distributed (Centralized) DBMS
− MPSD requires a network file server on which conventional applications
are accessed through a LAN.

− Client/Server architecture.
– Fully distributed DBMS with support for multiple DPs and
TPs at multiple sites.
– Homogeneous DDMS
 Integrate only one type of centralized DBMS over the network.

– Heterogeneous DDBMS
 Integrate different types of centralized DBMSs over a network.
Distributed DB Transparency
– Distribution transparency
– Transaction transparency
– Failure transparency
– Performance transparency
– Heterogeneity transparency
Distribution Transparency
• Distribution transparency allows us to manage a physically
dispersed database as though it were a centralized database.

• Three Levels of Distribution Transparency

– Fragmentation transparency

– Location transparency

– Local mapping transparency

Distribution Transparency
• Example :
Employee data (EMPLOYEE) are distributed over three locations: New York, Atlanta,
and Miami.
Depending on the level of distribution transparency support, three different cases of
queries are possible:
Distribution Transparency
• Case 1: DB Supports Fragmentation Transparency
SELECT * FROM EMPLOYEE WHERE EMP_DOB < '01-JAN-1940';

• Case 2: DB Supports Location Transparency

SELECT * FROM E1 WHERE EMP_DOB < '01-JAN-1940';
UNION
SELECT * FROM E2 WHERE EMP_DOC < '01-JAN-1940';
UNION
SELECT * FROM E3 WHERE EMP_DOC < '01-JAN-1940';

• Case 3: DB Supports Local Mapping Transparency

SELECT * FROM E1 NODE NY WHERE EMP_DOB < '01-JAN-1940';
UNION
SELECT * FROM E2 NODE ATL WHERE EMP_DOB < '01-JAN-1940';
UNION
SELECT * FROM E3 NODE MIA WHERE EMP_DOB < '01-JAN-1940';
Transaction Transparency
• Transaction transparency - ensures that database transactions

will maintain the database’s integrity and consistency.

• Related Concepts:
– Remote Requests

– Remote Transactions

– Distributed Transactions

– Distributed Requests
A Remote Request
 Allows us to access data to be processed by a single remote database
processor.
A Remote Transaction
 Composed of several requests, may access data at only a single
site.
 Allows a transaction to reference several different (local or remote) DP
sites.
A Distributed Request
 Reference data from several remote DP sites.
 Allows a single request to reference a physically partitioned table.

Example2:
Distributed Request
Transaction Transparency
 Two-Phase Commit Protocol
 DO performs the operation and records the “before” and “after” values
in the transaction log.
 UNDO reverses an operation, using the log entries written by the DO
portion of the sequence.
 REDO redoes an operation, using the log entries written by DO
portion of the sequence.

– The write-ahead protocol forces the log entry to be written to permanent

storage before the actual operation takes place.
Two-Phase Commit Protocol
• Two-phase commit protocol defines the operations between two
nodes;
• Coordinator and

• Subordinates or cohorts - one or more

Two-Phase Commit Protocol
• The protocol is implemented in two phases:
• Phase 1: Preparation

• The coordinator sends a PREPARE TO COMMIT message to all

subordinates.
• The subordinates receive the message, write the transaction log
using the write-ahead protocol, and send an acknowledgement
message to the coordinator.
• The coordinator makes sure that all nodes are ready to commit, or
it aborts the transaction.
Two-Phase Commit Protocol
– Phase 2: The Final Commit

– The coordinator broadcasts a COMMIT message to all

subordinates and waits for the replies.

– Each subordinate receives the COMMIT message then updates

the database, using the DO protocol.
– The subordinates reply with a COMMITTED or NOT COMMITTED
message to the coordinator.
 If one or more subordinates uncommitted, the coordinator sends
an ABORT message, thereby forcing them to UNDO all
changes.
Performance Transparency and
Query Optimization

• Query optimization must provide distribution transparency as well

as replica transparency.

• Replica transparency refers to the DDBMSs ability to hide the

existence of multiple copies of data from the user.

• Query optimization algorithms are based on two principles:

• Selection of the optimum execution order

• Selection of sites to be accessed to minimize communication

costs
 Operation Modes of Query Optimization
– Automatic query optimization
DDBMS finds the most cost-effective access path without user intervention.
– Manual query optimization
Optimization is selected and scheduled by the end user or programmer.

• Timing of Query Optimization

– Static query optimization takes place at compilation time.

– Dynamic query optimization takes place at execution time.
• Optimization Techniques Information -
– Statistically based query optimization
uses statistical information about the database.
– Rule-based query optimization algorithm

based on a set of user-defined rules to determine the best query access

strategy.
Distributed Database Design

 The design of a distributed database introduces three new issues:

– How to partition the database into fragments.
– Which fragments to replicate.
– Where to locate those fragments and replicas.
Data Fragmentation
 Data fragmentation allows us to break a single object
into two or more segments or fragments.
 Three Types of Fragmentation Strategies:

 Horizontal fragmentation

 Vertical fragmentation

 Mixed fragmentation
Data Fragmentation
 Horizontal Fragmentation - Consists of a subset of the tuples
of a relation.
 Fragment represents the equivalent of a SELECT statement, with
the WHERE clause on a single attribute.
Data Fragmentation
 Vertical fragment Consists of a subset of the attributes of a
relation.
 Equivalent to the PROJECT statement.
Data Fragmentation

 Mixed fragment - Consists of a horizontal

fragment that is subsequently vertically
fragmented, or a vertical fragment that is
then horizontally fragmented.
 A mixed fragment is defined using the
Selection and Projection operations of the
relational algebra.
Data Replication

 Data replication refers to the storage of data copies at multiple

sites served by a computer network.
– Enhance data availability and response time, reducing
communication and total query costs.
Data Replication
• Mutual Consistency Rule
– All copies of data fragments be identical.
– DDBMS must ensure that a database update is performed at all
sites where replicas exist.
• Replication Conditions
– Fully Replicated database stores multiple copies of all database
fragments at multiple sites.
– Partially Replicated database stores multiple copies of some
database fragments at multiple sites.
• Factors for Data Replication Decision
– Database Size
– Usage Frequency
Data Allocation
 Data allocation describes the processing of deciding where to locate
data.
 Data Allocation Strategies
– Centralized
The entire database is stored at one site.
– Partitioned
The database is divided into several disjoint parts (fragments) and
stored at several sites.
– Replicated
Copies of one or more database fragments are stored at several
sites.
• Data allocation algorithms

• Data allocation algorithm take into consideration a variety of

factors:

– Performance and data availability goals

– Size, number of rows, the number of relations that an entity

maintains with other entities.

– Types of transactions to be applied to the database, the

attributes accessed by each of those transactions.
Database system architectures
 Parallel versus Distributed Architectures

– There are two main types of multiprocessor system architectures :

■ Shared memory (tightly coupled) architecture. Multiple processors share secondary
(disk) storage and also share primary memory.
– Shared disk (loosely coupled) architecture. Multiple processors share secondary
(disk) storage but each has their own primary memory.
– Shared nothing(parallel processing (MPP)) architecture - multiple processor
architecture in which each processor is part of a complete system, with its own memory
and disk storage.
Some different database system architectures.
 Parallel database architectures:
(a) shared memory;
(b) shared disk;
(c) shared nothing.
 Centralized database architecture

 A truly distributed database architecture.

Shared nothing architecture
Centralized database
Distributed database

Site 1
Client/Server vs. DDBMS
• Client/server architecture refers to the way in which computers
interact to form a system.
Reference architecture for a DDBMS
Questions ?

Sample Book Archiving Your SAP Data - SAP Press
No ratings yet
Sample Book Archiving Your SAP Data - SAP Press
40 pages
Power BI - Final Project
No ratings yet
Power BI - Final Project
212 pages
FRP301 Advanced Exercises
100% (1)
FRP301 Advanced Exercises
70 pages
Chapter - 6 Distributed Database System
No ratings yet
Chapter - 6 Distributed Database System
50 pages
Unit 3 (Distributed DBMS Architecture) : Architecture: The Architecture of A System Defines Its Structure
No ratings yet
Unit 3 (Distributed DBMS Architecture) : Architecture: The Architecture of A System Defines Its Structure
11 pages
Unit No.4 Parallel Database
No ratings yet
Unit No.4 Parallel Database
32 pages
1-IAS Architecture-12-12-2022
No ratings yet
1-IAS Architecture-12-12-2022
34 pages
Advanced Database Systems: Chapter 4: Transaction Management
No ratings yet
Advanced Database Systems: Chapter 4: Transaction Management
78 pages
Unit-4-Database Security
No ratings yet
Unit-4-Database Security
14 pages
Chapter 2 Design Principles
100% (1)
Chapter 2 Design Principles
20 pages
Chapter 3 Transaction Processing Concepts
No ratings yet
Chapter 3 Transaction Processing Concepts
40 pages
Chapter 3 - Recovery Techniques
100% (1)
Chapter 3 - Recovery Techniques
22 pages
Unit - 5 DBMS Kca 204
No ratings yet
Unit - 5 DBMS Kca 204
19 pages
Chapter 4 Concurrency Control Techniques
No ratings yet
Chapter 4 Concurrency Control Techniques
41 pages
Introduction To Os
No ratings yet
Introduction To Os
34 pages
OS Question Bank Unit 1-5
No ratings yet
OS Question Bank Unit 1-5
9 pages
Computer Architecture and Parallel Processing
No ratings yet
Computer Architecture and Parallel Processing
29 pages
Database Management Systems: ©silberschatz, Korth and Sudarshan 1.1 Database System Concepts
No ratings yet
Database Management Systems: ©silberschatz, Korth and Sudarshan 1.1 Database System Concepts
33 pages
CH 3 - Requirments Elicitation
No ratings yet
CH 3 - Requirments Elicitation
30 pages
Difference Between Semaphore and Monitor
100% (1)
Difference Between Semaphore and Monitor
8 pages
Unit 5 - SE - Notes
No ratings yet
Unit 5 - SE - Notes
45 pages
Concurrency Control Dbms
No ratings yet
Concurrency Control Dbms
49 pages
OOAD Question Bank
100% (2)
OOAD Question Bank
5 pages
Classical Analysis
No ratings yet
Classical Analysis
6 pages
Unit 4 - Software Engineering - WWW - Rgpvnotes.in
No ratings yet
Unit 4 - Software Engineering - WWW - Rgpvnotes.in
12 pages
Unit I Dbms
0% (1)
Unit I Dbms
45 pages
PPS - Unit 1
No ratings yet
PPS - Unit 1
69 pages
Distributed Database Systems: January 2002
No ratings yet
Distributed Database Systems: January 2002
25 pages
Chapter 2: 8086 Microprocessor and Its Architecture
No ratings yet
Chapter 2: 8086 Microprocessor and Its Architecture
19 pages
Img Representation and Description (BY PVT) PDF
No ratings yet
Img Representation and Description (BY PVT) PDF
48 pages
Study On Intel 80386 Microprocessor
No ratings yet
Study On Intel 80386 Microprocessor
3 pages
Chapter 6 AJAX
No ratings yet
Chapter 6 AJAX
9 pages
Chapter 2 Processes and Process Management
No ratings yet
Chapter 2 Processes and Process Management
115 pages
Chapter 7 Memory Organization
No ratings yet
Chapter 7 Memory Organization
10 pages
Distributed Systems 2 Mark Question & Answers
No ratings yet
Distributed Systems 2 Mark Question & Answers
16 pages
SDD 1 Algorithm
No ratings yet
SDD 1 Algorithm
14 pages
Concurency Control Full
No ratings yet
Concurency Control Full
18 pages
Cs8582-Object Oriented Analysisand Design Laboratory-46023968-Cs8582 - Ooad Lab
No ratings yet
Cs8582-Object Oriented Analysisand Design Laboratory-46023968-Cs8582 - Ooad Lab
132 pages
Distributed System: Naming System in DS
No ratings yet
Distributed System: Naming System in DS
51 pages
sathyabama-IIsem-Distributed Computing-683201-783201
No ratings yet
sathyabama-IIsem-Distributed Computing-683201-783201
2 pages
Operating System (Questions)
No ratings yet
Operating System (Questions)
27 pages
7 Query Localization
No ratings yet
7 Query Localization
27 pages
6.823 Computer System Architecture Quiz 1
No ratings yet
6.823 Computer System Architecture Quiz 1
15 pages
NSK OS I 13 Solution 1
No ratings yet
NSK OS I 13 Solution 1
7 pages
SE Lab Manual
No ratings yet
SE Lab Manual
55 pages
DBMS Question Bank and Assignment - 1
No ratings yet
DBMS Question Bank and Assignment - 1
1 page
Vector and Array Processor
No ratings yet
Vector and Array Processor
3 pages
Ge2112 Fundamentals of Computing and Programming: Introduction To Computers
No ratings yet
Ge2112 Fundamentals of Computing and Programming: Introduction To Computers
25 pages
Distributed-Computing Notes
No ratings yet
Distributed-Computing Notes
108 pages
Transactions and Concurrency Control
100% (1)
Transactions and Concurrency Control
7 pages
28-5-I O Fundamentals Handshaking, Buffering-20!10!2021 (20-Oct-2021) Material I 20-10-2021 Unit-5-Lecture1
100% (1)
28-5-I O Fundamentals Handshaking, Buffering-20!10!2021 (20-Oct-2021) Material I 20-10-2021 Unit-5-Lecture1
15 pages
OS-I Unit
No ratings yet
OS-I Unit
32 pages
Fundamentals of Algorithmic Problem Solving: B.B. Karki, LSU 2.1 CSC 3102
No ratings yet
Fundamentals of Algorithmic Problem Solving: B.B. Karki, LSU 2.1 CSC 3102
4 pages
Database Security and Authorization
No ratings yet
Database Security and Authorization
17 pages
Ch#22 TRANSACTION - MANAGEMENT
No ratings yet
Ch#22 TRANSACTION - MANAGEMENT
80 pages
DPCO Unit 1 - New
No ratings yet
DPCO Unit 1 - New
78 pages
Distributed System
No ratings yet
Distributed System
162 pages
UNIT 6 Hardware & Software Concepts PDF
No ratings yet
UNIT 6 Hardware & Software Concepts PDF
9 pages
Failure Classification in DBMS
No ratings yet
Failure Classification in DBMS
2 pages
BCA COA Full Notes
No ratings yet
BCA COA Full Notes
83 pages
Advanced Unix Programming
From Everand
Advanced Unix Programming
Prof. N. B Venkateswarlu
No ratings yet
Chapter - 7 Distributed Database System
No ratings yet
Chapter - 7 Distributed Database System
58 pages
Chapter - 7 Distributed Database System
No ratings yet
Chapter - 7 Distributed Database System
54 pages
Algorithms: Freely Using The Textbook by Cormen, Leiserson, Rivest, Stein
No ratings yet
Algorithms: Freely Using The Textbook by Cormen, Leiserson, Rivest, Stein
204 pages
Database Administration Todd
No ratings yet
Database Administration Todd
23 pages
Implementing Transaction Processing Using Undo Logs
No ratings yet
Implementing Transaction Processing Using Undo Logs
14 pages
Working With Dates in Pandas: Prepared by Asif Bhat
No ratings yet
Working With Dates in Pandas: Prepared by Asif Bhat
13 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
61 pages
Chapter 2.2
No ratings yet
Chapter 2.2
46 pages
Implementing Transaction Processing Using Redo Logs
No ratings yet
Implementing Transaction Processing Using Redo Logs
6 pages
Chapter 6 Structure
No ratings yet
Chapter 6 Structure
32 pages
Chapter 3 - 1
No ratings yet
Chapter 3 - 1
62 pages
Chapter 1-C++
No ratings yet
Chapter 1-C++
28 pages
Makalah - TB2 Statistika Valdano Esnaidar
No ratings yet
Makalah - TB2 Statistika Valdano Esnaidar
5 pages
Ict Assignment 3
No ratings yet
Ict Assignment 3
13 pages
Complete Answer Guide for Principles of Auditing Other Assurance Services 19th Edition Whittington Solutions Manual
100% (19)
Complete Answer Guide for Principles of Auditing Other Assurance Services 19th Edition Whittington Solutions Manual
48 pages
Rdbms Model Question Paper
No ratings yet
Rdbms Model Question Paper
1 page
Eti Micrprjct
No ratings yet
Eti Micrprjct
14 pages
Case Study Report Template
No ratings yet
Case Study Report Template
2 pages
Sample Thesis With Mean Median Mode
100% (2)
Sample Thesis With Mean Median Mode
8 pages
Database Management Systems Lab - 20 - 21
No ratings yet
Database Management Systems Lab - 20 - 21
2 pages
Database Notes Data Modeling and Entity Relationship Diagram
No ratings yet
Database Notes Data Modeling and Entity Relationship Diagram
13 pages
Handshaking
No ratings yet
Handshaking
6 pages
DWG2000 New API Guide
No ratings yet
DWG2000 New API Guide
23 pages
How The CBO Works: Jonathan Lewis WWW - Jlcomp.demon - Co.uk
No ratings yet
How The CBO Works: Jonathan Lewis WWW - Jlcomp.demon - Co.uk
37 pages
BDA UNIT-2 (Final)
No ratings yet
BDA UNIT-2 (Final)
27 pages
Comparing ODS RTF in Batch Using VBA and SAS
No ratings yet
Comparing ODS RTF in Batch Using VBA and SAS
8 pages
Basic Commands in Linux
No ratings yet
Basic Commands in Linux
3 pages
SQL - Data Definition and Data Manipulation Exercise
No ratings yet
SQL - Data Definition and Data Manipulation Exercise
9 pages
Themida 1.9.5.0 Unpacker v0.2
No ratings yet
Themida 1.9.5.0 Unpacker v0.2
13 pages
Oreilly Essential Sqlalchemy June 2008
100% (1)
Oreilly Essential Sqlalchemy June 2008
230 pages
SGA1 - PoM - DH49MAR001 - Nhóm 5
No ratings yet
SGA1 - PoM - DH49MAR001 - Nhóm 5
4 pages
Dspace - Batch Upload Services
No ratings yet
Dspace - Batch Upload Services
3 pages
Penelitian Tindakan Kelas - Alberta Asti Intan Sherliana 20220049
No ratings yet
Penelitian Tindakan Kelas - Alberta Asti Intan Sherliana 20220049
25 pages
Data Engineering
No ratings yet
Data Engineering
92 pages
Module 12 - Managing Indexes
No ratings yet
Module 12 - Managing Indexes
19 pages
Detailed Execution Plan Summer Training Project - BBA 311: Category of Projects Mentioned Below
No ratings yet
Detailed Execution Plan Summer Training Project - BBA 311: Category of Projects Mentioned Below
18 pages
Interview Question For Veritas
No ratings yet
Interview Question For Veritas
1 page
Advanced SQL
No ratings yet
Advanced SQL
10 pages
BD Chapter 5
No ratings yet
BD Chapter 5
14 pages