Unit - 4 - Transaction Processing

Write about Transaction Processing Concepts.
Introduction to transaction processing:
• A transaction can be defined as a group of tasks.

• A single task is the minimum processing unit which cannot be divided further.
• The concept of transaction provides a mechanism for describing logical units of database
processing.
• Transaction processing systems are systems with large databases and hundreds of concurrent
users executing database transactions.
• Examples of such systems include airline reservation, banking, credit card processing, stock
markets, and so on.
Single-User versus Multiuser Systems:
• One criterion for classifying a database system is according to the number of users who can
use the system concurrently.
• A DBMS is single-user if at most one user at a time can use the system, and it is multiuser if
many users can use the system at a time.
• Single-user DBMSs are mostly restricted to personal computer systems; most other DBMSs are
multiuser.
• For example, an airline reservations system is used by hundreds of travel agents and
reservation clerks concurrently.
• Multiple users can access databases simultaneously because of the concept of
multiprogramming, which allows the computer to execute multiple programs or processes at
the same time.
• If only a single central processing unit (CPU) exists, it can actually execute at most one process
at a time. However, multiprogramming operating systems execute some commands from one
process, then suspend that process and execute some commands from the next process and
so on.
• A process is resumed at the point where it was suspended whenever it gets its turn to use the
CPU again.
• Hence, concurrent execution of processes is actually interleaved, as illustrated in figure below,
which shows two processes A and B executing concurrently in an interleaved fashion. If the
computer system has multiple hardware processors (CPUs), parallel processing of multiple
processes is possible, as illustrated by processes C and D in the figure given below.
Write about Various Transaction Operations/
• A transaction is an executing program that forms a logical unit of database processing.

• A transaction includes one or more database access operations – these can include insertion,
deletion, modification, or retrieval operations.
• Every database operation involves two major operations called read and write.
• The DBMS will generally maintain a number of buffers in main memory that hold database
disk blocks containing the database items being processed.
• read_item(X): Reads a database item named X into a program variable.
• write_item(X): Writes the value of program variable X into the database item named X.
Steps involved in read_item(X):
1. Find the address of the disk block that contains item X.
2. Copy that disk block into a buffer in main memory.
3. Copy item X from the buffer to the program variable named X.
Steps involved in write_item(X):
1. Find the address of the disk block that contains item X.
2. Copy the disk block into a buffer in main memory.
3. Copy item X from the program variable named X into its correct location in the buffer.
4. Store the updated block from the buffer back to disk.

Write about the Need of Concurrency Control. Or Write about various problems that occur due to
lack of concurrency control.
Several problems can occur when concurrent transactions execute in an uncontrolled manner. The
types of problems we may encounter with these transactions if they run concurrently.
The Lost Update Problem:
• This problem occurs when two transactions that access the same database items have their
operations interleaved in a way that makes the value of some database items incorrect.
• Suppose the transactions T1 and T2 are submitted at approximately the same time, and
suppose that their operations are interleaved as shown in the above figure (a); then the final
value of item X is incorrect because T2 reads the value of X before T1 changes it in the
database, and hence the updated value resulting from T1 is lost.
• For example, if X=80 at the start, N=5, and M=4, the final result should be x=79; but in the
interleaving of operations shown in the above figure (a), it is X=84 because the update in T1,
that removed the five seats from X was lost.
The Temporary Update (or Dirty Read) Problem:
• This problem occurs when one transaction updates a database item and then the transaction
fails for some reason.
• The updated item is accessed by another transaction before it is changed back to its original
value.
• The above figure (b) shows an example where T1 updates item X and then fails before
completion, so the system must change X back to its original value. Before it can do so,
however, transaction T2 reads the temporary value of X, which will not be recorded
permanently in the database because of the failure of T1. This type of problem is known as
dirty read problem.
The Incorrect Summary Problem:
• If one transaction is calculating an aggregate summary function on a number of records while

other transactions are updating some of these records, the aggregate function may calculate
some values before they are updated and others after they are updated.
• For example, suppose that a transaction T3 is calculating the total number of reservations on
all the flights; meanwhile, transaction T1 is executing.
• If the interleaving of operations shown in the above figure (c) occurs, the result of T3 will be
off by an amount N because T3 reads the value of X after N seats have been subtracted from
it but reads the value of Y before those N seats have been added to it.
Write about different types of Failures.
Failures are generally classified as transaction, system, and media failures. There are several possible
reasons for a transaction to fail in the middle of execution:
1. A computer failure (system crash):
• A hardware, software, or network error occurs in the computer system during transaction
execution.
• Hardware crashes are usually media failures – for example, main memory failure.
2. A transaction or system error:
• Some operations in the transaction may cause it to fail, such as integer overflow or division by
zero.
• Transaction failure may also occur because of erroneous parameter values or because of a
logical programming error.
3. Local errors or exception conditions detected by the transaction:

• During transaction execution, certain conditions may occur that necessitate cancellation of the
transaction.
• For example, data for the transaction may not be found, insufficient balance in bank account,
etc.
4. Concurrency control environment:
• The concurrency control method may decide to abort the transaction, to be restarted later,
because several transactions are in a state of deadlock
5. Disk failure:
• Some disk blocks may lose their data because of a read or write malfunction of because of a
disk read/write head crash.
6. Physical problems and catastrophes:
• This refers to an endless list of problems that includes power or air-conditioning failure, fire,
theft, overwriting disks or tapes by mistake, etc.
Write about various Transaction States.
• A transaction is an atomic unit of work that is either completed in its entirety or not done at
all.
• For recovery purposes, the system needs to keep track of when the transaction starts,
terminates, and commits or aborts.
• Therefore, the recovery manager keeps track of the following operations:
o begin_transaction: This marks the beginning of transaction execution.
o read or write: These specify read or write operations on the database items that are
executed as part of a transaction.
o end_transaction: This specifies that read and write transaction operations have ended
and marks the end of transaction execution.
o commit_transaction: This signals a successful end of the transaction so that any
changes (updates) executed by the transaction can be safely committed to the
database and will not be undone.
o rollback (or abort): This signals that the transaction has ended unsuccessfully; so that
any changes or effects that the transaction may have applied to the database must be
undone.
Desirable Properties (ACID properties) of Transactions:
Transaction should possess several properties, often called the ACID properties. They should be
enforced by the concurrency control and recovery methods of the DBMS. The following are the ACID
properties:
1. Atomicity: A transaction is an atomic unit of processing. It is either performed in its entirety or not
performed at all.
2. Consistency preservation: A transaction is consistency preservation if its complete execution take(s)

the database from one consistent state to another.
3. Isolation: A transaction should appear as though it is being executed in isolation from other
transactions. That is, the execution of a transaction should not be interfered with any other transaction
executing concurrently.
4. Durability or permanency: The changes applied to the database by a committed transaction must
persist in the database. These changes must not be lost because of any failure.
Write about various Concurrency Control Techniques and different types of Locks.
Concurrency Control Technique: Some of the main techniques used to control concurrent execution
of transactions are based on the concept of locking data items.
A lock is a variable associated with a data item that describes the status of the item with respect to
possible operations that can be applied to it.
Generally, there is one lock for each data item in the database. Locks are used as a means of
synchronizing the access by concurrency transactions to the database items.
Types of locks: Several types of locks are used in concurrency control such as binary locks and
shared/exclusive locks.
• Binary Locks: A binary lock can have two states or values: locked and unlocked (or 1 and 0, for
simplicity).
o A distinct lock is associated with each database item X. If the value of the lock on X is
1, item X cannot be accessed by a database operation that requests the item. If the
value of the lock on X is 0, the item can be accessed when requested. We refer to the
current value (or state) of the lock associated with item X as lock(X).
o Two operations, lock_item and unlock_item, are used with binary locking.
o Lock_item(X): A transaction requests access to an item X by first issuing a lock_item(X)
operation. If LOCK(X) = 1, the transaction is forced to wait. If LOCK(X) = 0, it is set to 1
(the transaction locks the item) and the transaction is allowed to access item X.
o Unlock_item (X): When the transaction is through using the item, it issues an
unlock_item(X) operation, which sets LOCK(X) to 0 (unlocks the item) so that X may be
accessed by other transactions.
o Hence, a binary lock enforces mutual exclusion on the data item; i.e., at a time only
one transaction can hold a lock.
• Shared/Exclusive (or Read/Write) Lock:
o Shared lock: These locks are referred to as read locks.
o If a transaction T has obtained Shared-lock on data item X, then T can read X, but
cannot write X.
o Multiple Shared lock can be placed simultaneously on a data item.
• Exclusive lock: These Locks are referred to as write locks.

o If a transaction T has obtained Exclusive lock on data item X, then T can be read as well
as write X.
o Only one Exclusive lock can be placed on a data item at a time.
o This means that a single transaction exclusively holds the lock on the item.
• Two-Phase Locking (2PL):

o A transaction is said to follow the two-phase locking protocol if all locking operations
(read_lock, write_lock) precede the first unlock operation in the transaction.
o Such a transaction can be divided into two phases: an expanding or growing (first)
phase, during which new locks on items can be acquired but none can be released;
and a shrinking (second) phase, during which existing locks can be released but no
new locks can be acquired.
o Locking is an essential operation in the protocol that provides permission to read or
write a data item.
o The two-phase locking protocol is a process that enables acquiring shared resources
without creating the possibility of deadlock.
o The protocol involves three main activities:
▪ (i) Lock Acquisition
▪ (ii) Modification of Data
▪ (iii) Release of Lock
• Deadlocks:
o A deadlock is a condition in which two (or more) transactions in a set are waiting
simultaneously for locks held by some other transaction in the set.
o Neither transaction can continue because each transaction in the set is on a waiting
queue, waiting for one of the other transactions in the set to release the lock on an
item.
o Thus, a deadlock is an impasse that may result when two or more transactions are
each waiting for locks to be released that are held by the other.
o Transactions whose lock requests have been refused are queued until the lock can be
granted.
o A deadlock is also called a circular waiting condition where two transactions are
waiting (directly or indirectly) for each other.
o Thus, in a deadlock, two transactions are mutually excluded from accessing the next
record required to complete their transactions.
o Example: A deadlock exists two transactions A and B exist in the following example:
Transaction A=access data items X and Y Transaction B=access data items Y and X Here,
Transaction-A has acquired lock on X and is waiting to acquire lock on y. While,
Transaction-B has acquired lock on Y and is waiting to acquire lock on X. But, none of
them can execute further.
• Time-Stamp Methods for Concurrency control:
o Timestamp is a unique identifier created by the DBMS to identify the relative starting
time of transaction.
o Typically, timestamp values are assigned in the order in which the transactions are
submitted to the system.
o So, a timestamp can be thought of as the transaction start time.
o Therefore, time stamping is a method of concurrency control in which each
transaction is assigned a transaction timestamp.
• Multiversion concurrency control (MVCC):
o Multi-version protocol aims to reduce the delay for read operations.
o It maintains multiple versions of data items. Whenever a write operation is performed,
the protocol creates a new version of the transaction data to ensure conflict-free and
successful read operations.
o The newly created version contains the following information −
▪ Content − This field contains the data value of that version.
▪ Write_timestamp − This field contains the timestamp of the transaction that
created the new version.
▪ Read_timestamp − This field contains the timestamp of the transaction that
will read the newly created value.
Benefits of multiversion concurrency control (MVCC):
• Less need for database locks
With MVCC, the database can allow multiple transactions to read and write data without
locking the entire database.
• Fewer issues with multiple transactions trying to access the same data
MVCC helps reduce conflicts between transactions accessing the same data.
• Faster access to read data
Since MVCC allows multiple transactions to read data at the same time, it improves the speed
of reading data.
• Records are still protected during write operations
MVCC ensures that data is protected from being changed by other transactions while a
transaction is making changes to it.
• Fewer database deadlocks
Deadlocks occur when two or more transactions are waiting for each other to release a lock,
causing the system to come to a halt. MVCC can reduce the number of these occurrences.
Drawbacks of Multiversion concurrency control (MVCC):
• Concurrent update control methods can be challenging to implement.
• The database can become bloated with multiple versions of records, which increases its overall
size.
Write about Characterizing schedules based on Serializability.
In the field of database management systems, concurrency control is an important aspect of

maintaining the consistency of data. One of the most widely used methods for ensuring consistency is
serializability, which is a concept that characterizes schedules based on their ability to produce the
same results as if the transactions were executed one at a time in some order.
• Serializability is a concept that is used to ensure that the concurrent execution of multiple
transactions does not result in inconsistencies or conflicts in a database management system.
• In other words, it ensures that the results of concurrent execution of transactions are the same
as if the transactions were executed one at a time in some order.
• A schedule is considered to be serializable if it is equivalent to some serial schedule, which is

a schedule where all transactions are executed one at a time.
• This means that if a schedule is serializable, it does not result in any inconsistencies or conflicts
in the database.
Types of Schedules:
There are two types of schedules: serial schedules and concurrent schedules.
A serial schedule is one where all transactions are executed one at a time, and a concurrent schedule
is one where multiple transaction are executed simultaneously.
A schedule is considered to be conflict serializable if it is equivalent to some serial schedule.
A schedule is considered to view serializable if it is equivalent to some serial schedule, but the order
of the transactions may be different.
There are several methods for testing if a schedule is serializable, including −
Conflict serializability − A schedule is conflict serializable if it is equivalent to some serial schedule and
does not contain any conflicting operations.
View serializability − A schedule is a view serializable if it is equivalent to some serial schedule, but
the order of the transactions may be different.
Write about Characterizing schedules based on Recoverability.
Recoverability refers to the ability of a system to restore its state in the event of a failure. The
recoverability of a system is directly impacted by the type of schedule that is used.
A serial schedule is considered to be the most recoverable, as there is only one transaction executing
at a time, and it is easy to determine the state of the system at any given point in time.
A parallel schedule is less recoverable than a serial schedule, as it can be more difficult to determine
the state of the system at any given point in time.
A concurrent schedule is the least recoverable, as it can be very difficult to determine the state of the
system at any given point in time.
If any transaction that performs a dirty read operation from an uncommitted transaction and also its
committed operation becomes delayed till the uncommitted transaction is either committed or
rollback such type of schedules is called as Recoverable Schedules.
Types of recoverable schedules:
There are three types of recoverable schedules which are explained below with relevant examples −
• Cascading schedules
• Cascadeless Schedules
• Strict Schedules.
Recoverable schedule:
First, let us see an example of a recoverable schedule.
T1 T2
R(X)
W(X)
W(X)
R(X)
commit
Commit
Here, transaction T2 is reading value written by transaction T1 and the commit of T2 occurs after the
commit of T1. Hence, it is a recoverable schedule.
• Cascading Schedule:
o A cascading schedule is classified as a recoverable schedule.
o A recoverable schedule is basically a schedule in which the commit operation of a
particular transaction that performs read operation is delayed until the uncommitted
transaction either commits or roll backs.
o A cascading rollback is a type of rollback in which if one transaction fails, then it will cause
rollback of other dependent transactions.
o The main disadvantage of cascading rollback is that it can cause CPU time wastage.
Given below is an example of a cascading schedule −
T1 T2 T3 T4
Read(A)
Write(A)
Read (A)
Write(A)
Read(A)
Write(A)
Read(A)
Write(A)
Failure
The above transaction is cascading rollback because of T1 failure, T2 is rollback and rollback of T2
causes T3 to rollback and rollback T3 causes the T4 to rollback.
• Cascadeless Schedule:
o When a transaction is not allowed to read data until the last transaction which has
written it is committed or aborted, these types of schedules are called cascadeless
schedules.
Given below is an example of a cascadeless schedule −
T1 T2
R(X)
W(X)
W(X)
commit
T1 T2
R(X)
Commit
Here, the updated value of X is read by transaction T2 only after the commit of transaction T1. Hence,
the schedule is cascadeless schedule.
• Strict Schedule:
Given below is an example of a strict schedule −
T1 T2
R(X)
R(X)
W(X)
commit
W(X)
R(X)
Commit
Here, transaction T2 reads and writes the updated or written value of transaction T1 only after the
transaction T1 commits. Hence, the schedule is strict schedule.

Unit - 4 - Transaction Processing

Uploaded by

Unit - 4 - Transaction Processing

Uploaded by

Write about Transaction Processing Concepts.

Introduction to transaction processing:

• A transaction can be defined as a group of tasks.

Single-User versus Multiuser Systems:

• A transaction is an executing program that forms a logical unit of database processing.

• read_item(X): Reads a database item named X into a program variable.

Steps involved in read_item(X):

1. Find the address of the disk block that contains item X.

2. Copy that disk block into a buffer in main memory.

3. Copy item X from the buffer to the program variable named X.

Steps involved in write_item(X):

1. Find the address of the disk block that contains item X.

2. Copy the disk block into a buffer in main memory.

4. Store the updated block from the buffer back to disk.

The Lost Update Problem:

The Temporary Update (or Dirty Read) Problem:

The Incorrect Summary Problem:

• If one transaction is calculating an aggregate summary function on a number of records while

Write about different types of Failures.

1. A computer failure (system crash):

2. A transaction or system error:

3. Local errors or exception conditions detected by the transaction:

4. Concurrency control environment:

6. Physical problems and catastrophes:

Write about various Transaction States.

Desirable Properties (ACID properties) of Transactions:

2. Consistency preservation: A transaction is consistency preservation if its complete execution take(s)

• Exclusive lock: These Locks are referred to as write locks.

• Two-Phase Locking (2PL):

Benefits of multiversion concurrency control (MVCC):

• Less need for database locks

• Faster access to read data

• Records are still protected during write operations

• Fewer database deadlocks

Drawbacks of Multiversion concurrency control (MVCC):

• Concurrent update control methods can be challenging to implement.

In the field of database management systems, concurrency control is an important aspect of

• A schedule is considered to be serializable if it is equivalent to some serial schedule, which is

A schedule is considered to be conflict serializable if it is equivalent to some serial schedule.

There are several methods for testing if a schedule is serializable, including −

Types of recoverable schedules:

First, let us see an example of a recoverable schedule.

Given below is an example of a cascading schedule −

Given below is an example of a cascadeless schedule −

Given below is an example of a strict schedule −

You might also like