Concurrency Control in DBMS
In a database management system (DBMS), allowing transactions to run
concurrently has significant advantages, such as better system resource utilization
and higher throughput.
However, it is crucial that these transactions do not conflict with each other.
The ultimate goal is to ensure that the database remains consistent and accurate.
For instance, if two users try to book the last available seat on a flight at the same
time, the system must ensure that only one booking succeeds.
Concurrency control is a critical mechanism in DBMS that ensures the consistency
and integrity of data when multiple operations are performed at the same time.
Concurrency control is a concept in Database Management Systems (DBMS)
that ensures multiple transactions can simultaneously access or modify data
without causing errors or inconsistencies. It provides mechanisms to handle the
concurrent execution in a way that maintains ACID properties.
By implementing concurrency control, a DBMS allows transactions to execute
concurrently while avoiding issues such as deadlocks, race conditions, and
conflicts between operations.
The main goal of concurrency control is to ensure that simultaneous transactions
do not lead to data conflicts or violate the consistency of the database.
Concurrent Execution and Related Challenges in DBMS
In a multi-user system, several users can access and work on the same database at
the same time. This is known as concurrent execution, where the database is used
simultaneously by different users for various operations.
For instance, one user might be updating data while another is retrieving it.
When multiple transactions are performed on the database simultaneously, it is
important that these operations are executed in an interleaved manner. This means
that the actions of one user should not interfere with or affect the actions of another.
This helps in maintaining the consistency of the database. However, managing such
simultaneous operations can be challenging, and certain problems may arise if not
handled properly.
These challenges need to be addressed to ensure smooth and error-free concurrent
execution.
Concurrent Execution can lead to various challenges:
Dirty Reads: One transaction reads uncommitted data from another transaction,
leading to potential inconsistencies if the changes are later rolled back.
Lost Updates: When two or more transactions update the same data
simultaneously, one update may overwrite the other, causing data loss.
Inconsistent Reads: A transaction may read the same data multiple times during
its execution, and the data might change between reads due to another
transaction, leading to inconsistency.
Why is Concurrency Control Needed?
Consider the following example:
Without Concurrency Control: Transactions interfere with each other, causing
issues like lost updates, dirty reads or inconsistent results.
With Concurrency Control: Transactions are properly managed (e.g., using
locks or timestamps) to ensure they execute in a consistent, isolated manner,
preserving data accuracy.
Concurrency control is critical to maintaining the accuracy and reliability of
databases in multi-user environments. By preventing conflicts and inconsistencies
during concurrent transactions, it ensures the database remains consistent and
correct, even under high levels of simultaneous activity.
Recoverable and Cascadeless Schedules in Concurrency Control
1. Recoverable Schedules
A recoverable schedule ensures that a transaction commits only if all the
transactions it depends on have committed. This avoids situations where a
committed transaction depends on an uncommitted transaction that later fails,
leading to inconsistencies.
o Concurrency control ensures recoverable schedules by keeping track
of which transactions depend on others. It makes sure a transaction
can only commit if all the transactions it relies on have already
committed successfully. This prevents issues where a committed
transaction depends on one that later fails.
o Techniques like strict two-phase locking (2PL) enforce recoverability by
delaying the commit of dependent transactions until the parent
transactions have safely committed.
2. Cascadeless Schedules
A cascadeless schedule avoids cascading rollbacks, which occur when the
failure of one transaction causes multiple dependent transactions to fail.
o Concurrency control techniques such as strict 2PL or timestamp
ordering ensure cascadeless schedules by ensuring dependent
transactions only access committed data.
o By delaying read or write operations until the transaction they depend
on has committed, cascading rollbacks are avoided.
Advantages of Concurrency
In general, concurrency means that more than one transaction can work on a
system. The advantages of a concurrent system are:
Waiting Time: It means if a process is in a ready state but still the process does
not get the system to get execute is called waiting time. So, concurrency leads to
less waiting time.
Response Time: The time wasted in getting the response from the CPU for the
first time, is called response time. So, concurrency leads to less Response Time.
Resource Utilization: The amount of Resource utilization in a particular system
is called Resource Utilization. Multiple transactions can run parallel in a system.
So, concurrency leads to more Resource Utilization.
Efficiency: The amount of output produced in comparison to given input is called
efficiency. So, Concurrency leads to more Efficiency.
Disadvantages of Concurrency
Overhead: Implementing concurrency control requires additional overhead, such
as acquiring and releasing locks on database objects. This overhead can lead to
slower performance and increased resource consumption, particularly in systems
with high levels of concurrency.
Deadlocks: Deadlocks can occur when two or more transactions are waiting for
each other to release resources, causing a circular dependency that can prevent
any of the transactions from completing. Deadlocks can be difficult to detect and
resolve, and can result in reduced throughput and increased latency.
Reduced concurrency: Concurrency control can limit the number of users or
applications that can access the database simultaneously. This can lead to
reduced concurrency and slower performance in systems with high levels of
concurrency.
Complexity: Implementing concurrency control can be complex, particularly in
distributed systems or in systems with complex transactional logic. This
complexity can lead to increased development and maintenance costs.
Inconsistency: In some cases, concurrency control can lead to inconsistencies
in the database. For example, a transaction that is rolled back may leave the
database in an inconsistent state, or a long-running transaction may cause other
transactions to wait for extended periods, leading to data staleness and reduced
accuracy.