0% found this document useful (0 votes)
70 views31 pages

Understanding Dirty Data in Transactions

This document discusses the dirty data problem in transaction management. The dirty data problem occurs when a transaction reads uncommitted, or dirty, data written by another uncommitted transaction. This can lead to inconsistent database states if the transaction that wrote the dirty data aborts. The document discusses how the dirty data problem can occur in two-phase locking (2PL) schedules and timestamp-based schedules. It also discusses solutions like cascading rollback and schedules that avoid cascading rollback, such as commit-rule schedules. The document covers other topics in transaction management like recoverable schedules, group commit, deadlocks, and deadlock prevention techniques.

Uploaded by

aurchichowdhury
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views31 pages

Understanding Dirty Data in Transactions

This document discusses the dirty data problem in transaction management. The dirty data problem occurs when a transaction reads uncommitted, or dirty, data written by another uncommitted transaction. This can lead to inconsistent database states if the transaction that wrote the dirty data aborts. The document discusses how the dirty data problem can occur in two-phase locking (2PL) schedules and timestamp-based schedules. It also discusses solutions like cascading rollback and schedules that avoid cascading rollback, such as commit-rule schedules. The document covers other topics in transaction management like recoverable schedules, group commit, deadlocks, and deadlock prevention techniques.

Uploaded by

aurchichowdhury
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter 19 (TCDS)

Transaction
Management
Sukarna Barua
Associate Professor, CSE, BUET
3/21/2024
The Dirty-Data Problem
▪ Dirty data: A data is dirty if it has been written by a transaction that is not committed.
▪ Dirty data may be in -
▪ buffers
▪ Disk
▪ Both.
▪ Dirty data problem: Leads to inconsistent database state.
▪ Suppose T1 reads dirty data from T2
▪ T1 commits.
▪ T2 later aborts.
▪ Now, T1’s read and subsequent writes will become inconsistent.

3/21/2024
The Dirty-Data Problem
▪ Dirty data in 2PL schedule: Consider the schedule shown below.
▪ T1 writes A, which is updated in buffer.
▪ T2 reads A, which is the updated from buffer.
▪ T1 aborts.
▪ T2's read of A is now an inconsistent value!
▪ T2’s actions will lead an inconsistent state in the
database.
▪ 2PL schedule allows dirty data read!

3/21/2024
The Dirty-Data Problem
▪ Dirty data in timestamp based schedule: Assume there is no commit bit is used.
▪ T2 writes B.
▪ T1 reads B (which is updated by T2)
▪ Eventually, T2 writes C which is physically
unrealizable [ Why? ]
▪ Since 𝑅𝑇(𝐶) > 𝑇𝑆(𝑇2)
▪ T2 aborts.
▪ This makes T1’s first read of B inconsistent.

▪ Timestamp protocol with commit bit does not allow dirty data read!
▪ A dirty-read attempt is actually paused until the other transaction commits or aborts.

3/21/2024
Solution to Dirty-Data Problem
▪ Cascading rollback: If a transaction T aborts, then all other transactions that read
dirty data of T should also be aborted.
▪ Algorithm for cascading rollback:
- When a transaction T aborts
- Determine all transactions that has read T's written value.
- Abort all these transactions
- To cancel out effects of aborted transaction, use the logs and
restore former values in the disk blocks if required.
- Perform the operation recursively for every aborted transaction.

▪ In timestamp based scheduler:


▪ Commit bit prevents dirty-data problem.
▪ So cascading rollback is not required.
3/21/2024
Recoverable Schedules
▪ Recoverable Schedules:
▪ A schedule is recoverable if each transaction commits only after each transaction
from which it has read has committed.

▪ Say T1 reads T2's written data. T1 commits before T2 commits. The schedule is not
recoverable. Why?
- If T2 now aborts, T1's read is consistent.
- However, T1 is already committed, so no way to use cascading rollback T1.

▪ Requirement:
▪ If T1 remains committed after recovery and T1 reads value written T2, then T2 must also
remain committed after recovery.

3/21/2024
Recoverable Schedules
▪ Example 1: The following schedule is recoverable and conflict-serializable. [Why?]
▪ T2 reads B which was written by T1.
▪ T2 commits after T1.

▪ Example 2: The following schedule is recoverable but not conflict-serializable. [Why?]


▪ T2 reads B which was written by T1.
▪ T2 commits after T1.

▪ Example 3: The following schedule is not recoverable but conflict-serializable. [Why?]


▪ T2 reads B which was written by T1.
▪ T2 commits before T1.

3/21/2024
Schedules Avoiding Cascading Rollback
▪ Recoverable schedules require cascading rollback.
▪ Consider the following schedule.

▪ If T1 has to roll back, T2 must be rolled back. [Possible as T2 did not commit]

▪ Condition for Cascading Rollback Avoidance:


▪ A schedule avoids cascading roll back if transactions may read only values written
by committed transactions.
- Remember commit means commit log records must have been sent to disk.
- Such schedules are known as ACR schedule.
- ACR schedule forbids reading of dirty data

3/21/2024
Schedules Avoiding Cascading Rollback
▪ Which of the following three schedules are ACR?

▪ Answer: None
▪ Is the following schedule ACR?

▪ Answer: Yes! [Why?]


▪ T2’s read of B happen after T1’s commit.

3/21/2024
Managing Rollbacks Using Locking
▪ Strict 2PL locking: A 2PL transaction must not release any exclusive locks until the
transaction has committed or aborted and the commit or abort log record has been sent
to disk. [ transaction is allowed to release shared locks before commit/rollback ]

▪ Strict 2PL schedule: A schedule of transactions that follow the strict 2PL locking rule
is called a strict 2PL schedule.
▪ P1: Every strict 2PL schedule is ACR. [ Why? ]
▪ A transaction cannot read a value of X written by T1 until T1 release exclusive
lock which happens after T1's commit.
▪ P2: Every strict 2PL schedule is serializable. [ Why? ]
▪ A strict schedule is equivalent to the serial schedule in which each transaction
runs instantaneously at the time it commit.

3/21/2024
Relationship Between Schedule Types
▪ Containments and non-containments among classes of schedules!

3/21/2024
Group Commit
▪ Under the following circumstance, we can avoid reading dirty data even if we do not
flush every commit record to disk:
▪ As long as we flush log records on disk in the order they are written, we can
release locks as soon as the commit record is written to the log in a buffer!

▪ Group commit rule:


▪ Do not release locks until the transaction finishes and the commit log record at least
appear in log buffer.
▪ Flush log records in the order they were created in buffer.

3/21/2024
Group Commit Example
▪ Suppose T1 writes X, finishes, writes COMMIT on the log buffer (not disk).
▪ T2 release lock on X.
▪ T2 read X, finishes, writes COMMIT on log buffer.
▪ Note that T1 and T2 cannot be considered committed as their COMMIT record has
not been flushed to disk.
▪ Two cases when recovery manager processes the logs:
▪ T1 is committed on disk: T2's read is not a problem is it has read values from a committed
transaction.
▪ T1 is not committed on disk: T2's read could be a problem; however T2 is not committed
as well [Why?] and hence no dirty-read.
▪ Since COMMIT records written to disk in the order they were first written to log.

3/21/2024
Deadlocks
▪ Deadlocks: A situation where each of several transactions is waiting for a resource
held one on of the others, and none can make progress.

▪ Previously discussed deadlock situations:


▪ Two-phased locked schedule can result in a deadlock.
▪ Upgrading locks can result in deadlocks.

▪ Approaches to deadlock:
▪ Deadlock detection and treatment:
- Allow deadlocks to happen.
- Detect a deadlock and remedy by rolling back transactions.
▪ Deadlock prevention:
- Allow conversative schedules that cannot have a deadlock.
3/21/2024
Deadlock detection by timeout
▪ Timeout mechanism:
▪ Put a time limit on how long a transaction can be active.
▪ If a transaction exceeds this time, roll it back. [ assume deadlock ]

▪ If typical duration of transactions is 1ms:


▪ A timeout of one minute would suffice for detecting deadlocked transactions.
▪ Once a deadlocked transaction is rolled back, others may continue before timeout

3/21/2024
Waits-for Graph
▪ Waits-for Graph:
▪ Node for each transaction that currently holds a lock or waiting for one.
▪ An edge from node T to node U if there is some X such that:
▪ U holds a lock on X,
▪ T is waiting for a lock on X, and
▪ T cannot get a lock on X in its desired mode unless U releases its lock on X.

3/21/2024
Waits-for Graph
▪ Deadlock detection using wait-for graph: There is a deadlock if and only if there is a
cycle in waits-for graph.
▪ There is a cycle: No transaction in a cycle can make a progress, so a deadlock
▪ There is no cycle: At least one transaction which is not waiting for others, it can finish,
release its locks, edges are deleted, and so on. Hence, no deadlock.

▪ Deadlock prevention using wait-for graph:


▪ Roll back a transaction if its lock request create a cycle in the wait-for graph.
▪ Aggressive approach!

3/21/2024
Waits-for Graph: Example

▪ Example: Consider the schedule shown above.


▪ After Step 7, wait-for-graph is acyclic.
▪ After Step 8, wait-for-graph has a cycle.
▪ So action is denied.
▪ T1 is rolled back [ preventative ].

3/21/2024
Waits-for Graph: Example
▪ Example: Consider previous schedule.
▪ Wait-for-graph after T1 is rolled back is shown below.
▪ Now, T2 and T4 can continue, and finish.
▪ After T2 releases locks, T3 can continue, and finish.
▪ At some point T1 is restarted, but it cannot obtains locks before T2, T3, and T4
finishes and releases other locks required by T1.

3/21/2024
Deadlock Prevention by Element Ordering
▪ Locking rules in element ordering:
▪ Database elements are ordered beforehand (say, lexicographically).
▪ Any transaction requesting locks on database elements must do so in order. [e.g.,
lexicographically]
▪ Example: Say Transaction T requests lock on elements A,B,C,D.
- Then it must request locks in the order: A, B, C, D
- It cannot request locks in other order: A, D, B, C

3/21/2024
Deadlock Prevention by Element Ordering
▪ Why locking in elements ordering prevent deadlock?
▪ Say there is a deadlock 𝑇1 → 𝑇2 → ⋯ → 𝑇𝑛 → 𝑇1.
▪ T1 has locked A1 but waiting for A2, T2 has locked A2 but waiting for A3, and so on, and
Tn has locked An but waiting for A1.
▪ This not possible because Tn cannot request lock on A1 after locking An as it violates
elements ordering principle. [ proof by contradiction ]

3/21/2024
Deadlock Prevention by Element Ordering
▪ Previously, we showed a schedule causing deadlock for the following four transactions
(left figure).
▪ As per elements ordering rule, T2 and T4 are not consistent. [Why?]
▪ Rewrite lock request of T2 and T2 (right figure).

3/21/2024
Deadlock Prevention by Element Ordering
▪ Modified schedule after rewriting T2 and T4’s lock requests as per elements ordering
rules.
▪ Now no deadlock arises. [Check why?]
▪ All transactions finishes successfully.

3/21/2024
Deadlock Prevention by timestamp
▪ Deadlock prevents by wait-for graph is time-consuming:
▪ Each time a lock request comes, scheduler has to check cycle and graph can be
large.

▪ An alternative approach is timestamping.

▪ Timestamping: Associate a timestamp with each transaction.


▪ The purpose is for deadlock detection only.
▪ Different than the timestamp used by concurrency control mechanism.
▪ If a transaction is rolled back and restarts due to deadlock, its timestamp remains
same.

3/21/2024
Deadlock Prevention by timestamp
▪ Use of timestamp in deadlock prevention: Assume a transaction T has to wait for a
lock that is held by another transaction U. Following two policies can be used to detect
deadlocks:

▪ Wait-die scheme:
(a) [T waits] If T is older than U, then T is allowed to wait for U.
(b) [T dies] If T is younger than U, then T is rolled back.

▪ Would-wait scheme:
(a) [T wounds U] If T is older than U, U is rolled back, its locks are released and
granted to T.
(b) [T waits] If T is younger than U, then T is allowed to wait for U.

3/21/2024
Deadlock Prevention by timestamp
▪ Example schedule in wait-die scheme:
▪ T1,T2,T3,T4 is in the order of times [ T1 is the oldest transaction ].
▪ Step 2: T2 dies.
▪ Step 4: T4 dies.
▪ Step 10: T2 waits.

3/21/2024
Deadlock Prevention by timestamp
▪ Example schedule in wound-wait scheme:
▪ T1,T2,T3,T4 is in the order of times [ T1 is the oldest transaction ].
▪ Step 2: T2 waits.
▪ Step 4: T4 waits.
▪ Step 5: T1 wounds T3.

3/21/2024
Deadlock Prevention by timestamp
▪ Why wait-die scheme works?
▪ As per policy, older transaction can wait for younger transaction, but not vice
versa.
▪ If T1 waits for T2, then T2 cannot wait for T1.
▪ Hence, no cycle in the wait-for graph can happen!

▪ Why would-wait scheme works?


▪ As per policy, younger transaction can wait for older transaction, but not vice
versa.
▪ If T1 waits for T2, then T2 cannot wait for T1.
▪ Hence, no cycle in the wait-for graph can happen!

3/21/2024
Wait-Die vs Wound-Wait
▪ Comparison of deadlock prevention methods:
▪ In wait-die and wound-wait schemes:
▪ Older transaction kills newer transactions.
▪ Killed transactions restart with old timestamp, eventually it becomes the oldest and is
guaranteed to complete.
▪ This policy known as "No starvation".
▪ In wait-for graph mechanism:
▪ “Starvation” may happen.
▪ A transaction is rolled back and restarted.
▪ Transaction falls in deadlock again.
▪ Thus it is rolled back again.
▪ This process can go on causing starvation for the transaction.
3/21/2024
Wait-Die vs Wound-Wait
▪ Frequency of roll backs:
▪ Wait-die:
- Younger is killed when it asks for a lock.
- Can happen frequently during start of transactions.
- More rollbacks will happen.
▪ Wound-wait:
- Younger transaction is killed when older asks a lock.
- This situation is less common [If transactions acquire locks at the beginning of their start ].
- Rollbacks would be rare in wound-wait.

3/21/2024
Wait-Die vs Would-Wait
▪ Amount of wasted work in rollbacks:
▪ Wait-die:
- Younger is killed, when it asks for a lock, presumably, at the beginning of the
transaction.
- Roll back will abort transactions nearly at the beginning.
- Less waste of work already done in the database

▪ Wound-wait:
- When a transaction is killed, it may have processed many actions.
- Transactions may be at the end to finish.
- Roll back severely impacts a large work as wasted.

3/21/2024

You might also like