0% found this document useful (0 votes)

434 views105 pages

Dekker's Algorithm and Memory Consistency

This document discusses shared memory consistency models on multiprocessor systems. It begins by explaining sequential consistency, which requires that memory accesses across different processors be interleaved arbitrarily while preserving program order within each processor. Optimizations like write buffering can violate sequential consistency on multiprocessors by allowing reads to see writes out of order. The document examines how Dekker's mutual exclusion algorithm could fail under this reordering and explores approaches for restoring sequential consistency.

Uploaded by

Deepak Chakrasali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

434 views105 pages

Dekker's Algorithm and Memory Consistency

Uploaded by

Deepak Chakrasali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Shared Memory Consistency Models:

A Tutorial
By Sarita Adve & Kourosh Gharachorloo
Review by Jim Larson

Outline

Shared Memory on a Uniprocessor

Optimizations on a Uniprocessor

Extending to a Multiprocessor Sequential

Consistency
Extending to a Multiprocessor Does
Sequential Consistency Matter?

Restoring Sequential Consistency

Conclusion

Outline

Shared Memory on a Uniprocessor

Optimizations on a Uniprocessor

Extending to a Multiprocessor Sequential

Consistency
Extending to a Multiprocessor Does
Sequential Consistency Matter?

Restoring Sequential Consistency

Conclusion

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
0
Flag2 =
0

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
1
Flag2 =
0

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
1
Flag2 =
0

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical sectioncritical section
Flag1 =
1
Flag2 =
0

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical sectioncritical section
Flag1 =
1
Flag2 =
1

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical sectioncritical section
Flag1 =
1
Flag2 =
1

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical sectioncritical section
Flag1 =
1
Flag2 =
1

Critical Zone is Protected

Works the same if Process 2 runs first!
Process 2 enters its Critical Section

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
1
Flag2 =
0

Arbitrary interleaving of Processes

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
1
Flag2 =
1

Arbitrary interleaving of Processes

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
1
Flag2 =
1

Arbitrary interleaving of Processes Both Processes Blocked, But no harm!

Outline

Shared Memory on a Uniprocessor

Optimizations on a Uniprocessor

Extending to a Multiprocessor Sequential

Consistency
Extending to a Multiprocessor Does
Sequential Consistency Matter?

Restoring Sequential Consistency

Conclusion

Optimization: Write Buffer with

Bypass
SpeedUp: Write takes 100 cycles, buffering takes
1 cycle. So Buffer and keep going.
Problem: Read from a Location with a buffered
Write pending??
(Single Processor Case)

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
0
Flag2 =
0

Flag1 = 1

Write Buffering

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
0
Flag2 =
0

Flag2 = 1
Flag1 = 1

Write Buffering

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
0
Flag2 =
0

Flag2 = 1
Flag1 = 1

Write Buffering

Optimization: Write Buffer with

Bypass
SpeedUp: Write takes 100 cycles, buffering takes
1 cycle.
Rule: If a WRITE is issued, buffer it and keep
executing
Unless: there is a READ from the same
location (subsequent WRITEs don't matter),
then wait for the WRITE to complete.

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section

STALL!

Flag1 =
0
Flag2 =
0

Flag2 = 1
Flag1 = 1

Write Buffering
Rule: If a WRITE is issued, buffer it and keep executing
Unless: there is a READ from the same location (subsequent
WRITEs don't matter), then wait for the WRITE to complete.

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
1
Flag2 =
0

Flag2 = 1

Write Buffering
Rule: If a WRITE is issued, buffer it and keep executing
Unless: there is a READ from the same location (subsequent
WRITEs don't matter), then wait for the WRITE to complete.

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
0
Flag2 =
0

Does this work for Multiprocessors??

Outline

Shared Memory on a Uniprocessor

Optimizations on a Uniprocessor

Extending to a Multiprocessor Sequential

Consistency
Extending to a Multiprocessor Does
Sequential Consistency Matter?

Restoring Sequential Consistency

Conclusion

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
0
Flag2 =
0

Does this work for Multiprocessors??

We assume it does!
What does that mean?

Sequential Consistency for

Multiprocessors
Sequential Consistency requires that the result of
any execution be the same as if the memory
accesses executed by each processor were
kept in order and the accesses among different
processors were interleaved arbitrarily.
...appears as if a memory operation executes
atomically or instantaneously with respect to
other memory operations
(Hennessy and Patterson, 4th ed.)

Understanding Ordering

Program Order

Compiled Order

Interleaving Order

Execution Order

Reordering

Writes reach memory, and Reads see memory

in an order different than that in the Program.

Caused by Processor

Caused by Multiprocessors (and Cache)

Caused by Compilers

Outline

Shared Memory on a Uniprocessor

Extending to a Multiprocessor Sequential
Consistency
Optimizations on a Uniprocessor
Extending to a Multiprocessor Does
Sequential Consistency Matter?

Restoring Sequential Consistency

Conclusion

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 =
0
Flag2 =
0

Multiprocessor Case
Rule: If a WRITE is issued, buffer it and keep executing
Unless: there is a READ from the same location (subsequent
WRITEs don't matter), then wait for the WRITE to complete.

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section

Flag1 = 1

Flag1 =
0
Flag2 =
0

Multiprocessor Case
Rule: If a WRITE is issued, buffer it and keep executing
Unless: there is a READ from the same location (subsequent
WRITEs don't matter), then wait for the WRITE to complete.

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section

Flag1 = 1

Flag1 =
0
Flag2 =
0

Flag2 = 1

Multiprocessor Case
Rule: If a WRITE is issued, buffer it and keep executing
Unless: there is a READ from the same location (subsequent
WRITEs don't matter), then wait for the WRITE to complete.

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section

Flag1 = 1

Flag1 =
0
Flag2 =
0

Flag2 = 1

Multiprocessor Case
Rule: If a WRITE is issued, buffer it and keep executing
Unless: there is a READ from the same location (subsequent
WRITEs don't matter), then wait for the WRITE to complete.

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section

Flag1 = 1

Flag1 =
0
Flag2 =
0

Flag2 = 1

Multiprocessor Case
Rule: If a WRITE is issued, buffer it and keep executing
Unless: there is a READ from the same location (subsequent
WRITEs don't matter), then wait for the WRITE to complete.

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 = 1

Flag1 =
0
Flag2 =
0

Flag2 = 1

Multiprocessor Case
Rule: If a WRITE is issued, buffer it and keep executing
Unless: there is a READ from the same location (subsequent
WRITEs don't matter), then wait for the WRITE to complete.

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Flag1 = 1

Flag1 =
0
Flag2 =
0

Flag2 = 1

Multiprocessor Case
Rule: If a WRITE is issued, buffer it and keep executing
Unless: there is a READ from the same location (subsequent
WRITEs don't matter), then wait for the WRITE to complete.

What happens on a
Processor stays on that
Processor

Dekker's Algorithm: Global Flags Init to 0

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section

Flag1 = 1

Flag1 =
0
Flag2 =
0

Flag2 = 1

Processor 2 knows nothing about the

write to Flag1, so has no reason to stall!
Rule: If a WRITE is issued, buffer it and keep executing
Unless: there is a READ from the same location (subsequent
WRITEs don't matter), then wait for the WRITE to complete.

Another way to look at the Problem: Reordering of

Reads and Writes (Loads and Stores).

Consider the Instructions in these processes.

Process 1::
Process 2::
Flag1 = 1
Flag2 = 1
If (Flag2 == 0)
If (Flag1 == 0)
critical section
critical section
Simplify as:
WX

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

WX
WX
WX
WX
WX
WX
RY
RY
WY
RX
WY
RX
RY
RY
WY
RX
WY
RX
RY
RY
WY
RX
WY
RX

RY WY
RY RX
WY RY
RX RY
WY RX
RX WY
WX WY
WX RX
WX RY
WX RY
WX RX
WX WY
WY WX
RX WX
RY WX
RY WX
RX WX
WY WX
WY RX
RX WY
RY RX
RY WY
RX RY
WY RY

RX
WY
RX
WY
RY
RY
RX
WY
RX
WY
RY
RY
RX
WY
RX
WY
RY
RY
WX
WX
WX
WX
WX
WX

There are 4! or 24 possible

orderings.

If either WX<RX or WY<RY

Then the Critical Section is
protected (Correct Behavior).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

WX
WX
WX
WX
WX
WX
RY
RY
WY
RX
WY
RX
RY
RY
WY
RX
WY
RX
RY
RY
WY
RX
WY
RX

RY
RY
WY
RX
WY
RX
WX
WX
WX
WX
WX
WX
WY
RX
RY
RY
RX
WY
WY
RX
RY
RY
RX
WY

WY
RX
RY
RY
RX
WY
WY
RX
RY
RY
RX
WY
WX
WX
WX
WX
WX
WX
RX
WY
RX
WY
RY
RY

RX
WY
RX
WY
RY
RY
RX
WY
RX
WY
RY
RY
RX
WY
RX
WY
RY
RY
WX
WX
WX
WX
WX
WX

There are 4! or 24 possible

orderings.

If either WX<RX or WY<RY

Then the Critical Section is
protected (Correct Behavior)

18 of the 24 orderings are OK.

But the other 6 are trouble!

Consider another example...

Global Data Initialized to 0

Process 1::
Process 2::
Data = 2000;
While (Head == 0) {;
Head = 1;
LocalValue = Data
Memory Interconnect

Head =
0

Data =
0

Write By-Pass: General Interconnect to multiple memory

modules means write arrival in memory is indeterminate.

Global Data Initialized to 0

Process 1::
Process 2::
Data = 2000;
While (Head == 0) {;
Head = 1;
LocalValue = Data
Data = 2000

Memory Interconnect

Head =
0

Data =
0

Write By-Pass: General Interconnect to multiple memory

modules means write arrival in memory is indeterminate.

Global Data Initialized to 0

Process 1::
Process 2::
Data = 2000;
While (Head == 0) {;
Head = 1;
LocalValue = Data
Head = 1

Memory
Interconnect
Data
= 2000

Head =
0

Data =
0

Write By-Pass: General Interconnect to multiple memory

modules means write arrival in memory is indeterminate.

Global Data Initialized to 0

Process 1::
Process 2::
Data = 2000;
While (Head == 0) {;
Head = 1;
LocalValue = Data
Memory
Interconnect
Data
= 2000

Head =
1

Data =
0

Write By-Pass: General Interconnect to multiple memory

modules means write arrival in memory is indeterminate.

Global Data Initialized to 0

Process 1::
Process 2::
Data = 2000;
While (Head == 0) {;
Head = 1;
LocalValue = Data
Memory
Interconnect
Data
= 2000

Head =
1

Data =
0

Write By-Pass: General Interconnect to multiple memory

modules means write arrival in memory is indeterminate.

Global Data Initialized to 0

Process 1::
Process 2::
Data = 2000;
While (Head == 0) {;
Head = 1;
LocalValue = Data
Memory
Interconnect
Data
= 2000

Head =
1

Data =
0

Write By-Pass: General Interconnect to multiple memory

modules means write arrival in memory is indeterminate.

Global Data Initialized to 0

Process 1::
Process 2::
Data = 2000;
While (Head == 0) {;
Head = 1;
LocalValue = Data
Memory Interconnect

Head =
1

Data =
2000

Write By-Pass: General Interconnect to multiple memory

modules means write arrival in memory is indeterminate.
Fix: Write must be acknowledged before another write (or
read) from the same processor.

Global Data Initialized to 0

Process 1::
Process 2::
Data = 2000;
While (Head == 0) {;
Head = 1;
LocalValue = Data
Memory Interconnect

Head =
0

Data = 0