Congestion Control
Congestion Control
What is congestion?
Methods for dealing with congestion
TCP congestion control
Review of Flow Control
Receiver advertises window in ACK
packets
Receiver window size = amount of free
space in receive buffer
Sender will limit number of unACK’ed
packets to receiver window
Invariant: every packet that arrives at
receiver can be buffered
Flow Control
Sender Receiver
recv window = 5
recv window = 0
Flow Control, Cont’d
Sender Receiver
recv wind = 2
recv wind = 2
Flow Control Justification
TCP flow control is conservative
Only send data if receiver is sure to have space
for it
Consider aggressive alternative
Send data optimistically, hoping receiver has
space for it
If receiver can’t buffer packet, simply discard
Which is more efficient?
Efficiency Metrics
Transmission speed
Get as many bytes from sender to receiver as
quickly as possible
Network utilization
Maximize “goodput”
Fraction of bytes spent on delivering new, useful
data
E.g. delayed ack hurts transmission speed,
but improves network utilization
Congestion
What is congestion?
Higher rate of inputs to a router than outputs
What are effects of congestion?
Delays
Loss
What layer does congestion occur at?
Network layer
So why are we talking about it now?
Congestion, simple case
Assume:
Sender transmits at full line rate (i.e. receiver window
infinite)
Instant, free, precise loss notification
No other traffic on network
10 Mbps 5 Mbps
Sender Router Receiver
Congestion, two senders
Assume:
Each sender transmits at 1/2 line rate
Sender 1 5 Mbps Receiver 1
20 Mbps
Router
Sender 2 20 Mbps Receiver 2
Congestion and Delays
During congestion, delay is much greater
than ordinary delay
Since loss notification is imperfect, may
retransmit packets still in the queue!
queue
Sender Router
retransmission original
Congestion Problems
Excessive queueing delays
Wasted network capacity on retransmissions
Could have been used by other flows
Situation gets worse with multi-hop paths
Wasteful retransmit of prematurely timed out packets
How bad does it get?
1000-fold reduction in bandwidth!
Congestion Control
Congestion control involves two tasks:
Detect congestion
Limit sending rate
Today we look at TCP approach to
this
TCP Congestion Control
Idea
Assumes best-effort network
FIFO or FQ
Each source determines network capacity for
itself
Implicit feedback
ACKs pace transmission (self-clocking)
Challenge
Determining initial available capacity
Adjusting to changes in capacity in a timely
manner
TCP Congestion Control
Basic idea
Add notion of congestion window
Effective window is smaller of
Advertised window (flow control)
Congestion window (congestion control)
Changes in congestion window size
Slow increases to absorb new bandwidth
Quick decreases to eliminate congestion
TCP Congestion Control
Specific strategy
Self-clocking
Send data only when outstanding data ACK’d
Equivalent to send window limitation mentioned
Growth
Add one maximum segment size (MSS) per
congestion window of data ACK’d
It’s really done this way, at least in Linux:
see tcp_cong_avoid in tcp_input.c.
Actually, every ack for new data is treated as an MSS
ACK’d
Known as additive increase
TCP Congestion Control
Specific strategy (continued)
Decrease
Cut window in half when timeout occurs
In practice, set window = window /2
Known as multiplicative decrease
Additive increase, multiplicative decrease
(AIMD)
Additive Increase/
Multiplicative Decrease
Objective
Adjust to changes in available capacity
Tools
React to observance of congestion
Probe channel to detect more resources
Observation
On notice of congestion
Decreasing too slowly will not be reactive enough
On probe of network
Increasing too quickly will overshoot limits
Additive Increase/
Multiplicative Decrease
New TCP state variable
CongestionWindow
Similar to AdvertisedWindow for flow control
Limits how much data source can have in transit
MaxWin = MIN(CongestionWindow,
AdvertisedWindow)
EffWin = MaxWin - (LastByteSent -
LastByteAcked)
TCP can send no faster then the slowest component,
network or destination
Idea
Increase CongestionWindow when congestion goes
down
Decrease CongestionWindow when congestion goes up
Additive Increase/
Multiplicative Decrease
Question
How does the source determine whether or not the
network is congested?
Answer
Timeout signals packet loss
Packet loss is rarely due to transmission error (on wired
lines)
Lost packet implies congestion!
Additive Increase/
Multiplicative Decrease
Algorithm
Source Destination
Increment CongestionWindow
by one packet per RTT
Linear increase
Divide CongestionWindow by
two whenever a timeout occurs
Multiplicative decrease
…
Additive Increase/
Multiplicative Decrease
Sawtooth trace
70
60
50
40
KB
30
20
10
1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
Time (seconds)
TCP Start Up Behavior
How should TCP start sending data?
AIMD is good for channels operating at
capacity
AIMD can take a long time to ramp up to
full capacity from scratch
Use Slow Start to increase window
rapidly from a cold start
TCP Start Up Behavior
Initialization of the congestion window
Congestion window should start small
Avoid congestion due to new connections
Start at 1 MSS, reset to 1 MSS with each
timeout (note that timeouts are coarse-
grained, ~1/2 sec)
Known as slow start
Slow Start
Objective
Source Destination
Determine initial available capacity
Idea
Begin with CongestionWindow = 1 packet
Double CongestionWindow each RTT
Increment by 1 packet for each ACK
Continue increasing until loss
Result
Exponential growth
Slower than all at once
Used
When first starting connection
…
When connection times out
TCP Congestion Control
To make up for slow start, ramp up congestion
window quickly
Maintain threshold window size
Use multiplicative increase
When congestion window smaller than threshold
Double window for each window ACK’d
Threshold value
Initially set to maximum window size
Set to 1/2 of current window on timeout
In practice, increase congestion window by one
MSS for each ACK of new data (or N bytes for N
bytes)
Slow Start
How long should the exponential increase
from slow start continue?
New variable: target window size
CongestionThreshold
Estimate network capacity
When CongestionWindow reaches
CongestionThreshold switch to additive
increase
Initial values
CongestionThreshold = 8
CongestionWindow = 1
Loss after transmission 7
CongestionWindow currently 12
Set Congestionthreshold =
CongestionWindow/2
Set CongestionWindow = 1
Slow Start
Example trace of CongestionWindow
70
60
50 CW flattens out due to loss
KB
40
30 Linear increase
20
10
Slow start until CW = CT
1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0
Timeout: CT = CT/2 = 11 CW = 1
Problem
Have to wait for timeout
Can lose half CongestionWindow of data
Fast Retransmit and Fast
Recovery
Problem Sender Receiver
Packet 1
Coarse-grain TCP Packet 2
ACK 1
timeouts lead to Packet 3
Packet 4 ACK 2
idle periods ACK 2
Packet 5
Solution Packet 6
ACK 2
Fast retransmit: use ACK 2
duplicate ACKs to Retransmit
packet 3
trigger ACK 6
retransmission
Fast Retransmit and Fast
Recovery
Send ACK for each segment received
When duplicate ACK’s received
Resend lost segment immediately
Do not wait for timeout
In practice, retransmit on 3rd duplicate
Fast recovery
When fast retransmission occurs, skip slow start
Congestion window becomes 1/2 previous
Start additive increase immediately
Fast Retransmit and Fast
Recovery
Results
70
60
50
40
KB
30
20
10
1.0 2.0 3.0 4.0 5.0 6.0 7.0
Fast Recovery
Bypass slow start phase
Increase immediately to one half last successful
CongestionWindow (ssthresh)
TCP Congestion Window
Trace
70
threshold
60 congestion
timeouts window
Congestion Window
50
fast retransmission
40
30
20
additive increase
10
slow start period
0
0 10 20 30 40 50 60
Time