On Distributed Communication Network
On Distributed Communication Network
BY
PAUL BARAN
Corporation or the official opinion or policy of any of its govern- munications Networks," The RAND Corporation, Santa Monica
mental or private research sponsors. Calif., paper P-2359; July 5, 1961. '
2 IEEE TRANSACTIONS ON COMMUNICATIONS SYSTEMS 1.Warch
0
: r :
l
0 0
0
R= I
0
0
0
0 f-:
R= 1.5 R=2
0
R=3
0
0
0
0
0
* R=4
7
0 0
0 0 0 0
0 0 0 0
R' = 3
0 0 0 0
0.1 0.2
Fig. 2-Definition of redundancy level. with a pk = 0.7, to produce over an 0.9 probability of
successfully bisecting the network. If hidden alternative
command is allowed, then the largest single group would
still have an expected value of almost 50 per cent of the
initial stations surviving intact. If this raid misjudges
complete availability of weapons, complete knowledge of
all links in the cross section, or the effects of the weapons
against each and every link, the raid fails. The high risk
of such raids against highly parallel structures causes
examination of alternative attack policies. Consider the
following uniform raid example. Assume that 2000 weapons
are deployed against a 1000-station network. The stations
are so spaced that destruction of two stations with a single
weapon is unlikely. Divide the 2000 weapons into two
equal 1000-weapon salvos. Assume any probability of
destruction of a single node from a single weapon less
Fig. 3-An array of stations.
than 1.0; for example, 0.5. Each weapon on the first salvo
has a 0.5 probability of destroying its target. But, each
it is still possible to draw a line to connect the ith station weapon of the second salvo has only a 0.25 probability,
to the jth station, the ith and jth stations are said to be since one half the targets have already been destroyed.
connected. Thus, the uniform attack is felt to represent a worst-case
configuration.
Node Destruction Such worst-case attacks have been directed against an
Fig. 4 indicates network performance as a function of 18 X 18-array network model of 324 nodes with varying
the probability of destruction for each separate node. probability of kill and redundancy level, with results
If the expected" noise" was destruction caused by conven- shown in Fig. 4. The probability of kill was varied from
tional hardware failure, the failures would be randomly zero to unity along the abscissa, while the ordinate marks
distributed through the network. But if the disturbance survivability. The criterion of survivability used is the
were caused by enemy attack, the possible "worst cases" percentage of stations not physically destroyed and re-
must be considered. maining in communication with the largest single group of
To bisect a 32-link network requires direction of 288 surviving stations. The curves of Fig. 4 demonstrate
weapons each with a probability of kill, pk = 0.5, or 160 survivability as a function of attack level for networks of
1964 Baran: On Distributed Co 3
mmunications Networks
be built using a moderately low redundancy of connec- Fig. 7-Probability density distribution of largest fraction of sta-
tivity level. Redundancy levels on the order of only three tions in communication: perfect switching, R = 3, 100 cases,
permit the withstanding of extremely heavy level attacks 80 per cent node survival, 65 per cent link survival.
with negligible additional loss to communications. Sec-
ondly, the survivability curves have sharp break points.
A network of this type will withstand an increasing attack Fig. 5 shows the results for the case of perfect nodes;
level until a certain point is reached, beyond which the only the links fail. There is little system degradation
network, rapidly deteriorates. Thus, the optimum degree caused even using extremely unreliable links, on the order
of redundancy can be chosen as a function of the expected of 50 per cent down time, assuming all nodes are working.
level of attack. Further redundancy gains little. The
redundancy level required to survive even very heavy Combination Link and Node Destruction
attacks is not great; it is on the order of only three or four The worst case is the composite effect of failures of both
times that of the minimum span network. the links and the nodes. Fig. 6 shows the effect of link
failure upon a network having 40 per cent of its nodes
Link Destruction destroyed. It appears that what would today be regarded
In the previous example we have examined network as an unreliable link can be used in a distributed network
performance as a function of the destruction of the nodes almost as effectively as perfectly reliable links. Fig. 7
(which are better targets than links). We shall now re- examines the result of 100 trial cases in order to estimate
examine the same network, but using unreliable links. the probability density distribution of system performance
In particular, we want to know how unreliable the links for a mixture of node and link failures. This is the distri-
may be without further degrading the performance of the bution of cases for 20 per cent nodal damage and 35 per
network. cent link damage.
4 IEEE TRANSACTIONS ON COMMUNICATIONS SYSTENIS March
DIVERSITY OF ASSIGNMENT Comparison with Present Systems
There is another and more common technique for using Present conventional switching systems try only a
redundancy than in the method described above in which small subset of the potential paths that can be drawn on
each station is assumed to have perfect switching ability. a gridded network. The greater the percentage of potential
This alternative approach is called "diversity of assign- paths tested, the closer one approaches the performance
ment." In diversity of assignment, switching is not of perfect switching. Thus, perfect switching provides
required. Instead, a number of independent paths are an upper bound of expected system performance for a
selected between each pair of stations in a network which gridded network; the diversity of assignment case provides
requires reliable communications. However, there are a lower bound. Between these two limits lie systems
marked differences in performance between distributed composed of a mixture of switched routes and diversity
switching and redundancy of assignment as revealed by of assignment.
the following Monte Carlo simulation. Diversity of assignment is useful for short paths,
eliminating the need for switching, but requires surviva-
Simulation bility and reliability for each tandem element in long-haul
In the matrix of N separate stations, each ith station circuits passing through many nodes. As every component
is connected to every jth station by three shortest but in at least one out of a small number of possible paths
totally separate independent paths (i = 1, 2, 3, • • • , N; must be simultaneously operative, high reliability margins
j = 1, 2, 3, · · · , N; i j). A raid is laid against the and full standby equipment are usual.
network. Each of the preassigned separate paths from ON FUTURE SYSTEMS
the ith station to the jth station is examined. If one or more
of the preassigned paths survive, communication is said We will soon be living in an era in which we cannot
to exist between the ith and the jth station. The criterion guarantee survivability of any single point. However, we
of survivability used is the mean number of stations can still design systems in which system destruction
connected to each station, averaged over all stations. requires the enemy to pay the price of destroying n
Unlike the distributed perfect switching case, Fig. 8 of n stations. If n is made sufficiently large, it can be
shows that there is a marked loss in communications shown that highly survivable system structures can be
capability with even slightly unreliable nodes or links. built, even in the thermonuclear era. In order to build
The difference can be visualized by remembering that such networks and systems we will have to use a large
fully flexible switching permits the communicator the number of elements. We are interested in knowing how in-
privilege of ex post facto decision of paths. Fig. 8 emphasizes expensive these elements may be and still permit the
a key difference between some present-day networks system to operate reliably. There is a strong relationship
and the fully flexible distributed network we are discussing. between element cost and element reliability. To design
a system that must anticipate a worst-case destruction
of both enemy attack and normal system failures, one
can combine the failures expected by enemy attack to-
gether with the failures caused by normal reliability
problems, provided the enemy does not know which
elements are inoperative. Our future systems design
problem is that of building at lowest cost very reliable
systems out of the described set of unreliable elements.
In choosing the communications links of the future,
digital links appear increasingly attractive by permitting
low-cost switching and low-cost links. For example,
if "perfect switching" is used, digital links are manda-
tory to permit tandem connection of many separately
connected links without cumulative errors reaching
an irreducible magnitude. Further, the signaling measures
to implement highly flexible switching doctrines always
require digits.
Future Low-Cost All-Digital Communications Links
When one designs an entire system optimized for digits
and high redundancy, certain new communications link
techniques appear more attractive than those common
0 o
today. A key attribute of the new media is that it permits
0.1 0.2 0.3 0.4 0.5 0.6
SINGLE NOOE PROBABILITY OF KILL cheap formation of new routes, yet allows transmission
on the order of a million or so bits per second, high enough
Fig. 8-Diversity of assignment vs perfect switching in a distributed
network. to be economic yet low enough to be inexpensively
1964 Baran: On Distributed Communications Networks 5
processed with existing digital computer techniques at Variable Data Rate Links
the relay station nodes. Reliability and raw error rates In a conventional circuit-switched system each of the
are secondary. The network must be built with the expec- tandem links requires matched transmission bandwidths.
tation of heavy damage anyway. Powerful error removal In order to make fullest use of a digital link, the post-
methods exist. error-removal data rate would have to vary, as it is a
Some of the communication construction methods that function of noise level. The problem then is to build a
look attractive for the near future include pulse regenera- communication network made up of links of variable data
tive repeater line, minimum-cost or "mini-cost" micro- rate to use the communication resource most efficiently.
wave, TV broadcast station digital transmission and
satellites. Variable Data Rate Users
Pulse Regenerative Repeater Line: S. F. B. Morse's
regenerative repeater invention for amplifying weak We can view both the links and the entry point nodes
telegraphic signals has recently been resurrected and of a multiple-user all-digital communications system as
transistorized. Morse's electrical relay permits amplifica- elements operating at an ever-changing data rate. From
tion of weak binary telegraphic signals above a fixed instant to instant the demand for transmission will vary.
threshold. Experiments by various organizations (pri- We would like to take advantage of the average demand
marily the Bell Telephone Laboratories) have shown that over all users instead of having to allocate a full peak
digital data rates on the order of 1.5 million bits per second demand channel to each. Bits can become a common
can be transmitted over ordinary telephone line at re- denominator of loading and we would like to efficiently
peater spacings on the order of 6000 feet for 22-gage handle both those users who make highly intermittent
pulp paper insulated copper pairs. At present, more than bit demands on the network and those who make long-
20 tandemly connected amplifiers have been used without term continuous, low-bit demands.
retiming synchronization problems. There appears to be
no fundamental reason why either lines of lower loss, Common User
with corresponding further repeater spacing, or more In communications, as in transportation, it is most
powerful resynchronization methods cannot be used to economic for many users to share a common resource
extend link distances to in excess of 200 miles. Such rather than each to build his own system, particularly when
distances would be desired for a possible national dis- supplying intermittent or occasional service. This inter-
tributed network. Power to energize the miniature transis- mittency of service is highly characteristic of digital
tor amplifier is transmitted over the copper circuit itself. communication requirements. Therefore, we would like
"Mini-Cost" Microwave: While the price of microwave to consider one day the interconnection, of many all-
equipment has been declining, there are still untapped digital links to provide a resource optimized for the
major savings. In an analog signal network we require a handling of data for many potential intermittent users:
high degree of reliability and very low distortion for each a new common-user system.
tandem repeater. However, using digital modulation Fig. 9 demonstrates the basic notion. A wide mixture
together with perfect switching we minimize these two of different digital transmission links is combined to form
expensive considerations from our planning. We would a common resource divided among many potential users.
envision the use of low-power, mass-produced microwave But each of these communications links could possibly
receiver/transmitter units mounted on low-cost, short, have a different data rate. How can links of different data
guyed towers. Relay station spacing would probably be rates be interconnected?
on the order of 20 miles. Further economies can be obtained
by only a minimal use of standby equipment and reduction
of fading margins. The ability to use alternate paths
permits consideration of frequencies normally troubled
by rain attenuation problems reducing the spectrum
availability problem. Preliminary indications suggest
that this approach appears to be the cheapest way of
building large networks of the type to be described.
TV Stations: With proper siting of receiving antennas,
broadcast television stations might be used to form
additional high data rate links in emergencies.
Satellites: The problem of building a reliable network
using satellites is somewhat similar to that of building a
communications network with unreliable links. When a
satellite is overhead, the link is operative. When a satellite
is not overhead, the link is out of service. Thus, such
links are highly compatible with the type of system to be
described. Fig. 9-All-digital network composed of mixture of links.
6 IEEE TRANSACTIONS ON COMMUNICATIONS SYSTEMS .March
A MODEL ALL-DIGITAL DISTRIBUTED SYSTEM ized message block simplifies construction of very high
A future system, incorporating the features outlined speed switches. Every user connected to the network can
in the preceding section, has been modeled and simulated. feed data at any rate up to a maximum value. The user's
The key attribute of the system is in its switching scheme. traffic is stored until a full data block is received by the
But prior to considering the way in which the system first station. This block is rubber stamped with a heading
would work, some thought must be given to message and return address, plus additional housekeeping informa-
format standardization. tion. Then it is transmitted into the network.
Present common carrier communications networks, In order to build a network with the survivability
used for digital transmission, use links and concepts properties shown in Fig. 4, we must use a switching scheme
originally designed for another purpose-voice. These able to find any possible path that might exist after
systems are built around a frequency division multiplexing heavy damage. The routing doctrine should find the
link-to-link interface standard. The standard between links shortest possible path and avoid self-oscillatory or "ring-
is that of data rate. Time division multiplexing appears around-the-rosey" switching.
so natural to data transmission that we might wish to We shall explore the possibilities of building a "real-
consider an alternative approach, a standardized message time" data transmission system using store-and-forward
block as a network interface standard. While a standard- techniques. The high data rates of the future carry us
ized message block is common in many computer-com- into a hybrid zone between store-and-forward and circuit
munications applications, no serious attempt has ever switching. The system to be described is clearly store
been made to use it as a universal standard. A universally and forward if one examines the operations at each node
standardized message block would be composed of perhaps singularly. But, the network user who has called up a
1024 bits. Most of the message block would be reserved "virtual connection" to an end station and has transmitted
for whatever type data is to be transmitted, while the messages across the United States in a fraction of a
remainder would contain housekeeping information such second might also view the system as a black box providing
as error detection and routing data, as in Fig. 10. an apparent circuit connection across the country. There
are two requirements that must be met to build such a
quasi-real-time system. First, the in-transit storage at
each node should be minimized to prevent undesirable time
delays. Secondly, the shortest instantaneously available
path through the network should be found with the
expectation that the status of the network will be rapidly
changing. Microwave will be subject to fading interrup-
tions and there will be rapid moment-to-moment varia-
tions in input loading. These problems place difficult
MESSAGE _J requirements upon the switching. However, the develop-
150 BITS/SEC ----- i --BU_F_FE_R _,
TlME T1 ment of digital computer technology has advanced so
rapidly that it now appears possible to satisfy these
.
8
uF-FE_R _,JL - - MESSAGE
11 540,000 a1rs/sEc requirements by a moderate amount of digital equipment.
What is envisioned is a network of unmanned digital
Fig. IO-Message block. switches implementing a self-learning policy at each node,
without need for a central and possibly vulnerable control
As we move to the future, there appears to be an point, so that over-all traffic is effectively routed in a
increasing need for a standardized message block for our changing environment. One particularly simple routing
all-digital communications networks. As data rates scheme examined is called the "hot-potato" heuristic
increase, the velocity of propagation over long links routing doctrine and will be described in detail.
becomes an increasingly important consideration. 3 We Torn-tape telegraph repeater stations and our mail
soon reach a point where more time is spent setting the system provide examples of conventional store-and-for-
switches in a conventional circuit-switched system for ward switching systems. In these systems, messages are
short holding-time messages than is required for actual relayed from station to station and stacked until the
transmission of the data. "best" outgoing link is free. The key feature of store-and-
Most importantly, standardized data blocks permit forward transmission is that it allows a high line occupancy
many simultaneous users, each with widely different band- factor by storing so many messages at each node that
width requirements to economically share a broad-band there is a backlog of traffic awaiting transmission. But
network made up of varied data rate links. The standard- the price for link efficiency is the price paid in storage
capacity and time delay. However, it was found that
3 3000 miles at :::,,e150,000 miles/sec :::,,e50 msec transmission
time, T. 1024-bit message at 1,500,000 bits/sec :::,,e2/3 msec message most of the advantages of store-and-forward switching could
time. M. Therefore, T » M. be obtained with extremely little storage at the nodes.
1964 Baran: On Distributed Communications Networks 7
Thus, in the system to be described, each node will for each major station of the network allowed to generate
attempt to get rid of its messages by choosing alternate traffic. A column is assigned to each separate link con-
routes if its preferred route is busy or destroyed. Each nected to a node. As it was shown that redundancy levels
message is regarded as a "hot potato," and rather than on the order of four can create extremely "tough" net-
hold the hot potato, the node tosses the message to its works and that additional redundancy can bring little,
neighbor who will now try to get rid of the message. only about eight columns are really needed.
The Postman Analogy: The switching process in any
store-and-forward system is analogous to a postman LINK NUMBER BEST CHOICE
-
sorting mail. A postman sits at each switching node. :W,
I 2 3 4 5 6 7 8 1st 2nd 3rd 4th
Messages arrive simultaneously from all links. The post-
LINK NUMBER for
man records bulletins describing the traffic loading status HANDOVER NUMBER ENTRIES 0::CISION CHOICE
can infer the "best" paths to transmit mail to any station F 7 10 12 13 LP'- I 2 3 4 •:,