0% found this document useful (0 votes)
72 views45 pages

Chapter 01 - Introduction Distributed Syetem

Uploaded by

bahar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
72 views45 pages

Chapter 01 - Introduction Distributed Syetem

Uploaded by

bahar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 45

1 Introduction to Distributed System

1.1 Introduction
1.2 Goal of Distributed
System
1.3 Types of Distributed
System
1.4 Characterization of DS

Faizur Rashid (Dr.)


1
1.1. Introduction and Definition
• before the mid-80s, computers were
 very expensive (hundred of thousands or even millions of dollars)
 very slow (a few thousand instructions per second)
 not connected among themselves
 after the mid-80s: two major developments
 cheap and powerful microprocessor-based computers appeared
 computer networks
 LANs at speeds ranging from 10 to 1000 Mbps (now even
10Gbps)
 WANs at speed ranging from 64 Kbps to gigabits/sec

• Consequence
 feasibility of using a large network of computers to
work for the same application; this is in contrast to the
old centralized systems where there was a single
computer with its peripherals 2
1.1. Introduction and Definition
• Definition of a Distributed System

 distributed system : a collection of


independent computers that appears to its
users as a single coherent system -computer
(Tanenbaum& Van Steen)
 this definition has two aspects:
1.hardware: autonomous machines
2.software: a single system view for the users

3
1.1. Introduction and Definition
Other Definitions
• A distributed system is a system designed to support the
development of applications and services which can
exploit a physical architecture consisting of multiple,
autonomous processing elements that do not share
primary memory but cooperate by sending
asynchronous messages over a communication network
(Blair & Stefani)
• A distributed system is one that stops you getting any
work done when a machine you have never even heard
of crashes (Leslie)

4
1.1. Introduction and Definition
• Why Distributed?
• Resource and Data Sharing
 printers, databases, multimedia servers, ...
• Availability, Reliability
 the loss of some instances can be hidden
• Scalability, Extensibility
 the system grows with demand (e.g., extra servers)
• Performance
 huge power (CPU, memory, ...) available
• Inherent distribution, communication
 organizational distribution, e-mail, video

5
1.1. Introduction and Definition
• Characteristics of Distributed Systems
• differences between the computers and the ways they
communicate are hidden from users
• users and applications can interact with a distributed
system in a consistent and uniform way regardless of
location
• distributed systems should be easy to expand and scale
• a distributed system is normally continuously available,
• even if there may be partial failures

6
1.2. Goals of a Distributed System
• to support heterogeneous computers and networks and to provide a
single-system view, a distributed system is often organized by
means of a layer of software called middleware that extends over
multiple machines

• a distributed system organized as middleware; note that


the middleware layer extends over multiple machines,
and offers each application the same interface
• Note: most diagrams in all slides are taken from the text
book 7
1.2. Goals of a Distributed System
• a distributed system should
 easily connect users with resources (printers, computers, storage
facilities, data, files, Web pages, ...)
 Some of the reasons

– economics: sharing resources such as printers and high-speed


computers
– to collaborate and exchange information
– groupware: software for collaborative editing, teleconferencing, etc.
– e-commerce: buying and selling goods
 be transparent: hide the fact that the resources and processes are
distributed across multiple computers
 be open
 be scalable
 Transparency in a Distributed System
 a distributed system that is able to present itself to users and applications as if it were only
a single computer system is said to be transparent
8
1.2. Goals of a Distributed System
• different forms of transparency in a distributed system
• Transparency Description
• Access Hide differences in data representation (endianness, file
naming, ...) and how a resource is accessed
• Location Hide where a resource is physically located; where is
http://www.prenhall.com/index.html? (naming)
• Migration Hide that a resource may move to another location
• Relocation Hide that a resource may be moved to another location
while in use; e.g., mobile users using their wireless
laptops and moving from place to place
• Replication Hide that a resource is replicated (for availability
and performance); all replicas have the same name
• Concurrency Hide that a resource may be shared by several
competitive users; a resource must be left in a
consistent state; through locking
• Failure Hide the failure and recovery of a resource
9
1.2. Goals of a Distributed System
• Openness in a Distributed System
• a distributed system should be open
• we need well-defined interfaces
• Interoperability
 components of different origin can communicate.
• portability
 components work on different platforms
• another goal of an open distributed system is that it should be
flexible and extensible; easy to configure the system out of different
components; easy to add new components, replace existing ones;
• Open Distributed System is a system that offers services according
to standard rules that describe the syntax and semantics of those
services; e.g., protocols in networks
• standards - a necessity

10
1.2. Goals of a Distributed System
• in distributed systems, such services are often specified through
interfaces often described using an Interface Definition
Language(IDL)
 specify only syntax: the names of the functions, types of
parameters, return values, possible exceptions, ...
 semantics given in an informal way by means of natural
languages
• Scalability in Distributed Systems
• a distributed system should be scalable; there are three dimensions
 size: adding more users and resources to the system
 geographically: users and resources may be far apart
 administratively: should be easy to manage even if it spans many
administrative organizations
• but a scalable system may exhibit performance problems
11
1.2. Goals of a Distributed System

• Concept Example
• Centralized services Single server for all users-mostly for
security reasons
• Centralized data A single on-line telephone book
• Centralized algorithms Doing routing based on complete
information
• Scaling Techniques: how to solve scaling problems
 the problem is mainly performance, and arises as a result of
limitations in the capacity of servers and networks (for
geographical scalability with high latency and mostly unreliable
links)
 three possible solutions: hiding communication latencies,
distribution, and replication

12
1.2. Goals of a Distributed System
• Hide Communication Latencies
• try to avoid waiting for responses to remote service requests
• let the requester do other useful job
• i.e., construct requesting applications that use only asynchronous
communication instead of synchronous communication; when a
reply arrives the application is interrupted
• good for batch processing and parallel applications since
independent tasks can be scheduled while another task is waiting
for communication to complete or use multithreading for non-
parallel programs
• hiding communication latencies is not in general
• applicable for interactive applications
• for interactive applications, try to reduce communication; move part
of the job to the client to reduce communication; e.g. filling a form to
access a database and checking the entries

13
1.2. Goals of a Distributed System

(a) a server checking the correctness of field entries


(b) a client doing the job
• e.g., checking the completeness of mandatory fields
• shipping code is now supported in Web applications using Java
Applets and ActiveX controls (with some security issues)
14
1.2. Goals of a Distributed System
b. Distribution
• means splitting a component into smaller parts and spreading
those parts across the system
• e.g., DNS -Domain Name System (faizur@HU.edu.et)
• divide the name space into non-overlapping zones.

• an example of dividing the DNS name space into zones


15
1.2. Goals of a Distributed System
c. Replication
• replicate components across a distributed system to increase
availability and for load balancing, leading to better performance
• replication is decided by the owner of a resource
• caching(a special form of replication) also reduces communication
latency; decided by the user

16
1.3. Types of Distributed System
Three types: distributed computing systems, distributed information
systems, and distributed pervasive/embedded systems
1.Distributed Computing Systems
-Used for high-performance computing tasks
-two types: cluster computing and grid computing
-Cluster Computing
 a collection of similar workstations or PCs (homogeneous),
closely connected by means of a high-speed LAN
 each node runs the same operating system
 used for parallel programming in which a single compute intensive
program is run in parallel on multiple machines

17
1.3. Types of Distributed System

• an example of a cluster computing system􀂄


• a master node runs a middleware (containing libraries for parallel
programs) and controls other compute nodes;
• it allocates tasks
• provides an interface to users etc.
18
1.3. Types of Distributed System
Grid Computing
• Resource sharing and coordinated problem solving in dynamic,
multi-institutional virtual organizations”(Ian Foster)
• high degree of heterogeneity: no assumptions are made concerning
hardware, operating systems, networks, administrative domains,
security policies, etc.
• Globus is a software system for Grid Computing; read about the
Globus Alliance at http://www.globus.org/
2. Distributed Information Systems
• many networked applications
• Problem: interoperability
• at the lowest level: wrap a number of requests into a single larger
request and have it executed as a distributed transaction; all or none
of the requests would be executed
• how to let applications communicate directly with each other, i.e.,
Enterprise Application Integration (EAI)
19
1.3. Types of Distributed System
a. Transaction Processing Systems
• consider database applications
• special primitives are required to program transactions, supplied
either by the underlying distributed system or by the language
runtime system
• exact list of primitives depends on the type of application;
procedure calls, ordinary statements, etc. can also be included
• Primitive Description
BEGIN_TRANSACTION Mark the start of a transaction
END_TRANSACTION Terminate the transaction and try to
commit
ABORT_TRANSACTION Kill the transaction and restore the old
values
READ Read data from a file, a table, or otherwise
20
• WRITE Write data to a file, a table, or otherwise
1.3. Types of Distributed System
• The Transaction Model
• the model for transactions comes from the world of business
• a supplier and a retailer negotiate on

 Price
 delivery date
 quality etc.
• until the deal is concluded they can continue negotiating
or one of them can terminate
• but once they have reached an agreement they are
bound by law to carry out their part of the deal
• transactions between processes is similar with this
scenario 21
1.3. Types of Distributed System
• e.g., assume the following banking operation
 withdraw an amount x from account 1
 deposit the amount x to account 2
• what happens if there is a problem after the first activity is carried
out
• group the two operations into one transaction; either both are
carried out or neither
• we need a way to roll back when a transaction is not completed

22
1.3. Types of Distributed System
• e.g. reserving a seat from Manchester to Lalibella through Heathrow
and AA Bole airports
BEGIN_TRANSACTION BEGIN_TRANSACTION
reserve Man →Heathrow; reserve Man →Heathrow
reserve Heathrow →Bole; reserve Heathrow →Bole;
reserve Bole →Lalibella; reserve Bole →Lalibella; full ⇒

END_TRANSACTION ABORT_TRANSACTION
(a) (b)

(a)transaction to reserve three flights commits


(b)transaction aborts when third flight is

23
1.3. Types of Distributed System
• properties of transactions, often referred to as ACID
1. Atomic: to the outside world, the transaction happens indivisibly;
a transaction either happens completely or not at all; intermediate
states are not seen by other processes
2. Consistent: the transaction does not violate system invariants;
e.g., in an internal transfer in a bank, the amount of money in the
bank must be the same as it was before the transfer (the law of
conservation of money); this may be violated for a brief period of
time, but not seen to other processes
3. Isolated or Serializable: concurrent transactions do not interfere
with each other; if two or more transactions are running at the same
time, the final result must look as though all transactions run
sequentially in some order
4. Durable: once a transaction commits, the changes are permanent;

24
1.3. Types of Distributed System
• Classification of Transactions
• a transaction could be flat, nested or distributed
• Flat Transaction
 consists of a series of operations that satisfy the ACID properties
 simple and widely used but with some limitations
 do not allow partial results to be committed or aborted

– i.e., atomicity is also partly a weakness


– in our airline reservation example, we may want to accept the
first two reservations and find an alternative one for the last

 some transactions may take too much time

25
1.3. Types of Distributed System
• Nested Transaction
 constructed from a number of subtransactions; it is logically
decomposed into a hierarchy of subtransactions; the flight
reservation can be split into three transactions, each accessing a
different database
 the top-level transaction forks off children that run in parallel, on
different machines; to gain performance or for programming
simplicity
 each may also execute one or more subtransactions
 permanence (durability) applies only to the top-level transaction;
commits by children should be undone
• Distributed Transaction
 a flat transaction that operates on data that are distributed across
multiple machines
 problem: separate algorithms are needed to handle the locking of
data and committing the entire transaction;
26
1.3. Types of Distributed System

(a) a nested transaction


(b) a distributed transaction

27
1.4. Characterization of Distributed System

 Distributed systems are undergoing a period of significant change and


this can be traced back to a number of influential trends:
• the emergence of pervasive networking technology;
• the emergence of ubiquitous computing coupled with the desire to support
user mobility in distributed systems;
• the increasing demand for multimedia services;
• the view of distributed systems as a utility.
1.4.1 Pervasive networking and the modern Internet
The modern Internet is a vast interconnected collection of computer networks
of many different types, with the range of types increasing all the time and
now including,

28
1.4. Characterization of Distributed System
(Cont…)
• for example, a wide range of wireless communication
technologies such as WiFi, WiMAX, Bluetooth (see Chapter 3) and
third-generation mobile phone networks.
• The net result is that networking has become a pervasive resource
and devices can be connected (if desired) at any time and in any
place.
• Programs running on the computers connected to it interact by
passing messages, employing a common means of
communication.
• The design and construction of the Internet communication
mechanisms (the Internet protocols) is a major technical
achievement,
• enabling a program running anywhere to address messages to
programs anywhere else and abstracting over the myriad of
technologies mentioned above. 29
1.4. Characterization of Distributed System
(Cont…)
• The Internet is also a very large distributed system. It enables users,
wherever they are, to make use of services such as the World Wide
Web, email and file transfer.
• The set of services is open-ended – it can be extended by the
addition of server computers and new types of service. The figure
shows a collection of intranets – subnetworks operated by
companies and other organizations and typically protected by
firewalls.
• The role of a firewall is to protect an intranet by preventing
unauthorized messages from leaving or entering.

30
1.4. Characterization of Distributed System
(Cont…)

Intranet

ISP

Backbone

Satellite Link

31
1.4. Characterization of Distributed System
(Cont…)
1.4.2 Mobile and ubiquitous computing
Technological advances in device miniaturization and wireless networking
have led increasingly to the integration of small and portable computing
devices into distributed systems. These devices include:
• Laptop computers.
• Handheld devices, including mobile phones, smart phones, GPS-enabled
devices, pagers, personal digital assistants (PDAs), video cameras and
digital cameras.
• Wearable devices, such as smart watches with functionality similar to a
PDA.
• Devices embedded in appliances such as washing machines, hi-fi
systems, cars and refrigerators.

32
1.4. Characterization of Distributed System
(Cont…)
1.4.2 Mobile and ubiquitous computing
• The portability of many of these devices, together with their ability to
connect conveniently to networks in different places, makes mobile
computing possible.
• Mobile computing is the performance of computing tasks while the user is
on the move, or visiting places other than their usual environment.
• Ubiquitous computing is the harnessing of many small, cheap
computational devices that are present in users’ physical environments,
including the home, office and even natural settings.
• The term ‘ubiquitous’ is intended to suggest that small computing devices
will eventually become so pervasive in everyday objects that they are
scarcely noticed.
• That is, their computational behaviour will be transparently and intimately
tied up with their physical function.

33
1.4. Characterization of Distributed System
(Cont…)
1.4.3 Distributed multimedia systems
• Another important trend is the requirement to support multimedia services
in distributed systems.
• Multimedia support can usefully be defined as the ability to support a
range of media types in an integrated manner.
• One can expect a distributed system to support the storage, transmission
and presentation of what are often referred to as discrete media types,
such as pictures or text messages.
• A distributed multimedia system should be able to perform the same
functions for continuous media types such as audio and video; that is,
• it should be able to store and locate audio or video files, to transmit them
across the network (possibly in real time as the streams emerge from a
video camera), to support the presentation of the media types to the user
and optionally also to share the media types across a group of users.

34
1.4. Characterization of Distributed System
(Cont…)
1.4.4 Distributed computing as a utility
• Physical resources such as storage and processing can be made
available to networked computers, removing the need to own such
resources on their own.
• At one end of the spectrum, a user may opt for a remote storage facility
for file storage requirements
• (for example, for multimedia data such as photographs, music or video)
and/or for backups.
• Similarly, this approach would enable a user to rent one or more
computational nodes, either to meet their basic computing needs or
indeed to perform distributed computation.
• At the other end of the spectrum, users can access sophisticated data
centres (networked facilities offering access to repositories of often large
volumes of data to users or organizations) or indeed computational
infrastructure using the sort of services now provided by companies such
as Amazon and Google. 35
1.4. Characterization of Distributed System
(Cont…)
1.4.4 Distributed computing as a utility
• Software services can also be made available across the global Internet
using this approach. Indeed, many companies now offer a comprehensive
range of services for effective rental, including services such as email and
distribut
• Google, for example, bundles a range of business services under the
banner Google Apps ed calendars.
• The term cloud computing is used to capture this vision of computing as a
utility. A cloud is defined as a set of Internet-based application, storage
and computing services sufficient to support most users’ needs, thus
enabling them to largely or totally dispense with local data storage and
application software.
• Clouds are generally implemented on cluster computers to provide the
necessary scale and performance required by such services. A cluster
computer is a set of interconnected computers that cooperate closely to
provide a single, integrated highperformance computing capability.
36
1.4. Characterization of Distributed System
(Cont…)
1.4.5 Focus
• Users are so accustomed to the benefits of resource sharing that they
may easily overlook their significance. We routinely share hardware
resources such as printers, data resources such as files, and resources
with more specific functionality such as search engines.
• Looked at from the point of view of hardware provision, we share
equipment such as printers and disks to reduce costs. But of far greater
significance to users is the sharing of the higher-level resources that play
a part in their applications and in their everyday work and social activities.
For example, users are concerned with sharing data in the form of a
shared database or a set of web pages – not the disks and processors on
which they are implemented.
• Similarly, users think in terms of shared resources such as a search
engine or a currency converter, without regard for the server or servers
that provide these.

37
1.4. Characterization of Distributed System
(Cont…)
1.4.5 Focus
• In practice, patterns of resource sharing vary widely in their scope and in
how closely users work together. At one extreme, a search engine on the
Web provides a facility to users throughout the world, users who need
never come into contact with one another directly.
• We use the term service for a distinct part of a computer system that
manages a collection of related resources and presents their functionality
to users and applications.
• The term server is probably familiar to most readers. It refers to a running
program (a process) on a networked computer that accepts requests from
programs running on other computers to perform a service and responds
appropriately.
• The same process may be both a client and a server, since servers
sometimes invoke operations on other servers. The terms ‘client’ and
‘server’ apply only to the roles played in a single request.
38
1.4. Characterization of Distributed System
(Cont…)
1.4.6 Challenges
• Heterogeneity The Internet enables users to access services and run
applications over a heterogeneous collection of computers and networks.
Heterogeneity (that is, variety and difference) applies to all of the
following:
• networks;
• computer hardware;
• operating systems;
• Programming languages;
• implementations by different developers.

39
1.4. Characterization of Distributed System
(Cont…)
1.4.6 Challenges
• Although the Internet consists of many different sorts of network
(illustrated in Figure 1.3), their differences are masked by the fact that all
of the computers attached to them use the Internet protocols to
communicate with one another. For example, a computer attached to an
Ethernet has an implementation of the Internet protocols over the
Ethernet, whereas a computer on a different sort of network will need an
implementation of the Internet protocols for that network.
• Data types such as integers may be represented in different ways on
different sorts of hardware – for example, there are two alternatives for
the byte ordering of integers. These differences in representation must be
dealt with if messages are to be exchanged between programs running
on different hardware.
• Different programming languages use different representations for
characters and data structures such as arrays and records. These
differences must be addressed if programs written in different languages
are to be able to communicate with one another.
40
1.4. Characterization of Distributed System
(Cont…)
1.4.6 Challenges
 Openness: The openness of a computer system is the characteristic that
determines whether the system can be extended and reimplemented in
various ways. The openness of distributed systems is determined
primarily by the degree to which new resource-sharing services can be
added and be made available for use by a variety of client programs.
• Openness cannot be achieved unless the specification and
documentation of the key software interfaces of the components of a
system are made available to software developers.
 Security: Many of the information resources that are made available and
maintained in distributed systems have a high intrinsic value to their
users.
• Their security is therefore of considerable importance. Security for
information resources has three components: confidentiality (protection
against disclosure to unauthorized individuals), integrity (protection
against alteration or corruption), and availability (protection against
interference with the means to access the resources). 41
1.4. Characterization of Distributed System
(Cont…)
1.4.6 Challenges
• In a distributed system, clients send requests to access data managed by
servers, which involves sending information in messages over a network.
For example:
1. A doctor might request access to hospital patient data or send additions
to that data.
2. In electronic commerce and banking, users send their credit card
numbers across the Internet.
• In both examples, the challenge is to send sensitive information in a
message over a network in a secure manner.
 Scalability: Distributed systems operate effectively and efficiently at
many different scales, ranging from a small intranet to the Internet.
• A system is described as scalable if it will remain effective when there is a
significant increase in the number of resources and the number of users.
42
1.4. Characterization of Distributed System
(Cont…)
1.4.6 Challenges
• The design of scalable distributed systems presents the following
challenges:
 Controlling the cost of physical resources:
 Controlling the performance loss:
 Preventing software resources running out:
 Avoiding performance bottlenecks:
 Failure handling: Failures in a distributed system are partial – that is,
some components fail while others continue to function. Therefore the
handling of failures is particularly difficult.
 Detecting failures: Some failures can be detected. For example, checksums can
be used to detect corrupted data in a message or a file.
 Masking failures: Some failures that have been detected can be hidden or made
less severe. Two examples of hiding failures:
 1. Messages can be retransmitted when they fail to arrive.
 2. File data can be written to a pair of disks so that if one is corrupted, the other
43
may still be correct.
1.4. Characterization of Distributed System
(Cont…)
1.4.6 Challenges
• Tolerating failures: Most of the services in the Internet do exhibit failures
– it would not be practical for them to attempt to detect and hide all of the
failures that might occur in such a large network with so many
components.
• Recovery from failures: Recovery involves the design of software so that the
state of permanent data can be recovered or ‘rolled back’ after a server has
crashed.
 Concurrency :The process that manages a shared resource could take
one client request at a time. But that approach limits throughput.
Therefore services and applications generally allow multiple client
requests to be processed concurrently.
• For example, if two concurrent bids at an auction are ‘Smith: $122’ and
‘Jones: $111’, and the corresponding operations are interleaved without
any control, then they might get stored as ‘Smith: $111’ and ‘Jones:
$122’.
44
• Transparency
Assignment #1
Q1. Use the World Wide Web as an example to illustrate the
concept of resource sharing, client and server. What are
the advantages and disadvantages of HTML, URLs
and HTTP as core technologies for information
browsing? Are any of these technologies suitable as a
basis for client-server computing in general?
Q2. Describe Trend, Focus, and Challenges in context to
World Wide Web.
Q3. List the three main software components that may fail
when a client process invokes a method in a server
object, giving an example of a failure in each case.
Suggest how the components can be made to tolerate
one another’s failures.

45

You might also like