0% found this document useful (0 votes)
16 views

Module 2DS

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Module 2DS

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Course Name : Distributed Systems

Course Code : CSE 2052


Module 2: Communication

1
OSI Model
To make it easier to deal with the numerous levels and
issues involved in communication,

The International Standards Organization (ISO)


developed a reference model that clearly identifies the
various levels involved, gives them standard names, and
points out which level should do which job.

This model is called the Open Systems Interconnection


Reference Model
Communication Protocol
The OSI model is designed to allow open systems to
communicate.

An open system is one that is prepared to communicate


with any other open system by using standard rules that
govern the format, contents, and meaning of the messages
sent and received.

These rules are formalized and called communication


protocols.
Communication Protocol
A protocol is said to provide a communication service

connection-oriented service, before exchanging


data the sender and receiver first explicitly establish a
connection, and possibly negotiate specific parameters of
the protocol they will use.

a connectionless services, no setup in advance is needed.


The sender just transmits the first message when it is
ready.
Layered Protocols
Computer networks are implemented using the concept of layered
protocols.

According to this concept, the protocols of a network are


organized into a series of layers in such a way that each layer
contains protocols for exchanging data and providing functions in a
logical sense with peer entities at other sites in the network.

Entities in adjacent layers interact in a physical sense through


the common interface defined between the two layers by passing
parameters such as headers, trailers, and data parameters.
Layered Protocols
• A typical message as it appears on the network.
Layered Protocols
• The interface consists of a set of operations that
together define the service the layer is prepared to
offer.
• Physical layer Deals with standardizing how two
computers are connected and how 0s and 1s are
represented.
• Data link layer Provides the means to detect and
possibly correct transmission errors, as well as protocols
to keep a sender and receiver in the same pace.
• Network layer Contains the protocols for routing a
message through a computer network, as well as protocols
for handling congestion.
Layered Protocols
• Transport layer Mainly contains protocols for directly
supporting applications, such as those that establish
reliable communication, or support real-time streaming of
data.
• Session layer Provides support for sessions between
applications.
• Presentation layer Prescribes how data is represented in
a way that is independent of the hosts on which
communicating applications are running.
• Application layer Essentially, everything else: e-mail
protocols, Web access protocols, file-transfer protocols,
and so on.
Communication Protocols
• Protocols are
agreements/rules
on communication

• Protocols could be
connection-oriented
or connectionless
Client-Server TCP

a) Normal operation of TCP.


b) Transactional TCP.
Communication Between Processes
• Unstructured communication
• Use shared memory or shared data structures

• Structured communication
• Use explicit messages (IPCs)

• Distributed Systems: both need low-level communication


support (why?)
Middleware Protocols
• Middleware:
layer that
resides
between an OS
and an
application
• May implement
general-purpose
protocols that
warrant their own
layers
• Example:
distributed
commit
Middleware Protocols
• The Domain Name System (DNS) is a distributed
service that is used to look up a network address
associated with a name, such as the address of a so-
called domain name.
• Distributed commit protocols establish that in a group
of processes, possibly spread out across a number of
machines, either all processes carry out a particular
operation, or that the operation is not carried out at
all.
• This phenomenon is also referred to as atomicity and is
widely applied in transactions.
Middleware Protocols
• A distributed locking protocol by which a resource can be
protected against simultaneous access by a collection of
processes that are distributed across multiple machines.
Types of Communication
An electronic mail system is a typical example in which
communication is persistent.

With persistent communication, a message that has been


submitted for transmission is stored by the
communication middleware as long as it takes to deliver it
to the receiver.

In contrast, with transient communication, a message is


stored by the communication system only as long as the
sending and receiving application are executing.
Types of Communication
The characteristic feature of asynchronous
communication is that a sender continues immediately
after it has submitted its message for transmission.

This means that the message is (temporarily) stored


immediately by the middleware upon submission.

With synchronous communication, the sender is blocked


until its request is known to be accepted.
Types of Communication
There are essentially three points where synchronization can
take place.

• First, the sender may be blocked until the middleware


notifies that it will take over transmission of the request.

• Second, the sender may synchronize until its request has


been delivered to the intended recipient.

• Third, synchronization may take place by letting the


sender wait until its request has been fully processed, that
is, up to the time that the recipient returns a response.
Types of Communication
Client-Server Communication Model
• Structure: group of servers offering service to clients
• Based on a request/response paradigm
• Techniques:
• Socket, remote procedure calls (RPC), Remote Method Invocation (RMI)
Issues in Client-Server Communication
• Addressing

• Blocking versus non-blocking

• Buffered versus unbuffered

• Reliable versus unreliable

• Server architecture: concurrent versus sequential

• Scalability
Remote Procedure Calls
• Goal: Make distributed computing look like centralized computing

• Allow remote services to be called as procedures


• Transparency with regard to location, implementation, language

• Issues
• How to pass parameters

• Bindings

• Semantics in face of errors

• Two classes: integrated into prog language and separate


RPC Model
The RPC model is similar to the well-known and well-understood
procedure call model used for the transfer of control and data
within a program in the following manner:
RPC Model
1. For making a procedure call, the caller places arguments to the
procedure in some well-specified location.
2. Control is then transferred to the sequence of instructions
that constitutes the body of the procedure.
3. The procedure body is executed in a newly created execution
environment that includes copies of the arguments given in the
calling instruction.
4. After the procedure's execution is over, control returns to the
calling point, possibly returning a result.
RPC Model
The RPC mechanism is an extension of the procedure call
mechanism in the sense that it enables a call to be made
to a procedure that does not reside in the address space
of the calling process. The called procedure (commonly
called remote procedure) may be on the same computer
as the calling process or on a different computer.
RPC Model
RPC Model
During remote procedure call is made, the caller and the callee
processes interact in the following manner
1.The caller (commonly known as client process) sends a call
(request) message to the callee (commonly known as server
process) and waits (blocks) for a reply message. The request
message contains the remote procedure's parameters, among
other things.
2. The server process executes the procedure and then returns
the result of procedure execution in a reply message to the client
process.
3. Once the reply message is received, the result of procedure
execution is extracted, and the caller's execution is resumed.
Conventional Procedure Call
a) Parameter passing in a local procedure b) The stack while the called procedure is
call: the stack before the call to read active
Parameter Passing
• Local procedure parameter passing
• Call-by-value
• Call-by-reference: arrays, complex data structures
• Remote procedure calls simulate this through:
• Stubs – proxies
• Flattening – marshalling
• Related issue: global variables are not allowed in RPCs
Client and Server Stubs
• Principle of RPC between a client and server program.
Stubs
• Client makes procedure call (just like a local procedure call) to the
client stub

• Server is written as a standard procedure

• Stubs take care of packaging arguments and sending messages

• Packaging parameters is called marshalling

• Stub compiler generates stub automatically from specs in an


Interface Definition Language (IDL)
• Simplifies programmer task
Steps of a Remote Procedure Call
1. Client procedure calls client stub in normal way

2. Client stub builds message, calls local OS

3. Client's OS sends message to remote OS

4. Remote OS gives message to server stub

5. Server stub unpacks parameters, calls server

6. Server does work, returns result to the stub

7. Server stub packs it in message, calls local OS

8. Server's OS sends message to client's OS

9. Client's OS gives message to client stub

10. Stub unpacks result, returns to client


Example of an RPC
Marshalling
• Problem: different machines have different data formats
• Intel: little endian, SPARC: big endian

• Solution: use a standard representation


• Example: external data representation (XDR)

• Problem: how do we pass pointers?


• If it points to a well-defined data structure,
pass a copy and the server stub passes a pointer to the local copy

• What about data structures containing pointers?


• Prohibit
• Chase pointers over network

• Marshalling: transform parameters/results into a byte stream


Binding
• Problem: how does a client locate a server?
• Use Bindings

• Server
• Export server interface during initialization
• Send name, version no, unique identifier, handle (address) to binder

• Client
• First RPC: send message to binder to import server interface
• Binder: check to see if server has exported interface
• Return handle and unique identifier to client
Binding: Comments
• Exporting and importing incurs overheads

• Binder can be a bottleneck


• Use multiple binders

• Binder can do load balancing


Failure Semantics
• Client unable to locate server: return error

• Lost request messages: simple timeout mechanisms

• Lost replies: timeout mechanisms


• Make operation idempotent
• Use sequence numbers, mark retransmissions

• Server failures: did failure occur before or after operation?


• At least once semantics (SUNRPC)
• At most once
• No guarantee
• Exactly once: desirable but difficult to achieve
Failure Semantics
• Client failure: what happens to the server computation?
Referred to as an orphan
• Extermination: log at client stub and explicitly kill orphans
• Overhead of maintaining disk logs

• Reincarnation: Divide time into epochs between failures and delete


computations from old epochs
• Gentle reincarnation: upon a new epoch broadcast, try to locate owner
first (delete only if no owner)
• Expiration: give each RPC a fixed quantum T; explicitly request extensions
• Periodic checks with client during long computations
Message Passing
A process is a program in execution

A distributed operating system needs to provide interprocess


communication (lPC) mechanisms to facilitate such
communication activities

Interprocess communication basically requires information


sharing among two or more processes. The two basic methods
for information sharing are as follows:

1. Original sharing, or shared-data approach


2. Copy sharing, or message-passing approach
Message Passing
Message Passing
A message-passing system is a subsystem of a distributed
operating system that provides a set of message-based IPC
protocols and does so by shielding the details of complex network
protocols and multiple heterogeneous platforms from programmers.

It enables processes to communicate by exchanging messages and


allows programs to be written by using simple communication
primitives, such as send and receive.
Desirable Features of a good message
passing system
1. Simplicity
2. Uniform Semantics
3. Efficiency
4. Reliability
5. Correctness
6. Flexibility
7. Security
8. Portability
Issues in IPC by Message Passing
Issues in IPC by Message Passing
In the design of an IPC protocol for a message-passing system,
the following important issues need to be considered:
• Who is the sender?
• Who is the receiver?
• Is there one receiver or many receivers?
• Is the message guaranteed to have been accepted by its
receiver(s)?
• Does the sender need to wait for a reply?
Issues in IPC by Message Passing
• What should be done if a catastrophic event such as a node crash or
a communication link failure occurs during the course of
communication?
• What should be done if the receiver is not ready to accept the
message: Will the message be discarded or stored in a buffer? In the
case of buffering, what should be done if the buffer is full?
• If there are several outstanding messages for a receiver, can it
choose the order in which to service the outstanding messages?
Distributed Shared Memory
Two basic primitives for interprocess communication:
• Send (recipient, data)
• Receive (data)

Processes access data in the shared address space through the


following two basic primitives
• data = Read (address)
• Write (address, data)
Distributed Shared Memory
• Distributed Shared Memory (DSM) provides a virtual address
space shared among processes on loosely coupled processors.

• DSM is an abstraction that integrates the local memory of


different machines in a network environment into a single logical
entity shared by cooperating processes executing on multiple
sites.

• The shared memory itself exists only virtually.

• Application programs can use it in the same way as a traditional


virtual memory, except, of course, that processes using it can
run on different machines in parallel.
Distributed Shared Memory
Design and Implementation Issues of DSM
• Important issues involved in the design and implementation of
DSM systems are as follows:
• 1. Granularity.
• 2. Structure of shared-memory space.
• 3. Memory coherence and access synchronization.
• 4. Datu location and access.
• 5. Replacement strategy.
• 6. Thrashing.
• 7. Heterogeneity.
Design and Implementation Issues of DSM
Granularity refers to block size, which is the unit of data sharing
and data transfer across the network. Both large- and small-sized
blocks have their own advantages and limitations.

Several DSM systems choose the virtual memory page size as


block size so that the MMU(Memory management Unit) hardware
can be used to trigger a DSM block fault.
Design and Implementation Issues of DSM
The structure of the shared-memory space of a DSM system
defines the abstract view to be presented to application
programmers of that system.

The three commonly used methods for structuring the shared-


memory space of a DSM system are
1. no structuring,
2. structuring by data type,
3. structuring as a database.

The structure and granularity of a DSM system are closely related


Implementation Issues
• Choice of protocol [affects communication costs]
• Use existing protocol (UDP) or design from scratch
• Packet size restrictions
• Reliability in case of multiple packet messages
• Flow control

• Copying costs are dominant overheads


• Need at least 2 copies per message
• From client to NIC and from server NIC to server

• As many as 7 copies
• Stack in stub – message buffer in stub – kernel – NIC – medium – NIC – kernel – stub – server

• Scatter-gather operations can reduce overheads

You might also like