Chapter 3 - Processes
Introduction
Communication takes place between processes
a process is a program in execution
from OS perspective, management and scheduling of processes is
important
other important issues arise in distributed systems
Multi threading to enhance performance by overlapping communication
and local processing
how are clients and servers organized and server design issues
process or code migration to achieve scalability and to dynamically
configure clients and servers
2
Thread s can be used in both distributed and non
distributed systems
Threads in Non distributed Systems
a process has an address space (containing program
text and data) and a single thread of control, as well as
other resources such as open files, child processes,
accounting information, etc.
Process 1 Process 2 Process 3
processes each with one thread one process with three threads
each thread has its own program counter, registers,
stack, and state; but all threads of a process share
address space, global variables and other resources
such as open files, etc.
Threads take turns in running
Threads allow multiple executions to take place in the same
process environment, called multi threading
Thread Usage –Why do we need threads?
e.g., a word processor has different parts for
interacting with the user
formatting the page as soon as changes are made
timed savings (for auto recovery)
spelling and grammar checking, etc.
[Link] the programming model: since many activities are
going on at once more or less independently
[Link] are easier to create and destroy than processes since they
do not have any resources attached to them
[Link] improves by overlapping activities if there is too
much I/ O; i. e., to avoid blocking when waiting for input or doing
calculations, say in a spreadsheet
4. Real parallelism is possible in a multiprocessor system
in non distributed systems, threads can be used with
shared data instead of processes to avoid context
switching overhead interprocess communication (IPC)
context switching as the result of IPC
Thread Implementation
threads are usually provided in the form of a thread package
the package contains operations to create and destroy a thread,
operations on synchronization variables such as mutexes and
condition variables
two approaches of constructing a thread package
a. construct a thread library that is executed entirely in user mode
(the OS is not aware of threads)
cheap to create and destroy threads; just allocate and free
memory
context switching can be done using few instructions; store
and reload only CPU register values
disadvantage: invocation of a blocking system call will block
the entire process to which the thread belongs and all other
threads in that process
b. implement them in the OS’skernel
let the kernel be aware of threads and schedule them
expensive for thread operations such as creation and deletion
since each requires a system call
solution: use a hybrid form of user-level and kernel-level threads, called
lightweight process (LWP)
Threads in Distributed Systems
Multithreaded Clients
consider a Web browser; fetching different parts of a page can be
implemented as a separate thread, each opening its own TCP connection to
the server
each can display the results as it gets its part of the page
parallelism can also be achieved for replicated servers since each thread
request can be forwarded to separate replicas
Multithreaded Servers
servers can be constructed in three ways
A. single-threaded process
it gets a request, examines it, carries it out to completion before getting
the next request
b. Threads
threads are more important for implementing servers
e.g., a file server
the dispatcher thread reads incoming requests for a file
operation from clients and passes it to an idle worker thread
C. finite-state machine
if threads are not available
it gets a request, examines it, tries to fulfill the request
from cache, else sends a request to the file system
Model Characteristics
Single-threaded process No parallelism, blocking system calls
Threads Parallelism, blocking system calls (thread only)
Finite-state machine Parallelism, non blocking system calls
read about virtualization (the illusion of having more resources than
we actually have): pages 79 -82
3.2 Anatomy of Clients
Two issues: user interfaces and client-side software for
distribution transparency
A. User Interfaces
to create a convenient environment for the interaction of
a human user and a remote server; e.g. mobile phones
with simple displays and a set of keys
GUIs are most commonly used
The X Window System (or simply X) as an example
it has the X kernel: the part of the OS that controls the
terminal (monitor, keyboard, pointing device like a
mouse) and is hardware dependent
b. Client-Side Software for Distribution Transparency
in addition to the user interface, parts of the processing
and data level in a client-server application are executed
at the client side
an example is embedded client software for ATMs, cash
registers, etc.
moreover, client software can also include components
to achieve distribution transparency
e.g., replication transparency
assume a distributed system with replicated servers; the
client proxy can send requests to each replica
3.3 Servers and Design Issues
3.3.1 General Design Issues
a. How to organize servers?
Iterative server
the server itself handles the request and returns the
result
Concurrent server
it passes a request to a separate process or thread and
waits for the next incoming request; e.g., a multithreaded
server;
b. Where do clients contact a server?
using endpoints or ports at the machine where the
server is running where each server listens to a specific
endpoint
how do clients know the endpoint of a service?
globally assign endpoints for well-known services; e.g.
FTP is on TCP port 21, HTTP is on TCP port 80
3.4 Code Migration
so far, communication was concerned on passing data
we may pass programs, even while running and in
heterogeneous systems
code migration also involves moving data as well:
when a program migrates while running, its status,
pending signals, and other environment variables
such as the stack and the program counter also have
to be moved
Reasons for Migrating Code
to improve performance; move processes from
heavily-loaded to lightly-loaded machines (load
balancing)
to reduce communication: move a client application
that performs many database operations to a server if
the database resides on the server; then send only
results to the client
to exploit parallelism (for nonparallel programs): e.g.,
copies of a mobile program (called a mobile agent is
called in search engines) moving from site to site
searching the Web
Models for Code Migration
code migration doesn’t only mean moving code; in some
cases, it also means moving the execution status of a
program, pending signals, and other parts of the
execution environment
a process consists of three segments: code segment (set
of instructions), resource segment (references to
external resources such as files, printers, ...), and
execution segment (to store the current execution state
of a process such as private data, the stack, the program
counter)
alternatives for code migration
weak versus strong mobility
is it sender-or receiver-initiated
is it executed at the target process or in a separate process
(for weak mobility); migrate or clone process (for strong
mobility)
Weak Mobility
transfer only the code segment and may be some
initialization data; in this case a program always starts
from its initial stage, e.g. Java Applets
execution can be by
the target process (in its own address space like in Java
Applets) but the target process and local resources must
be protected (security) or
by a separate process; still local resources must be
protected (security)
Strong Mobility (or process migration )
transfer code and execution segments; helps to migrate a
process in execution; stop execution, move it, and then
resume execution from where it is stopped
migration can be
sender-initiated: the machine where the code resides or is
currently running; e.g., uploading programs to a server; may
need authentication or that the client is a registered one;
crawlers to index Web pages
receiver-initiated: by the target machine; e.g., Java Applets;
easier to implement
in a client-server model, receiver-initiated is easier to
implement since security issues are minimized; if clients
are allowed to send code (sender-initiated), the server
must know them since they may access resources suchas
Summary of models of code migration
Types of Process-to-Resource Bindings
Binding by identifier (the strongest): a resource is
referred by its identifier; the process requires that
resource; e.g., a URL to refer to a Web page or an FTP
server referred by its Internet (IP) address
Binding by value (weaker): when only the value of a
resource is needed; in this case another resource can
provide the same value; e.g., standard libraries of
programming languages such as C or Java which are
normally locally available, but their location in the file
system may vary from site to site
Binding by type (weakest): a process needs a resource
of a specific type; reference to loca l devices, such as
monitors, printer
Resource-to-Machine Bindings
Unattached Resources: can be easily moved with
the migrating program (such as data files
associated with the program)
Fastened Resources: such as local databases and
complete Web sites; moving or copying may be
possible, but very costly
Fixed Resources: intimately bound to a specific
machine or environment such as local devices
and cannot be moved
Thank you!
?