System Models For Distributed and Cloud Computing
System Models For Distributed and Cloud Computing
• Unlike the cluster or grid, a P2P network does not use dedicated
interconnection network.
Distributed File Sharing: content distribution of MP3 music, video, etc. E.g.
Gnutella, Napster, BitTorrent.
• REST supports many data formats, whereas SOAP only allows XML.
• REST supports JSON (smaller data formats and offers faster parsing
compared to XML parsing in SOAP which is slower).
• REST provides superior performance, particularly through caching for
information that’s not altered and not dynamic.
• REST is used most often for major services such as Amazon, Twittter.
• REST is generally faster and uses less bandwidth.
• SOAP provides robust security through WS-Security and so useful for
enterprise apps such as banking and financial apps; REST only has SSL.
• SOAP offers built-in retry logic to compensate for failed communications.
Performance Metrics and Scalability Analysis
• Performance Metrics:
• CPU speed: MHz or GHz, SPEC benchmarks like SPECINT
• Network Bandwidth: Mbps or Gbps
• System throughput: MIPS, TFlops (tera floating-point operations per
second), TPS (transactions per second), IOPS (IO operations per second)
• Other metrics: Response time, network latency, system availability
• Scalability:
• Scalability is the ability of a system to handle growing amount of work in a
capable/efficient manner or its ability to be enlarged to accommodate that
growth.
• For example, it can refer to the capability of a system to increase total
throughput under an increased load when resources (typically hardware)
are added.
Scalability
Scale Vertically
To scale vertically (or scale up) means to add resources to a single node in
a system, typically involving the addition of CPUs or memory to a single
computer.
Tradeoffs
There are tradeoffs between the two models. Larger numbers of computers
means increased management complexity, as well as a more complex
programming model and issues such as throughput and latency between
nodes.
Also, some applications do not lend themselves to a distributed computing
model.
In the past, the price difference between the two models has favored "scale
up" computing for those applications that fit its paradigm, but recent
advances in virtualization technology have blurred that advantage, since
deploying a new virtual system/server over a hypervisor is almost always
less expensive than actually buying and installing a real one.
Scalability
• One form of scalability for parallel and distributed systems is:
• Size Scalability
This refers to achieving higher performance or more functionality by
increasing the machine size. Size in this case refers to adding processors,
cache, memory, storage, or I/O channels.
• Scale Horizontally and Vertically
Methods of adding more resources for a particular application fall into two
broad categories:
Scale Horizontally
To scale horizontally (or scale out) means to add more nodes to a system,
such as adding a new computer to a distributed software application. An
example might be scaling out from one Web server system to three.
The scale-out model has created an increased demand for shared data
storage with very high I/O performance, especially where processing of
large amounts of data is required.
Amdahl’s Law
It is typically cheaper to add a new node to a system in order to achieve
improved performance than to perform performance tuning to improve the
capacity that each node can handle. But this approach can have diminishing
returns as indicated by Amdahl’s Law.
where the first term is the sequential execution time on a single processor
and the second term is the parallel execution time on n processing nodes.
Double the processing power has only improved the speedup by roughly
System Efficiency, E = S / n = 1 / [α n + (1 - α ) ]
System efficiency can be rather low if the cluster size is very large.
• All hardware, software, and network components may fail. Single points of
failure that bring down the entire system must be avoided when designing
distributed systems.
• High-availability is, ultimately, the holy grail of the cloud. For clouds,
availability relates to the time that the datacenter is accessible or delivers
the intended IT service as a proportion of the duration for which the service
is purchased.
• CASE 2
• Same as Case 1 but we have 2 failures instead. Compute availability.
• MTTF = 900 / 2 = 450
• MTTR = 108 / 2 = 54
• Availability = 450 / (450 + 54) = (450 / 504) = 0.892 or 89.2%
Reliability vs. Availability
• Reliability is a measure of the probability that an item will perform its
intended function without failures for a specified interval under stated
conditions.
• Calculate the downtime per year given an availability of four 9’s (i.e. an
uptime of 99.99%).
• This is calculated as:
• (365 × 24) – 0.9999 (365 × 24) = 8760 – 8759.124 = 0.876 hours = 52
minutes and 30 secs
RESTful Web Service Example
•
SOAP Web Service Example
•