Chapter 6 - Cloud Resource Management and Scheduling
Chapter 6 - Cloud Resource Management and Scheduling
n The strategies for resource management for IaaS, PaaS, and SaaS
are different.
¨ The inputs à the offered workload and the policies for admission
control, the capacity allocation, the load balancing, the energy
optimization, and the QoS guarantees in the cloud.
disturbance
r s λ (k )
u* (k)
Predictive Optimal Queuing
filter controller dynamics
external forecast ω (k )
traffic
The controller uses the feedback regarding the current state and the estimation
of the future disturbance due to environment to compute the optimal inputs
over a finite horizon. r and s are the weighting factors of the performance
index.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 6 10
Two-level cloud controller
Application 1 Application n
Application 1 SLA 1 …. SLA n Application n
VM VM
…. VM VM
Application …. Application
controller controller
Monitor Monitor
Decision …. Decision
Cloud Platform
n The actions of the control system should be carried out in a rhythm
that does not lead to instability.
n Adjustments should only be carried out after the performance of the
system has stabilized.
n If upper and a lower thresholds are set, then instability occurs when
they are too close to one another if the variations of the workload
are large enough and the time required to adapt does not allow the
system to stabilize.
n The actions consist of allocation/deallocation of one or more virtual
machines. Sometimes allocation/dealocation of a single VM
required by one of the threshold may cause crossing of the other,
another source of instability.
n Use a high and a low threshold versus a high threshold only.
n Algorithm
¨ Compute the integral value of the high and the low threshold as
averages of the maximum and, respectively, the minimum of the
processor utilization over the process history.
¨ Request additional VMs when the average value of the CPU utilization
over the current time slice exceeds the high threshold.
¨ Release a VM when the average value of the CPU utilization over the
current time slice falls below the low threshold.
n Conclusions
¨ Dynamic thresholds perform better than the static ones.
¨ Two thresholds are better than one.
Blade
Blade
Blade
P
e
n
a
t
l
t
y
0 R0 R1 R2
R R - response time
e
w
a
r
d
Si3
Si1
Si6
Si4
vk
vkmax
Si5
rk (b)
rkmax
(a)
u Users provide bids for desirable bundles and the price they are
willing to pay.
u Ascending Clock Auction, (ASCA) à the current price for each
resource is represented by a “clock” seen by all participants at the
auction.
u2 Proxy x2(t)
x3(t)
u3 Proxy Auctioneer
∑x
u
u (t ) > 0
uU Proxy xU(t)
p(t+1)
The schematics of the ASCA algorithm; to allow for a single round auction
users are represented by proxies which place the bids xu(t). The auctioneer
determines if there is an excess demand and, in that case, it raises the price of
resources for which the demand exceeds the supply and requests new bids.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 6 24
Pricing and allocation algorithms
n A pricing and allocation algorithm partitions the set of users in two
disjoint sets, winners and losers.
n Desirable properties of a pricing algorithm:
¨ Be computationally tractable; traditional combinatorial auction algorithms
e.g., Vickey-Clarke-Groves (VLG) are not computationally tractable.
¨ Scale well - given the scale of the system and the number of requests
for service, scalability is a necessary condition.
¨ Be objective - partitioning in winners and losers should only be based on
the price of a user's bid; if the price exceeds the threshold then the user
is a winner, otherwise the user is a loser.
¨ Be fair - make sure that the prices are uniform, all winners within a
given resource pool pay the same price.
¨ Indicate clearly at the end of the auction the unit prices for each
resource pool.
¨ Indicate clearly to all participants the relationship between the supply
and the demand in the system.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 6 25
Cloud scheduling algorithms (1/2)
n Scheduling à responsible for resource sharing at several levels:
¨ A server can be shared among several virtual machines.
¨ A virtual machine could support several applications.
¨ An application may consist of multiple threads.
n A scheduling algorithm should be efficient, fair, and starvation-free.
n The objectives of a scheduler:
¨ Batch system à maximize throughput and minimize turnaround time.
¨ Real-time system à meet the deadlines and be predictable.
n Best-effort: batch applications and analytics.
n Common algorithms for best effort applications:
¨ Round-robin.
¨ First-Come-First-Serve (FCFS).
¨ Shortest-Job-First (SJF).
¨ Priority algorithms.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 6 26
Cloud scheduling algorithms (2/2)
n Multimedia applications (e.g., audio and video streaming)
¨ Have soft real-time constraints.
¨ Require statistically guaranteed maximum delay and throughput.
n Real-time applications have hard real-time constraints.
n Scheduling algorithms for real-time applications:
¨ Earliest Deadline First (EDF).
¨ Rate Monotonic Algorithms (RMA).
n Algorithms for integrated scheduling of several classes of
applications (best-effort, multimedia, real-time):
¨ Resource Allocation/Dispatching (RAD) .
¨ Rate-Based Earliest Deadline (RBED).
Hard real-time
strict
Hard-requirements
Soft-requirements
Best-effort
Timing
loose
loose strict
Fai(tai)=Sai(tai)+Pai Fai(tai)=Sai(tai)+Pai
Sai(tai)=Rai(tai) Sai(tai)=Fai-1(tai-1)
Fai-1(tai-1) Rai(tai)
The transmission of packet i of a flow can only start after the packet is
available and the transmission of the previous packet has finished.
(a) The new packet arrives after the previous has finished.
(b) The new packet arrives before the previous one was finished.
n The root node is the processor and the leaves of this tree are the
threads of each application.
¨ When a virtual machine is not active, its bandwidth is reallocated to the
other VMs active at the time.
¨ When one of the applications of a virtual machine is not active, its
allocation is transferred to the other applications running on the same VM.
¨ If one of the threads of an application is not runnable then its allocation is
transferred to the other threads of the applications.
VM1 VM2
(1) (3)
A1 A2 A3
(3) (1) (1)
The SFQ tree for scheduling when two virtual machines VM1 and VM2 run
on a powerful server
0 12 24 36
t
12 24 36 48
Thread b
0 3 6 9 12
3 6 9 12 15
Virtual time
36
24
12
6 Real
time
0 3 15 18 21 24 36 48 60
Top à the virtual startup time and the virtual finish time and function of the real time t
for each activation of threads a and b.
Bottom à the virtual time of the scheduler v(t) function of the real time
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 6 32
Borrowed virtual time (BVT)
n Objective - support low-latency dispatching of real-time applications,
and weighted sharing of CPU among several classes of applications.
n A thread i has
¨ an effective virtual time, Ei.
¨ an actual virtual time, Ai.
¨ a virtual time warp, Wi.
n The scheduler thread maintains its own scheduler virtual time (SVT)
defined as the minimum actual virtual time of any thread.
n The threads are dispatched in the order of their effective virtual time,
policy called the Earliest Virtual Time (EVT).
n Context switches are triggered by events such as:
¨ the running thread is blocked waiting for an event to occur.
¨ the time quantum expires.
¨ an interrupt occurs.
¨ when a thread becomes runnable after sleeping.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 6 33
Effective
virtual
time
450
390
360
300
270
210
180
120
90
30
Real
time
(mcu)
2 5 11 14 20 23 29 32 38 41
9 18 27 36 45
The effective virtual time and the real time of the threads a
(solid line) and b (dotted line) with weights wa = 2 wb when
the actual virtual time is incremented in steps of 90 mcu.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 6 34
Effective
virtual
time
450
390
360
300
270
210
180
120
90
30
Real
time
(mcu)
2 5 12 14 21 23 30 32 39 41
9 18 27 36 45
-‐60
The effective virtual time and the real time of the threads a (solid line), b (dotted line),
and the c with real-time constraints (thick solid line). Thread c wakes up periodically
at times t=9, 18, 27, 36,…, is active for 3 units of time and has a time warp of 60 mcu.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 6 35
Cloud scheduling subject to deadlines
n Hard deadlines à if the task is not completed by the deadline, other
tasks which depend on it may be affected and there are penalties; a
hard deadline is strict and expressed precisely as milliseconds, or
possibly seconds.
n First in, First out (FIFO) à The tasks are scheduled for execution
in the order of their arrival.
n Earliest deadline first (EDF) à The task with the earliest deadline
is scheduled first.
S1 Δ1
S2 Δ2
S3 Δ3
Sn Δn
The timing diagram for the Optimal Partitioning Rule; the algorithm requires
worker nodes to complete execution at the same time. The head node, S0,
distributes sequentially the data to individual worker nodes.
Cloud Computing: Theory and Practice.
Dan C. Marinescu Chapter 6 40
S0 Γ1 Γ2 Γ3 Γn
S1 Δ1
S2 Δ2
S3 Δ3
Sn Δn
The timing diagram for the Equal Partitioning Rule; the algorithm assigns an equal
workload to individual worker nodes.