0% found this document useful (0 votes)
11 views7 pages

Moshe OS Paper

Uploaded by

bagulgirish765
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views7 pages

Moshe OS Paper

Uploaded by

bagulgirish765
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/267561026

SOME CONSIDERATIONS ABOUT REAL TIME MULTIPROCESSOR OPERATING


SYSTEMS

Article · July 2008


DOI: 10.13140/2.1.2443.4569

CITATIONS READS

0 212

1 author:

Moshe Pelleh
Peres Academic Center
10 PUBLICATIONS 9 CITATIONS

SEE PROFILE

All content following this page was uploaded by Moshe Pelleh on 27 August 2014.

The user has requested enhancement of the downloaded file.


Some Considerations about Real Time Multiprocessor Operating Systems
Moshe PELLEH
Computer Science department
Holon Institute of Technology, Israel
[email protected]

Abstract holds its priority until it terminates. E.g. the Rate


Most Real Time Operating Systems (RTOS) are Monotonic algorithm assigns each task a priority
designated to a single-processor system, when we are just according to its frequency. A higher frequency task gets a
extensively developing multiprocessor systems. There is a higher priority.
variety of algorithms for scheduling tasks on a processor.
The most common scheduling algorithms are: RM, EDF In the model of task level dynamic priority, each job (of
and LST. They are optimal for a single processor the task), once released for execution, is assigned a
scheduling, but at the same time anomalies (deadline priority according to an algorithm, e.g. the Earliest Dead
miss) occur when the algorithms are used for line First. This priority does not change until the job
multiprocessor scheduling. We want to develop an RTOS termination. We observe that the task priority is changed
for multiprocessor systems, with scheduling algorithms with each of its instances.
that avoid anomalies, and with maximum schedulable
utilization. The maximum schedulable utilization is In the model of the job level dynamic priority, each time
achieved for a system of periodic preemptable the scheduler executes, it calculates again the priorities of
independent tasks (without blocking time). all the jobs which are ready to run. This is done according
to an algorithm, e.g. Least Slack Time first, (known also
1. Introduction by the name Minimum Laxity First). This algorithm
assigns the new calculated priorities to the ready jobs.
Each software unit which is managed or executed by the Consequently, a ready or running job changes its priority
operating system is called a job or an instance. every time the scheduler executes (and applies the
algorithm).
The collection of related jobs (or instances) cooperating to
execute a function is called a task. Let us denote any point in time with t, the slack time of a
job with S, its absolute dead line time with d, and time left
The tasks in the system can be presented by models. to its execution termination with e. According to the LST
algorithm: S = d – t – e and a job with a smaller S will
The model in which the jobs (instances) of a task are get a higher priority. The slack is the latest time (relative
executed periodically, with a fixed time period, on a to t) that the job must begin or resume by its execution in
regular and continues basis in order to execute a function order to meet its deadline. [1,9]
(or an application) is called periodic task.
RM, EDF and LST are optimal algorithms for scheduling
When it is needed to react to external random events, we independent preemptable jobs on a single processor.
use aperiodic tasks or sporadic tasks. The release time of [2,5,6]
their instances is random and can not be predicted.
When we try to use those algorithms for scheduling jobs
One way to execute correctly (on time) jobs in the system, on a multiprocessor system, some anomalies arise.
is to assign each job a priority according to its period or
its importance or its criticality, and then let the scheduler 2. System Model
(of the operating system), schedule them for execution
based on their priorities. We use the global approach for dynamic scheduling. In
this model every moment a single scheduler chooses jobs
There are few models (or algorithms) for assigning for execution on m independent identical processors from
priorities to jobs or tasks in the system. a single queue. This means that every moment the m
highest priority jobs are selected for execution on the m
In the fixed priority model, each task has its fixed priority processors and that every job can be executed by diverse
and every job (or instance of the task) gets always this processors in diverse moments (instants), but always on
same priority while it is placed in the ready jobs queue. It
one processor at the same time. Jobs are preemtable and A new released job can be admitted if the total utilization
can migrate between processors. [9,10,11,12] of the jobs in the global queue (including the new one
which was just released) is less than or equal to m.
All the processors use a single common shared memory.
We assume that migration time of jobs between In this algorithm, a moment can be a clock tick of 5
processors (or here, time of assigning jobs to processors) milliseconds or any other fixed small clock tick.
and context switch time are negligible. This assumption
takes most effect when using a multi-core architecture, in In every moment, an executing job can migrate (be
which all the processors are attached to a single chip with assigned) to another processor. An executing job is a
a single cache, the connection between CPUs is faster, member of the global queue.
memory references are faster, and the cache coherency
circuitry can operate at a higher clock rate so that the In a general form:
performance of cache snoop operations are improved. m = number of processors in the system.
n = number of jobs in the global queue.
In our model we have n periodic preemptable independent T = adjusted common period (common denominator) of
tasks {τ1,τ2,….,τn} with minimum period Ti , maximum the jobs in the global queue.
execution time Ci , initial relative deadline Di and such as C = execution time adjusted to common period of m jobs
Ci,j ≤ Ti,j = Di,j for every job in every task. due to execute now.
E = execution time adjusted to common period of the
Whenever a job (an instance of a task) is released and remaining jobs.
admitted, it is placed on a global common queue. U = temporary needed utilization.
m*C/T + (n-m)*E/T = U
m*C + (n-m)*E = U*T
We have m identical processors such as m < n.
m*C – U*T = (m – n)*E
According to MSU algorithm, in the next clock tick we
The utilization of a task is Ui = Ci / Ti . replace C with C-1 and T with T-1.
Using simple algebra, we place:
The utilization of a job is Ui,j = Ci,j / Ti,j . m*(C-1)/(T-1) + (n-m)*E/(T-1) >= U
m*(C-1) + (n-m)*E >= U*(T-1)
Let's denote with ∆Ci,j the remaining execution time of a m*(C-1) - U*(T-1) >= (m – n)*E
job and with ∆Ti,j its remaining time in the period (its But since:
actual relative deadline). m*C – U*T = (m – n)*E
We get:
Now, we define ∆Ui,j = ∆Ci,j / ∆Ti,j as the slack utilization m*(C-1) - U*(T-1) >= m*C – U*T
(or actual remaining density) of a job. m*(C-1) - m*C >= U*(T-1) – U*T
-m >= -U
3. Maximum Slack Utilization First (MSU) U >= m
Thus, the temporary needed utilization in each successive
The MSU algorithm for real-time scheduling of dynamic clock tick is greater than or equal to the number of
priority jobs on a multiprocessor system, using the global processors.
approach, works as follows: Alternatively, using the MSU algorithm, the processors
will be always busy whenever the total utilization of the
Every moment do: jobs in the global queue is equal to m.

Remove finished jobs from the global jobs queue. The MSU algorithm is dynamic at the job level; hence it
can select at every given time the most urgent jobs for
For each new released job, check if it can be admitted. execution. A job with a greater ∆Ci,j or/and smaller
∆Ti,j will be selected first.
If the answer is positive, add it to the global jobs queue.
Those observations imply that no job will miss its
Sort the global jobs queue in non increasing order of deadline.
their ∆Ui,j.
The case in which the total utilization of the jobs in the
Select the first m jobs for execution on the m global queue is smaller than m is obvious.
processors.
The case in which the total utilization of the jobs in the
End do. global queue is greater than m is not part of this paper.
4. Anomaly Examples 4.3 The Graham anomaly [4,8]

4.1 The Liu Exercise [3] A system contains 9 non preemptable jobs named Ji , for i
= 1,2,…..,9. Their execution times are 3,2,2,2,4,4,4,4,9
Consider a system that contains five independent periodic respectively, their release times are equal to 0, and their
tasks A,B,C,D,E and three identical processors P1,P2,P3. deadlines are 12. J1 is the immediate predecessor of J9 ,
The periods of A,B and C are 2 and their execution times and J4 is the immediate predecessor of J5 , J6 , J7 , J8 . For
are equal to 1. The periods of D and E are 8 and their all the jobs, Ji has a higher priority than Jk if i < k.
execution times are 6. The phase of every task is 0.
The system is schedulable on 3 processors.
The LST algorithm (or even the Strict LST) fails to
schedule it because task A misses its deadline T8. (See If the jobs are preemptable, the system is not schedulable.
Figure 1)
If the jobs are scheduled non-preemptivly on 4 processors,
The MSU algorithm schedules it correctly. the system is not schedulable.

The utilization bound here is 0.5+0.5+0.5+0.75+0.75 = 3. If the execution time of every job is reduced by 1, the
system is not schedulable on 3 processors.

T0 T1 T2 T3 T4 T5 T6 T7 T8 If instead of assigning fixed priorities, we apply the MSU


P1 A D D A D D D D algorithm with preemption allowed, all these cases are
schedulable.
P2 B E E E B E E E 5. Practical considerations

P3 C C B C A C B A Till now we have used our model with n independent


tasks. In order to achieve independent tasks (without
Figure 1 blocking time), we should eliminate the waiting states (for
I/O completion, semaphores, mailboxes etc.) as well as
4.2 The Dhall Effect [7] the use of critical sections in the application tasks. A
mailbox between 2 tasks can be implemented by a data
We have m identical processors. The application system structure called a circular queue or ring buffer, using
contains m+1 independent periodic tasks. The first m neither a semaphore nor a critical section. For more tasks
tasks are identical. Their periods are equal to 1 and their we can use an array of ring buffers. [15] Mailboxes can be
execution times are equal to 2ε, where ε is a small used to implement semaphores while the data that are
number. The period of the last task τm+1 is 1+ε and its passed is a flag called semaphore. We can substitute the
execution time is 1. pend operation (wait for message) with the accept
operation which allows tasks to read the message if it is
The EDF and RM algorithms fail to schedule it because available, or immediately return an error code if the
task τm+1 begins at time 2ε , terminates at time 1+2ε and message is not available. Moreover, we can replace
misses its deadline 1+ε . (See Figure 2) mailboxes with global common variables where possible.

The MSU algorithm schedules it correctly. The A typical task is normally made up of few functions or
utilization bound here is limε→0(m(2ε/1)+1/(1+ε)) = 1 . missions. For example, a wage task is assembled of the
following 3 missions: 1. Get employee data (read it from a
t0 ε 2ε 3ε 4ε 5ε 6ε 7ε 8ε 9ε 1 1+ε file on disk or read it from a buffer), 2. Calculate the
P1 T1 T1 Tm+1 Tm+1 Tm+1 Tm+1 Tm+1 Tm+1 Tm+1 Tm+1 Tm+1 Tm+1 employee salary, 3. Output it (write it to a file on disk or
print it). Usually, each mission executes on another
processor. We can break the tasks to missions and let the
P2 T2 T2
RTOS schedule the missions. These missions are in fact
independent tasks.
P3 T3 T3
6. Simulation Experiments
Pm Tm Tm
In order to compare the performance of the MSU
algorithm with those of the RM, the EDF and the Strict
Figure 2
LST (which runs every moment or every clock tick), we
developed a simulation in the C language (using used. (See Figure 3 based on the set of 100000 jobs). As
Microsoft Visual C++). This simulation was executed on table 1 shows, this diagram is similar for any combination
a Pentium M platform with 512 Megabytes RAM, under of clock ticks and scheduling execution time chosen.
Windows XP operating system. - For a multiprocessor system (as defined in paragraph 2,
system model), with 20 processors or more, with hard
In the first phase we used a random generator program to deadlines that must be respected, it is worth considering
create 5 sets of 10,000 , 20,000 , 30,000 , 50,000 and the MSU algorithm for scheduling, because it has no
100,000 synthetic jobs, each of which with its Period, anomalies (no deadline misses) and its overhead is similar
Execution time, Release time and repetitions. to the other algorithms' overhead.
- We also showed how to implement semaphores and
In the second phase we ran our simulated RM, EDF, Strict mailboxes with no waiting states and how to break tasks
LST and MSU algorithms to schedule the 5 sets of jobs on to missions in order to create independent tasks that result
2, 3, 5, 10 and 20 simulated processors, in order to detect in a better schedulable utilization.
anomalies and elapsed execution time, assuming - Our plan is not optimal for modules with long data-flow
negligible scheduling time. tasks with many dependencies between them. On the
other hand, a control system with many sensors and
In the third phase we added clock ticks of 1, 2 and 3 actuators communicating to it via interrupts, and short
milliseconds and scheduling time of 100, 200 and 300 data-flow tasks with few dependencies between them will
micro seconds to "our second phase system" in order to strongly benefit this plan, especially while using short
compare Scheduling Overhead between the 9 possible clock tick periods.
configurations. The parameters choice is based on the
500MHz MPC7410 performance [14], which executes in
average 1 instruction per cycle (or an instruction per 2ns). 3.50%
From a strictly mathematical point of view, 20 processors 3.00%
execute 10 instructions per 1ns. Considering we have in 2.50%
RM
average 1 memory reference for each instruction (because 2.00%
EDF
of the MPC7410 RISC architecture), we have 10 memory 1.50%
LST strict
accesses per 1ns. Assuming 95% of memory requests are 1.00%
cache hit, we have only 0.5 memory accesses per 1ns (or MSU
0.50%
an access per cycle). We need a 500 MHz bus frequency 0.00%
and a 500 MHz memory frequency. This numbers will be 2 3 5 10 20
achieved soon with new technologies like System on Chip
architecture, switches like Motorola VXE, or buses like Figure 3: Scheduling overhead (%) for 2, 3, 5, 10 and 20
HyperTransport (HT), known as Lightning Data Transport processors
(LDT), bidirectional serial/parallel high-bandwidth For 3 millisecond clock tick and 200 micro seconds
computer bus, etc. scheduler execution

7. Summary References

- We introduced some of the most well known real time [1] Liu, Jane W.S., Real Time Systems, Prentice Hall,
scheduling algorithms and their anomalies in the 2000.
multiprocessor environment. We proposed a new [2] Liu, Jane W.S., Real Time Systems, Prentice Hall, pp.
algorithm (MSU) that avoids the anomalies and we 67-69, 129-130, 2000.
showed comparative examples and simulated experiments [3] Liu, Jane W.S., Real Time Systems, Prentice Hall,
that demonstrate it. The disadvantage of the new Page 83, exercise 4.4, 2000.
algorithm is a scheduling overhead similar to that of the [4] Liu, Jane W.S., Real Time Systems, Prentice Hall,
strict LST scheduler. Page 83, exercise 4.5, 2000.
- The RM and EDF algorithms have a high rate of [5] Leung, J.Y.T., and J. Whitehead, On the complexity of
deadline misses, independently of the number of fixed-priority scheduling of periodic real-time tasks,
processors used, meanwhile the strict LST algorithm Performance Evaluation, vol. 2, pp. 37-250, 1982.
improves with the number of processors used. That is, [6] Mok, A.K.L., Fundamental design problems of
more processors used, with less deadlines missed. While distributed systems for hard real-time environment, Ph.D.
using 20 processors or more, the deadline misses rate is thesis, MIT, 1983.
quite negligible, but deadline misses might still occur. [7] Dhall, S.K., and C.L. Liu, On a real-time scheduling
- The scheduling overhead is an inverse function of the problem, Operations Research, vol. 26, no. 1, pp. 127-
number of processors used. It approaches the same value 140, February 1978.
for all algorithms on increasing the number of processors
[8] Graham, R.L., Bounds on multiprocessing timing
anomalies, SIAM Journal of Applied Mathematics, vol.
17, no. 2, March 1969.
[9] Omar U. Pereira Zapata and Pedro Mejia Alvarez,
EDF and RM multiprocessor Scheduling algorithms:
survey and performance evaluation, CINVESTAV-IPN,
Seccion de Computacion, September 18, 2004.
[10] Baruah, S., Optimal utilization bounds for the fixed
priority scheduling of periodic tasks systems on identical
multiprocessors, IEEE Transactions on Computers, vol.
53, issue: 6 , pp. 781 – 784, June 2004.
[11] Baruah, S.K. and Goossens, J., Rate-monotonic
scheduling on uniform multiprocessors, IEEE
Transactions on Computers, vol. 52, Issue: 7, Pages: 966
– 970, Jul 2003.
[12] Ripps, D., MTOS UX User's Manual, Version 3.1,
I.P.I., Jericho, NY 11753, 1994.
[13] Pedro Mejia Alvarez, Eugene Levner, and Damiel
Mosse, Power optimized scheduling server for real-time
tasks, IEEE proceedings of the eighth real time and
embedded technology symposium, San Jose, California,
24-27 September, 2002.
[14] Motorola, MPC7410 RISC Microprocessor User's
Manual, Rev. 0, 2000.
[15] Laplant P., Real Time Systems Design and Analysis,
IEEE Press, 2nd edition, pp 171-172, 1977.
Table 1: Results for the set of 100000 jobs

CPU
Algo Proce Elapsed Scheduler Deadlines Deadlines
Load
rithm ssors Time Instances Missed Missed %
%
2 584593 160128 16023 16.02 75.72
3 402375 121576 16677 16.68 77.43
RM 5 237304 96165 19207 19.21 81.67
10 132925 70616 16719 16.72 76.94
20 65101 46606 19136 19.14 82.1
2 650959 179009 6040 6.04 79.26
3 452866 137606 6639 6.64 81.14
EDF 5 272570 110980 9148 9.15 85.25
10 149265 80665 7783 7.78 81.46
20 75505 55390 9805 9.8 85.57
2 664521 664521 2141 2.14 79.31
3 459944 459944 1925 1.93 81.38
LST 5 279329 279329 1910 1.91 85.56
10 152831 152831 262 0.26 81.83
20 77827 77827 109 0.11 86.08
2 695097 695097 0 0 78.74
3 476821 476821 0 0 80.95
MSU 5 286679 286679 0 0 85.12
10 159607 159607 0 0 79.94
20 79242 79242 0 0 86.3

Scheduler Overhead
Algo Proce
1 ms 2 ms 3 ms
rithm ssors
100 200 300 100 200 300 100 200 300
2 1.40% 2.70% 4.10% 0.70% 1.40% 2.10% 0.50% 0.90% 1.40%
3 1.00% 2.00% 3.00% 0.50% 1.00% 1.50% 0.30% 0.70% 1.00%
RM 5 0.80% 1.60% 2.40% 0.40% 0.80% 1.20% 0.30% 0.50% 0.80%
10 0.50% 1.10% 1.60% 0.30% 0.50% 0.80% 0.20% 0.40% 0.50%
20 0.40% 0.70% 1.10% 0.20% 0.40% 0.50% 0.10% 0.20% 0.40%
2 1.40% 2.70% 4.10% 0.70% 1.40% 2.10% 0.50% 0.90% 1.40%
3 1.00% 2.00% 3.00% 0.50% 1.00% 1.50% 0.30% 0.70% 1.00%
EDF 5 0.80% 1.60% 2.40% 0.40% 0.80% 1.20% 0.30% 0.50% 0.80%
10 0.50% 1.10% 1.60% 0.30% 0.50% 0.80% 0.20% 0.40% 0.50%
20 0.40% 0.70% 1.10% 0.20% 0.40% 0.60% 0.10% 0.20% 0.40%
2 5.00% 10.00% 15.00% 2.50% 5.00% 7.50% 1.70% 3.30% 5.00%
3 3.30% 6.70% 10.00% 1.70% 3.30% 5.00% 1.10% 2.20% 3.30%
LST 5 2.00% 4.00% 6.00% 1.00% 2.00% 3.00% 0.70% 1.30% 2.00%
10 1.00% 2.00% 3.00% 0.50% 1.00% 1.50% 0.30% 0.70% 1.00%
20 0.50% 1.00% 1.50% 0.30% 0.50% 0.70% 0.20% 0.30% 0.50%
2 5.00% 10.00% 15.00% 2.50% 5.00% 7.50% 1.70% 3.30% 5.00%
3 3.30% 6.70% 10.00% 1.70% 3.30% 5.00% 1.10% 2.20% 3.30%
MSU 5 2.00% 4.00% 6.00% 1.00% 2.00% 3.00% 0.70% 1.30% 2.00%
10 1.00% 2.00% 3.00% 0.50% 1.00% 1.50% 0.30% 0.70% 1.00%
20 0.50% 1.00% 1.50% 0.30% 0.50% 0.70% 0.20% 0.30% 0.50%

View publication stats

You might also like