2004 SystemC Kernel Extensions For Heterogeneous System Modeling
2004 SystemC Kernel Extensions For Heterogeneous System Modeling
by
Hiren D. Patel
Virginia Polytechnic and State University,
Blacksburg, VA, U.S.A.
and
Sandeep K. Shukla
Virginia Polytechnic and State University,
Blacksburg, VA, U.S.A.
No part of this eBook may be reproduced or transmitted in any form or by any means, electronic,
mechanical, recording, or otherwise, without written consent from the Publisher
To my friends
Tushar Saxena,
Arush Saxena
and their families
Sandeep K. Shukla
Contents
Dedication v
List of Figures xi
List of Tables xiii
Foreword xv
Preface xix
Acknowledgments xxxi
1. INTRODUCTION 1
1 Motivation 1
2 System Level Design Languages and Frameworks 2
3 Our Approach to Heterogeneous Modeling in SystemC 8
4 Main Contributions of this Book 10
2. BACKGROUND MATERIAL 13
1 System Level Modeling and Simulation Methods 13
2 Models of Computation and Heterogeneous Modeling at
System Level 14
3 Ptolemy II: A Heterogeneous Modeling and Simulation
Framework 15
4 SystemC: Language and Framework 19
5 Implemented Models of Computation 21
3. SYSTEMC DISCRETE-EVENT KERNEL 31
1 DE Simulation Semantics 31
2 Implementation Specifics 33
3 Discrete-Event Simulation Kernel 34
4 Example: DE Kernel 37
viii
“There never were in the world two opinions alike, no more than two
hairs or two grains; the most universal quality is diversity”
Montaigne, 1533-1592
We have seen a rapid growth of interest in the use of SystemC for
system level modeling since the language first emerged in the late 1990’s.
As the language has evolved through several generations of capability,
designers and researchers in a wide variety of systems and semiconductor
design houses, and in many academic institutions, have built models of
systems or partial systems and explored the capabilities and limitations
of SystemC as a base on which to build system-level models. Many
of these models have been used as part of the specification and design
of systems which have subsequently been put into production; in these
cases, we can justifiably conclude that SystemC-based models have seen
“tape-outs” even though the language has been used for answering early,
high-level, system modeling and configuration questions, rather than as
the language of detailed implementation.
Another key role that has been played by the SystemC community has
been to experiment with many concepts of system level modeling using
SystemC as a base. These experiments have included adding mixed-
signal, continuous-time model solving to the discrete time concepts of
SystemC; developing various notions for transaction-level modeling at
intermediate abstraction levels between untimed functional models and
time, signal and pin-accurate RTL models; connecting SystemC to net-
work simulators; developing synthesis capabilities; and many other ex-
xvi
efficient for systems such as synchronous data flow that can be statically
scheduled. Finding, implementing, and sharing improved methods and
algorithms within the context of the SystemC community is an excellent
way of meeting the challenge laid down by the reference simulator. This
monograph is a concise summary of how that challenge has been met
by the authors. It will thus be of interest to anyone wanting to extend
SystemC for more efficient multi-domain simulation, or commercial or
internal tools groups wanting to build domain-specific SystemC based
simulators that have improved performance.
In their introduction, the authors give an overview of system-level
design, system level modeling, models of computation, and their ap-
proach to heterogeneous modeling and the major contributions of their
research. This introduction is a useful and quick summary of the thrust
of the research, and serves to motivate the goals of the work as a whole.
This is followed by two key chapters that flesh out the details behind
system level modeling approaches and give an analysis of the SystemC
discrete-event kernel. They review both multi-domain system modeling
examples such as Ptolemy II, to illustrate the notions of multiple models
of computation, and discuss the details of the SystemC modeling and
simulation approach.
The next several chapters are the heart of the research, covering three
different domain models and their implementation by strategic modi-
fications to the SystemC discrete event kernel to allow multiple MoC
domains. These include synchronous data flow, Hoare’s communicating
sequential processes (CSPs), and finite state machines. Each of these
modified kernels is described using extensive examples of implementa-
tion code, discussed in detail in each chapter. Where there is insufficient
room to publish the complete code, and for the convenience of the read-
ers and experimenters, reference is given to the FERMAT lab web site,
where the full source listings have been made available.
The remaining chapters cover diverse topics: an API to the modified
simulation kernel; several heterogeneous system model examples, build-
ing towards one which involves CSP, FSM, SDF and DE kernels, and
measured efficiency results which indicate a baseline of speedup possible
through using such methods which provides a floor for what is achievable
in optimizing domain-specific SystemC. It is notable that those MoCs
that are amenable to static scheduling, such as SDF, show by far the
greatest speedup, as one would expect.
Finally, two appendices provide a discussion of the QuickThreads
coroutine execution model in the SystemC kernel, and notes on configur-
ing and building the multiple kernel models. A useful set of references to
xviii
tion was proposed. Ptolemy project has been built around this notion
of a variety of models of computation, which include various sequential
models of computation such as FSM, Discrete-Time, Continuous Time
models, as well as models of interaction and communicating entities,
such as CSP, interaction models etc. Ptolemy II undoubtedly popu-
larized the idea of viewing a complex embedded system as a collection
of models which may belong to distinct models of computation (called
domains in Ptolemy), and the idea of creating a framework in which
efficient co-simulation of models from distinct domains is possible.
Another important work in the usage of Models of Computation in
abstracting functionalities of complex heterogeneous system was done
in the context of ForSyDe [15, 32] project in Sweden by Axel Jantsch
and his group. We distinguished this work from the Ptolemy group’s
work as a difference between a denotational view vs. operational view
of models of computation. Of course, such distinction have not been
made earlier, but to avoid confusion among the readers as to the use of
these two different classification of models of computation, we provide a
structure to the different systems of classification.
A denotational view looks at a model of computation as a set of
process constructors, and process composition operators, and provides
denotational semantics to the constructors and operators. In [32] Axel
Jantsch builds a classification based on the information abstraction per-
spective. The main models of computation in this view are
1 Untimed Model of Computation: where the timing information
is abstracted so that no knowledge of time taken for computation or
communication between computing processes is assumed. Since this
MoC makes no assumption on time, the process constructors in this
MoC can be used to construct processes which contain information
on computation carried out and data to be transferred, without any
regard to how much time it takes to carry them out.
between processes within the same domain. The processes within the
domain communicate via rendez-vous protocol, and the communica-
tion is synchronous.
Discrete Event Simulation: In this domain, the computation is
carried out in cycles, and each cycle consists of fixpoint computation
for a selected subset of variables, which constitute the so called ‘com-
binational’ part of the system and this fixpointing is done through
cycles known as ‘delta cycles’. The communication in this domain is
not specifically distinguished.
As one can see from the listing of these MoCs and their brief descrip-
tions, these are classified according to their computational and com-
munication characteristics from an operational perspective. One can
also see the orthogonality, for example, by observing that the Commu-
nicating Sequential Processes can be untimed or timed, or even clock
synchronous, as long as the rendez-vous protocol is maintained in the
inter-process communication. Similarly the Finite State Machine can be
timed, untimed, or clocked synchronous.
Hiren D. Patel
Sandeep K. Shukla
Acknowledgments
We would like to thank Sankar Basu and Helen Gill from the National
Science Foundation for their encouragement and support for building
the FERMAT lab under the NSF CAREER award CCR-0237947 which
served as the home for the project work reported in this book. A part of
the support for this work also came from an NSF NGS grant 0204028.
The graduate studies of Hiren D. Patel is also supported by an SRC
grant from the SRC Integrated Systems Program. Thanks are due to
Professor Rajesh Gupta from the University of California at San Diego
whose continuous encouragement and collaboration helped us along the
way in this project.
We would also like to thank Mark de Jongh from Kluwer Academic
Publishers for working with us and helping to ensure a fast turnaround
time for the book. We also thank Cindy Zitter from Kluwer, for all her
timely help with administrative issues.
We are grateful to Grant E. Martin, Chief Scientist, Tensilica, for
encouraging the project with his foreword for the book, and providing
us with feedback as and when we requested.
The students of FERMAT Lab at Virginia Tech, especially Deepak
Abraham Mathaikutty, Debayan Bhaduri, Syed Suhaib and Nicholas
Uebel have helped the project in different ways. The graduate students
of Computer Synthesis class at Virginia Tech in the Spring of 2003, and
Spring of 2004 had to endure some of this research material in the course,
and we thank them for their enthusiasm about this research.
We thank Edward A. Lee and Stephen Neundorffer from the Ptolemy
II group at the University of California at Berkeley for their inputs on
many occasions on the idea of heterogeneous modeling in SystemC. We
also thank the Ptolemy II project as being a source of guidance for
various nomenclatures and examples used in this project.
xxxii
INTRODUCTION
1. Motivation
SystemC.
DE + C++ for all other MoCs
SystemC.
DE + SDF + C++ for all other
Possible M odeling M istakes
MoCs
SystemC.
DE + SDF + FSM + C++ for all
other MoCs
SystemC.
DE + SDF + FSM + CSP +
C++ for all other MoCs
Modeling Fidelity
Figure 1.2 displays the difficulty in being able to express MoC specific
behaviors versus the possible modeling errors made by designers . Users
restricted to DE semantics of SystemC lack in facilities to model other
MoCs distinct from DE, requiring user-level manipulations for correct
behavior of these MoCs. Due to these manipulations, designers may
suffer from an increased number of modeling errors to achieve correct
operation of a model. Hence, Figure 1.2 shows that SystemC with its DE
kernel and functionality of C++ provides the most modeling expressive-
ness but also indicates more possible modeling mistakes. Conversely, the
modeling expressiveness is significantly reduced when employing MoC-
specific kernels for SystemC, but the possible errors made are also re-
duced.
We have therefore chosen the term ‘fidelity’ as a qualifying attribute
for modeling frameworks. Restricting SystemC users to use only im-
plemented MoC-specific structures and styles disallows users to use any
feature afforded by free usage of C++, but such restricted framework
offers higher fidelity. Fidelity here refers to the capability of the frame-
work to faithfully model a theoretical MoC. It is necessary to think
about simulation semantics of such a language away from the semantics
of pure Discrete-Event based hardware models and instead the semantics
should provide a way to express and simulate other Models of Computa-
tion. The inspiration of such a system is drawn upon the success of the
Ptolemy II framework in specification and simulation of heterogeneous
embedded software systems [25]. However, since SystemC is targeted
to be the language of choice for semiconductor manufacturers as well
as system designers, as opposed to Ptolemy II, its goals are more am-
bitious. Our focus is in developing an extension to the SystemC kernel
to propose our extended SystemC as a possible heterogeneous SLDL.
We demonstrate the approach by showing language extensions to Sys-
temC for Synchronous Data Flow (SDF), Communicating Sequential
Processes (CSP) and Finite State Machine (FSM) MoCs. A common
use of SDF is in Digital Signal Processing (DSP) applications that re-
quire stream-based data models. CSP is used for exploring concurrency
in models, and FSMs are used for controllers in hardware designs. Be-
sides the primary objective of providing designers with a better structure
in expressing these MoCs, the secondary objective is to gain simulation
efficiency through modeling systems natural to their behavior.
framework also has to be matched to allow for correct and efficient val-
idation of the system via simulation. The frameworks must provide
sufficient behavior to represent the system under investigation and must
also provide a natural modeling paradigm to ease designer involvement
in construction of the model. Consequently, most industries develop pro-
prietary modeling and simulation frameworks specific for their purpose,
within which their product is modeled, simulated and validated. This
makes simulation efficiency a gating factor in reducing the productivity
gap. The proprietary solutions often lead to incompatibility when vari-
ous parts of a model are reusable components purchased or imported as
soft-IPs. Standardization of modeling languages and frameworks allevi-
ate such interoperability issues.
To make the readers aware of the distinction between modeling guide-
line versus enforced modeling framework, we present a pictorial exam-
ple of how SDF models are implemented in [23, 45] (Figure 1.3) using
SystemC’s DE kernel. This example shows a Finite Impulse Response
(FIR) model with sc fifo channels and each process is of SC THREAD()
type. This model employs the existing SystemC Discrete-Event kernel
and requires handshaking between the Stimulus and FIR, and, FIR and
Display blocks. The handshaking dictates which blocks execute, allow-
ing the Stimulus block to prepare sufficient data for the FIR to perform
its calculation followed by a handshake communication with the Display
block.
Dynamic
SC_THREAD
Scheduling
sc_fifo channels
The same example with our SDF kernel is shown in Figure 1.4. This
model uses SDFports specifically created for this MoC for data pass-
ing instead of sc fifos and no synchronization is required since static
scheduling is used. The model with the SDF kernel abandons any need
for handshaking communication between the blocks and uses a schedul-
ing algorithm at initialization time to determine the execution schedule
of the blocks.
Introduction 7
SDF channels
SDF ports
Stimulus FIR Display
Controller IP
Distortion
Button Control
1 1 Control
1 1
1 Discrete-Event MOC 1
Interface Interface
1 Data Flow MOC 1
1 1
Analyzer IP
4096 4096
Filter Analyzer
4096
BACKGROUND MATERIAL
not allow flexibility in creating an object oriented design and limits itself
to mainly RTL and block level models.
Design Phase
Implementation Phase
Layout
Routing
Hardware /
Design Space Model at
Simulation Software Synthesis
Exploration System Level
Partitioning
Software
Coding
and timing since these are integral requirements for embedded systems
that need simultaneous and concurrent operations. Every director pro-
vides a notion of a domain in Ptolemy II. To mention a few of them,
Ptolemy II has DE and SDF directors that can impart the Discrete-
Event and Synchronous Data Flow behaviors respectively to the models
in their domains. A component-based design approach is used where
models are constructed as a collection of interacting components called
actors. Figure 2.2 shows a diagram depicting the actor component and
also displaying the interaction of a collection of components. Actors in
below taken from [6]. For a full list of MoCs implemented as domains in
Ptolemy II, please refer to [6].
Although SystemC does not have a mature GUI for system level mod-
eling besides the efforts in Cocentric System Studio [64], we show in this
book that SystemC can also be extended to support MoCs through ap-
propriate kernel development.
Options for the type of processes that can be used in SystemC are:
The SC CTOR(...) calls the constructor for that module that can
create a module of the above three types. The sensitivity list dictates
when the module is to execute upon events on signals/channels. Every
primitive signal and channel generates events to indicate that the par-
ticular signal/channel has been updated. When events occur on signals
or channels on the sensitivity list, the process is expected to fire. In
the case of SC CTHREAD() processes, it is necessary to define a clock
edge on which the thread will execute. A user can define more than one
kind of process and bind it with its appropriate entry function [Listing
2.1, Line 22]. [Listing 2.1, Line 16] binds entry to a SC THREAD()
process. This means that one instance of SC MODULE(...) can contain
more than one entry functions of varying process types. These entry
functions behave according to the process bound with it.
Every SC MODULE(...) can have port declarations such as happy port
and sad port [Listing 2.1, Line 8 - 9]. Channels/signals are used to
connect one module to another via these ports. Input ports such as
sc in<...> receive input from the signals and output ports sc out<...>
transmit data onto the signals. The ports like the signals are of template
type allowing for types to vary. The main top-level module has to create
signals using the sc signal<...> declaration and these must be connected
with the corresponding module ports. During the evolution of SystemC,
design alterations in terms of functionality, semantics and syntax are ex-
pected. For starters, the newer version of SystemC expects deprecating
the SC CTHREAD() process entirely, along with wait until() function
calls. We are aware of these expected changes, but specifically adhere
to the SystemC standards set by version 2.0.1.
SystemC does not provide a graphical user interface from which one
can construct models, though SystemC Cocentric Studio [64] is an at-
tempt at performing this. When comparing to Ptolemy II, a module can
be related to an actor, the channels as signals or channels in SystemC,
and directors as the SystemC kernel. Since SystemC extends the C++
language using classes, macros and such C++ methodologies, almost any
functionality can be added to SystemC models allowing construction of
most MoCs. However, at this point, this comparison is only valid when
Background Material 21
This example in Figure 2.3 is suitable for the SDF paradigm because
each function block should only execute when it has enough inputs on
its arcs. This is an ideal situation for static scheduling of the model.
Current SystemC DE kernel does not allow static scheduling of a system
like Figure 2.3. However, modeling or designer guidelines can be used
to model such a system. This enforces an SDF-like system onto a DE
kernel reducing the simulation efficiency of the model.
22
Notice that in this example there are no wait() calls even though
this is a SC THREAD() process. Interestingly, this does not breach the
modeling paradigm due to the use of sc fifo channels [Listing 2.2, Line
3 - 4]. Synchronization need not be explicit in SDF models when using
Background Material 23
five chairs assigned to each individual philosopher and five forks down
on the table. In the middle of the table is a big spaghetti bowl that is
replenished continuously (so there is no shortage of spaghetti).
PHIL3
PHIL2
PHIL1
PHIL4
PHIL0
Fork2
PHIL3 PHIL2
Fork1
Fork3
ht2
to Rig
5
toLeft
PHIL4 PHIL1
to
to
Le
Ri
ft2
gh
t5
Fork4 1 t1
ef t
t oL i gh Fork0
PHIL0 to R
i i
P HILi = thinki → requestseati → getf orki−1 → getf orki+1 → eati
i i
→ dropf orki−1 → dropf orki+1 → relinquishseati → P HILi .
28
DP =||i=4
i=0 (P HILi || F ORKi )
where ||i=4
i=0 is an obvious indexed form of the parallel composition oper-
ator.
Unfortunately as explained in [28], this solution can deadlock when
all the philosophers decide to ask for the fork on their left at the same
time. This necessitates an arbitrator, and a solution is given in the form
of a footman, invented by C. S. Scholten. In our examples, we use this
footman based solution to illustrate a finite state machine control of the
dining philosopher system. Here we describe in CSP notation, how the
footman works. Basically, the footman is the one who receives request
for a seat by a hungry philosopher, and if the number of philosophers
eating at that time is less than four, only then it grants such a request.
This avoids the deadlock scenario described above. We describe the CSP
notational form of the footman following [28] with a mutually recursive
definition. Let us denote the footman as F OOTnS for n = 0..4, such that
F OOTnS denotes the behavior of the footman when n philosophers are
seated and the set S contains the indices of those who are seated. The al-
phabet of F OOTnS is given by ∪i=4i=0 {requestseati , relinquishseati }, and
hence those are the actions of P HILi which needs to synchronize with
the footman. Now we describe the CSP terms for the F OOTnS as follows:
{} {i}
F OOT0 =|i=4
i=0 (requestseati → F OOT1 )
{i} {i,j}
F OOT1 = (|j=i (requestseatj → F OOT2 ))
| (relinquishi → F OOT0 )
{i,j} {i,j,k}
F OOT2 = (|k=i,j (requestseatk → F OOT3 ))
{i,j}−{l}
| (|l∈{i,j} (relinquishl → F OOT1 ))
{i,j,k} {i,j,k,l}
F OOT3 = (|l=i,j,k (requestseatl → F OOT4 ))
{i,j,k}−{x}
| (|x∈{i,j,k} (relinquishx → F OOT2 ))
Background Material 29
{i,j,k,l} {i,j,k,l}−{x}
F OOT4 =|x∈{i,j,k,l} (relinquishx → F OOT3 )
0
1
0
Odd Even
0 1
shown in Figure 2.6. The transitions from even to odd only occur when
the input is 1, but if the input is 0 then the FSM remains in its same
state satisfying the parity bit checker. We elaborate our implementation
of the FSM kernel in SystemC in Chapter 7.
Note that formally an FSM could be either a Mealy style FSM or
a Moore style FSM. In the example of the parity checker, output is
determined by the current state of the machine, and hence it is a Moore
style FSM. We often need Mealy style FSMs as well. In the current
version of the FSM kernel we have not made any distinction of the two,
and our FSM structure can support both.
We end this section with the example of the footman discussed in the
previous section as a Moore style FSM.
seatsTaken >= 4 seatsTaken >= 4 seatsTaken >= 4 seatsTaken >= 4 seatsTaken >= 4
Figure 2.7 shows five states where the initial state is state0. A re-
questseat from a philosopher results in the FSM taking a transition to
state1 and allowing the philosopher to seat at the table. This continues
through state2 and state3. Once the FSM takes the transition to state4,
four seats have already been taken and no more philosophers will sit to
eat. Hence, all other requestseat will result in returning back to the same
state4. After one of the philosophers is done eating, the FSM takes the
transition back to state0. For details on the workings of the footman,
we refer to the reader to the earlier section discussing the CSP Dining
Philosophers example.
Chapter 3
1. DE Simulation Semantics
Modification of the existing SystemC Discrete-Event (DE) kernel re-
quires study of the reference implementation source code. This section
provides an implementation overview of the Discrete-Event kernel in
SystemC. We provide code fragments as well as pseudo-code to identify
some aspects of the source code with the assumption that the reader has
basic application-level design knowledge of SystemC. For details on the
QuickThread [35] implementation in SystemC refer to Appendix A.
Initialization
Stage
notify();
// immediate
notification
Ready-to-run processes
Evaluate
Stage
Progress
Time
Delta Events
Update Events
Update
Stage notify(n);
// timed
notification
notify(SC_ZERO
_TIME);
// delayed
notification
2. Implementation Specifics
The original DE kernel is a queue-based implementation where the
primary data-structure used is a queue. Most of the code fragments are
extracted from the sc simcontext.cpp file present in the kernel source
directory.
34
The simulation starts once the sc start(...) function is called from the
modeler’s program code. This executes the kernel by proceeding through
the initialization phase where several flags are set and coroutine packages
are prepared for use. Then, all SC METHOD() and SC THREAD()
processes are made runnable. The runnable queues and process lists are
merely lists of pointer handles. The SC METHOD(), SC THREAD(),
and SC CTHREAD() vector lists display the lists used to store pointers
to the processes in Listing 3.1. The
m delta events is the list that holds the address of the events generated
as delta events. The type for the lists used in the kernel are typedefed
as shown in Listing 3.1.
Using these typedef types, three lists are instantiated for the corre-
sponding SC METHOD(), SC THREAD(), and SC CTHREAD() pro-
cesses as private members of the class sc process table, whereas
m delta events is simply an sc pvector<...> list.
We first present the pseudo-code for the simulation kernel followed by
explanation of the functions and structures present. The pseudo-code is
shown in Listing 3.2.
4. Example: DE Kernel
In effort to clarify the Discrete-Event kernel, we provide an example
that exercises some of the properties such as immediate and delayed
notification as well as the scheduling of all processes.
Listing 3.3 displays the primary entry functions and SC METHOD()
and SC THREAD() definitions. We provide a brief description of the
processes in this example and how they should function according to the
simple program.
P1 meth is an SC METHOD() process that updates the local variable
called temp with the value read from signal s2 [Listing 3.3, Line 23].
This value is incremented and written to signal s1 after which an event
is scheduled on event e2 after 2 nanoseconds (ns). The signal s1 gets
updated during the current delta cycle’s Update phase. The value is
ready on that signal at the end of that delta cycle. That completes the
execution of the SC METHOD(). However, P1 meth will only fire when
there is a change in signal s2 due to the sensitivity list definition. This
process is declared as dont initialize(), defining that this process should
not be executed during initialization phase.
The remaining processes are all of SC THREAD() types. P4 thrd
and P5 thrd are straightforward in that they wait on events e1 and e2
respectively [Listing 3.3, Line 46 & 53]. Upon receiving an event the
remaining of the code segment is executed until the wait(...) statement
is encountered again. At a wait statement, the executing process sus-
pends allowing other processes to execute. The sole purpose of these two
SC THREAD()s is to print to the screen when the event is notified along
with helper functions in identifying the changes on signals. P3 thrd is
similar in that it waits for events except in this case it waits for either
event to be notified.
P2 thrd has an internal counter called local counter that initializes
the s2 signal with -1 during the initialization stage and increments from
there on to write to the same signal [Listing 3.3, Line 29]. After the
increment is done, an immediate notification is sent on event e1. This
38
15] statement forces the module to only proceed execution once the
condition inside watching(...) is satisfied, which in this case is reset
being true. This mechanism ensures that the reset waits until four clock
cycles are complete as shown in Listing 3.4 before the FIR block is
executed. Dynamic scheduling is performed on this system with the
help of control signals that direct the flow of tokens from the Stimulus
to FIR and finally to the Display block.
Chapter 4
baseReceiver
1 1..*
1
*
___element
This chapter informs the reader about the organization of our ker-
nel implementation. During the development process of alternative ker-
nels for SystemC, several implementation hierarchies and data structures
were investigated. In this book we did not make an effort to unify them.
This book presents a snapshot of the current status of the project so
that other interested developers can use the concepts and ideas to de-
46
velop their own multi-MoC kernels. Once the reader goes through the
Synchronous Data Flow, Communicating Sequential Processes and Fi-
nite State Machine chapters, a distinct difference in class hierarchy with
the SystemC kernel development can be noticed. We expect further
gradual changes in implementation hierarchy in the future as we im-
prove our SystemC kernel implementations. Nonetheless, we propose a
hierarchy that allows for an extensible design for multi-MoC modeling.
We simply lay a foundation that can support this, but do not currently
employ it to its maximum potential.
In general, the class hierarchy resembles the class diagram shown in
Figure 4.1. Some of the terminology used in Figure 4.1 are borrowed
from [25]. It is not necessary to strictly conform to this class hierarchy,
because some implementations do not require such a class structure and
some require more encapsulation. The terminology used are as follows:
Kernel: A class that allows for creation and simulation of multiple in-
stances of a model represented by a specific MoC.
Node: A representation of a specific function block that exhibits behav-
ior specified by the MoC. For example, a CSPnode is a representation
for a CSP process by encapsulating an instance of CSPReceiver.
Receiver: An encapsulation class to separate the data structure of an
MoC from its communication with the designer and MoC-specific
nodes. The common functionalities can be derived from a baseRe-
ceiver class.
baseReceiver: A class that encapsulates common functionalities and
data structures employed by a receiver. Examples are queues that
are used in DE and CSP MoCs as runnable lists and graph structures
as used in representing an SDF and CSP model.
Element: Embedded deepest of all classes in terms of class hierarchy,
an element class defines a structure that aids in creating the main
data structure used to construct a model for an MoC.
This class hierarchy is only to provide minimal organization in devel-
oping the additional kernels and classes for encapsulation and function-
ality. Additional classes if required, should be added for better object
oriented programming practices.
The CSP kernel class diagram in Figure 4.2 illustrates the CSP ker-
nel implementation loosely employing the general implementation class
hierarchy.
A CSP model is best represented as a graph. This graph is represented
in CSPReceiver by a list-based data structure using Standard Template
Few words about Implementation Class Hierarchy 47
baseReceiver
2 CSPnodelist
CSPReceiver CSPnode * 1 *
1 1
1 1
1 3 1
CSPkernel
CSPelement *
3
1
1
sc_thread_process sc_domains
baseReceiver
FSMReceiver * 1 FSMkernel
1 *
* 1
sc_method_process sc_domains
sc_moc_port<T>
MoC-Specific Ports
Figure 4.4 describes a class hierarchy for ports that accommodates multi-
ple MoC communication. Basic functionality of the ports is implemented
in sc moc port class and specializations are implemented in the derived
classes. All the derived port classes are also polymorphic by making
them template classes.
Listing 4.1 shows the class declaration and definition for sc moc port.
The private data member of this class is a pointer to an sc moc channel
object called port. This variable addresses the channel that is bound to
this port. The roles of the member functions are shown in Table 4.1.
Most of the generic roles of the port are implemented in the base class.
If there is a need to add specific functionality for a particular port or
channel then it can be added to the derived class.
50
MoC-Specific Channels
Similarly, channels for the these MoC-specific ports follow a hierarchy
displayed in Figure 4.5. The base class is sc moc channel from which the
SDFchannel, CSPchannel and FSMchannel are all derived. Basic func-
tions of a channel are implemented in the base class sc moc channel and
MoC-specific channel implementations are contained in their respective
derived class. The SDFchannel and FSMchannel are used to transport
Few words about Implementation Class Hierarchy 51
sc_moc_channel<T>
4.4 and Figure 4.5. We provide SDF examples that employ the SDF
ports and channels at our website [36].
baseReceiver class
baseReceiver currently only holds the receiver type, indicating whether
an FSMReceiver or CSPReceiver has been derived. However, the usage
of this base class can extend to encompass common data structures and
helper functions. One such use of the baseReceiver can be to allow for
implementation of the data structure required to represent MoCs that
require a graph construction. We consider a graph like structure for SDF,
FSM and CSP and currently we employ individual representations for
each kernel shown in Figure 4.6. However, our current implementation
does not unify the idea of representing commonly used data structures
in the baseReceiver class and leave it aside as future work.
Few words about Implementation Class Hierarchy 53
No change No change
PHIL3
S0 S2 PHIL2
T0 T5
A=green A=red
B=red B=green
T1 T4
B := yellow
A := yellow T2 T3 PHIL1
PHIL4
A := red A := green
B := green B := red
A=red
A=yellow
B=yellow
B=red
PHIL0
S1 S3
2. Integration of Kernels
Kernel integration is a challenging task especially for kernels based on
MoCs that exhibit different simulation semantics other than the existing
DE scheduler semantics of SystemC. The MoC implementation chapters
discuss the addition of a particular MoC in SystemC and documenta-
tion is provided describing the integration of these different MoCs. MoCs
such as SDF and FSM are easy to integrate with the reference imple-
mentation and themselves, whereas CSP requires an understanding of
QuickThreads [35] and the coroutine packages in SystemC. Integration of
the SDF and FSM kernels are relatively straightforward, requiring minor
additions to the existing source with the usage of Autoconf/Automake
[21, 22]. In Appendix B we describe a method of adding newly created
classes to the SystemC distribution using Autoconf/Automake. This
approach is used for all MoC integration. However, the CSP kernel
integration is non-trivial, which we describe in detail in Chapter 6.
Chapter 5
1. SDF MoC
This chapter describes our implementation of the Synchronous Data
Flow (SDF) kernel in SystemC. We present code fragments for the SDF
data structure, scheduling algorithms, kernel manipulations and designer
guidelines for modeling using the SDF kernel along with an example.
The SDF MoC is a subset of the Data Flow paradigm [32]. This
paradigm dictates that a program is divided into blocks and arcs, repre-
senting functionality and data paths, respectively. The program is rep-
resented as a directed graph connecting the function blocks with data
arcs. From [53], Figure 5.1 shows an example of an SDF graph (SDFG).
An SDF model imposes further constraints by defining the block to be
invoked only when there is sufficient input samples available to carry out
the computation by the function block, and blocks with no data input
arcs can be scheduled for execution at any time.
In Figure 5.1, the numbers at the head and tail of the arcs represent
the production rate of the block and consumption rate of the block
respectively, and the numbers in the middle represent an identification
number for the arc that we call arc labels. An invoked block consumes
a fixed number of data samples on each input data arc and similarly
expunges a fixed number of data samples on each of the output data
arcs. Input and output data rates of each data arc for a block are
known prior to the invocation of the block and behave as infinite FIFO
queues. Please note that we interchangeably use function blocks, blocks
and nodes for referring to blocks of code as symbolized in Figure 5.1 by
A, B, C, D, E, F and G.
56
2 1 3
B 1
3
1
1
E 6
4
1
1 1
A F
1
2 D
2 2
C 2
5
1
3
7
* 1
*
1 1 sdf_graph
+prefix
edges -result : int*
* 2 -num_arcs : int
1 -num_nodes : int
+setname() : void -sdflist : vector<edges*>
+set_prod() : void -sdf_schedule : vector<edges*>
+set_cons() : void
+sdf_crunch() : void
+sdf_simulate() : void
+sdf_create_schedule() : void
1
*
1
sc_domains
-sdf_domain : vector<sdf_graph*>
The toplevel class is defined as sdf graph as shown in Listing 5.1. This
is the class that holds information pertaining to a single Synchronous
Data Flow Graph (SDFG). The SDFG encapsulates the following in-
formation: the number of blocks and arcs in the SDF, access to the
executable schedule via sdf schedule, the repetition vector through re-
sult, a string to identify the toplevel SDF by prefix, and a list of the
SDF blocks represented by sdflist [Listing 5.1, Line 6 - 13].
All SDF blocks are inserted into the vector<edges*> sdflist list. This
introduces the edges class that encapsulates the incoming and the out-
going arcs of that particular SDF block, an integer valued name for con-
Synchronous Data Flow Kernel in SystemC 59
structing the repetition vector and text based names for comparisons.
We typedef this class to SDFnode.
Our implementation uses process names for comparison since the
SystemC standard requires each object to contain a unique identify-
ing name. When storing the process name either in sdf graph class or
edges class shown in Listing 5.3, we add a dot character followed by the
name of the process.
1 class edges {
2
3 public :
4 v e c t o r <o u t e d g e s > o u t ;
5 v e c t o r <i n e d g e s > i n ;
6
7 // c o n s t r u c t o r / d e s t r u c t o r
8 edges () ;
9 ˜ edges () ;
10
11 // member f u n c t i o n s
12 void s e t n a m e ( s t r i n g name , v e c t o r <e d g e s ∗> & i n ) ; // s e t
name
13 void s e t p r o d ( s c m e t h o d ∗ t o p t r , i n t p r o d ) ; // s e t
production rate
14 void s e t c o n s ( s c m e t h o d ∗ f r o m p t r , i n t c o n s ) ; // s e t
consumption r a t e
15
16 // v a r i a b l e s t h a t w i l l remain p u b l i c a t t h e moment
17 s t r i n g name ;
18 i n t mapped name ;
19 };
The struct and class definitions in Listing 5.4 allow us to define an SDF
block shown in Figure 5.3.
edges block_name;
string name;
int mappedname;
3. Scheduling of SDF
3.1 Repetition vector: Linear Diophantine
Equations
The first issue of creating an executable schedule is discussed in [38]
where Lee et al. describe a method whereby static scheduling of these
62
function blocks can be computed during compile time rather than run
time. The method utilizes the predefined consumption and production
rates to construct a set of linear Diophantine [11] homogeneous system
of equations and represent it in the form of a topology matrix Γ. It was
shown in [38] that in order to have a solution, Γ must be of rank s − 1
where s is the number of blocks in the SDFG. Solution to this system of
equations results in a repetition vector for the SDFG. An algorithm used
to compute Hilbert’s basis [11] solves linear Diophantine equations using
the Completion procedure [11, 51] to provide an integer-valued Hilbert’s
basis. However, the fact that the rank is s − 1 shows that for SDFs the
Hilbert’s basis is uni-dimensional and contains only one basis element
[38].
Solving linear Diophantine equations is crucial in obtaining a valid
repetition vector for any SDF graph. A tidy mechanism using the pro-
duction and consumption rates to construct 2-variable equations and
solving this system of equations results in the repetition vector. The
equations have 2-variables because an arc can only be connected to two
blocks. Though this may seem as a simple problem, the simplicity of the
problem is challenged with the possibility of the solution of Diophantine
equations coming from a real-valued set. This real-valued set of solu-
tions for the Diophantine equations is unacceptable for the purpose of
SDF since the number of firings of the blocks require being integral val-
ues. Not only do the values have to be integers, but they also have to
be non-negative and non-zero, since a strongly connected SDFG can not
have a block that is never fired. These systems of equations in math-
ematics are referred to as linear Diophantine equations and we discuss
an algorithmic approach via the Completion procedure with an added
heuristic to create the repetition vector as presented in [11].
We begin by defining a system of equations parameterized by − →a =
→
−
{ai |i = 1...m}, b = {bj |j = 1...n} and {c, m, n ∈ such that the
general form for an inhomogeneous linear Diophantine equation is:
1u − 2v + 0w + 0x + 0y + 0z = 0 (5.3)
For the entire SDFG shown in Figure 5.1, the system of equations is
described in Equations 5.4. Note that this is a homogeneous system of
equations in which the total number of tokens inserted into the system
equals the total number of tokens consumed. Our SDF scheduling im-
plementation requires only homogeneous linear Diophantine equations,
hence limiting our discussion to only homogeneous Diophantine equa-
tions.
1u − 2v + 0w + 0x + 0y + 0z =0 (5.4)
1u + 0v − 2w + 0x + 0y + 0z =0
0u + 1v + 0w + 0x − 1y + 0z =0
0u − 1v + 0w + 1x + 0y + 0z =0
0u + 0v + 2w − 2x + 0y + 0z =0
0u + 0v + 3w + 0x + 0y − 1z =0
0u + 0v + 0w + 0x + 3y − 1z =0
This system of equations as you notice are only 2-variable equations
yielding the topology matrix Γ as:
⎛ ⎞
1 -2 0 0 0 0
⎜ 1 0 -2 0 0 0 ⎟
⎜ ⎟
⎜ 0 1 0 0 -1 0 ⎟
Γ=⎜ ⎟
⎜ 0 -1 0 1 0 0 ⎟
⎜ 0 0 2 -2 0 0 ⎟
⎝ 0 0 3 0 0 -1
⎠
0 0 0 0 3 -1
→
− →
−
Solving for X in Γ X = 0 yields the repetition vector for the SDFG
in Figure 5.1. A linear Diophantine equation solver [51] solves these
64
{Initialization}
→
− →
−
P1 := (e1 , 0 ), ..., (em , 0 )
M1 := N U LL
Q1 := N U LL
{Completion step}
→
−
Qk+1 := {p + ( 0 , ej )|p ∈ Pk , d(p) > 0, 1 ≤ j ≤ n}
→
−
∪{p + (e1 , 0 )|p ∈ Pk , d(p) < 0, 1 ≤ i ≤ m}
Mk+1 := {p ∈ Qk+1 |d(p) = 0} k
Pk+1 := {p ∈ Qk+1 \ Mk+1 | p minimal in p ∪ i=1 Mi }
{ Termination }
Pk = N U LL?
M := ki=1 Mi
eq1 = 1u − 2v + 0w + 0x + 0y + 0z =0
eq2 = 1u + 0v − 2w + 0x + 0y + 0z =0
eq3 = 0u + 1v + 0w + 0x − 1y + 0z =0
eq4 = 0u − 1v + 0w + 1x + 0y + 0z =0
eq5 = 0u + 0v + 2w − 2x + 0y + 0z =0
eq6 = 0u + 0v + 3w + 0x + 0y − 1z =0
eq7 = 0u + 0v + 0w + 0x + 3y − 1z =0
and steps in processing this system of equations with Algorithm 5.2
are shown in Table 5.2.
In Table 5.2, the proposal vector begins as a vector of all ones with
its computed defect. The next equations to consider are the ones with
68
not exist. Since, our goal is to obtain the firing order regardless of the
SDFG containing cycles or not we employ another algorithm in con-
structing the firing order.
Bhattacharyya, Murthy and Lee in [5] developed scheduling algo-
rithms for SDF of which one scheduling algorithm determines the fir-
ing order of non-trivial (cyclic or acyclic) SDFGs. However, before we
present scheduling Algorithm 5.3, we discuss the delay terminology for
an SDFG shown in Figure 5.4. Note the production rate at the head
of the arcs, consumption rate at the tail of the arcs (marked by the
arrow head), arc label or name in the middle of the arc and a newly
introduced delay(α) = γ where α is an arc label and γ is the delay on
that arc. delay [5] represents the number of tokens the designer has to
inject initially into arc α for the SDF model to execute. The concept
of delay is necessary when considering cyclical SDFGs. This is because
of the additional constraint set by the SDF paradigm that every block
only executes when it has sufficient input tokens on its input arcs, thus
a cycle as indicated by arc label 8 in Figure 5.1 without the delay causes
the SDFG to deadlock. Hence, no blocks can be fired. The delay acts as
an initial token on that arc to allow simulation to begin. We introduce
the concept of delay on all arcs, except we omit displaying the delays
that are zero, like on arc label 1 and 2 and so on.
Scheduling Algorithm 5.3 creates a firing order by using the repetition
vector from the Hilbert’s solver. If the firing order is valid then the SDF
70
kernel executes the SDF processes in the correct sequence for the correct
number of times. The simulation terminates if either the repetition vector
is invalid or the SDFG is inconsistent. An SDFG is inconsistent when
the number of times every block scheduled in the firing order is not
reflected by the number of times the block is supposed to be scheduled
for execution as per the repetition vector.
If the SDFG is consistent, the first block from ready is popped and
appended to the schedule for the number of times the block is to be
invoked. For all the incoming edges of this block the delay for this arc is
recalculated. For the outgoing edges a similar calculation is done except
this time the state for the arc is incremented by the production rate
multiplied by temp n. Basically, the algorithm looks at all the outgoing
edges and the blocks pointed by these outgoing edges and proceeds to
traverse focusing on all blocks pointed by the outgoing edges of the block
just popped of the ready. Finally, a check for inconsistency is performed
where the number of times each block is scheduled has to be equivalent
to the number of times it is supposed to be fired from the repetition
vector. The algorithm concludes with the correct schedule in S for a
consistent SDFG.
This scheduling algorithm yields the schedule in the correct firing
order with the number of times it is to be fired. The kernel will fire ac-
cording to this schedule. Acyclic and cyclic SDFGs are handled correctly
by this algorithm. The final executable schedule is stored in sdf schedule
accessible to the kernel for execution.
1 1
FIR
1
2
1 1
Stimulus Display
1 delay(3) = 1
1
3
3) with the converted model using the SDF kernel. We edit the source
code to remove some std :: cout statements to make the output from
DE FIR example to match the output of the SDF FIR model, but the
functionality of the FIR in DE and SDF remains the same. f ir const.h
is not included in the listings since it is a list of constants that will be
available in the full source prints at [36].
1#include <queue>
2 extern sdf graph sdf1 ;
3 using namespace s t d ;
4
5 SC MODULE( s t i m u l u s ) {
6 edges stimulus edge ;
7 s c i n t <8> send value1 ;
8 SC CTOR( s t i m u l u s ) {
9 s t i m u l u s e d g e . s e t n a m e ( name ( ) , s d f 1 . s d f l i s t ) ;
10 SC METHOD( e n t r y ) ;
11 send value1 = 0;
12 }
13 void e n t r y ( ) ;
14 };
Notice in Listing 5.5 that the Stimulus block has no input or output
ports along or control signal ports. This refers to declarations of ports
using sc in<...> or sc out<...>. These are no longer required in an SDF
model because static scheduling does not require control signals and our
method of data passing is through STL queue<...> structures. We do
not strictly enforce their removal and if the designer pleases to use Sys-
temC channels and ports for data paths then that can also be employed.
However, the SDF kernel statically schedules the SDF blocks for execu-
tion at compile time, hence there is no need for one block to signal to
the following block when data is ready to be passed on, obviating the
need for control signals. For data paths, using signals and channels in
SystemC generate events reducing simulation efficiency. Our advised ap-
proach is to use queue<...> STL queues to transfer data within the SDF
model instead of using SystemC. Only the data that has to be passed
from one block to another requires an instantiation of a queue such as
the extern queue<int> stimulusQ [Listing 5.6, Line 6]. Instantiation
of stimulus edge object is crucial in defining this SC MODULE() as an
SDF method process [Listing 5.5, Line 5 & 6]. This object is used to
pass the name of the SC MODULE() and the SDFG which this block
belongs to as shown in [Listing 5.5, Line 9].
The queues used in the FIR model are stimulusQ and firQ, where stim-
ulusQ connects the Stimulus block to the FIR block and firQ connects
from the FIR block to the Display [Listing 5.8, Line 4 & 5]. However,
74
1#include <queue>
2 extern sdf graph sdf1 ;
3 using namespace s t d ;
4
5 SC MODULE( f i r ) {
6 s c i n t <9> c o e f s [ 1 6 ] ;
7 s c i n t <8> sample tmp ;
8 s c i n t <17> p r o ;
9 s c i n t <19> a c c ;
10 s c i n t <8> s h i f t [ 1 6 ] ;
11 edges f i r e d g e ;
12 SC CTOR( f i r ) {
13 f i r e d g e . s e t n a m e ( name ( ) /∗” p r o c e s s b o d y ” ∗/ , s d f 1 . s d f l i s t )
;
14 SC METHOD( e n t r y ) ;
15 #i n c l u d e ” f i r c o n s t . h”
16 f o r ( i n t i = 0 ; i < 1 5 ; i ++) {
17 shift [ i ] = 0;
18 }
19 void e n t r y ( )
20 };
Synchronous Data Flow Kernel in SystemC 75
Every SDF block (SDF SC MODULE()) must also have access to the
SDFG that it is to be inserted in. This means the sdf graph object that is
instantiated globally must be available to the SC MODULE()s such as in
[Listing 5.7, Line 2]. However, the integral part of the SC MODULE()
declaration is the instantiation of the edges object as shown in [List-
ing 5.7, Line 11]. This object is accessed by the SDF kernel in deter-
mining certain characteristics during scheduling. These characteristics
are set by member functions available in the edges class. One of the
first member functions encountered in the SC MODULE() declaration
is the set name(...) [Listing 5.7, Line 13] function that is responsible
for providing the edges object with the module name and the storage
list of SDF method processes. The name() function from within the
SC MODULE() returns the name of the current module. In [Listing 5.7,
Line 13] sdf1.sdflist is the list where the addresses of the SC METHOD()
processes are stored for access by the SDF kernel. Apart from those alter-
ations, the structure of an SC MODULE() is similar to regular SystemC
processes.
Note that the functions used to insert data onto the queue<...>s are
STL functions push() and pop() [Listing 5.8, Line 8 & 18]. There is also
no check for the number of tokens ready to be received by each block.
Naturally, this is not required since we are statically scheduling the SDF
blocks for an appropriate number of times according to their consump-
tion and production rates. However, this burdens the designer with the
responsibility of carefully inserting sufficient tokens on the queue<...>s
to ensure the simulation does not attempt at using invalid data.
76
The SDF kernel requires the modeler to specify the terminating value
as in the DE kernel example. This is similar to the termination situa-
tions posed in [23]. However, we define a period of SDF as a complete
execution of the SDF. In this example since there is a cycle, every period
is an execution of the SDF model. We halt the execution after a specified
number of samples using the sc stop() [Listing 5.10, Line 9] which tells
the kernel that the modeler has requested termination of the simulation.
The toplevel SC METHOD() process labelled toplevel in [Listing 5.12,
Line 2] constructs the SDF graph encapsulating that entire SDF model
inside it. The choice of the toplevel process can be of any SystemC type.
The entry function has to be manipulated according to the process type
since they continue to follow SystemC semantics. The construction of
this module is straightforward whereby pointers to each of the SDF
methods are member variables and are initialized to their correspond-
ing objects in the constructor. The important step is in constructing the
SDFG within the constructor (SC CTOR()) since the constructor is only
invoked once per instantiation of the object. The functions set prod(...)
and set cons(...) set the arcs on the SDFG. Every SC MODULE() de-
Synchronous Data Flow Kernel in SystemC 77
fines an SDF block that requires the arcs being set. The arguments of
these set functions are: the address of the edges object instantiated in a
module that it is pointed to or from, the production or consumption rate
depending on which function is called and the delay. We also enforce
a global instantiation of sdf graph [Listing 5.11, Line 5] type object for
every SDFG that is present in the model. Using the schedule class and
the add sdf graph(...), the SDFG is added into a list that is visible by the
overlaying SDF kernel [Listing 5.12, Line 23]. In addition, this toplevel
process must have an entry function that calls sdf trigger() [Listing 5.12,
78
Line 8] signalling the kernel to process all the SDF methods correspond-
ing to this toplevel SDF module. These guidelines enable the modeler
to allow for heterogeneity in the models since the toplevel process can
be sensitive to any signal that is to fire the SDF. The SC CTHREAD()
[Listing 5.11, Line 39] partakes in this particular role where every three
cycles the SDF model is triggered through the signal data. However, the
designer has to be careful during multi-MoC modeling due to the trans-
fer of data from the DE blocks to the SDF blocks. This is because there
is no functionality in the SDF kernel or for that matter even the DE ker-
nel that verifies that data on an STL queue<...> path is available before
triggering the SDF method process. This has to be carefully handled by
the designer. These style guides for SDF are natural to the paradigm
Synchronous Data Flow Kernel in SystemC 79
mean that a DE model can not be present within an SDF model, though
careful programming is required in ensuring the DE block is contained
within one SDF block. For the future extension we are working on a
more generic design for hierarchical kernels through an evolved API.
Algorithm 5.4 displays the altered DE. The noticeable change in the
kernel is the separation of initialization roles. We find it necessary to
separate what we consider two distinct initialization roles as:
Pushing processes (all types) onto the runnable queues and executing
crunch() (see Chapter 3) once.
void de initialize1()
perform update on primitive channel registries;
prepare all THREADs and CTHREADs for simulation;
{ END initialize1()}
void de initialize2()
push METHOD handle onto regular DE METHOD runnable list;
push all THREADs onto THREAD runnable list;
process delta notifications;
execute crunch() to perform Evaluate-Update once.
{ END initialize2()}
void simulate()
initialize1();
if (clock count mod 2 == 0) then
set run sdf to true;
end if
execute crunch() until no timed notifications or runnable processes;
increment clock count;
{ END simulate()}
void crunch()
while (true) do
if (there is a user specified order) then
pop all methods and threads off runnable list and store into temporary
while (parsed user specified order is valid) do
find process handle in temporary lists and execute;
end while
else
execute all remaining processes in the temporary lists;
end if
{ Evaluate Phase }
execute all THREAD/CTHREAD processes;
break when no processes left to execute;
end while
{ Update Phase}
update primitive channel registries;
increment delta counter;
process delta notifications;
{ END crunch()}
84
1 void s c d o m a i n s : : s d f t r i g g e r ( s t r i n g topname ) {
2
3 s t r i n g sdfname = topname+” . ” ;
4 sdf graph ∗ run this ;
5
6 f o r ( i n t s d f g r a p h s = 0 ; s d f g r a p h s < ( signed ) model .
s d f d o m a i n . s i z e ( ) ; s d f g r a p h s ++) {
7 // p o i n t e r t o a p a r t i c u l a r SDF graph
8 s d f g r a p h ∗ p r o c e s s s d f g r a p h = model . s d f d o m a i n [ s d f g r a p h s
];
9 i f ( strcmp ( p r o c e s s s d f g r a p h −>p r e f i x . c s t r ( ) , sdfname . c s t r
( ) ) ==0 ) {
10 run this = process sdf graph ;
11 i f ( r u n s d f == true ) {
12 // e x e c u t e t h e SDF METHODs
13 r u n t h i s −>s d f s i m u l a t e ( sdfname ) ;
14 run sdf = false ;
15 } /∗ END IF ∗/
16 } /∗ END IF ∗/
17 } /∗ END FOR ∗/
18 } /∗ END s d f t r i g g e r ∗/
19
20 sdf graph ∗ sc domains : : f i n d s d f g r a p h ( s t r i n g sdf prefix ) {
21
22 f o r ( i n t s d f g r a p h s = 0 ; s d f g r a p h s < ( signed ) model .
s d f d o m a i n . s i z e ( ) ; s d f g r a p h s ++) {
23 // p o i n t e r t o a p a r t i c u l a r SDF graph
24 s d f g r a p h ∗ p r o c e s s s d f g r a p h = model . s d f d o m a i n [ s d f g r a p h s
];
25 i f ( strcmp ( p r o c e s s s d f g r a p h −>p r e f i x . c s t r ( ) , s d f p r e f i x .
c s t r ( ) ) ==0 ) {
26 return ( p r o c e s s s d f g r a p h ) ;
27 } /∗ END IF ∗/
28 } /∗ END FOR ∗/
29 return NULL ;
30 } /∗ END f i n d s d f g r a p h ∗/
Listing 5.16 shows the definition of sdf trigger() that calls the
sdf simulate() function responsible for finding the appropriate SDFG
86
(with the helper function find sdf graph(...) and executing the SDF
processes corresponding to that SDFG.
The creation of the schedules is encapsulated in the
sdf create schedule(...) function that constructs the topology matrix for
the Diophantine equations solver, returns the solution from the solver
and creates an executable schedule if one exists as demonstrated in List-
ing 5.17.
(c) Sobel
The graphs in Figure 5.6 present results from these three scenarios.
They show that every model demonstrates significant improvement in
the amount of simulation time over the original model and the non-
threaded model. The three bars on each chart refer to the time taken
in seconds for the entire model to be executed in their respective mod-
eling scenarios. The leftmost bar being the original, middle bar being
non-threaded are modeled using the DE kernel and the rightmost bar
using the SDF kernel. The bar charts show that increasing the num-
ber of sample size will still preserve the efficiency presented by the set
of collected data. The FIR and FFT yielded approximately 75% im-
provement in simulation time compared to the original model and the
Sobel yielded a 53% improvement in simulation time. Comparing re-
sults from the SDF kernel to the non-threaded models, the FIR, FFT
and Sobel, showed 70%, 57% and 47% improvement in simulation time
respectively. All results show better performance with the SDF kernel
than both the original and non-threaded models. We conducted an in-
vestigation on the relative lower improvement for the Sobel model and
understand that the Input block in Sobel is only executed once to read
in the entire matrix of values. However, when this is altered to se-
lect segments of the matrix then the performance reflects the FIR and
FFT model results. This indicates that simulation efficiency depends
on the number of invocations of the blocks required to perform certain
functions. The Sobel model using the DE kernel required a lot more
invocations when collecting segments of the matrix increasing the sig-
nal communication and data passing. Future experimentation proposes
comparison with different matrix segment sizes for this edge detection
algorithm. The percentage improvement over the non-threaded model
in [62] is also consistent with the Sobel edge detection yielding a lower
improvement and for the same reason.
Chapter 6
COMMUNICATING SEQUENTIAL
PROCESSES KERNEL IN SYSTEMC
Rendez-vous Communication
Implementation of the Communicating Sequential Processes Model
of Computation requires understanding of rendez-vous communication
protocol. Every node or block in a CSP model is a thread-like pro-
cess that continuously executes unless suspended due to communication.
The rendez-vous communication protocol dictates that communication
between processes only occurs when both the processes are ready to com-
municate. If either of the processes is not ready to communicate then
it suspends until its corresponding process is ready to communicate, at
which it is resumed for the transfer of data.
Figure 6.1 illustrates how the rendez-vous protocol works. T1 and T2
are threads that communicate through the channel labelled C1. T1 and
T2 are both runnable and have no specific order in which they are exe-
cuted. Let us consider process point 1, where T1 attempts to put a value
Communicating Sequential Processes Kernel in SystemC 95
Threads
CSP channel
T1 T2
put(…)
Point 1
Point 2
get(…)
Time progress
Point 3
get(…)
put(…)
Point 4
1. Implementation Details
We present some design considerations in this section followed by the
data structure employed for CSP and implementation details.
baseReceiver
2 CSPnodelist
CSPReceiver CSPnode * 1 *
1 1
1 1
1 3 1
CSPkernel
CSPelement *
3
1
1
sc_thread_process sc_domains
CSPnode 3
CSPnode 1 CSPnode 4
CSPnode 2
csp event, but regular bool data types. This avoids the use of SystemC’s
DE semantics and events.
Other than general set and get functions for the private members of
this class, the important member function is the overloaded equals op-
erator. The implementation of this overloaded operator compares the
fromNode and toNode to verify that the CSPelement objects on both
sides of the equals operator have the same addresses for the fromNode
and toNode. If they do, then a particular channel or CSPelement that
connects two CSPnodes is found. The responsibility of CSPelement is
exactly the same as that of a channel. This is a result of adhering to the
general implementation hierarchy, where the CSP channels are effectively
represented by CSPelement objects. Hence, we inherit CSPelement in
CSPchannel, which is discussed later. This is the mechanism that we em-
ploy in searching for the channels through which communication occurs.
However, this imposes a limitation that there can only be a maximum
of two channels between the two same CSPnodes. This gives rise to a
problem that if there exists two channels in the same direction between
the two same nodes, then according to the equals operator, they will be
indistinguishable. Thus, we limit the users to only one channel in the
same direction between two CSPnodes. We justify this implementation
in the following manner:
Figure 6.3 shows a simple CSP model with four CSP processes that are
connected via channels. The analogous representation of this simple CSP
model using our data structure is shown in Figure 6.4, which shows how
objects of CSPelement are used to construct a CSPG. The list holding
the CSPelements is the CSPReceiver. CSPReceiver objects are data
members of a CSPnode that are composed with CSPelement objects.
Figure 6.4 shows four CSPnodes and their respective CSPelements
for the purpose of providing a connection between two CSP processes.
The gray box displays objects of CSPReceiver. The role of the receiver
is simply to encapsulate the CSPelements as shown in Figure 6.4. A
simple data structure is employed to represent the CSPG. We employ
C++ STL vector<...> class to store the addresses of every CSPelement
inserted in the CSPG and iterate through the list to find the appropriate
100
CSPnode 3
CSPReceiver
1 3
3 4
CSPnode 1
CSPReceiver
1 3
CSPnode 2
1 2 CSPReceiver
4 2
1 2
CSPnode 4
CSPReceiver
4 2
3 4
CSPReceiver
1 2 CSPelement
CSPnode 1 points to CSPnode 2
channel for communication when required. Every CSPnode has its own
CSPReceiver object that contains the CSPelements that address that
particular CSP process. Listing 6.3 displays the class definition describ-
ing the elementlst as the container of the CSPelement addresses along
with a private helper function that is used to traverse through the list
and identify the requested channel.
We discuss some of the important member functions from this class
and their input and output arguments.
put(...):
Inputs:
Outputs:
Communicating Sequential Processes Kernel in SystemC 101
get(...):
Inputs:
Outputs:
If a put(...) has been called the suspended process that called the
put(...) is scheduled for execution (resumption).
CSPelement sc_moc_channel<T>
CSPchannel<T>
-value
Table 6.2. Few Important Member Functions of CSP Simulation class CSPnodelist
arguments: the entry function, the CSPnode object specific for that
SC CSP THREAD() and the CSPnodelist to which it will be added.
This macro calls a helper function that registers this CSP thread process
by inserting it in the CSPnodelist that is passed as an argument.
Invoking the function runcsp(...), initializes the coroutine package and
the current simulation context is stored in the variable main cor. The
simulation of the CSP model starts by calling the sc csp start(...) func-
tion. Table 6.2 shows a listing of some important functions and vari-
ables and whether the CSP kernel or the QuickThread package manages
108
them. The variable m cor pkg is a pointer to the file static instance of
the coroutine package. This interface for the coroutine package is better
explained in Appendix A. All thread processes require being prepared
for simulation. The role of this preparation is to allocate every thread
its own stack space as required by the QuickThread package. After this
preparation, the first process is popped from the top of the runlist using
pop runnable(...) and executed. The thread continues to execute until
it is either blocked by executing another thread process or it terminates.
This continues until there are no more processes on the runlist.
1 void CSPnodelist : : c s p t r i g g e r ( ) {
2 while ( true ) {
3 s c t h r e a d h a n d l e t h r e a d h = p o p r u n n a b l e ( )−>g e t m o d u l e ( ) ;
4 removefront () ;
5 while ( t h r e a d h ! = 0 & & ! t h r e a d h −>r e a d y t o r u n ( ) ) {
6 t h r e a d h = p o p r u n n a b l e ( )−>g e t m o d u l e ( ) ;
7 removefront () ;
8 }
9 i f ( thread h ! = 0 ) {
10 m cor pkg−>y i e l d ( t h r e a d h −>m cor ) ;
11 }
12
13 i f ( r u n n a b l e s i z e ( ) == 0 ) {
14 // no more r u n n a b l e p r o c e s s e s
15 break ;
16 }
17 };
18 }
Fork2
PHIL3 PHIL2
Fork1
Fork3
ht2
to Rig
5
toLeft
PHIL4 PHIL1
to
to
Le
Ri
ft2
gh
t5
Fork4 1 t1
t oL
ef t i gh Fork0
PHIL0 to R
and picks up the fork to their left. That prevents any of the philosophers
eating since two forks are required to eat causing the model to deadlock.
We use a simple deadlock avoidance technique where we have a footman
that takes the philosophers to their respective seats and, if there are four
philosophers at the table, asks the fifth philosopher to wait and seats
him only after one is done eating. This is a rudimentary solution, but
for our purpose it is sufficient.
We begin by describing the module declaration of a philosopher in
Listing 6.10. The original implementation that we borrow is available
at [60]. That implementation is a pure C++ based implementation that
we modify to make a CSP SystemC example. Each philosopher has a
unique id and an object of ProcInfo. This ProcInfo class is implemented
as a debug class to hold the address of the process and the name of the
process purely for the reasons of output and debugging. The full source
description will have the implementation of this class, though we do not
describe it since it is not directly relevant to the implementation of the
CSP kernel in SystemC. There is an instantiation of a CSPnode called
csp through which we enable our member function invocations for CSP
and two CSPports, toRight and toLeft. The toRight connects to the
CSPchannel that connects the philosopher to the fork on its right and
toLeft to the one on its left. There are several intermediate functions
defined in this module along with the main entry function. The entry
function is called soln() that is bound to a CSP process through the
SC CSP THREAD() macro.
Communicating Sequential Processes Kernel in SystemC 111
the philosopher enters the state where he attempts to drop the forks by
calling dropfork() described in Listing 6.13.
Dropping of the forks is modeled by sending a value on the channel
which is performed via the push(...) on the port. If the push(...) is
invoked without the corresponding CSP node at the end of the channel
ready to accept the token, the process will suspend. Returning back to
the entry function, after the forks have been dropped there is a random
Communicating Sequential Processes Kernel in SystemC 113
1 void PHIL : : d r o p f o r k ( ) {
2 // drop l e f t f i r s t , t h e n r i g h t not t h a t i t matters
3 state [ id ] = 4;
4 print states () ;
5 t o L e f t . push ( ∗ drop , c s p ) ;
6 state [ id ] = 5;
7 print states () ;
8 t o R i g h t . push ( ∗ drop , c s p ) ;
9 };
Case 1: The philosopher to the right has requested a fork so the fork
gives itself through the put(...) function to the philosopher on the
right.
Case 2: The philosopher to the left has requested this fork, so the fork
gives itself to the requesting philosopher since the fork is still down
for Cases 1 and 2.
Communicating Sequential Processes Kernel in SystemC 115
Case 4: The fork was given to the philosopher on the right so request
the fork back from the philosopher.
Case 5: The fork was given to the philosopher on the left so this is
requested back.
5. Example of Producer/Consumer
A trivial example using CSP is the Producer/Consumer model. This
model is simple and has two processes, a Producer, a Consumer and
one channel between them. The communication direction between the
processes goes from the Producer to the Consumer. This example is
similar to the simple fifo example in the SystemC distribution. The
differences are that the processes are CSP processes and instead of an
sc fifo channel between the processes, there is a CSPchannel.
Producer Consumer
Listing 6.17 shows the module declaration for the PRODUCER class.
Notice an instance of CSPnode and a CSPport. The production pointer
holds the string that the Producer sends to the Consumer one char-
acter at a time [Listing 6.17, Line 5]. In [Listing 6.17, Line 12], the
at(...) member function from the string class returns a character at the
location defined by the argument and stores it in a variable ch. This
character is pushed onto the channel by invoking the push(...) member
118
function on the port that connects the two CSP processes. An instance
of CSPnodelist labelled as DP is accessible by both the PRODUCER
and CONSUMER objects.
The if construct repeatedly sends the same string by the Producer
when the sz string location counter is equal to the number of characters
in the string. This makes the model run infinitely. The constructor of
PRODUCER module sets the production pointer to a string and invokes
the SC CSP THREAD() macro for registering this process as a CSP
process.
1 SC MODULE(PRODUCER) {
2
3 CSPnode c s p ;
4 CSPport<char > toConsumer ;
5 s t r i n g ∗ production ;
6 ProcInfo proc ;
7
8 void sendChar ( ) {
9 int sz = 0;
10 while ( 1 )
11 {
12 char ch = p r o d u c t i o n −>a t ( s z ) ;
13 ++s z ;
14 toConsumer . push ( ch , c s p ) ;
15 // c s p . send ( ( t o k e n )&ch , toConsumer . read ( ) ) ;
16 // a l l o w f o r i n f i n i t e e x e c u t i o n
17 i f ( s z == ( signed ) p r o d u c t i o n −>s i z e ( ) )
18 sz = 0;
19 csp . r e s c h e d u l e ( ) ;
20 }
21 };
22
23 SC CTOR(PRODUCER) {
24 p r o d u c t i o n = new s t r i n g ( ) ;
25 ∗ p r o d u c t i o n = ” This i s a t e s t s t r i n g f o r Produced / Consumer
example : ] ” ;
26 SC CSP THREAD( sendChar , DP, c s p ) ;
27 };
28 };
1 i n t s c m a i n ( i n t a r g c , char ∗ a r g v [ ] ) {
2 CSPchannel<char > p t o c ; // Channel from Producer t o Consumer
3 PRODUCER p ( ” P r o d u c e r ” ) ; // Producer I n s t a n c e
4 p . toConsumer ( p t o c ) ; // Bind Producer
5 p . c s p . s e t p r o c n a m e ( ” P r o d u c e r ” ) ; //Debug i n f o r m a t i o n
6
7 CONSUMER c ( ” Consumer ” ) ; // Consumer I n s t a n c e
8 c . f r o m P r o d u c e r ( p t o c ) ; // Bind Consumer
9 c . c s p . s e t p r o c n a m e ( ” Consumer ” ) ; //Debug i n f o r m a t i o n
10
11 p . c s p . p o i n t s t o ( c . csp , p t o c ) ; // S e t d i r e c t i o n o f c h a n n e l
12
13 DP. r u n c s p (DP) ; // Prepare CSP f o r e x e c u t i o n
14 s c c s p s t a r t ( ” ” ,&DP) ; // S t a r t s i m u l a t i o n
15 return 0 ;
16 };
creating stack space along with initializing the stack with the appropriate
function and its arguments. After thread initialization, the threads are
executed by invoking the yield(...) function from the sc cor pkg that
switches out the current executing process and prepares the new pro-
cess (passed via the argument of the function) to execute. Suspension
functions such as wait(...) perform this switch to allow other runnable
processes to execute. The QuickThread package uses preswitch for con-
text switching that allows for this implementation. A function called
next cor(...) is used to determine the next thread to execute. Once the
runnable queues are empty, the control is returned to the main coroutine
identified by the main cor coroutine. This main coroutine can also be
suspended, which is what happens when a new thread process is sched-
uled for execution. It is also resumed after no more thread processes are
runnable.
only show the inclusion of one CSPnodelist (one CSP model) addressed
by the coroutine packages. However, we plan to extend this later to
support multiple CSP models using the same coroutine package.
We are considering invocations of the DE kernel through the CSP ker-
nel, which requires altering the initialization code for the sc simcontext
class. We need to point the m cor pkg private member of class
sc simcontext to the sc cor pkg pointer in the CSPnodelist class. This is
performed by invoking the cor pkg() from the CSPnodelist followed by
an invocation of get main() to retrieve the main coroutine. We introduce
a new private data member in sc simcontext called oldcontext of type
sc cor*, which we set by invoking the get demain() member function on
variable m cor pkg. We use oldcontext during the next cor() function for
class sc simcontext as shown in Listing 6.22.
Variable oldcontext is returned when there are no more runnable
threads in the system, similar to the original implementation of the
next cor() function where main cor was being returned. The purpose of
saving oldcontext is to allow the simulation to return to the coroutine
that invoked the DE kernel. Suppose a CSP process invokes a DE ker-
nel for some computation. oldcontext would then store the coroutine of
the calling CSP process. The DE simulation returns to oldcontext once
it has no more processes for execution, resuming the execution of the
calling CSP process.
We illustrate the invocation of the DE kernel from the CSP in Figure
6.8. The assigned addresses are made up and do not resemble real ad-
dresses in our simulation, but we merely present them to further clarify
the manner in which the oldcontext is used. During initialization of the
CSP model, shown by the CSP block, m cor pkg and main cor are set
to their correct addresses. Every thread process has an m cor variable
Communicating Sequential Processes Kernel in SystemC 123
CSP BLOCK
m_cor_pkg = 0x8000001
m_cor = 0x81a0f01
main_cor = 0x8100f00
m_cor_pkg = 0x8000001
B main_cor = 0x8100f00
D oldcontext = 0x81b0f01
c
m_cor = 0x81c0f01
that holds the coroutine for that particular thread. At some point dur-
ing the execution of process B, a DE model is supposed to execute. This
DE model requires that the CSP kernel yield to the DE kernel to simu-
late the DE block. Hence, the initialization functions of the DE kernel
are called where the addresses of the private data members m cor pkg
and main cor are extracted from the CSP kernel and the current sim-
ulation context is saved in oldcontext. Notice that the address of the
oldcontext is the same as the m cor value of process B. According to the
next cor(...) function definition in Listing 6.22, oldcontext is returned
once there are no more threads to execute, implying that once the DE
simulation model is complete and there are events to be updated, the
scheduler returns control to oldcontext which is the calling CSP thread.
This in effect allows for DE kernel invocations from CSP as we show via
an implemented example in Chapter 9.
Chapter 7
A set of states
A start state.
An input alphabet and
A transition function that maps the current state to its next state.
FSMs are generally represented in the form of graphs with nodes and
transitions connecting the nodes with some conditions on the transitions.
Figure 7.1 shows a diagram of a two traffic light system and Figure 7.2
illustrates a Finite State Machine controller for this system.
No change No change
S0 S2
T0 T5
A=green A=red
B=red B=green
T1 T4
B := yellow
A := yellow T2 T3
A := red A := green
B := green B := red
A=red
A=yellow
B=yellow
B=red
S1 S3
The two traffic lights are represented by A and B and the set of states
contains S0, S1, S2 and S3. The transitions are represented by the
Finite State Machine Kernel in SystemC 127
Key Value
toplevel.state.state0 0xf000001
toplevel.state.state1 0xf000011
toplevel.state.state2 0xf000101
toplevel.state.state3 0xf001001
toplevel.state.state4 0xf100001
arrows and the action associated with the transition is marked in the
dotted ellipses. Suppose S0 is the initial state. Then a transition to S1
causes traffic light A to change from green to yellow and B to remain
at red. This is a simple controller example, but FSMs can be extensive
and large in size. We do not discuss the specifics of Moore and Mealey
machines since FSMs serve as pedestals to most engineering. However,
we refer the reader to [10] for additional reference and continue our
discussion to the implementation details of the FSM kernel in SystemC.
1. Implementation Details
1.1 Data Structure
The FSM kernel’s data structure implements a map<...> object from
the C++ STL. A map object is simply a list of pairs consisting of a key
and a value. FSM uses a string and a pointer to the SC METHOD()
process via the sc method process class as shown in Listing 7.1 as a
pair entry in the map<...> object. For illustration purposes Table 7.1
displays the pairs inserted in the data structure. The addresses for the
values are made up. The keys are of type string and the value is an
address to an object of type sc method handle. The key field is used
when searching this map<...> object for a particular string and if a pair
entry is found with the corresponding search string, then the value is
returned.
The FSMReceiver class once again derives from the baseReceiver class.
The baseReceiver holds the type of the receiver that is derived from it.
Besides the fsmlist private data member, there is an id and a string
type variable called currentState. This currentState variable preserves
the current state that the simulation has reached for the FSM. This may
not seem necessary in a pure FSM model. However, in heterogeneous
models, a particular state in the FSM may resume another MoC and re-
turn back to the FSM requiring the preservation of the last state that it
had executed. The member functions of class FSMReceiver are standard
functions used to insert elements into the fsmlist and retrieve a particu-
128
state1, state2 and state3 representing the states S0, S1, S2, and S3 re-
spectively. Each of these entry functions are bound to an SC METHOD()
process via the SC FSM METHOD() macro. Registration of the entry
functions as FSM processes is performed via this macro. The constructor
of SC MODULE(...) remains the same as existing SystemC syntax with
the use of SC CTOR(...). Notice the initial state of the FSM is set within
the constructor with fsm model→setState(“toplevel.state.state0”). We
preserve the naming conventions of SystemC to target the FSM process
for execution. However, this requires knowledge of the encapsulating
process as well since the naming convention of SystemC concatenates
the names by taking the module name, adding a dot character at the
end, followed by appending the entry function name. The hierarchy
of the module is preserved by preceding with the name of the toplevel
module name as shown by toplevel.state.state0.
Two instances of light are present where A represents traffic light A
and B represents traffic light B. The colors are enumerated by enum
Finite State Machine Kernel in SystemC 131
1 SC MODULE( s t a t e ) {
2
3 l i g h t A;
4 l i g h t B;
5 i n t random ;
6
7 void s t a t e 0 ( ) {
8 random = rand ( ) ;
9 c o u t << ” −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−”
<< e n d l ;
10 c o u t << ” S0 −− Random v a l u e = ” << random << e n d l ;
11 A = GREEN;
12 B = RED;
13 p r i n t L i g h t (A , B) ;
14 i f ( random % 2 == 0)
15 fsm model−>s e t S t a t e ( ” t o p l e v e l . s t a t e . s t a t e 1 ” ) ;
16 };
17
18 void s t a t e 1 ( ) {
19 random = rand ( ) ;
20 c o u t << ” −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−”
<< e n d l ;
21 c o u t << ” S1 −− Random v a l u e = ” << random << e n d l ;
22 A = YELLOW;
23 B = RED;
24 p r i n t L i g h t (A , B) ;
25 i f ( random % 2 == 0)
26 fsm model−>s e t S t a t e ( ” t o p l e v e l . s t a t e . s t a t e 2 ” ) ;
27 };
28
29 void s t a t e 2 ( ) {
30 random = rand ( ) ;
31 c o u t << ” −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−”
<< e n d l ;
32 c o u t << ” S2 −− Random v a l u e = ” << random << e n d l ;
33 A = RED;
34 B = GREEN;
35 p r i n t L i g h t (A , B) ;
36 i f ( random % 2 == 0)
37 fsm model−>s e t S t a t e ( ” t o p l e v e l . s t a t e . s t a t e 3 ” ) ;
38 };
39
40 void s t a t e 3 ( ) {
41 random = rand ( ) ;
42 c o u t << ” −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−”
<< e n d l ;
43 c o u t << ” S3 −− Random v a l u e = ” << random << e n d l ;
44 A = RED;
45 B = YELLOW;
46 p r i n t L i g h t (A , B) ;
47 i f ( random % 2 == 0)
48 fsm model−>s e t S t a t e ( ” t o p l e v e l . s t a t e . s t a t e 0 ” ) ;
49 };
50
51 SC CTOR( s t a t e ) {
52 fsm model−>s e t S t a t e ( ” t o p l e v e l . s t a t e . s t a t e 0 ” ) ;
53 SC FSM METHOD( s t a t e 0 , f s m m o d e l ) ;
54 SC FSM METHOD( s t a t e 1 , f s m m o d e l ) ;
55 SC FSM METHOD( s t a t e 2 , f s m m o d e l ) ;
56 SC FSM METHOD( s t a t e 3 , f s m m o d e l ) ;
57 };
58 };
132
sc_domains
11
1 1
1
get access to any other kernel implemented and instantiated allowing for
interaction between them. This way the Manager object can find a par-
ticular model in a specific MoC and execute it accordingly. This setup
for sc domains exists to allow for behavioral hierarchy for the future.
Functions Description
sdf trigger(...) Global SDF specific function to execute the
SDF graph.
init domains(...) Function that invokes all initialization mem-
ber functions for every kernel in sc domains.
split processes() SDF specific function to split SDF func-
tion block processes from regular SystemC
method processes.
init de() Create an instance of the DE kernel.
init sdf() Initialization function for SDF kernel. Tra-
verses all SDFGs and constructs an exe-
cutable schedule if one exists.
find sdf graph(...) Helper function to find a particular SDF
graph for execution.
initializeCSP() Prepare csp kernel such that instances of
CSP models can be inserted
initializeFSM() Prepare fsm kernel such that instances of
FSM models can be inserted
get de kernel() Return the pointer de kernel.
get csp kernel() Return the pointer csp kernel.
get fsm kernel() Return the pointer fsm kernel.
1 void s c d o m a i n s : : s d f t r i g g e r ( s t r i n g topname ) {
2 s t r i n g sdfname = topname+” . ” ;
3 sdf graph ∗ run this ;
4
5 f o r ( i n t s d f g r a p h s = 0 ; s d f g r a p h s < ( signed ) model .
s d f d o m a i n . s i z e ( ) ; s d f g r a p h s ++) {
6 // p o i n t e r t o a p a r t i c u l a r SDF graph
7
8 s d f g r a p h ∗ p r o c e s s s d f g r a p h = model . s d f d o m a i n [ s d f g r a p h s
];
9
10 i f ( strcmp ( p r o c e s s s d f g r a p h −>p r e f i x . c s t r ( ) , sdfname . c s t r
( ) ) ==0 ) {
11
12 run this = process sdf graph ;
13
14 if ( ( r u n s d f == true ) ) {
15 // e x e c u t e t h e SDF METHODs
16 r u n t h i s −>s d f s i m u l a t e ( sdfname ) ;
17 run sdf = false ;
18 } /∗ END IF ∗/
19 } /∗ END IF ∗/
20 } /∗ END FOR ∗/
21 } /∗ END s d f t r i g g e r ∗/
their kernel implementation in SystemC. The API is also not fully com-
plete, for example, we do not support multiple models for the CSP MoC
as yet. Ideally, we envision the Manager object to manipulate the client
QuickThread coroutine instances as well, but these considerations are
still under investigation.
Finally, Listings 8.2, 8.3 and 8.4 show some of the API functions.
Chapter 9
HETEROGENEOUS EXAMPLES
DE DE
Encrypted
Upload
Image
Decryption Encryption Encrypted
Downloads
Images (.mtx)
(.mtx)
SDF
Image Format
Converter
(.sob)
pushing data into the SDF and the SDF itself, must have access to the
SDFchannel and since MoC-specific channels and ports do not generate
SystemC events, this can be done. The block pushing data into the
SDF model can also have an SDFport that binds to an SDFchannel
that the SDF component retrieves its inputs from. A control signal can
be used to trigger the SDF toplevel process once data is ready on the
channel for the SDF to consume. This is a simple example showing how
we implement a heterogeneous model (the image Converter) using our
improved modeling and simulation framework.
Figure 9.2 shows approximately 13% improvement over the original
model. We attribute the limitation in simulation efficiency increase to
Amdahl’s law [41]. The SDF block in this Converter model serves only
a small portion of the entire system allowing for only that much im-
provement in total simulation performance. If the SDF component was
responsible for more percentage of the original model, then the sim-
ulation efficiency of that model would be significantly larger than its
counterpart (created using the DE kernel).
We profiled the Converter model. Taking an approximate percentage
of time spent for the SDF model when using the original reference im-
plementation kernel, we can see from Table 9.1 that the approximated
total running time for the SDF is approximately 14%. This particular
Heterogeneous Examples 141
seatsTaken >= 4 seatsTaken >= 4 seatsTaken >= 4 seatsTaken >= 4 seatsTaken >= 4
The FSM based footman allows the seating of four philosophers, where
by each seat allocation signifies a state. A global seat counter called
seatsTaken is updated every time a philosopher is assigned a seat and
when a philosopher leaves his seat. The FSM in Figure 9.3 is combined
with the Dining Philosopher implementation to yield a heterogeneous
example shown in Figure 9.4.
Figure 9.3 presents a state machine diagram showing the functions
of the footman. The initial state is state0. The functionality of every
state is the same except for the next state transitions. Every state has
a self-loop suggesting that the control in the FSM does not transition to
another state whenever four philosophers are seated. However, if there
are seats available, then a seat is allocated and the transition to the
next state occurs. This is a simple FSM that changes the solution of
the Dining Philosopher such that it ensures that every philosopher gets
a turn to eat as well as serving as a deadlock avoidance mechanism.
The module definition is shown in Listing 9.1 where object s is the
Heterogeneous Examples 143
PHIL3 PHIL2
fo fo
ot tm o
Fork1
m a
an n
Fork3
ht2
toRig
toLeft5
FOOTMAN
an
n
tma
tm fo o
foo
PHIL4 PHIL1
footman
to
to
Le
Ri
ft2
gh
5t
Fork4 f t1 t1
to
Le
R igh Fork0
PHIL0 to
state machine. This module itself has pointers to CSPnode and CSPport
types. The variables with prefix fromPhil are the ports through which
the philosophers communicate with the footman requiring the footman
144
1 void s : : state0 () {
2 i f ( seatsTaken < 4) {
3 ++s e a t s T a k e n ;
4 s e a t A v a i l a b l e [ 0 ] = true ;
5 f r o m P h i l 0 −>push ( ∗ g i v e S e a t , ∗ c s p ) ;
6 fsm model−>s e t S t a t e ( ” t o p l e v e l . s t a t e . s t a t e 1 ” ) ;
7 }
8 };
9
10 void s : : state1 () {
11 i f ( seatsTaken < 4) {
12 ++s e a t s T a k e n ;
13 s e a t A v a i l a b l e [ 1 ] = true ;
14 f r o m P h i l 1 −>push ( ∗ g i v e S e a t , ∗ c s p ) ;
15 fsm model−>s e t S t a t e ( ” t o p l e v e l . s t a t e . s t a t e 2 ” ) ;
16 }
17 };
18
19 void s : : state2 () {
20 i f ( seatsTaken < 4) {
21 ++s e a t s T a k e n ;
22 s e a t A v a i l a b l e [ 2 ] = true ;
23 f r o m P h i l 2 −>push ( ∗ g i v e S e a t , ∗ c s p ) ;
24 fsm model−>s e t S t a t e ( ” t o p l e v e l . s t a t e . s t a t e 3 ” ) ;
25 }
26 };
27
28 void s : : state3 () {
29 i f ( seatsTaken < 4) {
30 ++s e a t s T a k e n ;
31 s e a t A v a i l a b l e [ 3 ] = true ;
32 f r o m P h i l 3 −>push ( ∗ g i v e S e a t , ∗ c s p ) ;
33 fsm model−>s e t S t a t e ( ” t o p l e v e l . s t a t e . s t a t e 4 ” ) ;
34 }
35 };
36
37 void s : : state4 () {
38 i f ( s e a t s T a k e n < numPhil − 1) {
39 ++s e a t s T a k e n ;
40 s e a t A v a i l a b l e [ 4 ] = true ;
41 f r o m P h i l 4 −>push ( ∗ g i v e S e a t , ∗ c s p ) ;
42 fsm model−>s e t S t a t e ( ” t o p l e v e l . s t a t e . s t a t e 0 ” ) ;
43 }
44 };
Heterogeneous Examples 145
The seatAvailable array maintains which seat has been occupied and
a record of every philosopher to his particular seat is kept by the index of
the array. For example, seatAvailable[1] refers to the seat that belongs
to a philosopher with id one.
DE:
Data Generator
waitState
FIR
ready FIR FFT SOBEL
Output
FIR FFT
FFT
SOBEL
FSM:
Controller
PHIL3 PHIL2
fo fo o
ot tm Fork1
m a
an n
Fork3
ht2
toRig
5
toLeft
FOOTMAN
an
n
ma
tm fo ot
foo
PHIL4 PHIL1
foo tman
to
to
Le
Ri
ft2
gh
t5
DE : Solves Producer/Consumer
FIFO
Fork4 f t1 t1
Le i gh Fork0
to R
to
PHIL0 Producer FIFO Consumer
DE: Solves
RSA Encryption
Algorithm
EPILOGUE
[4] D. Berner, S. Suhaib, S. Shukla, and H. Foster, XFM: Extreme Formal Method
for Capturing Formal Specification into Abstract Models, Tech. Report 2003-08,
Virginia Tech, 2003.
[6] Shuvra S. Bhattacharyya, Elaine Cheong, John Davis II, Mudit Goel, Christo-
pher Hylands, Bart Kienhuis, Edward A. Lee, Jie Liu, Xiaojun Liu, Lukito
Muliadi, Steve Neuendorffer, John Reekie, Neil Smyth, Jeff Tsay, Brian Vogel,
Winthrop Williams, Yuhong Xiong, Yang Zhao, and Haiyang Zheng, Heteroge-
nous Concurrent Modelling and Design in Java: Volume 2 - Ptolemy II Software
Architecture, Memorandum UCB/ERL M03/28, July 2003.
[8] N. Chomsky, Three models for the description of language, IRE Transaction on
Information Theory 2 (1956), no. 3, 113–124.
[9] A. Church, The calculi of lamdba conversion, Princeton University Press, 1985.
[10] E. Clarke, O. Grumberg, and D. Peled, Model Checking, The MIT Press, 1999.
[23] T. Grotker, S. Liao, G. Martin, and S. Swan, System Design with SystemC,
Kluwer Academic Publishers, 2002.
[32] A. Jantsch, Modeling Embedded Systems And SOC’s - Concurrency and Time
in Models of Computations, Morgan Kaufmann Publishers, 2003.
REFERENCES 157
[35] David Keppel, Tools and Techniques for Building Fast Portable Threads Pack-
ages, Tech. Report UWCSE 93-05-06, University of Washington Department of
Computer Science and Engineering, May 1993.
[40] Formal Systems (Europe) Ltd., The FDR Model Checker, Web-
site:http://www.fsel.com/, 2004.
[45] B. Niemann, F. Mayer, F. Javier, R. Rubio, and M. Speitel, Refining a High Level
SystemC Model, Kluwer Academic Publishers, 2003, In SystemC: Methodologies
and Applications, Ed. W. Muller and W. Rosenstiel and J. Ruf.
[56] , Truly Heterogeneous Modeling with SystemC, ch. Formal Models and
Methods for System Design, Kluwer Academic Publishers, The Netherlands,
2004.
[58] M. O. Rabin and D. Scott, Finite Automata and Their Decision Problems, IBM
Journal of Research 3 (1959), no. 2, 115–125.
[59] D. Sangiorgi and D. Walker, The pi-calculus: A theory of mobile processes, Cam-
bridge University Press, 2003.
[62] S. Sharad and S. K. Shukla, Efficient Simulation of System Level Models via
Bisimulation Preserving Transformations, Tech. report, FERMAT Lab Virginia
Tech., 2003-07.
1. QuickThreads
QuickThread (QT) is a package core and not a standalone thread package by itself.
The difference is that a package core only provides an interface to create machine de-
pendent code for easier portability, whereas a standalone thread package provides the
user with full implementation of stack space, thread synchronization and sometimes
even scheduling. Instead, QuickThread allows users to construct non-preemptive
thread packages. The goals of the QuickThread core package are as follows:
To provide an easy API to construct user-level thread packages. These user thread
packages can be seamlessly ported to architectures supported by QuickThread.
To separate execution of threads from their allocation and scheduling.
Synchronization of threads for uni and multi processors has problems of its own
such as race conditions, violation of stack space access, need for locking, mutexes,
extra context switches and so on. We do not discuss these issues here since our focus is
in gaining some basic level understanding of QuickThreads and their implementation
with SystemC. The QT package uses preswitch as its synchronization mechanism.
This mechanism functions in the following manner:
1 Block the current executing thread.
2 Switch to the new thread’s stack and execute some clean-up code for the old
thread.
162
sc_cor
sc_cor_qt
-m_stack : void*
#sc_cor()
-m_sp : qt_t*
+~sc_cor()
-m_pkg : sc_cor_pkg_qt*
+stack_protect() : void
-m_stack_size -m_sp qt_t
+stack_protect() : void
-...
+sc_cor_qt()
+~sc_cor_qt() 1 1 +...()
1 -m_pkg
sc_cor_pkg
sc_cor_pkg_qt
-m_simc : sc_simcontext*
-instance_count : static int +sc_cor_pkg()
+sc_cor_pkg_qt() +~sc_cor_pkg()
+~sc_cor_pkg_qt() +create() : sc_cor*
+yield() : void +yield() : void
+abort() : void +abort() : void
+get_main() : sc_cor_qt +get_main() : sc_cor*
+simcontext() : sc_simcontext*
1 -m_simc
sc_simcontext
-...
+...()
of sc cor qt are the constructor, virtual destructor and the stack protect function. The
constructor simply initializes all the data members to NULL and the destructor frees
up the memory if m stack is allocated. The stack protect member function however, is
responsible for allocating a stack region and protecting it with appropriate privilege.
The class responsible for creating the thread coroutine is sc cor pkg qt, which in-
herits from an abstract class sc cor pkg. The class definition of sc cor pkg qt is shown
in Listing A.2.
There is only one private data member in this class, which holds a static integer
variable called instance count. This variable is used to ensure that there is only one
instantiation of the sc cor pkg qt class further enforced by the file static variables
static sc cor qt main cor and static sc cor qt* curr cor. These file static variables are
instantiated in the implementation file sc cor qt.cpp. The coroutine package follows
a singleton pattern [19].
The constructor of this class shown in Listing A.3 takes in a pointer to the current
simulation context and sets it in the abstract class sc cor pkg and assigns the current
coroutine to the address of the main cor object. No other instantiations are allowed
of this class.
The create(...) function takes in the stack size, a function and an argument list
as mentioned earlier in our discussion of QuickThreads. In this function, the stack
size is set in the sc cor qt object, the stack is allocated memory and the stack pointer
is initialized along with passing of the function and arguments. At the end of the
create(...) function, the thread created is returned as a pointer to sc cor object. The
coroutine classes employ the use of keywords such as SCAST and RCAST. These are
164
defined in sc iostream.h as short forms for static cast and reinterpret cast, perhaps
simply for easier use.
The yield(...) function takes in a pointer to sc cor to indicate the next coroutine
to be executed. The current coroutine to be executed is moved to a casted pointer to
sc cor qt and the QuickThread primitive for blocking is called via the QT BLOCK(...)
function. The arguments passed to the QT blocking primitive are the helper function,
the old coroutine, and the new coroutine. The helper function saves the stack pointer
onto the old thread’s stack and resumes execution to the new thread using preswitch.
The abort(...) member function is responsible for causing the threads to terminate
(die) in a similar fashion and the get main() function returns the main coroutine for
the simulation to allow continuation of the original simulation context. We do not
discuss the abstract classes that are used to interface with the QT coroutine pack-
age and continue to discuss how SystemC threads make invocations to the functions
described in this section.
We have described some of the classes and some of the main functions of those
classes that interact with the QuickThread core package. What is relevant to most
SystemC users and more so to developers is how SystemC integrates these coroutine
packages with their simulation environment. We describe how the Discrete-Event
simulation kernel incorporates thread processes. We step through the code in giving
details on how threads are instantiated (up to the coroutine client package calls) and
how they are executed.
Appendix A: QuickThreads in SystemC 165
1 void
2 sc cor pkg qt : : yield ( sc cor ∗ next cor )
3{
4 s c c o r q t ∗ n e w c o r = SCAST<s c c o r q t ∗ >( n e x t c o r ) ;
5 sc cor qt ∗ old cor = curr cor ;
6 c u r r c o r = new cor ;
7 QT BLOCK( s c c o r q t y i e l d h e l p , o l d c o r , 0 , ne w cor−>m sp ) ;
8}
1 sc thread handle
2 s c s i m c o n t e x t : : r e g i s t e r t h r e a d p r o c e s s ( const char ∗ name ,
3 SC ENTRY FUNC e n t r y f n ,
4 s c m o d u l e ∗ module )
5{
6 s c t h r e a d h a n d l e h a n d l e = new s c t h r e a d p r o c e s s ( name ,
7 entry fn ,
8 module ) ;
9 m p r o c e s s t a b l e −>p u s h b a c k ( h a n d l e ) ;
10 s e t c u r r p r o c ( handle ) ;
11 return h a n d l e ;
12 }
the simulation. Within this function the initialization of all processes is performed.
We extract the segment responsible for thread processes only and display it in Listing
A.7.
1 void
2 s c s i m c o n t e x t : : i n i t i a l i z e ( bool n o c r u n c h )
3{
4 // Some code h e r e ...
5
6 // i n s t a n t i a t e t h e c o r o u t i n e package
7#i f n d e fWIN32
8 m c o r p k g = new s c c o r p k g q t ( t h i s ) ;
9#e l s e
10 m c o r p k g = new s c c o r p k g f i b e r ( t h i s ) ;
11#e n d i f
12 m cor = m cor pkg−>g e t m a i n ( ) ;
13
14 // p r e p a r e a l l t h r e a d p r o c e s s e s f o r s i m u l a t i o n
15 const s c t h r e a d v e c & t h r e a d v e c = m p r o c e s s t a b l e −>
thread vec () ;
16 f o r ( i n t i = t h r e a d v e c . s i z e ( ) − 1 ; i >= 0; − − i ) {
17 t h r e a d v e c [ i ]−> p r e p a r e f o r s i m u l a t i o n ( ) ;
18 }
19
20
21 // Some code h e r e ...
22
23 // make a l l thread processes runnable
24
25 s i z e = thread vec . s i z e () ;
26 f o r ( i n t i = 0 ; i < s i z e ; ++ i ) {
27 sc thread handle thread h = thread vec [ i ] ;
28 i f ( t h r e a d h −> d o i n i t i a l i z e ( ) ) {
29 push runnable thread ( thread h ) ;
30 }
31 }
32
33 // Some code h e r e ...
34
35 }
1 void s c s i m c o n t e x t : : crunch ( )
2{
3 // Some code h e r e ...
4
5 while ( true ) {
6
7 // EVALUATE PHASE
8
9 while ( true ) {
10
11 // e x e c u t e method p r o c e s s e s
12
13 // Some code h e r e ...
14
15 // e x e c u t e ( c ) t h r e a d p r o c e s s e s
16
17 sc thread handle thread h = pop runnable thread () ;
18 while ( t h r e a d h ! = 0 & & ! t h r e a d h −>r e a d y t o r u n ( ) ) {
19 thread h = pop runnable thread () ;
20 }
21
22 i f ( thread h ! = 0 ) {
23 m cor pkg−>y i e l d ( t h r e a d h −>m cor ) ;
24 }
25
26 i f ( m error ) {
27 return ;
28 }
29
30 // Some code h e r e ...
31
32 m runnable−>t o g g l e ( ) ;
33 }
34
35 // UPDATE PHASE
36
37 // Some code h e r e ...
38 }
All threads that are on the runnable lists are executed. This raises many questions
as to how the simulation proceeds from one thread to the next. For example, suppose
a thread switches to the current context and executes. Then, how does another thread
after the completion of the current executing thread get scheduled for execution. We
Appendix A: QuickThreads in SystemC 169
believe that one of the important understandings is how this simulation proceeds to
the next thread and how a wait() suspension call makes the current executing pro-
cess suspend and switch another process in for execution. The implementation is
interesting and behaves differently when there are suspension calls, when there are
none, and when the entry function for a thread does not contain an infinite loop. The
implementation provides explanation to the experienced behaviors in these circum-
stances. Let us first study the function shown in Listing A.10. sc switch thread(...)
takes in a pointer to the simulation context as an argument. This function invokes the
yield(...) function with the argument being the return object of a next cor() function
implemented in the sc simcontext class, serving the purpose of retrieving the next
runnable coroutine. This next cor() function is shown in Listing A.11.
1 i n l i n e void
2 s c s w i t c h t h r e a d ( s c s i m c o n t e x t ∗ simc )
3{
4 simc−>c o r p k g ( )−>y i e l d ( simc−>n e x t c o r ( ) ) ;
5}
This next cor() function returns the coroutine to the next thread if one is available
on the runnable queues, otherwise it returns the main coroutine, m cor that returns
to the main simulation context. This implies that one thread is somewhat responsible
for invoking another thread until the simulation has no more threads to execute at
which it returns to the main coroutine.
Suspension calls such as wait(...) simply block the current executing thread by
switching to another thread using the sc switch thread(...), leaving the QuickThread
core package to be responsible for saving the state of the old thread. This is how the
main cor (main coroutine) is saved as well, when another thread process is invoked,
the current executing process is suspended. Therefore, none of these threads are
terminated in a normal execution until the end of simulation. The event notifications
170
for resumption of threads puts the thread process handle on the runnable queue so
that the main coroutine can execute the threads. The threads generally terminate
only at the end of simulation.
Suppose the user has an SC THREAD() process and no infinite loop implemented
in the entry function. Then, the thread will execute only once (depending on the
number of suspension points in the entry function) and stall. Suppose, a user has
defined an entry function with an infinite loop but no suspension points and without
the use of an sc signal or channels that generate sc events. The expected behavior
is an infinite execution of that one process once it is scheduled. The implementation
details provide some understanding as to why an infinite loop is required for a thread
process in SystemC to prevent the thread from aborting and dying. It explains why
all threads also have at least one suspension point to avoid continuous execution of
that single thread.
Our efforts in heterogeneity question the current implementation of QuickThreads
in SystemC. The concern is not with the QuickThread packaging, but more so the
client package interaction with the simulation kernel. The static instance of the
sc cor qt object makes it difficult to implement heterogeneity and hierarchy in Sys-
temC. We believe that every Model Of Computation requiring threads must be able
to cleanly communicate to the existing Discrete-Event kernel without raising con-
cerns about threading implementations. We are currently investigating this alteration.
However, we believe that this short introduction to QuickThread implementation in
SystemC is valuable for kernel designers in understanding the coroutine implementa-
tion.
Appendix B
Autoconf and Automake
Autoconf and Automake are tools provided by GNU for making scripts that config-
ure source code packages. Autoconf is responsible for creating a configuration script
with information regarding the operating system. This allows for adaptation to dif-
ferent UNIX-variant systems without much intervention from the code developers or
the user. Automake is also a GNU tool that requires Autoconf to generate the con-
figuration script which Automake utilizes to create Makefile.in files for the code
package. Detailed information on the usage and purpose of Autoconf and Automake
is available at [21, 22]. Our purpose in this section is to describe how to add additional
files to SystemC, such that the library created after compilation will incorporate the
additional classes introduced in the added files.
We explain this procedure via example. It is important to save a copy of the
directory that contains the QuickThread package. We do not elaborate on the problem
except that updating the configuration and Makefiles causes a slight disturbance with
the QuickThread package. Suppose we wish to add the CSP kernel that has all class
definitions in a file called sc csp.h and its respective implementation in sc csp.cpp.
We define our source untarred in a directory called systemc-2.0.1-H/. The following
steps are needed to add the CSP kernel to SystemC (this is specific for version 2.0.1).
1 Copy the CSP kernel source files into the kernel directory
systemc-2.0.1-H/src/systemc/kernel/
2 Edit the Makefile.am in systemc-2.0.1-H/src/systemc/kernel/ to add sc csp.h
under H FILES and sc csp.cpp under CXX FILES in a similar fashion to the existing
source.
3 Save the Makefile.am and move to systemc-2.0.1-H/src/ where the systemc.h
exists.
4 Edit the systemc.h file and #include the header file for the CSP kernel. For
example, add a line #include "systemc/kernel/sc csp.h".
5 Save systemc.h and move to systemc-2.0.1-H/.
6 Run aclocal.
7 Run autoconf. If there are problems due to version discrepancies with Autoconf
then run autoreconf -i.
172
8 Run automake
9 Begin compiling systemC-2.0.1-H as specified in the installation guidelines. Up-
dating of the Makefile.ins causes a compile error when trying to compile the
QuickThread package. To rectify this, simply remove the existing QuickThread
directory and replace it with the backup.
Users of the installation from systemc-2.0.1-H/ will have access to classes included
in sc csp.h. This is a solution to keeping most of the additional source code separate
from existing implementation, but avoid changes in original source files is a very
difficult task. Hence, we maintain the idea that the newly implemented classes must
remain detached with their data structures in a separate file and make necessary
changes to original SystemC source files.