0% found this document useful (0 votes)
36 views37 pages

Lec7 - Partial Reconfiguration

Partial reconfiguration allows dynamically modifying blocks of logic on an FPGA by downloading partial bit files while the remaining logic continues operating. This enhances flexibility. Challenges include methodologies, tools, and applications. Dividing algorithms into time-exclusive segments and coordinating behavior between configurations is difficult.

Uploaded by

Charan Eswar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views37 pages

Lec7 - Partial Reconfiguration

Partial reconfiguration allows dynamically modifying blocks of logic on an FPGA by downloading partial bit files while the remaining logic continues operating. This enhances flexibility. Challenges include methodologies, tools, and applications. Dividing algorithms into time-exclusive segments and coordinating behavior between configurations is difficult.

Uploaded by

Charan Eswar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Reconfigurable Computing

AEL ZG 554 / ES ZG 554 / MEL ZG 554


Session 7
BITS Pilani
Pawan Sharma
ps@[Link]
Pilani Campus 24/02/2024
Last Lecture

• FPGA case study

2
BITS Pilani, Pilani Campus
Today’s Lecture

• Partial Reconfiguration
• FPGA Design Flow

3
BITS Pilani, Pilani Campus
Partial Reconfiguration
• Partial Reconfiguration is the ability to dynamically modify blocks of logic by
downloading partial bit files while the remaining logic continues to operate
without interruption.

• Allows designers to change functionality on the fly, eliminating the need to fully
reconfigure and re-establish links, dramatically enhancing the flexibility that
FPGAs offer.
• Allow designers to move to fewer or smaller devices, reduce power, and improve
system upgradability.
• Make more efficient use of the silicon by only loading in functionality that is
needed at any point in time.
• Useful for self-adaptive hardware systems.

BITS Pilani, Pilani Campus


Challenges

• 1. Methodologies: How can hardware modules be efficiently


integrated into a system at runtime? How can this be implemented
with present FPGA technology? And, how can runtime reconfigurable
systems be managed at runtime?
• 2. Tools: How to implement reconfigurable systems at a high level of
abstraction
• for increasing design productivity?
• 3. Applications: What are the applications that can considerably
benefit from runtime reconfiguration?

BITS Pilani, Pilani Campus


Challenges

• Dividing the algorithm into time‐exclusive segments.


• Portions should remain resident for a reasonable
amount of time and perform their tasks relatively
independent of other configurations.
• Coordinating the behavior between configurations of
RTR applications
• Manage fragmentation of the device and
communication between newly placed modules.

11
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


FPGA Application and Configuration Layers

BITS Pilani, Pilani Campus


Partial Reconfiguration

• A system where you don’t want to use


all the modules at the same time
• mutually exclusive components mobile phone
• either BCE or ADF at any instant of time
• insert MUX in between to make
selections
• called spatial multiplexing approach

BITS Pilani, Pilani Campus


Partial Reconfiguration

• A few facts
– specialized circuits can operate at higher speeds versus their general static
counterparts
– chip area is saved by programming only the physical resources that are needed
in each operation phase
– power can be saved by programming only the circuit that is needed, which
allows for static leakage reduction, and by programming optimized circuits,
which allows for dynamic power reduction.
• Dynamic (their function is changed at runtime in response to application
requirements) and partial reconfiguration (modification of an active or a functional
FPGA design by loading a partial configuration or bit file) are key differentiating
capabilities of field programmable gate arrays.
• Self-Reconfiguration means that the FPGA provides an internal port to access the
configuration layer from within the application layer. In case of Xilinx FPGAs, this
port is called ICAP.
6
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


• After a full BIT file configures the FPGA, partial BIT files can be
downloaded to modify reconfigurable regions in the FPGA
• A difficult process that must handle side‐effect factors
• So you need another processor to control this process ‐‐ a scheduler and
a placer that can be implemented as part of an operating system
running on a processor.
• The processor can either reside inside or outside the reconfigurable
chip.
• Several applications, including neural nets, template matching, and DNA
sequence matching, are used to illustrate the superiority of run‐time
reconfigurable implementations.

6
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Partial Reconfiguration

• At time T1, HW1 module can be put on PRR and at time T2, HW2 module can be put on
same PRR bcoz you don’t need both at the same time. They can be temporally multiplexed
on same PRR.
• Dynamic reconfiguration and partial reconfiguration have been used
interchangeably, they can be different.
• The PR operation can be static or dynamic
• Not necessary that all dynamic reconfigurations are partial in nature. .

HW1
HW2

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Some terminologies

• PRR: Partially reconfigurable region. Area on the device that undergoes


reconfiguration at run time. designated at compile time.
• Static region: area on the device that is not reconfigured during run time
• Reconfiguration module: a module targeted for run time reconfiguration in the
design
• Modes: mutually exclusive implementation of the reconfiguration module
• Configuration: set of co‐existent modes that make up a functional processing
chain

13
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Advantages

• Effective logic density of


the chip can be
increased
• Reduced reconfiguration time compared
to a full reconfiguration.
• Adaptive hardware systems
• Useful when an interface is
required to persist while
functionality changes.
• Reduced Memory footprint

13
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Desired features of PR

• Supported granularity for reconfiguration?


– modify portions from as small as a single LUT up to
the entire chip.
• Support for runtime relocation. This allows the
same bitstream to be used to configure a circuit in
different locations on the FPGA,
• Reconfiguration time should be negligible

14
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Desired features

• Faster reconfiguration
• Transparent reconfiguration operation to the
application
• High‐level design tool to support adaptive
application

11
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Managing Dynamic Device Reconfiguration

BITS Pilani, Pilani Campus


Partial Reconfiguration Region

PRR1 PRR2 PRR3

PRRS consist of several frames. Must extend the full height of device and align horizontally with
multiple of four slices.

17
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Anchoring Logic

• A single PRR hosts multiple circuits at runtime in a time multiplexed


manner, need to handle interfaces to the static region ‐‐special anchoring
logic is necessary.
• Virtex‐II and Virtex‐II Pro devices use internal tri‐state buffers (TBUFs) to
manage this connectivity.
• To support runtime circuit relocation, the relative positions of these TBUFs
must also match for different PRRs.
• The number of TBUFs available on these devices is restricted and their
positions fixed, leading to further restrictions on the size and positions of
PRRs

18
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Virtex‐4
• TBUFs were replaced by bus macros ‐‐constructed out of LUTs
• Size of frames was also reduced in the Virtex‐4
• Each frame is 1 bit wide and 16 CLBs high and contains 41 32‐bit words (1312 bits)
• The reconfigurable region height is also a multiple of 16 CLBs.
• Resulted into floorplanning problem being two dimensional.
• But runtime relocation has become more difficult.
• The ICAP interface width was also increased from 8 to 32 bits.

18
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Virtex‐5
• the entire device is divided into several rows and
columns
• division into row essentially represents a clock
region and device size determines their number.
• The columns, called blocks, span the entire device
height.
• Each block contains a single type of FPGA
primitive arranged in a columnar fashion.
• The FPGA is composed of several tiles where a
block and a row intersect: CLB tiles, DSP tiles, and
BRAM tiles.
• Xilinx uses the term reconfigurable frame to
denote these tiles and these are the basic or
smallest unit for PR.
• frame is always 1‐bit wide and extends height of
one row.
• PRRs must be integer multiples of tiles

19
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Tile sizes

• One Virtex‐5 CLB tile contains 20 CLBs, one DSP tile contains 8 DSP
slices, and one BRAM tile contains 4 Block RAMs.
• Virtex6 FPGAs follow the basic architecture of Virtex‐5 FPGAs with a
CLB tile containing 40 CLBs, a DSP tile containing 8 DSP slices, and a
BRAM tile containing 8 18Kbit Block RAMs.
• Xilinx 7‐ series FPGAs (Artix, Kintex, and Virtex‐7) also have a similar
tile architecture with one CLB tile containing 50 CLBs, and DSP and
BRAM tiles containing 10 DSP slices and 10 18Kbits Block RAMs,
respectively.
• These improved architecture features enable FPGAs to implement
more complex circuits as well as to reduce resource wastage.
• Designers are now able to define multiple PRRs with varying sizes with
different kinds of resources.

22
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Zynq 7000

21
BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Architecture of runtime reconfigurable system

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


System Integration –
Computation flow

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Design Flow –
Hardware/Software
partitioning

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus


Example

26
BITS Pilani, Pilani Campus
FPGA Design Flow

• A designer defines the target by


describing source codes, by drawing
circuit diagrams, and by setting
parameters, based on given
specifications and constraints.
• source codes written in register
transfer level (RTL) that are
described in the hardware
description language (HDL), such as
VerilogHDL and VHDL, as typical
design entries.
• Configuration data to implement
the target system on the FPGA chip
are generated through RTL
description, logic synthesis,
technology mapping, and place and
route.

4
BITS Pilani, Pilani Campus
BITS Pilani, Pilani Campus
Synthesis

• Synthesis is the process by which the system specifications and


constraints are translated to an implementation (a net list of connected
components).
• Synthesis is considered to be a key stage in automated design tools
(CAD Tools).
• A significant area of research in EDA is in the development of tools that
can synthesize hardware from a design written in the form of high-level
programming language such as C.
• It is believed that these ``hardware compilers” will help to decrease
development time, thus shortening the crucial time to market for
designs.

BITS Pilani, Pilani Campus


BITS Pilani, Pilani Campus
Logic synthesis - Goal

A digital structured system consists of


combinational parts separated by memory
elements.
Goal is to provide an implementation of a digital
system for a given platform or for a given target
library.
FPGA-Goal: Generation of configuration data
The implementation must be optimized according
to factors such as area, delay, power

BITS Pilani, Pilani Campus


BITS Pilani, Pilani Campus
Synthesis

• Refining the model, from an abstract one to a


detailed one
• Objective is to maximize some figures of merit of
the circuit to improve its quality.
• Validation consists of verifying the consistency of
the models used during design, as well as some
properties of the original model
• It can be performed by simulation and
verification

3
BITS Pilani,
BITSPilani Campus
Pilani, Pilani C
Optimization

• To improve circuit quality:


– Performance relates to the time required to process
some information,
– circuit quality relates to the overall area -- to
minimize the area, for many reasons.
– the circuit quality also relates to the testability.

3
BITS Pilani,
BITSPilani Campus
Pilani, Pilani C
Two Level Logic Synthesis and
Optimization
• George Boole’s Boolean algebra laid the roots of logic synthesis
• Shannon’s discovery in 1938 showed that two-valued Boolean algebra can
describe the operation of switching circuits.
• In the early days, logic design involved manipulating the truth table
representations as Karnaugh maps.
• The Karnaugh map–based minimization of logic is guided by a set of rules on
how entries in the maps can be combined.
• A human designer can only work with Karnaugh maps containing four to six
variables.
• The first step toward the automation of logic minimization was the introduction
of the Quine–McCluskey procedure that could be implemented on a computer.
• This exact minimization technique presented the notion of prime implicants and
minimum cost covers that would become the cornerstone of two-level
minimization.

11
BITS Pilani, Pilani Campus
BITS Pilani, Pilani Campus
Two-level and Multi Level Logic

BITS Pilani, Pilani Campus


BITS Pilani, Pilani Campus
BITS Pilani, Pilani Campus
Synthesis approaches

Two different approaches:


– technology dependent synthesis
– technology independent synthesis.
• The technology-independent method is most used, because of the large
set of available optimization methods.
• With a technology independent representation, synthesis for FPGA devices
is done in two steps.
– In the first step, all the Boolean equations are minimized, independent
of the function generators used.
– In the second step, the technology mapping process maps the parts of
the Boolean network to a set of LUTs.

BITS Pilani, Pilani Campus


BITS Pilani, Pilani Campus
Computational Boolean Algebra

• You are familiar with how to manipulate Boolean functions using


Boolean laws or use K-maps to minimize Boolean functions
• But these techniques are far fetched from writing a computer
program needed for automated synthesis tools that can write
Boolean equation as a data structure manipulated by an
operator
• K-map is not sufficient for real designs, like simplifying a Boolean
function of 20 variables using K-map is not feasible
• need computational approach to have algorithmic and
computational strategies for Boolean stuff as well
– decomposition strategies: ways to break down complex functions into
simpler ones
– computational strategies: ways to think about Boolean functions that
can be manipulated by programs

3
BITS Pilani,
BITSPilani Campus
Pilani, Pilani C
• Useful analogy to calculus
– you can represent complex functions like exp(x) using simpler
functions
• if you get to use 1,x,x2,x3,x4……..as the pieces
• turns out exp(x) = 1+x+x2/2!+x3/3!....
• Likewise, you have Taylor series (function is an infinite sum of
terms that are expressed in terms of the function's derivatives)
and Fourier series (expansion of a periodic function in terms of
an infinite sum of sines and cosines) where complicated functions
are represented as pieces of simpler functions
• Do we have something like this for Boolean functions where a
complex Boolean functions can be split apart into small pieces
and then you can join them together to create complex design??

3
BITS Pilani,
BITSPilani Campus
Pilani, Pilani C

You might also like