0% found this document useful (0 votes)
84 views34 pages

Overview of Virtex II FPGA Architecture

The Virtex-II FPGA architecture contains configurable logic blocks (CLBs) that provide combinational and synchronous logic functions. Each CLB contains logic cells with look-up tables (LUTs) and flip-flops. The Virtex-II family also includes block RAM memory and dedicated 18x18 multiplier blocks. Digital clock manager blocks provide clock management functions. The CLBs, block RAM, multipliers, and DCMs are interconnected through a programmable switch matrix that provides routing resources between the elements.

Uploaded by

Dhanapackiyam V
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views34 pages

Overview of Virtex II FPGA Architecture

The Virtex-II FPGA architecture contains configurable logic blocks (CLBs) that provide combinational and synchronous logic functions. Each CLB contains logic cells with look-up tables (LUTs) and flip-flops. The Virtex-II family also includes block RAM memory and dedicated 18x18 multiplier blocks. Digital clock manager blocks provide clock management functions. The CLBs, block RAM, multipliers, and DCMs are interconnected through a programmable switch matrix that provides routing resources between the elements.

Uploaded by

Dhanapackiyam V
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Introduction To VIRTEX II

FPGA Architecture

Presented By
[Link]
Virtex Family
 In 2010,Xilinx's announced the Virtex 7 family, is based
on a 28 nm design and is reported to deliver a two-fold
system performance improvement at 50% lower power.
In addition, Virtex-7 doubles the memory bandwidth
compared to previous generation Virtex FPGAs with
1866 Mbit/s memory interfacing performance and over
two million logic cells.
 In 2011, Xilinx began shipping sample quantities of the
Virtex-7 2000T FPGA, which combines four smaller
FPGAs into a single package by placing them on a special
silicon interconnection pad (called an interposer) to
deliver 6.8 billion transistors in a single large chip.
 In 2012, using the same 3D technology, Xilinx introduced
initial shipments of their Virtex-7 H580T FPGA, a
heterogeneous device, so called because it comprises two
FPGA dies and one 8-channel 28Gbit/s transceiver die in
the same package.
 As Xilinx introduced new high capacity 3D FPGAs,
including Virtex-7 2000T and Virtex-7 H580T products,
these devices began to outpace the capacity of Xilinx’s
design software, which led the company to completely
redesign its tool set. The result was the introduction of the
Vivado Design Suite, which reduces the time needed for
programmable logic and I/O design, and speeds systems
integration and implementation compared to the previous
software.
General Description
 The Virtex-II family is a platform FPGA
developed for high performance from low-
density to high-density designs that are based on
IP cores and customized modules.
 The family delivers complete solutions for
telecommunication, wireless, networking, video,
and DSP applications, including PCI, LVDS, and
DDR interfaces.
Virtex-2 Array
 Virtex-II devices are user-programmable gate
arrays with various configurable elements .
 The Virtex-II architecture is optimized for high-
density and high-performance logic designs .
 The programmable device is comprised of
input/output blocks (IOBs) and internal
configurable logic blocks (CLBs).
Virtex-II Architecture
Elements in virtex-2
 The internal configurable logic includes four
major elements organized in a regular array.
 Configurable Logic Blocks (CLBs)
 Block Select RAM Memory
 Multiplier blocks
 DCM (Digital Clock Manager)
 Configurable Logic Blocks (CLBs) provide
functional elements for combinational and
synchronous logic,including basic storage elements.
 Block SelectRAM memory modules provide
large18 Kbit storage elements of dual-port RAM.
 Multiplier blocks are 18-bit x 18-bit dedicate
multipliers.
 DCM (Digital Clock Manager) blocks provide self-
calibrating, fully digital solutions for clock
distribution delay compensation, clock multiplication
and division, coarse- and fine-grained clock phase
shifting.
Features
 Up to 2 Million System Gates at 100+ MHz
 Distributed and Block RAM available
 Low Power
 Delay Logic Loops
 2.5V Internal Operation with support of
common power
 High-Performance Interfaces to External
Memory
 Active Interconnect Technology
Naming Conventions
Block Diagram of VIRTEX-II
SONET / SDH

LVDS
DCM
DDR
CAM DDR SDRAM
Distri
RAM
FIFO QDR
PCI-X PCI 18Kb DDR SRAM
BRAM Shift
Registers DDR CAM
Multiplier
BLVDS

Backplane
CLB Resources
 Basic resource unit is the Logic Cell
 1 CLB contains 2 - 4 Logic Cells, depending on device family
 Logic Cell = 4-input Look-Up Table (LUT) + D Flip-flop
 LUT capacity limited by number of inputs, not complexity of
function
 LUTs can be used as ROM or synchronous RAM
 Flip-flop can be configured as a transparent latch in Virtex and
Spartan-II

LUT FF
Closer Look at a CLB Structure
COUT COUT

YB YB
G4 Look-Up Carry Y S
G4 Look-Up Carry Y S
G3 D Q G3 D Q
G2 Table O & G2 Table O &
Control CK Control CK
G1 G1
Logic EC Logic EC
R R
F5IN F5IN
BY BY
SR SR
XB XB
F4 Look-Up Carry X S F4 Look-Up Carry X S
F3 D Q F3 D Q
Table O & Table O &
F2 CK F2 CK
F1 Control F1 Control
Logic EC Logic EC
R R

CIN CIN
CLK CLK
CE SLICE CE SLICE

 Each slice has 2 LUT-FF pairs with associated carry logic


 Two 3-state buffers (BUFT) associated with each CLB
 Each Slice has four outputs: Carry Logic for fast addition
Two registered outputs Two independent carry chain per CLB
Two non-registered output
CLB (Configurable Logic
Blocks)
Each CLB is connected to one switch
matrix Providing access to general routing
COUT COUT resources.
TBUF
TBUF
High level of logic integration
Slice S3
X1Y1
 Wide-input functions:
—16:1 multiplexer in 1 CLB or any
Slice S2 function
Switch SHIFT
X1Y0 —32:1 multiplixer in 2 CLBs
Matrix (1 level of LUT)
Slice S1
 Fast arithmetic functions
X0Y1 —2 look-ahead carry chains
per CLB column
Slice S0
X0Y0 Fast Connects  Addressable shift registers in LUT
—16-b shift register in 1 LUT
CIN CIN
—128-b shift register in 1 CLB
Interconnect Technology
Offered by VIRTEX-II
 Interconnect is an array of switch matrices
 All Virtex II features can access routing resources
through the switch matrix
 Simplify design and place & route

Switch Switch
CLB Matrix
Matrix
Switch
Switch Matrix
IOB 18Kb MULT
Matrix BRAM 18x18
Switch
Matrix
Switch DCM Switch
Matrix
Matrix
Shift Register
LUT
 Each LUT can be IN D
CE
Q
CE
configured as shift register CLK
 Serial in, serial out
D Q
CE
 Dynamically addressable
delay up to 16 cycles
 For programmable pipeline
LUT
= D
CE
Q OUT

 Cascade for greater cycle


delays
 Use CLB flip-flops to add D Q

depth CE

DEPTH[3:0]
Shift Register Look-Up Table
 High density integration of shift registers
 DSP applications use SRL16 for delay matching

 CDMA wireless and video applications require shift

registers

Up to 128-b per CLB


Cascadable output
Dynamic addressable output
16-b per LUT

Multiple SRLC16 cascadable to any length


Digital Clock Manager
 High-Speed 420 MHz clock generation:
 Clock de-skew on-chip and off-chip

Up to 12 DCM per device


Fully digital circuitry
Flexible Frequency Synthesis
Synthesis outputs: clock 0° & 180° (def.: 4X)
High-Resolution Phase Shifting
DPS fixed and variable modes
Delay-Locked Loop (DLL)
Precise Clock De-Skew
DLL outputs: clock 0°, 90°, 180°, 270°
DLL outputs: clock 2X and clock division
50/50 duty cycle correction
DCM Features
• Clock De-skew: The DCM generates new system
clocks (either internally or externally to the FPGA),
which are phase-aligned to the input clock, thus
eliminating clock distribution delays.
• Frequency Synthesis: The DCM generates a wide
range of output clock frequencies, performing very
flexible clock multiplication and division.
• Phase Shifting: The DCM provides both coarse phase
shifting and fine-grained phase shifting with dynamic
phase shift control.
Digital Clock Manager: DCM
DCM The DCM has the following
CLKIN CLK0
CLK90
general control signals:
CLKFB
CLK180
RST CLK270
CLK2X • RST input pin: resets the
DSSEN CLK2X180
CLKDV
entire DCM
PSINCDEC
PSEN CLKFX • LOCKED output pin:
PSCLK CLKFX180 asserted High when all
LOCKED enabled DCM circuits have
STATUS[7:0] locked.
PSDONE
• STATUS output pins
(active High)
Clock signal
Control signal
Frequency Synthesis of DCM
The CLK2X and CLK2X180 o/p double the clock
frequency.
The CLKDV CLK0
CLK90
output creates divided output clocks
CLKFB
RST
with division options of 1.5, 2, 2.5, 3, ….., 7, 7.5,8,
CLK1803333333
CLK270
9, 10, 11,
DSSEN
CLK2X12, 13, 14, 15, and 16.

CLK2X180 CLKDV
CLK2X180 is phase shifted 180 degrees relative to
PSINCDEC
PSENCLK2X. CLKFX
PSCLK CLKFX180
CLKFX180 is phase shifted 180 degrees relative to
LOCKED
CLKFX
The CLKFX and CLKFX180 outputs can be used
STATUS[7:0]

to produce clocks at the following frequency:

Freq CLKFX = (M/D) x Freq CLK IN


High Resolution Phase
Shifting
 The DCM provides additional control over clock skew
through either coarse or fine-grained phase shifting.
 TheCLK0, CLK90, CLK180, and CLK270 outputs are
each phase shifted by ¼ of the input clock period relative
to each other, providing coarse phase control.
 Note that CLK90 andCLK270 are not available in high-
frequency mode. Fine-phase adjustment affects all nine
DCM output clocks.
 When activated, the phase shift between the rising edges of
CLKIN and CLKFB is a specified fraction of the input
clock period
Global
Clocks
 Up to 16 Dedicated Low Skew Clocks

16 global clock multiplexers & buffers


8 clock nets in each quadrant
Global clock ENABLE
Switch glitch-free from one clock to another
16 clock pads (can be used as user I/O)
Clock Distribution
 16 Global Clock Multiplexers Unused Branches are Disable
 Eight on the top
(Power Saving)
8 BUFGMUX
 Eight on the bottom

 Switch “glitch free” from 1 clock to the


NW NE
other
8
 8 Clocks selectable per 8 8 max
quadrant

8 BUFGMUX 16 Clocks
NW NE
8 8
16 Clocks
SE
SW
SW 8 BUFGMUX SW
8 BUFGMUX
Use Global Buffers to
Reduce Clock Skew
•Global buffers are connected to dedicated routing.
•This routing network is balanced to minimize skew
•All Xilinx FPGAs have global buffers

D Q

D Q
CLK2

BUFG
CLK1 Introduces clock skew between CLK1 and
CLK2
Uses an extra BUFG to reduce skew on
BUFG CLK2
Design contains 2 clock signals
Memory
On-Chip SelectRAMTM Memory
Large FIFOs
Packet Buffers
Video Line Buffers
Cache Tag Memory
DSP Coefficients
CAM
Small FIFOs
Deep/Wide Up to
CAM
400 Mbps/pin
Shallow/Wide
DDR & QDR

18 kb
128x1 Blocks

Distributed RAM Block RAM External RAM/CAM


bytes kilobytes megabytes

Terabit Memory Continuum


Embedded 18 kb Block RAM
 Up to 3 Mb on-chip block RAM
 High internal buffering bandwidth
 Reduced I/O count and more embedded memory

18Kbit block RAM


Parity bit locations (parity in/out busses)
Data width up to 36 bits
3 WRITE modes
Output latches Set/Reset
True Dual-Port RAM
Independent clock (async.) & control
Distributed RAM
RAM16X1S
D
WE

=
WCLK
LUT A0 O
 CLB LUT configurable as A1
A2

Distributed RAM A3

RAM32X1S
 A LUT equals 16x1 RAM D
WE
WCLK
 Implements Single and Dual- A0 O
A1
A2
Ports A3
A4

or
 Cascade LUTs to increase RAM LUT RAM16X2S
D0

size D1
WE

 Synchronous write = WCLK


A0
A1
A2
O0
O1
RAM16X1D
D
WE
A3
 Synchronous/Asynchronous read WCLK

or
A0 SPO
LUT
A1
 Accompanying flip-flops used A2
A3
for synchronous read DPRA0 DPO
DPRA1
DPRA2
DPRA3
18 x 18 Embedded Multiplier
 Fast arithmetic functions
 Optimized to implement multiply / accumulate
modules
18 x 18 signed multiplier
Fully combinatorial
Optional registers with CE & RST (pipeline)
Independent from adjacent block RAM
18 x 18 Multiplier
 Embedded 18-bit x 18-bit multiplier
 2’s complement signed operation
 Multipliers are organized in columns

Data_A
(18 bits)

18 x 18 Output
Multiplier (36 bits)

Data_B
(18 bits)
Basic I/O Block Structure
Three-State D Q
FF Enable EC
Three-State
Clock SR Control
Set/Reset

Output D Q
FF Enable EC
Output Path
SR

Direct Input
FF Enable
Input Path
Registered Q D
Input EC
SR
I/O Signal Types

I/O Signal Type

Single-Ended Differential

LVCMOS HSTL SSTL LVTTL LVDS Bus LVDS LVPECL

NOTE: Only the popular IO types shown here


IOB: Double Data Rate
Registers
 DDR registers can be clocked by
 Clock and not (clock) if the duty cycle is 50/50
 CLK0 and CLK180 DLL outputs

CLK

DATA_1 D1A D1B D1C

DATA_2 D2A D2B D2C

Dual Data Rate D1A D2A D1B D2B D1C


THANK YOU

You might also like