Test-Model based Hierarchical DFT Synthesis
Sanjay Ramnath, Frederic Neuveux, Mokhtar Hirech and FelixNg
Synopsys Inc., Mountain View, CA 94043
{sramnath,fiedn, hirech, felixng}@synopsys.com
Abstract The use of test models instead of complete netlists presents us
with a number of challenges, in t a m of reusing existing proven
With increasing design shes and adoption of System on a Chip technology. In this context we address Design Rule Checking (DRC),
floc) methodolow, design synthesis and t ~ automation
t tools are DFT architecting and optimization.
hining cspnciiy nnd perfonnonce bonlenecb. Currently,
hierarchical synthesisflows for laxe designs lack complete design- DRC can be applied stand-alone, to validate test design rules or as a
for-test (DFT) support. With this paper. we address a solution, prelpst-processor to DFT synthesis, to extract infomiation for the
involving the infroduction of lest models in a traditional DFT putpose of DFT modeling such as sequential cells t h d violate test
synthesisflow. that we term Hierarchical DFTSynlhesLs @IDS). We design rules and scan chain information. Our implementationof DRC
discuss the use of Core T& Lunguage (CTL) bared test models in a traditional DFT synthesis flow is simulation based and therefore
combined with physical and timing models toprovide a completeflow relies on the availability of a gate-level netlist. To leverage this
for c h i p l e d DFIFT.In doing so we oddress some challenges the new technology we present a technique by which we extract
frow presents such as Design Rule Checking (DRC). DFT representative netlists from test models.
architecting and optimization. We describe methodr to overcome
there challenges thereby presenting a new methodology to handle DFT Closure, that is to rapidly and predictably meet all DFT
compL?x neri genaalion designs. requirements from RTL to GDSII[Z], is a mandate for most designs.
This needs to be achieved at every stage in the design process,
1.0 Introduction particularly at the chiplevel. Therefore HDS must avoid the
following:
Recent advances in manufachning and methodology allow for
larger and more complex IC designs. In today's environment, circuit Timing violations of design rules and constraints due to test logic.
sues typically exceed million gates introducingfurthe?.complexity to Placement violation and routing congestion created by scan path
the design flow in terms of timing, placement and routing. This, buffers and scan nets.
coupled with the increasing relevance of the design-reuse paradigm
suggests that capacity and performance will soon become major Although CTL models provide sufficient details for inserting DFT
concerns with most DFT tools. In the past, DFT flows advocated a logic and connecting scan structures, they lack information that is
topdown approach to synthesize, optimize and insert test logic on required to optimize designs. Design optimization integrated with
flattened designs. Today, Design re-use and System on a Chip (SoC) DFT insemon is already implemented in one-pass DFT synthesis[3].
methodologies [I] are driving the shift towards hierarchical flows, The challenge lies in enabling this technology in the presence of test
where pre-assembled blocks are integrated with control logic to form models.
a complete system These flows recommend a bottom-up approach to
perform DFT synthesis on large hierarchical designs. Designers This paper describes the combined use of test, timing and physical
develop blocks concurrently, synthesizing them and implementing models to develop a new complete chip-level DFT synthesis flow to
DFT at the front end of the design process. This enables predictahility handle complex multi-million gate hierarchical designs.
and facilitates optimization to minimize the impad of test logic on
the design. Once all the blocks are complete and DFT ready, final We use Core Test Language (CTL) to describe test models. CTL is
assembly integrates them and addresses DFT at the chiplevel. If the modeling language portion of the proposed P I500 standard for
each of these blocks is over a million gates large, then reading in Embedded Core Test[4]. Although the standard is targeted towards
the entire design and performing DFT insertion at the chip-level SoC methodologies, this paper illustrates a powerful application of
becomes impractical. In addition we also have to account for glue CTL to enhance traditional DFT flows.
logic between these blocks which might be significant
This paper is organized as follows. In section 2 we present an
The concept of test modeling presents a solution to this prohlm overview of the classical bottom-up approach to one-pass DFT
Test modeling refers to the abstraction of DFT structures embedded synthesis. In section 3 we introduce HDS. In this context we briefly
in a design, in the form of a test model. In other words, a test model introduce some key concepts of CTL and present the new flow and its
encapsulates all DFT information needed by a system integrator. The advantages. In section 4 we discuss the challenges we faced and the
proposed HDS flow uses a test model instead of a netlist techniques we used to migrate existing technology to the new flow.
representation of the sub-modules during chiplevel integration. In section 5 we present experimental results and conclude in section
Thus, we realize a significant improvement in terms of both capacity 6.
and performance since the sue of the abstract model is typically only
a small fraction of the original netlist.
0-7803-7607-2/021'$17.000 2002 IEEE 286
2.0 Classieal DFT Synthesis 3.1.1 CTL Structure and Syntax
Figure 1 illustrates a traditional one-pass bottom-up DFT synthesis The information wntained in the CTL model for a module is
flow. In this flow, insertion of DFT logic (e.g. test-points) and scan classified according to configurations (moder) of the module. Figure
assembly first takes place at a module level. The module designer 2 illustrates this architecture. Every mode has an associated
then hands out a DFT-ready block to the system integrator. This initialization sequence. Some modes contain test pattern information
testable block is integrated ‘as-is’ at higher levels of abstraction. while others contain structural information about the DFT logic
This means during integration, no further changes are allowed to the included in the module. For the purpose of this discussion we focus
DFT structures within the block. on the InremalTesf mode of operation of the module. This mode
allows for the testing of the internal logic of the module through the
As sub-design sizes increase, this flow will soon hit capacity and DFT structures. The CTL description for this mode typically contains
performance limitations. However it presents several advantages with the following details:
respect to predictability and optimization during insertion of DFT
logic. A key point to be noted here is the fact that once DFT is I. Signals and Signal Groups - Defines the U 0 boundary
inserted in the sub-modules, we do not require any information about 2. Macros - A template that applies data defined by a pattern
them other than the DFT structures. Therefore it is possible to model in a certain sequence. The initialization sequence for the
the sub-modules in a more compact fashion instead of retaining the mode is defined here.
entire netlist during chiplevel integration. 3. Procedures -Define the scan test sequence.
4. Scan Structures- Describes the scan chains.
5. Data Types for various control signals, such as clock, test
Mode, and scan enable.
The CTL syntax can be illushated with a simple example of a DFT
ready design shown in Figure 3 and its associated partial CTL model
described in Figure 4. The design comprises one scan chain built with
2 multiplexed scan flip-flops and a synchronization latch. Since CTL
is under development, the syntax is subject to modification.
The following sections describe the application of test models to
HDS.
I I
*
Chip level I
I
L( Modes
Figure 2. CTL Structnre
ATPG
Figure 1. Bottom-up DFT synthesis
3.0 Test Model-based HLerarchieal DFT synthesis
in sol
3.1 Core Test Language (CTL) test-si1
test-se
CTL (F‘1450.6) is being developed as part of the IEEE PI500
standard for Embedded Core Test. The goal is to define a language to
elk
describe all the necessq information for test pattern reuse and the
needs of test during system integration. Test aspects of a wre can be tm
described via CTL so that the core can be integrated as a black box
into a SoC design. While [4][5] give more details on the language Figure 3. DFT ready design
and syntax, we briefly describe some key concepts that are required
for this discussion.
287
3.2 Hierarchical DFT Synthesis (HDS)
Figure 5 illustrates the HDS flow. During chip-level integration, we
use the test model representation of the sub-modules instead of their
netlist representations. Since the sub-modules are DFT-ready, we do
not allow any changes to their netlists. Test models are useful here
because during integration, we are only concerned with potions of
sub-modules that are important for inserting DFT at the higher level of
abstraction. CTL enables us to either manually create the test models
or integrate the process with automated tools. Thus, the new flow
helps accommodate multi-million gate hierarchical designs.
4.0 Challenges
The following sub-sections discuss the challenges imposed by H D S
in the context of DRC, DFT architecting and optimizaiion. We present
the requirements of each task and describe the techniques devised to
meet these requirements by leveraging as much of the existing
technology as possible. For the sake of simplicity, all examples assume
Multiplexed flip-flop scan style [ 6 ] .
4.1 Design Rule Checking
4.1.1 Requirements
Our implementation of DRC in a classical DFT synthesis flow relies
on symbolic simulation of a test protocol. A test protocol is a formal
description of the sequence of operations performed while testing a
design. A test protocol for a serial scan design comprises the serial
scan-in, parallel measure and capture, and serial scan-out operations
[71[8]. The key idea is to logic simulate the process of testing a design
and through this simulation verify compliance with scan test design
rules. The symbolic simulator is based on a system of classical three-
valued logic {l,O,x) in addition to values that enable simulation oftest
protocols. Simulation values are propagated as tokens to establish
states in all sequential cells within the design that are then checked for
scan compliance. For example, simulation of a scan-in operation
should establish an arbitrary known state in all sequential Cells within
the design. A cell whose state is not controllable represents a design
rule violation. This approach generalizes the concept of scan desi@ to
any sequential cell that can be controlled and obswed through the
application of a test protocol. When DRC is invoked on a DFT-ready
design, the test protocol is updated with details regarding the new scan
structures, if any. Figure 6 illustrates the typical DRC flow.
Symbolic simulation requires a gate-level netlist representation.
Modules without such netlists are considered black boxes since we
c m o t propagate simulation tokens through them. This includes DFT-
ready sub-modules with only test model representations.
Therefore, for the purpose of DRC, we need to replace each DFT-
ready sub-module with a CTL model by a significantly smaller
equivalent netlist that only represents DFT infomtiou described in
the ccmesponding test models a process we term DRC modeling. This
netlist should accurately represent the DFT logic in the sub-module
while at the same time preserving the capacity benefit that we realize
by using test models.
4.1.2 DRC l o r HDS
The following sub-sections detail the DRC modeling mechanism. Figure 4. Partial CTL model
We begin by introducing some definitions that characterize the DFT
information that is extracted from a test model representation of a
sub-module for the purpose of DRC modeling. We then use these
definitions to describe the mechanism in detail.
288
e. Clock ordering
The precedence relationships between scan flip-flops imposed by
clock domain timing characteristics are defined at the scan segment
level. Capture and launch times for a scan segment are deduced from
the capture time of its first scan cell (driven by its scan input) and the
launch time of its last scan cell (driving its scan output). This this
precedence relationship between scan segments can be respected
during chip-level DFT architecting.
11- ai*L I Y d
I
Figure 6. Design Rule Checking
ATPG 4.1.2.2 DRC Modeling
Figure 5. High-capacity D I T flow Figure 7 transcribes the algorithm and Figure 8 illustrates the
scheme followed for DRC modeling. The algorithm makes use of the
4.1.2.1 Definitions following terminology:
8. ScanSegment S = {SI, sz ...%) is the set of all scan segments in a DFT-ready sub-
module.
This concept of scan segment has been introduced in the context of
hierarchical scan synthesis [9]. N, is defined as the number of scan cells in scan segment si
A scan segment is a chain containing one or more completely C, = {h,,c,... h}Ir the set of all scan master clocks for a scan
connected scan cells. A scan chain specified as a scan segment is segment.
complete and atomic in the sense that it cannot be reconfigured. A
scan segment, has a scan-in pin si and a scan-out pin so. A scan C, = c, ... ce} is the set of all scan slave clocks for a scan
segment can he made a part of another scan chain. Therefore in a segment.
bottom-up flow, at the module level each scan chain is a scan
segment. At the chip-level integration, each scan segment in the sub- Each clock has an associated rise time r, and fall t i m e j
module can be a part of a scan chain. Our discussion on DFT
architecting will M e r elaborate on scan segments and how to A = (al, az .._k} is the set of asynchronous setsiresets for a scan
structure them. Here we will concentrateon the application to DRC. segment
h. Clock domain E = (el, y ... a}is the set of scan enable signals for a scan segment
An active edge (leading edge or trailing edge) of each clock is Figure 8a shows the netlist of a sub-module (M) from which a test
considered to be in a separate clock domain. Both edges of a clock model is generakd. LI and L2 denote combinational logic and ffl,
and clocks with different timing characteristics may be used to fl2and ff3 are multiplexed D flip-flops. Figure 8b describes the
control edge-triggered scan flip-flops of a scan chain. In order to integration of sub-module M at a higher level of abstraction (design
construct functional scan chains, two adjacent scan flip-flops A and B Top). L' and L" could be combinational or sequential user-defined
(A serially driving B) must adhere to the rule that B must be clocked logic (UDL). At this level, sub-module M appears as a pure black
at the same time or before A. box. Only the interface and the test model are visible to the DFT
289
architect. Figure 8c shows the equivalent netlist resulting h m DRC
modeling.
Therefore we replace the black box by a netlist (Cn-to-gales) that
we quickly construct fiom the test model description, the basic idea
being to minimize the size of the scan segments . DRC then operates
on the modeled netlist
For DRC, we need to ensue that all control and access pins for
scan segments are completely represented in the model. This is
required for accurate validation. As long as this property is preserved
during the modeling process, there will he .no loss of critical
information. For example, a scan segment of any sequential length,
with one asynchronous input, and one clock domain, can be
represented by a single scan cell.
Therefore, we obtain a significant reduction in segment length for
most scenarios. In the worst case when A = Nor C, = N, there will
be no reduction in the segment length. A similar technique can he
applied to other scan styles as well by using more complex
technology-independent, and pre-defined set of cells.
The following considerations figurein this context:
We do not perform capture checks since they require functional
information that is not available in the test model. Capture
checks are more pertinent to automatic test pattern generation.
In the case of 3-state pins, the model is enhanced with additional Figure 7. DRC modeling algorithm
structures to represent disabling logic.
We assume that segments in the test-modeled subdesigns are in107
correct by wnshuction. a, 011031
0210:71
The DRC module is a transient entity and is removed at the end so
of rules checldng. Therefore violations detected on cells of the
model must be reported at the boundary of the sub-module and
not on the virtual cells that constitute the model.
This technique has the following impact:
'\'. a Suh-module M
The test protocol needs to be updated before simulation to
reduce the number of simulation cycles depending on the length
of the DRC modeled segment. The original value is restored at
the end of simulation.
We need to maintain correspondence between the original scan
segment and the scan segment in the DRC model. This is critical
to reporting violations on the correct entity.
One of the functions that DRC performs, when run as a post-
processor to DFT insertion, is to extract scan structures for
,& Hierarchical design with test model f&,M.
reporting.putposes. Here, care should be taken here to extract a. 11
the original scan segment out of the DRC model.
This solution therefore allows us to leverage the current DRC
technology, without modifying the DRC engine.
c. DRC Model for M
Figure 8. DRC modeling
290
This infomation is translated and stored as boolean equations in OUT
4.2 DFT Architecting internal model. During integration, combining these equations will
help identify the enabling and disabling conditions for drivers
4.2.1 Requirements connected to the same bus.
Given a design encapsulating a set of module instances, each 4.3 DFT Insertion and Optimization
pointing to a CTL model, the task of DFT architecting, involves
generating a complete DFT plan for the design, integrating DFT 4.3.1 Requirements
structures from the modeled instances. The DFT structures created at
the design level must preserve the testability achieved at the module Since DFT insertion calls for design modification, it is important that
level and provide access to the instances eom the design ports. the design be optimized in order to meet logical and physical synthesis
constraints. CTL models lack timing and cell information that is
DFT insertion using CTL models must obey the same d e s as the required during optimization. In the example of Figure 3, the
classical flow that works with leaf level cells. Controllability of clock functional output of a scan register or a block is used as a scan output;
and asynchronous signals, clock sensitivity, balancing of scan chains, thereby increasing the load on the register “FF-T or the module
avoidance of float and contention conditions during scan shifting are output pin. This timing consideration is illustrated in Figure 9.
typical constraints that drive DFT architecting. These constraints
require that the CTL model contain information such as:
1. Scanstructures
2. Enabling/disablingconditions on hi-state ports
3. Test Control ports and purpose
4. Clock dependency
5. Asynchronous control signals
4.23 DIT Architecting for ADS Figure 9. Chiplevel timing path
Migrating h m a full gate level netlist representation of a sub- Design optimization during DFT insertion has been implemented in
module to a CTL model requires changes to the way we evaluate the what is known today as one-pass DFT synthesis[3]. This flow is part
design for DFT architecting. To facilitate this we utilize the concept of our DFT solution and has been proven on a variety of industrial
of a scan segment as described earlier. designs. In order to harness this technology for H D S we require
information that is not available in the CTL model. This includes loads
A scan segment is characterizedby the following: on inputs, drive strength of outputs, timing path to the closest register
etc. Therefore we rely on additional models for logical and physical
1. Scanlength optimization.
2. Serial scan access ports
3. Scan control signals. 4.3.2 DFT Insertion and Optimiution for H D S
4. Clock synchronization
5. Test control signals 4.3.2.1 Logic Optimization
In addition to this we might have information about Recently, logic synthesis tools have started using t@ng models to
enabling/disabling conditions for bi-state signals, attached to the scan cope with increased design sizes. The model used, called Interface
segment. Logic Model (ILM) [lo], is well suited for synthesis and timing
analysis. The concept is illustrated in Figure 10 where we show a
Recollect that during chip-level integration, each scan segment in the module and its ILM representation. The ILM shown here
sub-modulecan be made a part of a scan chain. encapsulates the following information:
The description of the scan structures in a CTL model identifies the 1. The cloud of combinational logic going h m input ports to the
scan serial inputs, serial outputs and global inputs ports of the block. sequential cells in the fan-out cone of these ports.
This indicates how the segment should be connected. Information from
the CTL model like Captureclock, LaunchClocks, DataType, Sean 2. The cloud of combinational logic between output ports and
Length (Figure 4) is modeled as a scan segment object. Therefore we sequential cells in the f a n 4 cone of these ports.
encapsulate all the information about scan structures available in a
CTL model in the form of scan segments. At the end of this process, 3. The cloud of combinational logic between ports.
each scan segment becomes analogous to a leaf level scan cell. Once
all the scan segments are extracted from the CTL description of a sub- This partial representation of the design enables us to extract
module, chip-level DFT architecture and insertion can happen accurate timing information and design port characteristics.
transparently. By making the scan segment the unit of representation,
we blur the difference between cell instances that are leaf level cells If the full gate, level netlist is available, one-pass DFT synthesis
and instances that are characterizedby CTL models. integrates DFT insertion and design optimization steps to concurrently
6x synthesis design rule and timing constraint violations. For
Proper attention must be paid to disabling logic on the hi-state hierarchical designs, availability of ILM and CTL models for sub-
signals as this may cause contention or float conditions during scan modules extends the one-pass DFT synthesis approach to be applicable
shiEng if not properly accounted for. In the CTL model, hi-state logic during chip-level assembly. During chip-level DFT synthesis, unified
diiving outputs of the sub-module is represented by the IsDisabledBy models that have both CTL and ILM information are used to perform
property and a boolean equation combining inputs of the sub-module. DFT synthesis to meet both the timing and test constraints.
291
Y
k
W 3
-: CLK
Figure 10. ILM model creation
4.3.2.2 Physical optimization
Fignre 11. Physical D€T architecture and optimization
Timing and placement driven optimization is at the core of today’s
physical synthesis tools. In addition to logic optimization based on
library cell data, physical optimization aims at incorporating In the Experiment 2 we instantiated the design used in the first
parameters such as wire geometry, cell placement to achieve sharper experiment eight times and performed top-level DFT synthesis. The
design analysis and better quality of results. total number of transistors was around 4 million. Again we compared
the memory consumption and CPU run time between using full gate
During chip-level integration, routing scan nets is a significant and test model representations. Table 3 describes the ‘results. We
portion of the overall routing process. Scan chain ordering[ll] is observe 7X improvement in memory and 41X improvement in run-
widely used at the sub-module level as well at the chip-level to reduce time.
overall routing congestion and reduce connection lengths; thereby
minimizingthe impact of scan nets on timing. Hence, incorporation of The results show that we can obtain significant capacity and
physical data during DFT insertion is critical to timing closure. This is performance benefit by replacing netlists with test-model
illustrated in Figure 11. representations, especially for large hierarchical designs.
By enhancing the unified ILM & CTL models with physical data, we 6.0 Conclusion
allow better optimization of block level scan structures during chip-
level assembly. With a structure similar to the’Layout Exchange Capacity and performance bottlenecks are growing concerns for
Format (LEF) representation, the outside geometry of a DFT insetted most commercial EDA tools. In this paper, we pre,rent a new
block and the position of its scan ports is captured as a physical model. hierarchical DFT synthesis flow based on test, timing 2nd physical
The precise position of scan ports on the block provides more accurate models. This work was motivated by a comprehensive effort to
information than just the simple block location. As shown in Figure address the capacity issue with current DFT synthesis flows, and
11, those models drive DFT synthesis to perform optimal partitioning illustrates a powerful application of the IEEE P1450.6 Core Test
and ordering of block scan structures. This reduces muting congestion language (CTL) for test modeling.
due to scan nets. Long scam nets are handled by addition of b e e r s
during timing driven optimization. The combined use of DFT, timing and physical abstractions has
enabled us to solve the capacity bottleneck without compromising on
Therefore by unifymg CTL, ILM and physical models we avoid the quality of results. Core Test Language is a technology (enablerfor
iterations between DFT synthesis, logic optimization and placement modeling DFT information. Using ILM abstractions, timing closure
thereby guaranteeing a fully integrated flow. is achieved during top-level DFT synthesis. Physical ;abstraction
enables better ordering of DFT struchxes that reducer: top-level
routing congestion.
5.0 Experimental Results
The techniques described here have been implemented as part of our
The techniques described above have been implemented as part of commercial DFT synthesis solution. Several enhancements to the
our commercial DFT Synthesis tool. Two experiments were run to current implementation are being planned. Some of these being
compare the memory usage and CPU run time between the support for test-models with pass-through control signals (such as
conventional DFT synthesis flow and the new flow. The comparisons clocks, resets), and support for models with more complex sequential
are based on applying the same tool to two different flows, the initialization sequences (as opposed to combinational initialization
classical DFT synthesis flow versus the H D S flow using test-models. sequences that are currently supported).
Table 1 describes the statistics of the design used for the experiments.
In Experiment 1 we inserted DFT on the design containing 3 7.0 References
hierarchical sub-modules. We performed DFT synthesis on each block
and saved it as both full gate database and test model representations. 1. Cupta R.K, Zorian. Y , “Infroducing Core-Bared System
During the top level DFT synthesis we compared the memory Design”,IEEE Design and Test of Computers, December 1997.
consumption and CPU run time between using full gate database and Pages: 14(4): 15-25
using test models. The comparison was done at different stages in the 2. Hayat. F, Williams. T.W, Kapur. R, Hsu D, “DFT Closure’:
flow. Table 2 describes the results. We observe 2X improvement in Proceedings of Asian Test Symposium, 2000. Pages: 8-9.
memory and 4X improvement in run-time.
292
~
3. Hirech. M and Ramnath. S, ”Moving JFom one-pass scan 8. Pitty. E.B, Martin. D and Ma. H.-K.T, “A simulafion-based
synfhesh fo one-pass DFT synthesis”, European Test Workshop, protocol-driven scan tesf design rule checker”, Proceedings of
2001.3B.1 International Test Conference, 1994. Pages: 999-1006.
4. Kapur. R, Keller. B, Koenemann. B, Lousberg. M, Reuter. P, 9. Beausang. J, Ellingham. C and Robinson. M. “Infegrating scan
Taylor. T and Varma. P, “PISOO-CTL: Towards a Standard into hierarchical Synthesis methodologies ’; Proceedings of
Core Tesf Language”, Proceedings of 17* IEEE VLSI Test International Test Conference, 1996. Pages: 751-756.
Symposium, 1999. Pages: 489490. 10. Daga. A, Ananthanaayanan. S and Neuveux. F, “Interface
5. Kapur. R, Lousberg. M, Taylor. T, Keller. B, Reuter. P and Kay. Logic Models in o Hierarchical SoC Design Flow’: Submitted
D, “CTL the language for describing corebased fesf”, to CICC 2002.
Proceedings of International Test Conf, 2001. Pages: 131-139. 11. Hirech. M, Beausang. J and Gu X., “A new approach fo scan
6. Scan Synthesis Reference manual, release 2001.08, Synopsys chain reordering using physical design infomation”,
Inc., Mountain View, CA, 2001. Proceedings o f International Test Conference, 1998. Pages: 348-
7. Varma P, “TDRC - A Symbolic Simulation Eased Derign for 355.
Testability Rules Checker”, Proceedings of International Test
Conference, 1990. Pages: 1055-1063
A m (transistors)
Sub-module #1 1,123
Sub-module #2 54,709
Sub-module#3 455,604
Top level glue logic 1,375
Total 512,811
Table 3. Experiment 2
Table 2. Experiment 1
293