0% found this document useful (0 votes)

177 views20 pages

Introduction To Fault Tolerance

Fault tolerance is a non-functional (QoS) requirement that requires a system to continue to operate, even in the presence of faults Fault tolerance should be achieved with minimal involvement of users or system administrators

Uploaded by

ankitbhattt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

177 views20 pages

Introduction To Fault Tolerance

Uploaded by

ankitbhattt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

BY:

ANKIT BHATT
ME-VLSI & EMBEDDED

What
Is
Failure?
A system is said to fail when it cannot meet its promises.
A failure is brought about by the existence of errors in

the system.
The cause of an error is called a fault.

Concept of Fault Tolerance

Hardware, software and networks cannot be totally free
from failures
Fault tolerance is a non-functional (QoS) requirement
that requires a system to continue to operate, even in the
presence of faults
Fault tolerance should be achieved with minimal
involvement of users or system administrators
Distributed systems can be more fault tolerant than
centralized systems, but with more processor hosts
generally the occurrence of individual faults is likely to
be more frequent

Attributes Consequences and Strategies

What is a

Attributes
Dependable
Availability
system
Reliability
Safety
How to
distinguish
Confidentiality
faults
Integrity
Maintainability Consequences
Fault
Error
Strategies
Failure
Fault prevention
Fault tolerance
Fault recovery
Fault forcasting

Distributed Systems

How to
handle
faults?

Terminology of Fault Tolerance

Fault

causes

Error

results in Failure

Fault is a defect within the system

Error is observed by a deviation from the expected
behaviour of the system
Failure occurs when the system can no longer perform as
required (does not meet spec)
Fault Tolerance is ability of system to provide a service,
even in the presence of errors
Distributed Systems

Strategies to Handle Faults

Fault avoidance
Techniques aim to prevent
faults from entering the
system during design stage
Fault removal
Methods attempt to find
faults within a system before
it enters service
Fault detection
Techniques used during
service to detect faults within
the operational system
Fault tolerant
Techniques designed to tolerant
faults, i.e. to allow the system
operate correctly in the presence of
faults.
Distributed Systems

Actions to identify and

remove errors:
Design reviews
Testing
Use certified tools
Analysis:
Hazard analysis
Formal methods proof & refinement

No non-trivial system
can be guaranteed free
from error
Must have an
expectation of failure
and make appropriate
provision

Fault Models
A fault model identifies targets for testing
A fault model makes analysis possible
Effectiveness measurable by experiments
Different types

Stuck-at faults
Multiple stuck-at faults
Bridging faults

Single Stuck At Fault

Single (line) stuck-at fault

The given line has a constant value (0/1)

independent of other signal values in the circuit
Properties
o Only one line is faulty
o The faulty line is permanently set to 0 or 1
o The fault can be at an input or output of a gate
o Simple logical model is independent of technology
o It reduces the complexity of fault-detection

Example:
XOR circuit has 12 fault sites and 24 single stuck-at faults

Multiple Stuck-At Faults

Multiple stuck-at fault

Several single stuck-at faults occur at the same time

Multiple stuck-at faults are usually not considered in
practice because of two reasons
The number of multiple stuck-at faults in a circuit
with k lines is 3K-1, which is too large a number
even for circuits of moderate size
o Tests for single stuck-at faults are known to cover a
very high percentage (greater than 99.6%) of multiple stuck-at
faults when the circuit is large and
has several outputs
o

Bridging Fault
Two or more normally distinct points (lines) are
shorted together
Two types of bridging faults:
Input bridging
Can form wired logic or voting model.
Feedback (input-to-output) bridging
Can introduce feedback.
Can cause oscillation or latching.

Transistor Fault
o MOS transistor is considered an ideal switch.

o Two types of faults are modeled:-

Stuck-open -A single transistor is permanently stuck in

the open state turn the circuit into a sequential one and
need a sequence of at least 2 tests to detect a single fault.
Stuck-on - A single transistor is permanently
shorted irrespective of its gate voltage.
o Detection of a stuck-open fault requires two vectors.

Example of Transistor Stuck-Open

fault

Hardware Faults Classification

Three types of faults:
Transient Faults - disappear after a relatively short time
Example - a memory cell whose contents are changed spuriously
due to some electromagnetic interference .
Overwriting the memory cell with the right content will make
the fault go away.
Permanent Faults - never go away, component has to be

repaired or replaced.
Intermittent Faults - cycle between active and benign states
Example - a loose connection

Fault Tolerance Techniques

Hardware Redundancy
Software Redundancy
Information Redundancy
Time Redundancy

Hardware Redundancy
Extra hardware is added to override the effects of a failed

component
Static Hardware Redundancy - for immediate masking of a

failure
Example: Use three processors and vote on the

result. The wrong output of a single faulty processor is masked

Dynamic Hardware Redundancy - Spare components are

activated upon the failure of a currently active component

Hybrid Hardware Redundancy - A combination of static and

dynamic redundancy techniques

Software Redundancy
Multiple teams of programmers

Write different versions of software for the same

function
The hope is that such diversity will ensure that not all
the copies will fail on the same set of input data

Information Redundancy
Add check bits to original data bits so that an error in
the data bits can be detected and even corrected

Error detecting and correcting codes have been

developed and are being used
Information redundancy often requires hardware

redundancy to process the additional check bits

Time Redundancy
Provide additional time during which a failed
execution can be repeated

Most failures are transient - they go away after some

time
If enough slack time is available, failed unit can

recover and redo affected computation

THANK YOU

Fault Tolerance Techniques Overview
No ratings yet
Fault Tolerance Techniques Overview
101 pages
Lect8 FaultTolerance
No ratings yet
Lect8 FaultTolerance
37 pages
Fault Tolerance Techniques: Unit 3
No ratings yet
Fault Tolerance Techniques: Unit 3
40 pages
Fault Tolerance for Engineers
100% (1)
Fault Tolerance for Engineers
104 pages
Fault Avoidance and Tolerance Technique
No ratings yet
Fault Avoidance and Tolerance Technique
15 pages
Fault Tolerance:-: Introduction, Process Resilience, Distributed Commit, Recovery
No ratings yet
Fault Tolerance:-: Introduction, Process Resilience, Distributed Commit, Recovery
52 pages
Fault Tolerance Notes
No ratings yet
Fault Tolerance Notes
101 pages
Fault Tolerance
No ratings yet
Fault Tolerance
33 pages
Design Verification Overview and Methods
No ratings yet
Design Verification Overview and Methods
60 pages
Advanced Pipelining for CE Students
No ratings yet
Advanced Pipelining for CE Students
43 pages
Fault Tolerant System Design
100% (1)
Fault Tolerant System Design
44 pages
Design Methodologies & RTOS Tools
No ratings yet
Design Methodologies & RTOS Tools
16 pages
Data Management in Wireless Sensor Networks
No ratings yet
Data Management in Wireless Sensor Networks
9 pages
Advanced Digital System Design (2013)
No ratings yet
Advanced Digital System Design (2013)
2 pages
VLSI Design Verification Techniques
No ratings yet
VLSI Design Verification Techniques
15 pages
Week 5 Course Material
No ratings yet
Week 5 Course Material
76 pages
DFTS BE 4 II Sem Unit 1
No ratings yet
DFTS BE 4 II Sem Unit 1
166 pages
VLSI System Testing: Fault Modeling
No ratings yet
VLSI System Testing: Fault Modeling
21 pages
Eg09 AXI4 Stream
No ratings yet
Eg09 AXI4 Stream
15 pages
RISC-V Processor Verification Study
100% (1)
RISC-V Processor Verification Study
7 pages
Clocking Strategies
100% (1)
Clocking Strategies
21 pages
Metastability and Synchronizers - IEEEDToct2011 PDF
No ratings yet
Metastability and Synchronizers - IEEEDToct2011 PDF
13 pages
System On Chip
No ratings yet
System On Chip
47 pages
Advanced VLSI Design Overview
No ratings yet
Advanced VLSI Design Overview
339 pages
Shared Memory Architecture
No ratings yet
Shared Memory Architecture
17 pages
Bipartite Graph Matching Guide
No ratings yet
Bipartite Graph Matching Guide
71 pages
Fault Equivalent & Collapsing: Combinational Circuits
No ratings yet
Fault Equivalent & Collapsing: Combinational Circuits
5 pages
Fault Simulation in Circuit Testing
No ratings yet
Fault Simulation in Circuit Testing
28 pages
EE292A Lecture 2.ML - Hardware
No ratings yet
EE292A Lecture 2.ML - Hardware
61 pages
Flip-Flops: Revision of Lecture Notes Written by Dr. Timothy Drysdale
No ratings yet
Flip-Flops: Revision of Lecture Notes Written by Dr. Timothy Drysdale
35 pages
Understanding Binary Decision Diagrams
No ratings yet
Understanding Binary Decision Diagrams
74 pages
ESZG611 Processor Architecture and Design
No ratings yet
ESZG611 Processor Architecture and Design
4 pages
Adaptive Filters
No ratings yet
Adaptive Filters
23 pages
Verilog Basics
100% (7)
Verilog Basics
50 pages
Lecture 3
No ratings yet
Lecture 3
118 pages
Clock Domain Crossing (CDC) Design Techniques
No ratings yet
Clock Domain Crossing (CDC) Design Techniques
20 pages
HDLs for Engineers and Designers
No ratings yet
HDLs for Engineers and Designers
12 pages
008 Architectural
No ratings yet
008 Architectural
45 pages
Level 2 Flowcharts Housekeeping Tasks Merged With The Operation Tasks To Form Level 2 Flowcharts
100% (1)
Level 2 Flowcharts Housekeeping Tasks Merged With The Operation Tasks To Form Level 2 Flowcharts
19 pages
Low Power Design Techniques
No ratings yet
Low Power Design Techniques
26 pages
RTL Design Examples in Verilog
No ratings yet
RTL Design Examples in Verilog
113 pages
Gujarat Technological University: Semester - VIII Subject Name: Internship/ Project
No ratings yet
Gujarat Technological University: Semester - VIII Subject Name: Internship/ Project
3 pages
Chapter 9 - Functional Coverage
No ratings yet
Chapter 9 - Functional Coverage
112 pages
Syn ABC Part2
No ratings yet
Syn ABC Part2
34 pages
Wiredembeddedprotocols Sample
No ratings yet
Wiredembeddedprotocols Sample
56 pages
VLSI Testing and Design For Testability
No ratings yet
VLSI Testing and Design For Testability
19 pages
Setup and Hold Time Equations in VLSI
No ratings yet
Setup and Hold Time Equations in VLSI
5 pages
Clocking
No ratings yet
Clocking
22 pages
On Chip Network by Natalie PDF
100% (1)
On Chip Network by Natalie PDF
141 pages
16 Fault Tolerance
No ratings yet
16 Fault Tolerance
34 pages
Finite State Machines: Mor Vered, BIU University Multi Robot Systems Based On Lectures by George Mason and CMU
No ratings yet
Finite State Machines: Mor Vered, BIU University Multi Robot Systems Based On Lectures by George Mason and CMU
28 pages
NoC Design Methodologies and Testing Challenges
No ratings yet
NoC Design Methodologies and Testing Challenges
6 pages
Design and Implementation of Multibit LFSR On FPGA To Generate Pseudorandom Sequence Number
No ratings yet
Design and Implementation of Multibit LFSR On FPGA To Generate Pseudorandom Sequence Number
4 pages
Ch-4-Fault Tularance - Naming-SM
No ratings yet
Ch-4-Fault Tularance - Naming-SM
42 pages
Fault Modeling in Digital Systems
No ratings yet
Fault Modeling in Digital Systems
21 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
21EC63 Module 4A
No ratings yet
21EC63 Module 4A
39 pages
Vlsi&Testing Module4
No ratings yet
Vlsi&Testing Module4
72 pages
VLSI Fault Modeling Guide
No ratings yet
VLSI Fault Modeling Guide
24 pages
R Wem ZQL On Y
No ratings yet
R Wem ZQL On Y
11 pages
Matrix 04feb14
No ratings yet
Matrix 04feb14
841 pages
AN-1348 LM3670 Evaluation Board: User's Guide
No ratings yet
AN-1348 LM3670 Evaluation Board: User's Guide
10 pages
Harmonics Electrotek PDF
No ratings yet
Harmonics Electrotek PDF
10 pages
Be 01000051
No ratings yet
Be 01000051
2 pages
DynaForce Boiler Specs
No ratings yet
DynaForce Boiler Specs
3 pages
Lab Report 8
33% (3)
Lab Report 8
5 pages
Interview Details With Iifl: PLZ Share Ur Current CTC, Notice Period and Availability For Interview
No ratings yet
Interview Details With Iifl: PLZ Share Ur Current CTC, Notice Period and Availability For Interview
2 pages
Wuyi Mailun Technology Co., LTD.: Offer Sheet
No ratings yet
Wuyi Mailun Technology Co., LTD.: Offer Sheet
1 page
Documentation EX-8000 Rev 2.2 IEP February 2016
No ratings yet
Documentation EX-8000 Rev 2.2 IEP February 2016
62 pages
Fairfax Digital Rate Card
No ratings yet
Fairfax Digital Rate Card
4 pages
Single Phase Induction Motor
100% (1)
Single Phase Induction Motor
7 pages
Hydraulic Cylinder Repair Guide
No ratings yet
Hydraulic Cylinder Repair Guide
4 pages
Basic Electronics Assignments Overview
No ratings yet
Basic Electronics Assignments Overview
7 pages
AEP611S Lecture Notes Two-Port Networks Semester-1 2024 Revised The Complete Document
No ratings yet
AEP611S Lecture Notes Two-Port Networks Semester-1 2024 Revised The Complete Document
167 pages
Module 3 - 2 - Battery, Fuel Cells and Corrosion
No ratings yet
Module 3 - 2 - Battery, Fuel Cells and Corrosion
64 pages
Expt 1
No ratings yet
Expt 1
8 pages
Ot-701 Op. Manual
No ratings yet
Ot-701 Op. Manual
52 pages
Alarm Panel Programming Guide
No ratings yet
Alarm Panel Programming Guide
48 pages
BJT Mosfet Igbt
No ratings yet
BJT Mosfet Igbt
48 pages
Jobs vs. Gates: Marketing Strategies
No ratings yet
Jobs vs. Gates: Marketing Strategies
4 pages
January 4-25, 2011 (For Upload)
No ratings yet
January 4-25, 2011 (For Upload)
2 pages
IGCSE Chemistry 0620 - Ques Paper w10 - QP - 12
No ratings yet
IGCSE Chemistry 0620 - Ques Paper w10 - QP - 12
16 pages
Au7000 Uhd 4k Smart TV Manual
100% (1)
Au7000 Uhd 4k Smart TV Manual
15 pages
Batter y Charger: "AA"/"AAA" Ni-MH
100% (1)
Batter y Charger: "AA"/"AAA" Ni-MH
1 page
Edan F9 F9Express Service Manual
No ratings yet
Edan F9 F9Express Service Manual
86 pages
3.3. p1 250 - Mantenance of Panelboard - Siemens
No ratings yet
3.3. p1 250 - Mantenance of Panelboard - Siemens
7 pages
Electromagnetics Question Bank by RK Kanodia
100% (1)
Electromagnetics Question Bank by RK Kanodia
708 pages
Accident Prevention System On Turning: Submitted by
No ratings yet
Accident Prevention System On Turning: Submitted by
32 pages
HD3N-TC Series Crane Inverter User Manual (V1.0) - 20170928
No ratings yet
HD3N-TC Series Crane Inverter User Manual (V1.0) - 20170928
76 pages
EEE Measurements Lab Manual
100% (2)
EEE Measurements Lab Manual
85 pages

Introduction To Fault Tolerance

Uploaded by

Introduction To Fault Tolerance

Uploaded by

BY:

Concept of Fault Tolerance

Attributes Consequences and Strategies

Terminology of Fault Tolerance

Fault is a defect within the system

Strategies to Handle Faults

Actions to identify and

Single Stuck At Fault

The given line has a constant value (0/1)

Multiple Stuck-At Faults

Several single stuck-at faults occur at the same time

o Two types of faults are modeled:-

Stuck-open -A single transistor is permanently stuck in

Example of Transistor Stuck-Open

Hardware Faults Classification

Fault Tolerance Techniques

result. The wrong output of a single faulty processor is masked

Dynamic Hardware Redundancy - Spare components are

activated upon the failure of a currently active component

dynamic redundancy techniques

Write different versions of software for the same

Error detecting and correcting codes have been

redundancy to process the additional check bits

Most failures are transient - they go away after some

recover and redo affected computation

You might also like