0% found this document useful (0 votes)

88 views5 pages

Fast Convolution Algorithms

This document discusses fast convolution algorithms called overlap-add and overlap-save. These algorithms break up signals and filters into blocks and use the FFT to perform convolution faster than directly implementing the convolution equation. Overlap-add works by overlapping and adding the output blocks from each input block. Overlap-save overlaps and saves part of the input blocks to calculate the output blocks. Both methods use one FFT, complex multiplications, and inverse FFT per block, making convolution faster for longer signals and filters. The optimal block length is a tradeoff between FFT efficiency and processing appended zeros.

Uploaded by

Rogmary Dominguez

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

0% found this document useful (0 votes)

88 views5 pages

Fast Convolution Algorithms

Uploaded by

Rogmary Dominguez

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

You are on page 1/ 5

Fast Convolution Algorithms

Overlap-add, Overlap-save

Introduction

One of the first applications of the (FFT) was to implement convolution

faster than the usual direct method. Finite impulse response (FIR) digital
filters and convolution are defined by
y(n) =

L1
X

h(k) x(n k)

(1)

k=0

where, for an FIR filter, x(n) is a length-N sequence of numbers considered

to be the input signal, h(n) is a length-L sequence of numbers considered to
be the filter coefficients, and y(n) is the filtered output. Examination of this
equation shows that the output signal y(n) must be a length-(N + L 1)
sequence of numbers and the direct calculation of this output requires N L
multiplications and approximately N L additions (actually, (N 1)(L 1)).
If the signal and filter length are both length-N , we say the arithmetic
complexity is of order N 2 , O(N 2 ). The goal is to calculate this convolution
or filtering faster than directly implementing (1). The most common way to
achieve fast convolution is to section or block the signal and use the FFT
on these blocks to take advantage of the efficiency of the FFT. Clearly, one
disadvantage of this technique is an inherent delay of one block length.
Indeed, this approach is so common as to be almost synonymous with fast
convolution. The problem is to implement on-going, non-cyclic convolution
with the finite-length, cyclic convolution that the FFT gives. An answer was
quickly found in a clever organization of piecing together blocks of data using
what is now called the overlap-add method and the overlap-save method.
These two methods convolve length-L blocks using one length-L FFT, L
complex multiplications, and one length-L inverse FFT.

Overlap-Add and Overlap-Save Methods for Fast

Convolution

If one implements convolution by use of the FFT, then it is cyclic convolution

that is obtained. In order to use the FFT, zeros are appended to the signal
1

or filter sequence until they are both the same length. If the FFT of the
signal x(n) is term-by-term multiplied by the FFT of the filter h(n), the
result is the FFT of the output y(n). However, the length of y(n) obtained
by an inverse FFT is the same as the length of the input. Because the DFT
or FFT is a periodic transform, the convolution implemented using this FFT
approach is cyclic convolution which means the output of (1) is wrapped or
aliased. The tail of y(n) is added to it head but that is usually not what
is wanted for filtering or normal convolution and correlation. This aliasing,
the effects of cyclic convolution, can be overcome by appending zeros to
both x(n) and h(n) until their lengths are N + L 1, and by then using
the FFT. The part of the output that is aliased is zero and the result of the
cyclic convolution is exactly the same as non-cyclic convolution. The cost is
taking the FFT of lengthened sequences sequences for which about half
the numbers are zero. Now that we can do non-cyclic convolution with the
FFT, how do we account for the effects of sectioning the input and output
into blocks?

2.1

Overlap-Add

Because convolution is linear, the output of a long sequence can be calculated by simply summing the outputs of each block of the input. What is
complicated is that the output blocks are longer than the input. This is
dealt with by overlapping the tail of the output from the previous block
with the beginning of the output from the present block. In other words, if
the block length is N and it is greater than the filter length L, the output
from the second block will overlap the tail of the output from the first block
and they will simply be added. Hence the name: overlap-add. Figure 1
illustrates why the overlap-add method works, for N = 10, L = 5.
Combining the overlap-add organization with use of the FFT yields a
very efficient algorithm for calculating convolution that is faster than direct
calculation for lengths above 20 to 50. This cross-over point depends on the
computer being used and the overhead needed by use of the FFTs.

2.2

Overlap-Save

A slightly different organization of the above approach is also often used

for high-speed convolution. Rather than sectioning the input and then calculating the output from overlapped outputs from these individual input
blocks, we will section the output and then use whatever part of the input
contributes to that output block. In other words, to calculate the values

y = h x = y1 + y2 +
*

y1 = h x1
*

y2 = h x2
*

y =h x
3
*3

y4 = h x4
*

Figure 1: Overlap-Add Algorithm. The sequence y(n) is the result of convolving x(n) with an FIR filter h(n) of length 5. In this example, h(n) = 0.2
for n = 0, . . . , 4. The block length is 10, the overlap is 4. As illustrated in
the figure, x(n) = x1 (n) + x2 (n) + and y(n) = y1 (n) + y2 (n) + where
yi (n) is result of convolving xi (n) with the filter h(n).

in a particular output block, a section of length N + L 1 from the input

will be needed. The strategy is to save the part of the first input block that
contributes to the second output block and use it in that calculation. It
turns out that exactly the same amount of arithmetic and storage are used
by these two approaches. Because it is the input that is now overlapped
and, therefore, must be saved, this second approach is called overlap-save.
This method has also been called overlap-discard because, rather than
adding the overlapping output blocks, the overlapping portion of the output
blocks are discarded. As illustrated in Figure 2, both the head and the tail
of the output blocks are discarded. It may appear in Figure 2 that an FFT
of length 18 is needed. However, with the use of the FFT (to get cyclic
3

convolution), the head and the tail overlap, so the FFT length is 14. (In
practice, block lengths are generally chosen so that the FFT length N +L1
is a power of 2).
x

y = h x*

y1 = h x1
*

X
x2

y2 = h x2
*

X
x

y =h x
3
*3

X
x4

y4 = h x4
*

X30

Figure 2: Overlap-Save Algorithm. The sequence y(n) is the result of convolving x(n) with an FIR filter h(n) of length 5. In this example, h(n) = 0.2
for n = 0, . . . , 4. The block length is 10, the overlap is 4. As illustrated in
the figure, the sequence y(n) is obtained, block by block, from the appropriate block of yi (n), where yi (n) is result of convolving xi (n) with the filter
h(n).

2.3

Use of the Overlap Methods

Because the efficiency of the FFT is O(N log(N )), the efficiency of the overlap methods for convolution increases with length. To use the FFT for
convolution will require one length-N forward FFT, N complex multiplications, and one length-N inverse FFT. The FFT of the filter is done once and
4

stored rather done repeatedly for each block. For short lengths, direct convolution will be more efficient. The exact length of filter where the efficiency
cross-over occurs depends on the computer and software being used.
If it is determined that the FFT is potentially faster than direct convolution, the next question is what block length to use. Here, there is
a compromise between the improved efficiency of long FFTs and the fact
you are processing a lot of appended zeros that contribute nothing to the
output. An empirical plot of multiplication (and, perhaps, additions) per
output point vs. block length will have a minimum that may be several
times the filter length. This is an important parameter that should be optimized for each implementation. Remember that this increased block length
may improve efficiency but it adds a delay and requires memory for storage.

Linear Convolution Vs Circular Convolution in The DFT
100% (3)
Linear Convolution Vs Circular Convolution in The DFT
4 pages
Fast Fourier Transform
100% (3)
Fast Fourier Transform
32 pages
Course 11-1 - Circular Conv. Revisited
No ratings yet
Course 11-1 - Circular Conv. Revisited
10 pages
Fast Signal Processing Algorithms Week 3
No ratings yet
Fast Signal Processing Algorithms Week 3
58 pages
Overlap Add Save
No ratings yet
Overlap Add Save
8 pages
Digital Overlap Add Filter
No ratings yet
Digital Overlap Add Filter
3 pages
Lab 1 - Block Filtering
No ratings yet
Lab 1 - Block Filtering
9 pages
Topic22 DFT and FFT
No ratings yet
Topic22 DFT and FFT
7 pages
CH18
No ratings yet
CH18
8 pages
Assignment 10 Linear and Circular Convolution: April 3, 2019
No ratings yet
Assignment 10 Linear and Circular Convolution: April 3, 2019
5 pages
DSP BEC502 Module-3
No ratings yet
DSP BEC502 Module-3
29 pages
Adamek_FFT
No ratings yet
Adamek_FFT
20 pages
lec25
No ratings yet
lec25
17 pages
Linear Filtering Using DFT
No ratings yet
Linear Filtering Using DFT
18 pages
Dsp Module 3
No ratings yet
Dsp Module 3
40 pages
Department of Electronics and Communication Engineering: Digital Signal Processing
No ratings yet
Department of Electronics and Communication Engineering: Digital Signal Processing
25 pages
Advanced Digital Signal Processing Part 4: DFT and FFT: Gerhard Schmidt
No ratings yet
Advanced Digital Signal Processing Part 4: DFT and FFT: Gerhard Schmidt
73 pages
Adsp 04 DFT and FFT
No ratings yet
Adsp 04 DFT and FFT
73 pages
9 HRTH
No ratings yet
9 HRTH
22 pages
DSP Unit 2
No ratings yet
DSP Unit 2
137 pages
Overlap and Add and Overlap and Save Methods
No ratings yet
Overlap and Add and Overlap and Save Methods
6 pages
Convolution
No ratings yet
Convolution
6 pages
Review of Discrete Fourier Transform: Cwliu@twins - Ee.nctu - Edu.tw
No ratings yet
Review of Discrete Fourier Transform: Cwliu@twins - Ee.nctu - Edu.tw
46 pages
Module 2-1
No ratings yet
Module 2-1
32 pages
ee123-notes
No ratings yet
ee123-notes
27 pages
DSP 1
No ratings yet
DSP 1
4 pages
Department of Electronics & Communication Engineering: Amrapali Institute OF Technology and Sciences
No ratings yet
Department of Electronics & Communication Engineering: Amrapali Institute OF Technology and Sciences
42 pages
Linear Filtering Methods Based On The DFT
67% (3)
Linear Filtering Methods Based On The DFT
50 pages
EC2302 QB PDF
No ratings yet
EC2302 QB PDF
22 pages
PHYS 352: FFT Convolution
No ratings yet
PHYS 352: FFT Convolution
6 pages
Course Material (Question Bamk)
No ratings yet
Course Material (Question Bamk)
4 pages
QB unit 2
No ratings yet
QB unit 2
4 pages
FOSIP_EXPT5_014_Writeup
No ratings yet
FOSIP_EXPT5_014_Writeup
7 pages
Sol 5
No ratings yet
Sol 5
6 pages
Overlap and Add: Convolution of Finite Duration Signals Convolution With Indefinite Length Signals Augusto Sarti
No ratings yet
Overlap and Add: Convolution of Finite Duration Signals Convolution With Indefinite Length Signals Augusto Sarti
41 pages
Questions On DSP
No ratings yet
Questions On DSP
10 pages
Lecture10_13_DFT
No ratings yet
Lecture10_13_DFT
31 pages
DSP Module 2
No ratings yet
DSP Module 2
37 pages
Digital Filter Structures: K N X B K N y A N y
No ratings yet
Digital Filter Structures: K N X B K N y A N y
54 pages
Unit 22222
No ratings yet
Unit 22222
23 pages
Digital Filter Structures: K N X B K N y A N y
100% (1)
Digital Filter Structures: K N X B K N y A N y
54 pages
DSP Circular Convolution
No ratings yet
DSP Circular Convolution
5 pages
DFT and FFT Chapter 6
No ratings yet
DFT and FFT Chapter 6
45 pages
The Fractional Fourier Transform and Application1
No ratings yet
The Fractional Fourier Transform and Application1
7 pages
DSP Two Marks - Unit I-IV
No ratings yet
DSP Two Marks - Unit I-IV
27 pages
Fast Fourier Transform
100% (1)
Fast Fourier Transform
16 pages
ECT303 - Module 1 - Part - 3
No ratings yet
ECT303 - Module 1 - Part - 3
24 pages
Overlap Save Convolution
No ratings yet
Overlap Save Convolution
4 pages
DSP Matlab Lab Session-2
No ratings yet
DSP Matlab Lab Session-2
17 pages
3 Goertzel's Algorithm
100% (1)
3 Goertzel's Algorithm
41 pages
Parallel Fast Fourier Transform: 159.735 Studies in Parallel and Distributed System
No ratings yet
Parallel Fast Fourier Transform: 159.735 Studies in Parallel and Distributed System
9 pages
Digital Signal Processing Lab Manual ECE 5th Sem 14092023
No ratings yet
Digital Signal Processing Lab Manual ECE 5th Sem 14092023
91 pages
DSP Lecture Notes
No ratings yet
DSP Lecture Notes
24 pages
CH 11
No ratings yet
CH 11
20 pages
Fourier Analysis
From Everand
Fourier Analysis
Roger Ceschi
No ratings yet
Filter Bank: Insights into Computer Vision's Filter Bank Techniques
From Everand
Filter Bank: Insights into Computer Vision's Filter Bank Techniques
Fouad Sabry
No ratings yet
Fundamentals of Electronics 3: Discrete-time Signals and Systems, and Quantized Level Systems
From Everand
Fundamentals of Electronics 3: Discrete-time Signals and Systems, and Quantized Level Systems
Pierre Muret
No ratings yet
Some Case Studies on Signal, Audio and Image Processing Using Matlab
From Everand
Some Case Studies on Signal, Audio and Image Processing Using Matlab
Dr. Hedaya Mahmood Alasooly
No ratings yet
Digital and Kalman Filtering: An Introduction to Discrete-Time Filtering and Optimum Linear Estimation, Second Edition
From Everand
Digital and Kalman Filtering: An Introduction to Discrete-Time Filtering and Optimum Linear Estimation, Second Edition
S. M. Bozic
No ratings yet
Signal, Audio and Image Processing
From Everand
Signal, Audio and Image Processing
Dr. Hidaia Mahmood Alassouli
No ratings yet
Fast Separable Gabor Filter - Areekul
No ratings yet
Fast Separable Gabor Filter - Areekul
7 pages
Divide and Conqure
No ratings yet
Divide and Conqure
114 pages
Moment Distribution Beam
No ratings yet
Moment Distribution Beam
10 pages
CPM Test - Fadhli Nadhif - Kirana Sekar Arini
No ratings yet
CPM Test - Fadhli Nadhif - Kirana Sekar Arini
2 pages
SVM
No ratings yet
SVM
35 pages
Assignment No 2 - OCR CNN
No ratings yet
Assignment No 2 - OCR CNN
2 pages
Blob Detection Final Version PDF
No ratings yet
Blob Detection Final Version PDF
14 pages
Corner and Interest Point Detection
No ratings yet
Corner and Interest Point Detection
37 pages
Full Numerical Analysis 2nd Edition Walter Gautschi (Auth.) PDF All Chapters
100% (12)
Full Numerical Analysis 2nd Edition Walter Gautschi (Auth.) PDF All Chapters
70 pages
Adaptive Signal Processing Techniques For Extracting Abdominal Fetal Electrocardiogram
No ratings yet
Adaptive Signal Processing Techniques For Extracting Abdominal Fetal Electrocardiogram
6 pages
U02Lecture06 Regression
No ratings yet
U02Lecture06 Regression
25 pages
Part A. System of Linear Equations: Computer Lab # 3
No ratings yet
Part A. System of Linear Equations: Computer Lab # 3
4 pages
New Daa Rahul
No ratings yet
New Daa Rahul
24 pages
Module1 ECO-598 AI & ML Aug 21
No ratings yet
Module1 ECO-598 AI & ML Aug 21
45 pages
Tutorial10 Multi Vehicle Routing With Time Windows PDF
100% (1)
Tutorial10 Multi Vehicle Routing With Time Windows PDF
114 pages
Assignment No. B5
No ratings yet
Assignment No. B5
10 pages
Hs S 10 2 Hs S - . 0 2 0 2: Step Response
No ratings yet
Hs S 10 2 Hs S - . 0 2 0 2: Step Response
2 pages
Lab2 - TMC-đã G P
No ratings yet
Lab2 - TMC-đã G P
41 pages
Signals Question Paper
No ratings yet
Signals Question Paper
1 page
The Z Transform, System Transfer Function, Poles and Stability
No ratings yet
The Z Transform, System Transfer Function, Poles and Stability
18 pages
Multi-Objective Optimization Using Particle Swarm Optimization
No ratings yet
Multi-Objective Optimization Using Particle Swarm Optimization
46 pages
Hair PPT Ch05
No ratings yet
Hair PPT Ch05
18 pages
Lecture 2 EE675
No ratings yet
Lecture 2 EE675
4 pages
Textbook Solutions Expert Q&A Study Pack Practice
No ratings yet
Textbook Solutions Expert Q&A Study Pack Practice
4 pages
Python Lab
No ratings yet
Python Lab
13 pages
Image Processing and Pattern Recognition
No ratings yet
Image Processing and Pattern Recognition
2 pages
R C L Z Z R L Z: Lecture 28: M-Derived Filter Section
No ratings yet
R C L Z Z R L Z: Lecture 28: M-Derived Filter Section
3 pages
Algorithm up to 7 lectures
No ratings yet
Algorithm up to 7 lectures
13 pages
MAT 275 MATLAB Assignment Lab2
No ratings yet
MAT 275 MATLAB Assignment Lab2
10 pages
EEC 605 VLSI CAD Final Exam 1436-1437 Term 1 Cairo University
No ratings yet
EEC 605 VLSI CAD Final Exam 1436-1437 Term 1 Cairo University
3 pages