MMSE Equalizer Design: Phil Schniter March 6, 2008
MMSE Equalizer Design: Phil Schniter March 6, 2008
Phil Schniter
March 6, 2008
w̃[k]
a↑ [k] m̃[k] ṽ[k] y↑ [k]
a[m] ↑P g[k] h̃[k] + q[k] ↓P y[m]
For a trivial channel (i.e., h̃[k] = δ[k]), we know that the use of square-root raised-
cosine (SRRC) pulses at transmitter and receiver suppresses inter-symbol interference
(ISI) and maximizes the received signal-to-noise ratio (SNR) in the presence of white
noise {w̃[k]}. With a non-trivial channel, however, we need to re-visit the design of the
receiver pulse {q[k]}, which is called an “equalizer” when it tries to compensate for the
channel.
Here we design the minimum mean-squared error (MMSE) equalizer coefficients {q[k]}
assuming that the input symbols {a[n]} and the noise {w̃[k]} are white random sequences
that are uncorrelated with each other. This means that
for some positive variances σa2 and σw2 . For practical implementation, we will consider a
causal equalizer with length Nq , so that q[k] = 0 for k < 0 and k ≥ Nq . To simplify the
derivation, we combine the transmitted pulse g[k] and the complex-baseband channel h̃[k]
into the “effective channel”
h[k] := g[k] ∗ h̃[k]
and assume that this effective channel is causal with finite length Nh . Throughout, we
assume that the effective channel coefficients {h[k]}, as well as the variances σa2 and σw2 ,
are known. Learning these quantities is a separate (and often challenging) problem.
Notice that, because the effective channel is causal and length Nh , it can delay the
upsampled input signal a↑ [k] by between 0 and Nh − 1 samples. Since it is difficult to
compensate for this delay with a causal equalizer, we will allow for the presence of end-
to-end system delay. Thus, our goal is to make y[m] ≈ a[m − ∆] for some integer ∆ ≥ 0.
Throughout the design, we assume that ∆ has been chosen for us, although eventually
we shall see how to optimize ∆.
1
Recall that if y[m] = a[m − ∆], then we will be able to make perfect decisions on the
symbols a[m] from the output sequence y[m]. However, we would never expect a perfect
output in the presence of noise. Thus, we take as our objective the minimization of the
error signal
e[m] := y[m] − a[m − ∆].
In particular, we minimize the mean squared error (MSE)
E := E{|e[m]|2 }.
We saw earlier that, if e[m] can be modelled as a zero-mean Gaussian random variable
(with variance σe2 = E), then the symbol error rate (SER) decreases as E/σa2 decreases.
Thus, there is good reason to minimize E.
Our eventual goal is to derive an expression for the MSE E from which the equalizer
coefficients can be optimized. But first we notice that, due to the stationary of {a[m]}
and {w̃[k]} (i.e., the time-invariance of their statistics) and the LTI nature of our filters,
the statistics of {e[m]} will also be time invariant, allowing us to write E = E{|e[0]|2 }.
This allows us to focus on e[0] instead of e[m], which simplifies the development.
The next step is then to find an expression for e[0]. From the block diagram,
where in (7) we used the fact that a↑ [l] = 0 when l is not a multiple of P and that
a↑ [nP ] = a[n]. Though (7) is written with an infinite summation, it turns out that most
values of n will not contribute. Due to the causality and length-Nh of h[n], the values of
n which lead to contributions to ṽ[k] ensure that
2
Next we use a vector formulation to simplify the development. We start by rewriting
(5) as
ṽ[0]
ṽ[−1]
e[0] = q[0] q[1] · · · q[Nq − 1] .. − a[−∆], (12)
| {z } .
:= q T ṽ[1 − Nq ]
Defining δ ∆ as the column vector with a 1 in the ∆th place1 and 0’s elsewhere, we can
write δ T∆ a = a[−∆], which yields the final expression for the time-0 error:
3
where in (20) we transposed the scalar quantities on the right (e.g., q T w = (q T w)T =
wT q) and in (21) we distributed the complex conjugate, using the “Hermitian transpose”
notation (·)H := (·)T ∗ . Expanding (21) gives
E = E q T wwH q ∗ + E q T waH (H H q ∗ − δ ∆ )
+ E (q T H − δ T∆ )awH q ∗ + E (q T H − δ T∆ )aaH (H H q ∗ − δ ∆ ) (22)
T
H ∗ T
H H ∗
= q E ww q + q E wa (H q − δ ∆ )
+ (q T H − δ T∆ ) E awH q ∗ + (q T H − δ T∆ ) E aaH (H H q ∗ − δ ∆ ), (23)
where I denotes the identity matrix. The same reasoning can be used to show
E{aaH } = σa2 I.
Similarly,
E{awH } = a matrix whose (m, n)th entry equals E{a[−m]w̃∗ [−n]} = 0 (29)
= 0, (30)
4
which shows that the MSE consists of σw2 kqk2 (due to noise) plus σa2 kH T q − δ ∆ k2 (due
to ISI and error in the end-to-end gain). In general, minimizing the sum of these two
components requires making a tradeoff between them. For example, setting q = 0 would
cancel the noise component but result in an end-to-end gain of zero. Similarly, choosing
q such that H T q = δ D (if this is even possible) can amplify the noise.
To proceed further, we rewrite the MSE as follows.
By completing-the-square,2 the MSE expression (34) can be put into a convenient form,
from which the MSE-minimizing q ∗ will become readily apparent. To do this, it is essential
that the matrix A can3 be decomposed as A = BB H , so that
E = q T BB H q ∗ − q T b − bH q ∗ + σa2 (35)
H
H ∗
= q T BB q − q T BB b − b B −1 −H
B q + σa2
H ∗
(36)
H H
T
= (q B − b B −H H ∗ −1
)(B q − B b) − b B B b + −H −1
σa2 (37)
| {z }
= A−1
= (B H q ∗ − B −1 b)H (B H q ∗ − B −1 b) + σa2 − bH A−1 b . (38)
| {z } | {z }
≥0 Emin
Note that the equalizer parameters only affect the first term in (38), which is non-negative.
So, to minimize E via choice of g, the best we can do is to set the first term in (38) to
zero, at which point the second term specifies the minimum possible E. Thus, the MSE-
minimizing equalizer parameters are those which give B H q ∗min = B −1 b, i.e.,
5
Finally, we can see how the delay ∆ can be optimized. Notice from (43) that the term
−1
σw2
δ T∆ H H 2
I + HH H Hδ ∆
σa
2
−1
th H σw H
is simply the ∆ diagonal element of the matrix H + HH σa2
H, and that Emin
I
decreases as this term gets bigger. Thus, the MSE-minimizing
2 ∆ is simply the index of
−1
H σw H
the maximum diagonal element of the matrix H σ 2 I + HH H.
a