Multiple hyperplane detector for implementing the asymptotic Bayesian decision feedback equalizer

(1)

Multiple Hyperplane Detector for Implementing the Asymptotic Bayesian

Decision Feedback Equalizer

S. Chen , L. Hanzo and B. Mulgrew

Department of Electronics and Computer Science

University of Southampton, Southampton SO17 1BJ, U.K.

Department of Electronics and Electrical Engineering

University of Edinburgh, Edinburgh EH9 3JL, U.K.

Abstract— A detector based on multiple-hyperplane partitioning of

the signal space is derived for realizing the optimal Bayesian decision feedback equaliser (DFE). It is known that the optimal Bayesian deci-sion boundary separating any two neighbouring signal classes is asymp-totically piecewise linear and consists of several hyperplanes, when the signal to noise ratio (SNR) tends to infinity. The proposed technique determines these hyperplanes and uses them to partition the observa-tion space. The resulting detector can closely approximate the optimal Bayesian detector, at an advantage of considerably reduced decision complexity.

I. INTRODUCTION

For the class of DFEs that employ a symbol-decision finite-memory structure with a fixed decision delay, the opti-mal solution is the Bayesian detector [1]–[3]. The complex-ity of the optimal Bayesian DFE is determined by the factor of , where being the size of the symbol constellation and the channel impulse response (CIR) length. As the complexity of this optimal detector increases exponentially with the size of symbol set , the conventional or linear-combiner DFE [4]–[6] is often used in practice to provide a trade-off between performance and detector complexity.

For the 2-PAM case, the performance difference between the conventional and optimal Bayesian DFEs has a geomet-ric explanation: a linear-combiner DFE can only partition the observation space with a hyperplane while the Bayesian detector can do so with a hypersurface [2]. Asymptotically, as the SNR tends to infinity, the Bayesian hypersurface be-comes piecewise linear and is made up of a set of hyper-planes [7]. In practice, at large rather than infinite SNR, the Bayesian decision hypersurface can closely be approximated by a multiple-hyperplane form. This motivated our previous research on multiple-hyperplane detector [8].

Signal space partitioning techniques for binary channels have been developed from different motivations. Kim and Moon [9],[10] developed a novel partitioning design. Their technique determines a set of hyperplanes which separate clusters of noiseless channel states. The convex regions asso-ciated with individual states are constructed by intersecting hyperplanes. The overall decision region is then formed from

these convex regions. The decision complexity and perfor-mance of the detector is controlled during design by a spec-ified minimum separating distance. The main drawback of their design is that it involves extensive computational effort during the design process. In our previous work [8], we have proposed a much simpler alternative design to explicitly re-alize the asymptotic Bayesian decision boundary.

This paper extends this multiple-hyperplane detector de-sign to -PAM channels. Based on a geometric transla-tion property for the sets of noiseless channel states, the asymptotic Bayesian boundary for separating any two neigh-bouring signal classes can be deduced, and this allows us to extend the binary case design [8] to the general -PAM case. Similar to the binary case, the design of our multiple-hyperplane detector for -PAM channels is straightforward, and guarantees to realize the asymptotic Bayesian DFE de-tector. Furthermore, the reduction in detector complexity with signal space partitioning approach is more significant

for .

II. THE PROBLEM FORMULATION

We will assume that the real-valued channel generates the received signal samples of:

!

#"%$& ')(*+,

(1)

where

are the CIR taps, the Gaussian white noise -(*+/. has zero mean and variance012 , and the

-PAM symbol

takes the values from the set:354

$6"

"78,97;: $<:

.

. The SNR is defined as =?> @

1

BA

0

1CD

0

12 , where

0

1C is the symbol variance. The DFE uses the information present in the noisy observation vectorE

F+GIHJKFL" 7KMBMNM&O"QPR'7!SUT

and the past detected symbol vector

V

WXN+YH[Z FK"G\]"^7_MBMBMZ FK"G\]"

`SaT

to produce an estimate Z Fb"c\*

of "c\*

, where\ ,P

and are the decision delay, the feedforward and feedback orders, respectively. We will choose \d

"e7

, Pf

and

"Y7

(2)

choice is sufficient to guarantee a desired linear separability for different signal classes [5].

The observation vectorE +

can be expressed as [5]:

E

gih

W_jk+6'lh

1

WXN+ ')mK+,

(2)

whereW_jk+OnH +MBMBM o"\*!SUT

, W_XN+OnH o"\O" 7KMBMNM c"p\q"

!ST

,m]F+YH(*KMBMNMr(*c"PY'Q7_`ST , and

h

st

t

u

MBMBM

v

. ._. .._. ..

. . .. . ..

v

MBMBM

v

wyx

x

z

(3)

h

1

st

t

u

v v

MBMNM

v

@

v . .

. ...

1

. ._. v

..

. . .. . .. v

MNMBM

1

_@ wyx

x

z

(4)

are theP|{)\}'~7

andP{

CIR matrices, respectively. Assuming correct past decisions, we have

E

gih

W j +6'lh

1

V

W X + ')mK+

(5)

Thus the decision feedback translates the original spaceE + into a new space

:

F+

4

E

"h

1

V

W

X

K

(6)

Let the j

5J possible sequences of W j F+

be W j ,

7:~:

j

. The set of the noiseless channel states in the translated signal space, namely,

4

-_ Gh

Wj/,97}::

j.

(7)

can be partitioned into conditional subsets:

c

4

-_

5

F;"\* N

.8,97:$:

(8)

The optimal Bayesian DFE [3] can now be summarized. The

decision variables are given by

+g

&/ ¢¡

(

q£!¤ y¥r¡a¦

¤

£!§ §!¨

§

©

,97q:Q$g:

,

(9)

and the minimum-error-rate decision is defined by

Z #"ª\«<

a¬ª¯®y°²±

$r³´µ¶²·e¸¹µº »

»¼

-

F+r/.

(10)

Table I gives the complexity of this optimal detector.

TABLE I

COMPARISON OF DECISION COMPLEXITY FOR THE FULLBAYESIAN AND MULTIPLE-HYPERPLANE DETECTORS.

Full Bayesian Multiple-hyperplane Additions ½¾«¿?ÀoÁ

¯Â

À Ã¢¾8¿ÄÀ

Â

½ÅÇÆ

Multiplications Ã¢¾8¿ÄÈÅÀoÁ

¾«¿_Æ ÉÊ

À

Á

—

III. MULTIPLE-HYPERPLANE DETECTOR

We first establish a geometric translation property for any two neighbouring subsets of channel states.

Lemma 1: For 7l:Ë$o:

"57

, the subset

6

is a translation of

by the amountÌÍÇÎ!Ï

#

6

# ¢

'

Ì ÍÇÎ!Ï

,

(11)

whereÌ+ÍÇÎ!Ï ÐH

? MNMBM

SUT

. Furthermore,

and

6

are linearly separable.

Proof: From the definitions of

¢ and h

, for any Ñ

¢

, there exists a

6

such that

Ñ '

J

"

_

Ì Í[Î!Ï

Ñ

'

Ì ÍÇÎ!Ï , which implies (11). To prove the linear separability, consider the hyperplane

Ò

'lÓ

4

ÔZ TlÕ

'

Õ

"%$!Ö Ì ÍÇÎ!Ï

Ö

v

(12)

withÔ×ÙØZ vqv

MNMBM

v

/Ú?Û

T

. For anyÑ

¢

and any

6

, we have

Ò

_Ñ 'ÜÓ

Ý"}7ßÞ

v

and

Ò

'Ó

à

7

v

. Fig. 1 illustrates this lemma graphically.

A. Asymptotic optimal boundary for two neighbouring classes

Although it is always possible to construct a hyperplane to correctly separate

¢ from

6

, the optimal decision boundaryá

that separates

¢ from

6

cannot gener-ally be approximated by a single hyperplane. Without the loss of generality, consider $â

¼

1

, the optimal decision

boundaryápã

§

for separating

ã

§

and

ã

§

6

. Because of lemma 1, when SNRä å (or 0

12

ä

v

), the influence

asymptotic Bayesian boundary

separating hyperplanew^

rev

a 2

R(i)

R(i+1)

(3)

from all the other for$#æ

¼

1

and$#æ

¼

1

'~7

vanishes much more quickly, and it effectively becomes a two-class problem. We have the following definition [8].

Definition 1: A pair of opposite-class channel states

ã

§

6

,

ã

§

is said to be dominant if

ç

}

ã

§

@è

ã

§

6

, Læ

and ¹æ

:

é

"

é

1

é

"

é

1

,

(13)

where

è

denotes the union operator and

'

(14)

The following properties ofápã

§

are useful in the derivation of a multiple-hyperplane detector (see [7]). A necessary con-dition for a point?ê

ápã

§

is

ê

'

'Ëë

"

ìí

,

(15)

whereî

í

denotes an arbitrary vector in the subspace orthog-onal toî ,

and

are a pair of dominant states; and the sufficient conditions forê

ápã

§

are

é

?ê

"

é

1

Þ

é

?ê

"

Ñ

é

1

,

ç

Ñ

ã

§

6

,

Ñ

æ

,

(16)

é

?ê

"

é

1

Þ

é

?ê

"

é

1

,

ç

ã

§

,

¹æ

,

(17)

é

ê

"

é

1

é

ê

"

é

1

(18)

The following lemma describingápã

§

in the asymptotic case of0

12

ä

v

is a direct consequence of the necessary and suf-ficient conditions (15)-(18).

Lemma 2: Asymptotically, the optimal decision boundary

ápã

§

separating

ã

§

and

ã

§

6

is piecewise linear and made up of a set ofï hyperplanes. Each of these hyperplanes is defined by a pair of dominant states, the hyperplane is or-thogonal to the line connecting the pair of dominant states and passes through the midpoint of the line.

B. Multiple-hyperplane detector for two neighbouring classes

According to lemma 2, a multiple-hyperplane detector can be constructed to partition the signal space into the two re-gions of Z c"p\*ð:"}7

andZ F}"p\*ðñ~7

, respectively. The detector will consist of ï linear discriminant functions and a many-to-one Boolean mapper, similar to the binary case given in [8]. For completeness, the design procedure for this multiple-hyperplane detector is produced here with the nec-essary modifications:

Step 1 Select all theï pairs of dominant channel states from the two subsets

ã

§

and

ã

§

6

. For each pair, compute a hyperplane that separates these two opposite-class states. Step 2 A Boolean logic function is obtained to make a de-cision based on the location of the observation vector

+ relative to each hyperplane. This is achieved by first defining a convex region associated with each state in a given class, e.g. the class

ã

§

J

, and then forming a union of these regions.

From (15)–(18), it is seen that pairs of dominant states which define the asymptotic boundary can be selected using:

ÆLòó;

FORôNõyö÷ øúùqû

Ã

ã

§

öKü

Å

FORô õaý÷ þ ùû

Ã

ã

§

Å

ÿ

ò 8¡

ö

¢¦8¡

;

ðò&ô õyö÷

ø

Â

ÿ

; IF=

&ôNõyö÷

Â

ÿ

8ôNõyö÷ ùû

Ã

ã

§

öKü

Å

ò

A

AND

=

&ô õaý÷

Â

ÿ

8ô õaý+÷ ùû

Ã

ã

§

Å

ò

A

Æ += 1; û! #"

Ã¢ôNõyö÷

$

ôNõaý÷

$

Å%òpÃ¢ôNõyö÷

ø

ôNõaý÷

þ

Å;

END IF NEXTôBõaý+÷

þ

NEXTôNõyö÷

ø

Each pair

Ñ

,

Ñ

ð '&()+*

determines a hyperplane

Ò

Ñ

gÜÔ T

Ñ

'-,

Ñ

v

(19)

that is a part of the asymptotic optimal decision boundary. The weight vectorÔ

Ñ and bias

,

Ñ of the hyperplane can be computed straightforwardly as:

Ô

Ñ

=

Ñ

"

Ñ

A

é

Ñ

"

Ñ

é

1

(20)

and

,

Ñ

"

Ñ

"

Ñ

rT<

Ñ

'

Ñ

é

Ñ

"

Ñ

é

1

(21)

The hyperplane defined by (20) and (21) is a canonical hy-perplane with

Ñ

,

Ñ

as its two support vectors [6], and has the property that Ò

Ñ

7

andÒ

Ñ

p "}7 . The following definition is useful in the optimal multiple-hyperplane partitioning:

Definition 2: A state

ã

§

è

ã

§

6

is said to be sufficiently separable byÒ

Ñ if

Ò

Ñ can separate

correctly with a “canonical distance”.

ÔT

Ñ

<', Ñ .

ñ 7 .

Notice that

ã

§

6

is sufficiently separable by Ò

Ñ

iffÔT

Ñ

}'/,

Ñ

ñ 7

. Similarly,

ã

(4)

separable byÒ Ñ iff Ô T Ñ '0, Ñ : "}7

. Number the states in

ã § as to

132 and those in

ã § J as to 132 , where C j D

. All the states in

ã § 6è ã § 6 are tested to see if they can be separated sufficiently byÒ

Ñ, and this generates the following “separability” matrix:

ô õaý+÷ ü ô õ¢ý+÷ 45454 ô õaý+÷ 6 2 ô õyö÷ ü 45474 ô õyök÷ 6 2 8

ü 9ü;:ü 9*ü;:

45454 9 ü;: 6 2 9 ü;: 6 2 öü 45474 9 ü;: 6 2 . .

. ... ... 45454

. .

. ... 45474

. . . 8 $ 9 $

:ü 9

$ : 45454 9 $ : 6 2 9 $ : 6 2 öKü 45474 9 $ : 6 2

where< Ñ>= c

-v

,B7.

. The rule in generating this matrix is: if a state can sufficiently be separated byÒ

Ñ, the corresponding binary index< Ñ>=

à7

; otherwise< Ñ>= à

v

.

Define the half-space ?

Ñ 4 -_ cÒ Ñ Üñ v . . To

construct a convex region @

A covering a state

A ã § 6

, select those hyperplanes which can sufficiently

separate

A and denote

B A 4 -DC < Ñ>= A 1 2 7. . Then @

A is obtained by the intersection of all the

? with¹ B A @ A E GF 8¡ ? (22)

In fact, a subset of the hyperplanes defined byB

A is enough to construct @

A , provided that every state in

ã § can sufficiently be separated by at least one hyperplane in the subset. The overall decision region@

associated with the decision Z F;"\*ðñ~7

is simply @ 1 2 H A @ A (23)

The Boolean logic function for the multiple-hyperplane de-tector is now completed. Define the threshold dede-tector output

I

+

for a linear discriminant function

Ò F+r : I + 4 KJ 78, Ò +ðñ v , v , Ò 8 +ðÞ v (24)

A Boolean logic valueL

A

r

indicating whether F+

@

A or not is obtained via a logic AND operation of

-I r ¹ B A . : L A + 4 E GF M«¡ I 8 +K (25)

A Boolean logic value N

+

indicating whether F+%

@

(that is, Z F¯"ß\*ðñ~7

) or not is obtained via a logic OR

operation of-OL

A

+.

for allP :

N F+r 4 1Q2 H A L A +K (26)

C. Multiple-hyperplane detector for classes

According to lemma 1, if

Ò

Ñ

is a hyperplane that forms a part of the asymptotic decision boundary for separating

ã § and ã § 6 , Ò Ñ 'pÓ

is a hyperplane that is a part of the asymptotic boundary for separating

¢ and 6 , whereÓ R " $&

Ì Í[Î!Ï . In fact, the asymptotic decision boundary for separating

J and 1

is the translation of the asymptotic decision boundary for separating

and 6

by an amountÌÍÇÎ!Ï . Note that

Ò Ñ +6'lÓ 4 iÔ T Ñ

F+ 'SR , Ñ>=

R

Ò

Ñ

F+ 'SR , Ñ>= , (27) whereR, Ñ>= Ô T Ñ Ó 'T,

Ñ. To indicate which asymptotic de-cision boundary, the index $

, 7 : $Q:

"e7

, is used. The half-space defined by the hyperplaneÔT

Ñ '/R, Ñ>= v is ? = ¢ Ñ 4 -N ÔT Ñ ' R, Ñ>= ñ v .

, the convex region covering = ¢ A J is@ =

A , and the decision region for Z F¹")\*cñ

6 is@ =

. The corresponding Boolean logic value for the linear discriminant function R

Ò

Ñ

+' R, Ñ=

is denoted by

I Ñ>= + I Ñ F+´'Ó

, the Boolean logic value indicating whether

+l @ = ¢

A or not is denoted by L A = F+% L A +G'dÓ

, and the Boolean logic value indicating whether

F+I @ = ¢

or not is denoted by

N +ß N +ð' Ó

. The resulting multiple-hyperplane detector can now be summarized. At

:

FORòpÈ toÆ

COMPUTE U

8

ÃV?ÅòXWZY ôNÃV?Å;

Â

È

FOR«òpÈ toÆ

COMPUTE U 8 Ã\VÅ«Ä U] :^ ; NEXT

COMPUTE Boolean logic value_

^

ÃV?Å;

IF ( NOT_

^

ÃV?Å )`

a b Ã\V Âdc Å]ò b ^ ; BREAK; e

ELSE IF ( ([+òbòoÀ

Â

È )`

a b Ã\V Âdc Å]ò bgf ; BREAK; e NEXT[

As all the values ofR, Ñ=

are pre-computed at the design stage, the detector complexity is what is required to compute theï linear discriminant functions, as listed at Table I. Thus the complexity of this multiple-hyperplane detector is ï times of the linear-combiner DFE. As long as ï

Þ

(5)

the optimal Bayesian detector. Theï pairs of dominant states are selected from two subsets, which have8_@ states. Empirically, we have found usuallyï

Þ

N@ .

IV. ASIMULATION EXAMPLE

An example was used to test the multiple-hyperplane de-tector, in which 4-PAM symbols were transmitted over a 3-tap channel specified by the CIR Ì

H

v

hÜ7 vv

iSaT . The structure parameters of the DFE were accordingly set

to P kj

, \d

and

. The channel state set

had

jdliGh

states. Five pairs of dominant states were found from the subsets

1

and m

, giving rise to 5 sep-arating hyperplanes. The sepsep-arating matrix is listed in Ta-ble II, from which a required Boolean logic function was obtained. For instance, the states

=

1

to

=

1

n in

om

are separated from the opposite-class states

1

by the two hyperplanes Ò

andÒ

1

;

=

1

p and

=

1

q require

Ò

m and Òsr

for separation from

1

; and

=

1

t to

=

1

n are sepa-rated from

1

by Òsu

. The symbol error rate (SER) per-formance of this multiple-hyperplane detector is compared with those of the Bayesian and conventional minimum mean square error (MMSE) DFEs in Fig. 2 under different SNR conditions, where it can be seen that there is hardly any SER performance difference between the multiple-hyperplane and full Bayesian detectors. For this example, the full Bayesian detector requires 380 additions, 256 multiplications and 64 (Dv

function evaluations to detect a symbol. The multiple-hyperplane detector, however, needs only 25 additions and 15 multiplications to make a decision, which is less than 6% of the complexity required by the full Bayesian DFE.

V. CONCLUSIONS

We have extended a signal space partitioning technique, originally developed for binary channels, to -PAM chan-nels. A scheme is presented to construct a multiple-hyperplane partitioning that is asymptotically optimal. The resulting detector consists of a set of linear discrimi-nant functions and associated Boolean logic values, and it has much lower decision complexity compared with the Bayesian detector. Although this multiple-hyperplane

detec-TABLE II

SEPARABILITY MATRIX FOR THE TWO SUBSETS OF CHANNEL STATES.

§

¡

xw!¡

1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

0 0 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0

1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1

0 0 0 0 0 0 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0 0

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1

-6 -5 -4 -3 -2 -1 0

16 18 20 22 24 26 28 30

log10(Symbol Error Rate)

y

Signal to Noise Ratio (dB) MMSE

AB FB

Fig. 2. Performance comparison of the classical MMSE DFE (MMSE), the multiple-hyperplane detector (AB) and the full Bayesian DFE (FB) with detected symbols being fed back.

tor achieves the optimal Bayesian performance only at the asymptotic case of infinite SNR, in practice, it can close ap-proximate the optimal performance under finite SNR condi-tions.

REFERENCES

[1] D. Williamson, R.A. Kennedy and G.W. Pulford, “Block decision feedback equalization,” IEEE Trans. Communications, Vol.40, No.2, pp.255-264, 1992.

[2] S. Chen, B. Mulgrew and S. McLaughlin, “Adaptive Bayesian equaliser with decision feedback,” IEEE Trans. Signal Processing, Vol.41, No.9, pp.2918–2927, 1993.

[3] S. Chen, S. McLaughlin, B. Mulgrew and P.M. Grant, “Bayesian deci-sion feedback equaliser for overcoming co-channel interference,” IEE

Proc. Communications, Vol.143, No.4, pp.219–225, 1996.

[4] S.U.H. Qureshi, “Adaptive equalization,” Proc. IEEE, Vol.73, No.9, pp.1349–1387, 1985.

[5] S. Chen, B. Mulgrew, E.S. Chng and G. Gibson, “Space translation properties and the minimum-BER linear-combiner DFE,” IEE Proc.

Communications, Vol.145, No.5, pp.316–322, 1998.

[6] S. Chen, S. Gunn and C.J. Harris, “Decision feedback equalizer design using support vector machines,” IEE Proc. Vision, Image and Signal

Processing, Vol.147, No.3, pp.213–219, 2000.

[7] R.A. Iltis, “A randomized bias technique for the importance sam-pling simulation of Bayesian equalizers,” IEEE Trans.

Communica-tions, Vol.43, No.2/3/4, pp.1107–1115, 1995.

[8] S. Chen, B. Mulgrew and L. Hanzo, “Asymptotic Bayesian decision feedback equalizer using a set of hyperplanes,” IEEE Trans. Signal

Processing, Vol.48, No.12, pp.3493–3500, 2000.

[9] Y. Kim and J. Moon, “Delay-constrained asymptotically optimal de-tection using signal-space partitioning,” in Proc. ICC’98 (Atlanta, USA), 1998.

[10] Y. Kim and J. Moon, “Multi-dimensional signal space partitioning us-ing a minimal set of hyperplanes for detectus-ing ISI-corrupted symbols,”