Hardware Implementation of Adaptive SVD Beamforming Algorithm for MIMO-OFDM Systems

(1)

Procedia Engineering 64 ( 2013 ) 84 – 93 Available online at www.sciencedirect.com

1877-7058 © 2013 The Authors. Published by Elsevier Ltd. Open access under CC BY-NC-ND license. Selection and peer-review under responsibility of the organizing and review committee of IConDM 2013 doi: 10.1016/j.proeng.2013.09.079

ScienceDirect

International Conference On DESIGN AND MANUFACTURING, IConDM 2013

Hardware Implementation of Adaptive SVD Beamforming

algorithm for MIMO-OFDM Systems

Sellathambi. D

a

_*

,

Srinivasan. J

b

, Rajaram. S

c a_{PG student,ME Communication Systems, Thiagarajar College of Engineering, Madurai, India}

b_{PG student,ME Wireless Technologies, Thiagarajar College of Engineering, Madurai, India} c_{Associate Professor,Thiagarajar College of Engineering, Madurai, India}

Abstract

This paper presents an adaptive hardware design for computing Singular Value Decomposition (SVD) of the radio communication channel characteristic matrix. The hardware developed is suitable for computing the SVD of a maximum of 4 ×4 real-value matrices used in MIMO-OFDM standards, such as the IEEE 802.11n. The information of the right singular-vector matrix can be fed back to the transmitter for beamforming to improve the error performance when facing the channel matrix with rank deficiency problem. The time for the SVD of one complex matrix is limited to about 400 ns. When the channels have short coherent time, the information derived by SVD should be sent from the receiver to the transmitter as soon as possible to keep the beamforming performance. The algorithms to decompose the channel matrix were implemented using the Virtex-VI xc6vlx75t-3ff484 FPGA from Xilinx as the target device with minimum path delay. The implementation concentrates on utilizing the features of the FPGA to speed up operations and reduce the area required.

Selection and peer-review under responsibility of the organizing and review committee of IConDM 2013.

Keywords: Beamforming; multiple-input multiple-output (MIMO)-orthogonal frequency division multiplexing (OFDM); singular value decomposition (SVD)

1. Introduction

Increasing demand for high-performance 4G broadband wireless is enabled by the use of multiple antennas at

* Corresponding author. Tel.: +91-967 787 0854; E-mail address: [email protected]

Comment [S1]: Elsevier to u

and page numbers.

(2)

both base station and subscriber ends. Multiple antenna technologies enable high capacities suited for Internet and multimedia services, and also dramatically increase range and reliability.

Multiple antennas at the transmitter and receiver provide diversity in a fading environment. By employing multiple antennas, multiple spatial channels are created, and it is unlikely all the channels will fade simultaneously. At the receiver, multiple antennas are used to separate spatial multiplexing streams and for interference mitigation, which makes aggressive frequency reuse a reality.

Transmitter precoding for multiple-input multiple-output orthogonal frequency division multiplexing (MIMO-OFDM) is an effective way of leveraging the diversity gains afforded by a multiple transmit-multiple receive antenna system in a frequency selective environment. In point-to-point multiple-input multiple-output (MIMO) systems, a transmitter equipped with multiple antennas communicates with a receiver that has multiple antennas. Most classic precoding results assume narrowband, slowly fading channels, meaning that the channel for a certain period of time can be described by a single channel matrix which does not change faster. In practice, such channels can be achieved, for example, through OFDM. The precoding strategy that maximizes the throughput, called channel capacity, depends on the channel state information available in the system. If the channel matrix is completely known, singular value decomposition (SVD) precoding is known to achieve the MIMO channel capacity. In this approach, the channel matrix is diagonalized by taking an SVD and removing the two unitary matrices through pre- and post-multiplication at the transmitter and receiver, respectively. Then, one data stream per singular value can be transmitted (with appropriate power loading) without creating any interference whatsoever.

1.1. Literature Review

In the last few years, numerous algorithms have been developed for precoding. SVD based algorithms are discussed below. In this [6] paper tells a hardware-efficient VLSI architecture for steering matrix computation using a hardware optimized Givens rotation SVD algorithm. It utilizes bidiagonalization, diagonalization, and Givens rotation to achieve high processing throughput. The resulting VLSI implementations with 0.18um technology requires 3.3us to complete the SVD of one complex matrix, which is still more than 8 times the critical requirement (i.e., 400 ns) in IEEE 802.11n systems. This [7] paper based on adaptive SVD beamforming algorithm with perturbation theory. Nevertheless, the computational cost is also high. The algorithm with iterative division will apparently cause performance degradation in practical hardware implementations with severe quantization effect. In this paper [8] first-order perturbation, updates can be computed recursively, resulting in a highly efficient algorithm that has lower complexity than the earlier least-mean square (LMS)-based algorithm. In this [9] paper, a systolic algorithm for the SVD of arbitrary complex matrices based on the cyclic Jacobi method with parallel ordering is presented. A novel two-step, two-sided unitary transformation scheme, tailored to the use of coordinate

(3)

rotation digital computer algorithms for high-speed arithmetic, is employed to diagonalize a complex 2×2 matrix. An expandable array of O(n2_{) complex 2×2 matrix processors computes the SVD of an n×n matrix in O(n log n} ) time. In this [10] paper, they described a minimum- area matrix decomposition architecture that is programmable to perform QRD and SVD with variable precision. In this [12] paper, an SVD processor system is presented in which each processing element is implemented using a simple CORDIC unit. The internal recursive loop within the CORDIC module is exploited, with pipelining being used to multiplex the two independent microrotations onto a single CORDIC processor. A matrix decomposition architecture was proposed in [21] according to the Golub-Kahan SVD (GK-SVD) algorithm [22]. It achieves higher processing throughput than [19] with lower hardware cost. Based on the matrix decomposition architecture in [21], a

hardware-in [23] by modifyhardware-ing the GK-SVD algorithm and ushardware-ing a high-speed Givens rotation design.

In hardware implementations, all the elements will be represented in finite precision. The orthogonal property among singular vectors, column vectors of U and V, may be corrupted and induce the interferences among transmitted substreams. The destruction of orthogonal property among singular vectors caused by quantization error may not be prevented. However, we can use orthogonality reconstruction for fixed point operation to operation to preserve the most orthogonality. The destruction of orthogonal property among singular vectors caused by quantization error may not be prevented. However, we can use orthogonality restoration for fixed point operation to eliminate the destruction caused by deflation stage and improve the performance. For the architecture of orthogonality restoration, Modified Gram-Schmidt is utilized.

(1) The storages of left singular vectors before or after Gram Schmidt operation is shown in Fig.1. With dedicate task arrangement, the storages of the right singular vectors can be outputted for Gram Schmidt operation or Hvi

computation. The results of right singular vectors can be stored after Gram Schmidt operation.

2. System model

2.1. Block Diagram

(4)

SVD engine is depicted in Fig.2. It consists of six

2.2.

In a MIMO system, assume that the maximum number of transmitter and receiver antennas is MR and MT, respectively. This means that we have possibly MR T different sizes of channel matrices (i.e., 1×1, 1×2,..., MR ×MT

for Square and Nonsquare Channel Matrix: The maximum size of channel matrix is MR ×MT in a MIMO system. Hence, it is intuitive to design an SVD engine to support the maximum channel size. For the smaller channel matrix, we can extend it to the maximum-size channel matrix by inserting zeros. If the size of a given matrix is NR × NT, the extended channel matrix is

(2)

After extending the original channel matrix by inserting zeros, the SVD operation of the original channel is exactly the same as that of the maximum-size channel matrix. The extended channel shown in the referenced algorithms. Note that the value of d in Algorithm depends on the size of the original channel matrix. Therefore, d is still equal to min(NR, NT ).Fig.2shows the block diagram of zero padding unit. A given channel matrix H NR×NT is extended to H MR×MT by inserting zeros, and the multiplexer is used to construct the positive

semi-R1 based on (4). We also apply the zero padding scheme to singular calculation unit and partial update unit. According to [7], Fig. 2 illustrates the architecture of singular calculation unit. Three multiplexers is used to consider two cases of NR NT and NR < NT. We employ [9] to realize partial update unit as shown in Fig. 7.

Schmidt Scheme for Nonsquare Channel Matrix: In [15], we apply the Gram-Schmidt R > NT .Due to the fact that the entries of a channel matrix, as well as the entries of its singular vectors, are always

complex-kth entry being 1. With this setting, we can rewrite [15]

3. Design of the Algorithm

In Algorithm 1, the positive semide nite matrix R1 is estimated by a moving average of the recent received signal vectors. In many MIMO OFDM-based standards, the channel matrix H is already known by channel estimation. Therefore, we can utilize the information to evaluate accurate R1

R1 = HH_{H, NR NT}

HHH_{, NR < NT . (3)} sequentially.

(5)

d d d d d d d d d d d d d d d d T R d d d i i i i i i i i i i i i i i i i T R i i i H i i i i i th i i H i i T R T R H T R H T R

Hu

v

R

u

R

tr

else

Hv

u

R

v

R

tr

N

if

v

u

for

Update

Partial

Step

end

Hu

v

w

u

else

Hv

u

w

v

N

if

d

i

v

u

of

Derivation

Step

end

n

R

deflation

apply

n

I

n

R

n

pair

i

with

update

a

sw

subcarrier

adjacent

Deflation

and

Update

Step

N

d

N

HH

N

H

R

N

H

Given

,

)

1 (:,

)

1 (:,

,

)

(

,

)

1 (:,

)

1 (:,

,

)

(

,

3 ,

,

)

1 ,..(

2 ,

1 ,

,

2 )

1 (

w

)

1 (

w

)

(

w

)

(

)(

(

)

(

w

)

1 (

w

)

0 (

/

)

0 (

)

0 (

w

)

0 (

w

)

0 (

)

(

'

)

0 (

w

Initialize

1)

-(d

:

1 i

for

)

1 )

,

min(

,

i i 1 i i i i i i 1

(6)

-end end w w v v v e e w N N fork else w w u u u e e w N N fork N N if vectors gular remaining for Schmidt Gram Step k d k d k d k d i i i k k k d T R k d k d k d k d i i i k k k d T R T R 1 1 1 1 , ) ( : 1 , ) ( : 1 sin 4

4. Result and Discussion

4.1. Synthesis Report

Table 1 Device properities

Project File: xil.xise Module Name: AdaptiveSVD Target Device: xc6vlx75t-3ff484 Product Version: ISE 14.1 Design Goal: Balanced

Design Strategy: Xilinx Default (unlocked) Environment: System Settings

Table 2 Device utilization

Logic Utilization Used Available Utilization

Number of occupied Slices

Distribution 3,149 17,280 18% Number of bonded IOBs 572 680 84%

Number of DSP48Es 64 64 100% Number of

(7)

4.2. Simulation Results

Fig. 3. Simulation Output of Zero Padding Unit

(8)

Fig. 5. Simulation Output of Gram Schmidt Unit;

(9)

4.3. Comparison Result

Table 3 Comparison table

JSSC D. Markovic et al ACSSC C.Studer et al ISCAS C. Senning et al This Work Support Antenna Configuration 4x4 4x4 4x4 1x1, 1x2, 1x3,1x4, 2x1, 2x2, 2x3,2x4, 3x1, 3x2, 3x3,3x4, 4x1, 4x2, 4x3,4x4

SVD U,S U,S,V V,S U,S,V

Throughput 50K 86.4K 303K 479K

Frequency 100 MHZ 133 MHZ 149 MHZ 600 MHZ Power Efficiency 1.47 3.5 N\A 3.83 Technology 90 nm 180 nm 180 nm 40 nm

5. Conclusion and Future work

Thus the above architecture reduces the overhead of hardware and computational complexity of Beamforming which is the highest computational complexity module of 3GPP MIMO-OFDM standards. The demand of high-throughput wireless transmissions, such as IEEE 802.11n WLAN systems and IEEE 802.16e WiMAX systems, continues to grow. Our design can achieve 9.8 M channel-matrices/s. Eigen values are derived iteratively so the beamforming is performed in real time and method we employed can support all

MIMO system. In successive matrix processing, the equivalent processing time required for each matrix can even be reduced to 90ns. By reduction of complexity of the highest computational module the Beamforming and Equalization will become efficient. In the critical case, we have only 46 to derive the precoding matrices of 128 subchannels. In other words, the time for the SVD of one complex matrix is limited to about 400 ns. When the channels have short coherent time, the information derived by SVD should be sent from the receiver to the transmitter as soon as possible to keep the beamforming performance. The decomposing time and accuracy will therefore greatly affect the beamforming performance. These design strategies enable the use of SVD to be effectively applied to the high-throughput wireless communication applications. In future this work can be extended to deal with different transmit and receive antenna sets such as 8x8 and 16x16.

References

[1] Alamouti

1451 1458, Oct. 1998.

[2] -division-duplex (FDD)

Proc. IEEE 43rd Veh. Technol. Conf., May 1993, pp. 508 511.

[3] Sampath.H and Paulraj.A - IEEE Commun. Lett.,

vol. 6, no. 6, June 2002, pp. 239 41.

[4] Love.D.J and Heath.R.W IEEE Trans. Inf. Theory,vol.51,8,pp.

2967 2976, Aug. 2005.

(10)

[6] Foschini.G.J and Gans.M.J Wireless Pers.commun., vol. 6, no. 3, pp. 311 335, Mar. 1998.

[7] Willink.T.J IEEE Trans. Signal Process., pp. 615 622, Feb. 2008.

[8] Sampath.H, Talwar.S, Tellado.T, Erceg.V, and Paulraj.A fourthgeneration MIMO-OFDM: Broadband wireless system: Design,

IEEE Commun. Mag., vol. 40, no. 9,p. 143 149, Sep. 2002.

[9] Hemkumar.N.D and Cavallaro.J.R Proc. IEEE Int. Symp. Circuits Syst., May 1992,

vol. 3, pp. 1061 1064.

[10] Deprettere.F, SVD and Signal Processing: Algorithms, Analysis and Applications. Amsterdam, The Netherlands: Elsevier, 1988. [11] Cheng Zhou Zhan, Yen-Liang Chen Iterative Superlinear-Convergence SVD Beamforming algorithm and VLSI Architecture for

MIMO-OFDM Systems IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 6, JUNE 2012 [12] Liu, Z, Dickson, K., McCanny A

floating-[13] Yen-Liang Chen gurable Adaptive Singular Value Decomposition Engine Design for High-Throughput MIMO-OFDM

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS ,2012.

[14] Raleigh.G.G - IEEE Trans. Commun., vol. 46, pp. 357 366, Mar.

1998.

[15] David J. Love, -Input Multiple- IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 51, NO. 7, JULY 2003.

[16] Goldsmith.A, Jafar.S.A, Jindal.N, and Vishwanath.S, Capacity limits of MIMO channels IEEE J. Sel. Areas Commun., vol. 21, no 5, pp. 684 702, Jun. 2003.

[17] Specification of IEEE 802.11n physical layer [Online]. Available: http://www.ieee802.org/11/

[18] Markovic.D, Nikolic.B, and -

[19] State Circuits, vol. 42, no. 4, pp. 922 934, Apr. 2007V.

[20] Design and implementation

trade-1990.

[21] Senning.C -efficient steering matrix computation architecture for MIMO communication syst Proc. IEEE Int. Symp. Circuits Syst., May 2008, pp. 304 307.

[22] Erceg et al.

IEEE802.16.3c-01/29r4; http://www. ieee802.org/16 /tg3/contrib/ 802163c-01_29r4.pdf

[23] -OFDM for wireless communications: Signal detection with enhanced channel 1477, Sep.2002.

[24] FPGA Implementation of Precoding using Low Complexity SVD for MIMO-OFDM Systems [25] Chiueh.T and P. Y. Tsai, OFDM Baseband Receiver Design for Wireless Communications. New York: Wiley, 2007.

[26] Yen-Liang Chen, Cheng-Zhou Zhan, Reconfigurable Adaptive Singular Value Decomposition Engine Design for High-Throughput MIMO-OFDM Systems IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 2012. [27] Telatar.E -Bell Labs Internal Tech. Memo, 1995.