• No results found

In this chapter, first a brief discussion on Wirtinger calculus is presented. It is shown to be very useful in dealing with nonholomorphic functions that are obvious in the CVNNs. Most importantly, the well known Cauchy-Riemann equations for holomorphic function becomes a special case of derivatives in Wirtinger calculus. The derivative rules (e.g., product, quotient and chain rule) in Wirtinger calculus comply with those in the ordinary calculus, thereby making it painless to use.

The complex gradient descent and the complex-LM algorithm have been derived using the Wirtinger calculus, which enables performing all computations directly in the complex domain. The use of the Wirtinger calculus also simplifies the deriva- tion. In course of complex-LM derivation, we have encountered a general least squares problem in the complex domain. A solution with proof is presented and the result in utilized in the derivation. We identify that (i) the Wirtinger calculus, (ii) coordination transformation between the conjugate coordinate and the real coordinate system, and (iii) the functional dependency graph of Jacobians greatly simplifies the adaptation of the learning algorithms in RVNNs to the CVNN framework. Computer simulation results are provided to verify the derivations, and it is shown that the complex-LM as well as its variant, the pseudo-complex-LM, have much faster learning convergence. This resembles the convergence behavior of LM algorithm in the RVNNs for which the LM is widely used in various RVNN applications. Therefore, as the application areas of CVNNs become wider the complex-LM expected to be very useful in those applications.

Chapter 4

Complex-Valued Functional Link

Network Design by Orthogonal

Least Squares Method

4.1

Introduction

In Chapter 1, we have seen how complex-valued data naturally arise in various appli- cations, such as array and radar signal processing, magnetic resonance imaging, com- munication systems, signal representation in complex baseband, and processing data in the frequency domain. We have also seen how some real-valued two dimensional data can have better representation as a complex vector field, for example, wind speed and direction and tree representation of hand skeleton in the hand gesture recognition system. The emerging application areas have spurred growing interests of complex- valued neural networks (CVNNs). However, as shown in Chapter 2, unlike real-valued neural networks (RVNNs), a major issue in the CVNNs involves designing/selecting activation functions. It seems that there is no conclusive remark as to how one should choose activation function for a given problem.

Among various architectures of CVNNs, the complex-valued multilayer perceptron (CMLP) (31, 75) is perhaps the most widely used model. Besides the CMLP, there have been a growing interests in developing complex-valued radial basis function (CRBF) networks. One of the early works by (14) extends the real RBF by using complex cen- ters and weights. A potential application of CRBF networks for solving communication

channel equalization with a stochastic gradient training algorithm is demonstrated in (12). In order to design and ensure a parsimonious RBF network by growing and pruning strategies, (28) have proposed a sequential learning algorithm, referred to as complex minimal resource allocation network (CMRAN). Even though the aforemen- tioned approaches employ complex-valued centers, the response of hidden units remains real as the hidden units use Gaussian RBFs with Euclidean norm. Hence, they cannot approximate input-output mapping efficiently, especially for the phase values. This has been overcome with the development of a fully complex-valued RBF (FC-RBF) network with a fully complex-valued activation function (62). It is shown that the FC-RBF outperforms the conventional CRBF networks in its approximation ability. The authors have also proposed an efficient learning scheme for the FC-RBF that use a self-regulatory system (63). The activation function used in FC-RBF, however, is one of the elementary transcendental function having singularities.

Besides the issue of activation function in complex domain, the complex-valued multilayer structures (CMLPs and CRBF networks) share similar problems with the networks in real domain. The Multilayer structures are computationally intensive be- cause of nonlinear projections by the hidden layers (56). Most importantly, their train- ing is difficult due to several problems, including local minima trapping, saturation of activation functions, slow convergence, initial weight dependence, and overfitting due to structural complexity (67).

The problems stated above can be avoided by removing the hidden layers, but with- out giving up the nonlinearity. Instead, nonlinearity is imposed in the input layer by functional expansion of input variables. The resulting model has a flat structure, i.e., single layer network. This alternative approach is known as functional link networks (FLNs) in the literature (e.g., (54)) and advocated for alleviating the previously men- tioned problems of multilayer structures. However, the price to be paid for is the proper choice of functional expansion, for example, polynomials, trigonometric functions, and orthogonal basis functions.

In the real domain, many works on the FLNs can be found encompassing pattern recognition, control applications, and communication channel equalization (17, 53, 55). Only few attempts have been taken towards the application of FLNs for processing complex-valued data. In (56), despite the data being complex, the authors have con- sidered the problem in the real domain separating the real and imaginary parts. This

4.1 Introduction

real-valued perspective does not exploit the well defined complex algebraic rules, and hence, cannot provide efficient modeling.

In this chapter, we present a complex-valued functional link network (CFLN) for solving complex-valued function approximation problems. In order to capture nonlinear relationship between the input and output, multivariate polynomials are considered. In contrast to other possibilities like elementary transcendental functions, polynomials are easy to compute and include higher order cross product terms. Moreover, polynomials are equally well behaving in the real and complex domain, while most of the transcen- dental functions have singularities in the complex domain (31). Since the number of total terms (or monomials) in the multivariate polynomial grows exponentially with the degree and number of input variables, only the relevant monomials are selected in the CFLN by the orthogonal least squares (OLS) method. The Polynomial degree is increased in an incremental way until a negligible improvement is seen from the incre- ment. In this regard, it is worth mentioning that computing with polynomials are also prevalent in other CVNNs because the nonlinear functions are often evaluated by the polynomial expansions. Thus, using polynomials in the input layer does not involve additional computational cost than the other CVNNs.

To the best of our knowledge, our study is the first of its kind in the FLNs that con- siders complex-valued data directly, without separating the real and imaginary parts. The proposed CFLN offers a number of advantages over conventional CVNNs. First, it does not require activation functions which has been a major concern in the complex domain till now. Second, the cost function, i.e., mean squared error (MSE), has a single minimum as it is quadratic in parameters. Therefore, fast learning algorithms such as OLS or recursive least squares (64) can be used to learn the weight parameters in the CFLN. In contrast, learning in the multilayer architectures often becomes difficult due to the local minima problem and slow convergence of error backpropagation. Third, selecting a near optimal set of monomials by the OLS method yields very simple CFLNs with favorable performance.

The remaining sections of the chapter are structured as follows. Section 4.2 provides a brief overview of FLNs in a general framework. A detailed description of CFLN design by the OLS method is presented in Section 4.3. Experimental results comprising a complex-valued function approximation, a channel equalization, and a real-world wind

prediction problem are provided in Section 4.4. Finally, conclusion is given in Section 4.5.

Figure 4.1: Schematic of functional link network (FLN). The original N input variables are expanded by M linearly independent functions and the output is their linear combina- tion.

Related documents