Two-Stage GP - System Design - Pattern recognition using genetic programming for classification

8.2 System Design

8.2.1 Two-Stage GP

The choice of classes for the two stages was not only based on the statistical dis- tribution of features but the previous literature also helped to make this decision. In [130], Swami and Sadler gave a detailed analysis of AMC for different modulation schemes. They used higher order cumulants for classification of a large set of modulation schemes. They also did classification of BPSK, QPSK and QAM(>4). Their results showed that the classification of 16QAM and 64QAM was difficult compared to other modulations. Wong and Nandi [162] also used higher order cumulants for the classification of above mentioned modulations. Also they came to the conclusion that the classification of QAM(>4) is difficult as compared to classification of other two modulation schemes. This shows that classification of BPSK, QPSK and QAM(>4) is easier as compared to the classification of 16QAM and 64QAM. In order to cope with this problem we have used a two-stage genetic programming.

At the first stage, classification of BPSK, QPSK and QAM(>4) is performed and at the second stage GP is used again to do the classification of remaining two classes. Since there are four classes at the first stage and GP needs to do multi-class classification, KNN has been used as a fitness evaluator. The use of KNN for fitness evaluation makes GP a multi-class classifier. SVM or ANN can also be used at the first stage but these two classifiers are computationally complex compared to KNN. Since the training of GP is a computationally complex process, KNN for fitness evaluation is the preferred choice because of its simplicity. At the second stage KNN, SVM, ANN and Fisher criterion are used for fitness evaluation of binary classification problem, producing four solutions one for each GP combination. As the second stage is independent from the first and solely devoted for the classification of QAM(>4) modulations, the classification accuracy should increase.

In addition to the other two stages (used for increasing classification accuracy of 16QAM and 64QAM), the proposed system can be divided into training and testing phases. During the training phase cumulants are given as input features to GP. Different combinations of cumulants are tested by GP during the evolution and the best combination generated during the evolution is returned as final result. GP differs from other machine learning classifiers in the sense that it does not return a trained classifier as a result of training and instead returns a solution by using combinations of the input features, later tested by an independent classifier. The system model showing training phase is shown in Figure 8.1.

8.2. SYSTEM DESIGN ₁₅₀ BPSK, QPSK, 16QAM, 64QAM Extracting Cumulants GP (KNN) 16QAM/64QAM BPSK/QPSK/(16QAM,64QAM) Stage 1 Stage 2

Tree2 (KNN) Tree (SVM) Tree (ANN) Tree1 (KNN)

GP (KNN) GP (SVM) GP (ANN) GP (FC)

Tree (FC)

Figure 8.1: System model showing training phase of two-stage modulation classification.

8.2. SYSTEM DESIGN ₁₅₁ from the received signal. These cumulants are given as inputs to GP (KNN) at the first stage where KNN has been used in combination with GP to make GP a multi-class classifier. KNN is used for fitness evaluation during the training. At the first stage, three-class classification is performed as explained earlier. Since classification of BPSK and QPSK is easier compared to 16QAM and 64QAM, a second stage classification is performed to separate 16QAM and 64QAM. After the completion of first stage GP (KNN) produces Tree1 (solution 1) which is the best tree produced during all GP generations, in terms of separation between these three classes. The remaining two modulations are fed into the second stage where three types of classifiers (KNN, SVM and ANN) are used for fitness evaluation of GP trees. Details about how these classifiers are used for fitness calculation are given in the next sections. In addition, Fisher criterion (FC) has also been used at the second stage which has the ability to provide a cost efficient solution for binary classification. Once the training phase is completed, five trees are produced, one from the first stage and four from the second stage. In Figure 8.1 the term inside parenthesis is used for fitness evaluation. For example in GP (KNN), KNN has been used for fitness evaluation of trees and same is the case for other notations.

Once the training process is complete, the trees produced by GP are tested with test data. Figure 8.2 shows the full testing process. We again start with extracting cumulants from the received signals. These cumulants are fed into the Tree1 produced by GP (KNN) at the first stage which performs three-class classification. Since Tree1 was produced by GP (KNN), the performance of this tree is tested using KNN (other classifiers can also be used). BPSK and QPSK are separated at this stage and the remaining two classes are fed into the second stage. Four trees produced during the second stage of the training phase, through different GP combinations are used at this stage, with each of them trying to separate 16QAM and 64QAM. The performance of each tree is tested with the same classifier that was used during fitness evaluation. For example, the performance of the tree produced by GP (SVM) is evaluated using SVM classifier and same is the case for other classifiers. For tree returned by GP (FC), KNN is used for testing because of its simplicity. Figure 8.2 shows different stages of the testing phase.

8.2. SYSTEM DESIGN ₁₅₂

BPSK, QPSK, 16QAM, 64QAM

Extracting Cumulants Stage 1

Stage 2

Tree2 (KNN) Tree (SVM) Tree (ANN) Tree1 (KNN)

QPSK

BPSK 16QAM 64QAM 16QAM/64QAM

Tree (FC)

Figure 8.2: System model of testing phase for modulation classification.

8.2.2 Fitness Evaluation Using KNN

The evaluation of fitness using KNN was presented in Section 7.6.1. The same method is used here so it will not be discussed again and interested readers are referred to Section 7.6.1.

In document Pattern recognition using genetic programming for classification of diabetes and modulation data (Page 169-172)