CHAPTER 4 The constraint based decomposition approach
4.3. Theoretical framework
4.5.1 The description of the CBD algorithm
4.5.1.3. Classification of more than two classes
The purpose here is to separate various inputs into classes when inputs from more than two classes are presented. Let us consider that patterns from Ci,C2,...,Cn are presented to the net. A 3 class problem is presented in fig. 7. Various approaches are possible.
©
©O
©
o
0 ©o
©o
o
Fig. 7. A 3 class problem in a 2D input space.
One approach is to solve iteratively the multiclass problem as a set of two-class problems. Firstly, the algorithm separates the pattern set into homogeneous subsets Sl,S2,...,Sn, each of which contains only patterns in the same class, Then, all
pairs of classes are chosen and separated by training a network using the two class version of the CBD algorithm. Let us suppose that classes Ci and C2 are chosen to be separated in this first phase. In the example given, the patterns from C3 are ignored in the first instance and classes Cl and C2 are separated. The result of separating Cl from C2 (shown in fig. 8) is a set of hyperplanes determining a set of regions Ri, each region having assigned a class Ck where k is 1 or 2.
Fig. 8. The first stage in a multiclass separation is solving a 2-class problem. Any two classes are chosen and separated.
Once two classes have been separated, two approaches are possible: iteration on the classes to be separated and iteration on the regions obtained by solving the first two-class problem.
In the first approach, for each remaining class Cj from C3,...,Cn, each pattern will be taken and fed to the net in order to identify the region in which this pattern lies. Let this region be Ri and the class assigned to this region by the first run of the two- class algorithm Cik- Subsequently, a two-class problem will be solved: separate Cik from Cj in R{.
In the example, let us consider that a pattern from the third class is randomly chosen. Let this be one of the patterns situated in RI. The next problem will be a 2-
class problem: separate the white patterns from the dashed patterns to be solved in Rl as shown in fig. 9
R2
Fig. 9. Iteration on the patterns. In this case one of the patterns of the remaining class (dashed) has been chosen and it happens to be in Rl. Consequently, Rl is split so that the classes in Rl (dashed and white) are separated.
Subsequently, all dashed patterns in Rl will be ignored because they are already separated from the patterns in the class to which Rl was initially assigned. Another dashed pattern will be chosen (perhaps in R2) and another 2-class problem will be solved in a limited region of the input space. The process is continued until all patterns in all remaining classes are considered and the problem is completely solved. A possible solution is presented in fig. 10 and the algorithm used is presented in fig. 11.
In conclusion, this approach involves solving a two class problem in the entire input space followed by a iterative check through all the patterns from the remaining classes. For each pattern, if the region in which it lies is not consistent (i.e. contains patterns from more than one class), the separation procedure will be called in that particular region.
In the second case, after two classes have been separated and the space divided into
consistent regions Ri, each region Ri can be checked for consistency with respect to all classes. If the region is not consistent, the multiclass procedure can be applied recursively. As the regions become smaller at each step and if the training set contains a finite number of patterns, the process will eventually converge to consistent regions which can then be labelled with their correspondent class. Such an algorithm is presented in fig. 12.
Algorithm 1. (iteration on classes Cj)
separate Ci,C2,...»Cn,
Regionsplit the pattern set in consistent subsets (one class in each subset) build a pattern set with only two classes Cl and C2-
separate Cl from C2 in Region (the whole space). The result is a set of hyperplanes determining a set of regions Ri, each region having assigned a class Cik-
for
each class Cj with j from 3 to n dofor
each pattern pij in Cjdo
apply pij to the input of the net and see in which region is classified, let that region be Ri with its corresponding class Cik-
separate Cik from Cj in Ri
end
Fig. 11 Algorithm 1 for a multiclass classification.
Algorithm 2. (iteration on regions Rj)
separate Cl,C2,...,Cn,
Regionsplit the pattern set in consistent subsets (one class in each subset) separate Cl and C2 in Region
for
each region Ri of Regiondo
if
Rik is not consistentthen
separate whatever classes there are in Rik
end
The problem of multiclass separation can be approached in a different way, which does not involve a first stage of separation of two classes. The previous solutions build a single network provided with output neurons corresponding to each class. When a pattern from a given class is presented to the network, the output neuron corresponding to that particular class is turned on. An alternative approach can build a network for each class. Each net will make the distinction between its own class and any other class. For this purpose, the patterns must be organised in sets corresponding to each class net. Each such set will contain the patterns from one class as the class to be recognised and the patterns from all other classes as the opposite class. Such an algorithm is presented in fig. 13.
Algorithm 3. (for parallel hardware)
separate
cl,c2,...,cn in regionis
split the pattern set in consistent subsets (one class in each subset)
for
each class cjdo
build subset with the patterns in cj as one class and all the other patterns in a different class cdiff
separate_two_classes cj and cdiff in region
Fig. 13 Algorithm 3 for a multiclass classification.
Discussion.
Algorithm 1 can be very efficient if there is an important difference between the number of patterns in different classes. The classes with the largest number of patterns can then be separated first. If the number of patterns in the remaining classes is not too large, only a few supplementary separations will be needed. In the same case, algorithm 2 can be very inefficient. Suppose there are N regions after the separation of Cl from C2 and just one supplementary class C3 with just one pattern. In this situation, algorithm 2 takes all N regions into consideration and their consistency will be checked. Algorithm 1 takes only the pattern in class C3, finds the region the pattern is in and separates only in that region. Algorithm 1 eliminates
the useless consistency checks of those regions which are consistent by using the patterns in the remaining classes to identify the regions which are not consistent. Algorithm 3 separates each class from every other class. Thus, each net will need only the hyperplanes to separate between its own class and the rest, without any concern for the separation between other classes and therefore, each net will be smaller in general, than the net obtained with the any of the previous approaches. On the other hand, the same hyperplane could be useful in the separation of more than one class. In this case, this hyperplane will be implemented by different units in different class nets. Therefore, the total number of neurons used by this approach would in general be greater than the number of neurons used by the previous approaches.
However, algorithm 3 is very convenient in parallel hardware if a piece of hardware can be allocated to each class-net. The convenience comes from the fact that the training can be done in parallel. Each training of one class net is independent of any other training and the parallelism is fully exploited.