Incremental Learning with IPSL - Accelerating classifier training using AdaBoost within cascade

4.2.1 Variant PSL Discussion

By removing negative training samples from each subsequent node thus making the overall training set more limited, there is concern that the generalization of a final classifier might be compromised. However, this can be solved by employing the Viola and Jones (2001b)

bootstrapping method if a training process possesses a large number of negative training samples. Each negative training sample that is removed from subsequent nodes can be replaced by a new negative sample for the next round. The only criterion which a new negative sample would have to meet, is that it is correctly rejected by all nodes in that current layer.

The variant PSL structure provides guarantees during training that ensure a requested percentage of negative training samples is rejected at each layer which the original PSL structure is not able to. When the original PSL structure completes training a layer, it does so when a single node achieves a specified hit rate and a minimum required false detection rate. However, a false detection rate for a node may not be the same as the false detection rate for the whole layer. It is likely that a false detection rate for an entire layer is considerably higher than that of individual nodes because various nodes might have learned to reject different subsets of negative training samples. Due to the fact that only negative samples that achieve a unanimous vote from all nodes are rejected, the consequence is that only the intersection of every negative subset rejected by all nodes will be correctly classified by a layer. The variant PSL structure on the other hand ensures that whatever false detection rate last node in a layer achieves, will be the false detection rate for an entire layer.

4.3 Incremental Learning with IPSL

Section 3.5 highlighted current shortcomings of the boosting paradigm in providing capabilities to train classifiers incrementally. This means that classifiers which have been previously trained on a given dataset must be entirely re-trained if new data becomes available which is deemed as necessary to incorporate into an existing classification rule. For classification domains where new and unique training data is made available in regular intervals, the consequences are that all previous efforts invested in training, tuning and

optimization are wasted with there being no way of capitalizing on previously successful training results. Also, since current boosting structures do not cope well with training classifiers on massive datasets due to memory and processing constraints, it means that classifiers cannot be trained incrementally on smaller partitions of datasets. Instead, the size of a dateset must be sampled and capped in line with hardware capabilities of a training environment. This section presents a novel approach to incrementally train classifiers of boosted ensembles using the IPSL framework.

000000 000000 000000 000000 000000 111111 111111 111111 111111 111111 000000 000000 000000 000000 000000 111111 111111 111111 111111 111111 000000 000000 000000 000000 000000 111111 111111 111111 111111 111111 0000000 0000000 0000000 0000000 1111111 1111111 1111111 1111111 Object detected + Node 3d Node 1d New positive training samples

Layer 1

Layer 2

Layer 3

Layer 4

Old negative training samples New negative training samples

Node 4a Node 4b Node 4c Node 4d Node 3b Node 3a Node 3c Node 2d Node 2c Node 2b Node 2a

Node 1a Node 1b Node 1c

Figure 4.6: The incremental PSL (IPSL) training structure illustrating the creation of additional nodes to the existingbase classifier.

Figure 4.6 illustrates the approach used by IPSL to build classifiers incrementally. IPSL uses the same methodology as PSL in its attempt to train cascade layers that produce 100% hit rates on positive training data. To achieve this, the IPSL structure takes as input an existing classifier (base classifier), together with a new additional positive training dataset. IPSL then evaluates the additional positive training samples against each layer of a base classifier and if any new positive training samples fail to pass a layer, a new node is trained on those misclassified positive samples. The new node is subsequently appended to the same layer of a base classifier. The misclassified positive training samples are trained against a combination of new negative training samples and all the old negative training samples used in prior training for that particular layer. Algorithm 5 shows the

4.3. Incremental Learning with IPSL ₅₁

details of IPSL.

The generation of new nodes for each layer continues until all positive training samples have been classified correctly and once a layer has learned to reject at least 50% of the total negative training samples. The latter is a modification to the standard PSL requirement in regards to rejecting negative training samples. The standard PSL permitted a layer to finish once a single node learned to reject 50% of the negative training samples. However, this did not necessarily translate to a 50% rejection of all negative training samples for an entire layer. It seemed necessary to create stricter constraints for rejecting negative training samples per layer, since the prospect of additional nodes gave the impression that overall false detection rates might increase rapidly. This was anticipated due to the fact that existing nodes of a base classifier had not been trained on new negative training samples. This means that any new negative training samples that did not attain a unanimous negative detection vote from existing nodes had no chance of being rejected by a layer through new nodes. Because of this, new nodes had to have the ability to preserve false alarm rates of base classifiers and therefore needed to have the old negative training samples available to them. On the other hand, all the positive training samples that were utilized in previous training rounds do not need to be used in subsequent incremental training since base classifiers’ nodes would already have been trained to classify them as positives.

The characteristic of a PSL classifier which requires a unanimous negative detection vote by all nodes in a layer in order for a sample to be rejected by that layer outright, presents considerable difficulties and a limitation to the IPSL structure. The challenge for the IPSL structure arises in regards to producing a false detection rate which is lower than that of a base classifier. This issue comes up if the decision boundary of a base classifier has not been sufficiently defined in its training to classify correctly new negative training samples. In such a scenario, an IPSL classifier will be not be able to learn to reject new negative training samples even though subsequent new nodes appended to the base classifier have been trained to correctly reject the additional negative training samples. The IPSL structure in its present form appears to theoretically provide a mechanism, which at best preserves false detection rates of a base classifier whist improving its positive detection rates. The IPSL structure must therefore rely heavily on effective training of a

base classifier, particularly in regards to the usage of large and diverse negative datasets. Ideally, such datasets needs to fully encompass the decision boundary of the negatives, allowing additional IPSL positive training to only provide a refinement of the actual decision boundary.

Input : PSL classifier

Output: IPSL classifier

Data: MN = misclassified negatives, MP = misclassified positives.

BoostNewPSLNode() initiate the boosting of a new PSL node until the criterion condition is met.

AppendNode() add the new IPSL node to the layer.

ResetPositiveSet() re-initialize the positive training set all the positive samples.

PopulateCurrMPSet() fill the positive training set with positive samples that have been misclassified.

PopulateCurrMNSet() fill the negative training set with negative samples that have been misclassified.

foreachlayer in the cascade do

totalMP=PopulateCurrMPSet(currMPSet); 2

if totalMP >0 then

PopulateCurrMNSet(currMNSet);

while totalMP>0 and!layerTargetMet and totalPSLNodes <MAXdo

layerTargetsMet =BoostNewPSLNode(currMNSet,currMPSet); 6 if layerTargetsMet then 7 currMNSet =ResetPositiveSet(); 8 else 9

totalMP= PopulateCurrMPSet(currMPSet); 10 PopulateCurrMNSet(currMNSet); 11 end 12 end 13 end 14 end 15

Algorithm 5: IPSL learning

4.3.1 IPSL Discussion

When considering IPSL’s approach to incremental training which involves continually appending new nodes to existing base classifiers, questions of scalability arise. Questions like, how many rounds of incremental learning can the IPSL structure perform before the number of nodes becomes too cumbersome to execute in real-time? Also, unlike the positive training samples, if all previously used negative training samples need to be used in every subsequent training run, how many rounds of incremental training can be performed before the negative training set becomes computationally too expensive?

In document Accelerating classifier training using AdaBoost within cascades of boosted ensembles : a thesis presented in partial fulfillment of the requirements for the degree of Master of Science in Computer Sciences at Massey University, Auckland, New Zealand (Page 67-71)