3.3 Framework Design
3.3.4 Multi-window Ensemble Learning
The overall processing conducted by our Multi-window ensemble learning (MWEL) is shown in Algorithm 3.3. When the first window arrives, WC andW M are initialised with empty sets; and the first classifier is trained on the first window. If the precisionmaj and precisionmaj satisfies the conditions mentioned above, (Algrithm3.3line 7), we add this classifier toWCas the first sub-classifier and set its weight to 1. W I is also be filled with instances in this window. Later on, for the following sliding windows, whether or not training or adding a new sub-classifier toWC is determined by theWCupdating strategy described in Subsection3.3.3. IfWCis considered to be updated with a new clas- sifierctrained on currentWB, instances used to train old sub-classifiers and classifiers’ weights are necessary to update together as shown in Algorithm3.2. WhetherWCis up- dated or not, we always updateW Maccording to rules described in Subsection3.3.2. As show in line 35 of Algorithm3.3, theW Mis updated for currentWBbefore next window of instances arrives. Here, we simplified the procedure saving only the latest minori- ty instances for the space and time efficiency rather than keeping all minority instances [148] or select the nearest instance to the current window [3]. The primary procedure is described in Algorithm3.1.
In particular, when the next window arrives at time stept, it become to the current window denoted byWBt. We must update the weights of all sub-classifiers inWCbefore we use them onWBtto fit the concept in the window since the window. We firstly com- pute and normalize the similarity between the current window and existing windows in
3.3 Framework Design 53
Algorithm 3.3:Multi-window ensemble learning (MWEL)
Input: InstancesD={(X0,l0),(X1,l1),· · · ,(Xn,ln)}and size limitsMbofWB,
maxWCofWC,maxW MofW M
Output:W M,WC,W I,Weightand predicted labelsL= l0,l1,· · · ,lMb
1 Initialise: WB={},W M= {},WC ={},W I={}and
Weighti =1,i=0, 1,· · · ,maxWC
2 foreach sequencial instance in Ddo 3 WB←getBatchO f Instances(maxWB) 4 if|WC|is emptythen
5 c←trainClassi f ier(WB)
6 precisionmin,precisionmaj,errorRate,L←classi f y(WB,c) 7 ifprecisionmin >0.5and precisionmaj >0.5then
8 WC=WC∪cwith Algorithm3.2
9 Weight1 =1
10 else
11 WB0 ←resample instances inWBusing Algorithm3.4
12 WB0 =WB0∪W M
13 c0 ←trainClassi f ier(WB0)
14 precisionmin,precisionmaj,errorRate,L←classi f y(WB,c0)
15 ifprecisionmin>0.5and precisionmaj >0.5then
16 WC =WC∪c0with Algorithm3.2
17 Weight1=1
18 end 19 end 20 else
21 foreach window of instances in W Ido
22 computeWeightifor each subclassifier inWCwith Eq.3.4
23 end
24 precisionmin,precisionmaj,errorRate,L←classi f y(WB,WC) 25 ifprecisionmin <0.5or precisionmaj <0.5then
26 WB0 ←resample instances inWBusing Algorith3.4
27 WB0 =WB0∪W M
28 c0 ←trainClassi f ier(WB0)
29 precisionmin,precisionmaj,errorRate,L←classi f y(WB,c0) 30 ifprecisionmin>0.5and precisionmaj >0.5then
31 WC =WC∪c0with Algorithm3.2
32 end 33 end 34 end
35 W M←updatetheWM using Algrithm3.1 36 end
W Ito modify the weights of sub-classifiers inWCbased on the following observation: Observation 3.1. The greater the similarity between window Wiand window Wj, the closer are
the corresponding concepts within them. Therefore, those sub-classifiers trained on windows more similar to the current window should be given larger weights as follows.
Weighti =
Weighti
(1−sim(WB,W Ii) +e),i=1, 2, ...,maxWC (3.4)
Here,Weighti is the weight of theith sub-classifier inWC,sim(WB,W Ii)is the similarity between the current window andW Ii, and eis a small constant. Moreover, this obser-
vation can be used to target reoccurring concept drift because an existing sub-classifier corresponding to a reoccurring concept will obtain a larger weight, which is expected to provide better results.
For an instance Insk in WB, the predicted class label is determined by a majority weighted voting scheme. The output value of the majority weight voting scheme can be a soft label representing to what extent Insk belongs to a class or an exact value of 0 or 1 indicating Insk should belong to a class or not. We adopt the second method that output the hard class label for each instance. UsingWCi(Insk)to denote the classification of new instances using the sub-classifiers inciwe have the following formulation:
lInsk = 1, |c| ∑ i=1 Weighti·ci(Insk)>0.5 0, else (3.5) Algorithm 3.4:Resample
Input:WB, expected imbalance rateγ
Output: sampled instancesS
1 truePositive,trueNegative,f alsePositive,f alseNegative←classi f y(WB) 2 while||MinorityMajority|| <γdo
3 sampleMinority ←SMOTE(f alseNegative)
4 sampleMajority ←randomUnderSample(trueNegtive) 5 end
6 S=truePositive∪ f alsePositive∪sampleMinority∪sampleMajority
After classification of the current WB with existing sub-classifiers in WC, we com- pare the predicated label with the true label of each instance inWBnot merely to judge
3.4 Experiment and Evaluation 55