2016 International Conference on Electronic Information Technology and Intellectualization (ICEITI 2016) ISBN: 978-1-60595-364-9
Research and Application on Hybrid-weighted
Association Rule Algorithm
Jing Zang, Wu Ju Li, Jin Hao Chen, Li Dong Fu
and Cheng Hua Li
ABSTRACT
In order to improve the efficiency of locating the root alarm, a new weighted association rule mining algorithm was proposed based on the attributes and time characteristics of the alarm information. The extension measure was used to build the alarm information model, then the similarity between the alarm matter-element cases were calculated in the method of the correlation function. Furthermore, the hybrid-weighted method of horizontal and vertical weighted was introduced to mine association rules on the alarm cases. The simulation experiment was carried out by the synthetic data set. In comparison with the horizontal weighted algorithm, the hybrid weighted algorithm can effectively reduced the generation of redundant rules.
INTRODUCTION
Many researchers have done a lot of researches on the alarm correlation analysis using data mining methods[1], such as mining algorithm without weight[2], which generated a certain amount of redundant rules, the horizontal-weighted algorithm[3] did not consider the time characteristics of alarm. In view of the disadvantages of the above methods, a new weighted algorithm is proposed based on the characteristics of the network alarms.
_________________________
Jing Zang, Wuju Li, Jinhao Chen, Lidong Fu, College of Information Science and Engineering, Shenyang Ligong University, Liaoning, China
ALARM SIMILARITY MEASUREMENT
Extension Matter Element Model
Based on the extension matter-element model[4], several attributes of the original alarm information is extracted according to the ITU-T X.733 standards[5]. The formal representation of original alarm information is given by the formula (1):
6 5 4 3 2 1 1 v v v v v v , , , , , , , count Alarm time Alarm level Alarm type Device port Alarm node Alarm I R (1) Similarity Measurement
Historical alarm case is divided into root alarm and secondary alarm.
The correlation function is used to calculate the similarity between the attributes of the root alarm and the sub-alarms, which is shown by the formula(2).
. , 1 , 1 , 1 1 M x M x M x M x M x x M x
k (2)
M, x denotes the attribute value of the root alarm and the sub-alarm separately.
WEIGHTED ASSOCIATION RULES ALGORITHM
The Horizontal Weighted Algorithm On Similarity
The new transaction horizontal-weighted algorithm is given by the formula (3).The horizontal-weights of case-set X is denoted by hx, and n is the number of cases in the case-set X. The support degree of the case is denoted by sup(X).The horizontal-weighted support degree of the case-set X is showed by formula (4).
i i I j ij I j ij iji I m I
h
1 1
X I
i x
i
h n
h 1 ,wsup X hxsup X (4)
I
Ii and |Ii| denotes the number of attributes in the transaction, the similarity vector of the alarm case Ii can be denoted byMi
mi1,mi2,mij
.Horizontal Weighted Algorithm
The hybrid-weighted algorithm combines the horizontal and vertical weights [6]. (1)The vertical-weighted support degree is denoted by sup pv(X), showed by formula (5).
v 1
supp X vCount X N
n
i
i i
v
,
n
i i i v vN
N
1
(5)
The time of a day is divided into n-intervals tt1,t2,,tn ti denotes a time- interval.Count Xi denotes the number of transactions of X in time-interval ti. Nv denotes the total number of transactions with vertical weight, and Ni denotes the total number of the vi.
(2)The hybrid-weighted support degree of case-set is given by the formula (6).
msup
X hxsuppv
X (6)Frequent Caseset
Step 1 The part of alarm cases are represented by the extension matter-element by Eq.(1). Eq.(2) are used to calculate similarity, and Eq.(4)are used to calculate the horizontal weight, which are shown as table I.
Step 2 Transaction database D is generated by the method of time window, and the corresponding vertical weight is added, which is described as table II.
Step 3 Enter the minimum-weighted support degree and minimum-confidence, scan the database D, calculate the vertical-weighted support of each case by the Eq.(5). If the vertical-weighted support degree is less than the minimum-weighted support degree, the case is removed, and the super-case set is generated.
Step 4 The super-case set is multiplied by the horizontal weights to calculate hybrid-weighted support degree as Eq.(6). If hybrid-weighted support degree is less than the minimum-weighted support degree, the case is removed and the weighted-frequent 1-caseset is given.
weighted frequent 2-caseset is generated while the case-set is removed if hybrid-weighted support is less than minimum-hybrid-weighted.
Step 6 Repeat the step5 until you get all the frequent case-sets.
THE ALGORITHM SIMULATION AND ANALYSIS
The 1-st experiment to generate 10000 data by the data synthesizer, and watch the rule of run time when the minimum support degree is changed.
The hybrid-weighted support degree algorithm gives full consideration to each attribute, the running time is lower than MINWAL (0). It is showed as the Fig 1.
[image:4.612.93.494.323.424.2]The 2th experiment is showed as Fig 2.Hybrid-weighted algorithm takes the vertical and horizontal weight into account, some of the more important but small number of case-set can be mined out, which reduce the loss of effective case-set and the generation of invalid case-set comparing with MINWAL(0)
TABLE I. PART OF ALARM CASES AND THEIR HORIZONTAL-WEIGHTED. ca
se
node number
alarm link
Device type
Alarm level
Alarm count
similarity vector M H-weight(hi)
1
I 9 13 4 5 5 (0.17,0.14,0.50,0.25,0.20) 0.21
2
I 3 9 1 4 10 (0.70,0.33,0.58,0.33,0.10) 0.30
3
I 10 17 5 3 7 (0.14,0.09,0.33,0.50,0.14) 0.17
4
I 7 11 2 4 8 (0.25,0.20,0.71,0.33,0.10) 0.23
5
I 6 12 3 2 6 (0.33,0.17,1.00,1.00,0.17) 0.35
6
[image:4.612.117.474.463.663.2]I 8 15 1 3 11 (0.20,0.11,0.58,0.50,0.09) 0.17
TABLE II. VERTICAL-WEIGHTED OF ALARM CASES.
line time interval alarm cases V-weight( suppv(X))
1 9:00:00-9:00:20 I1I2I3I5,I5I3I6,I6I4I5I2,I2I3I5 0.5
2 9:00:21-9:00:40 I5I4I6I1,I1I2I3,I3I4I5I6,I6I1 0.3
3 9:00:41-9:01:00 I1I3I4,I4I3I5,I5I6,I6 0.7
4 9:01:01-9:01:20 I6I5,I5I4I2,I3I5I2,I2I1 0.2
CONCLUSIONS
1.The historical alarm data is represented by the extension matter-element model, which can extract key attributes and reduce the amount of alarm data in the data source and run time.
2.The method of correlation function is used to the horizontal weight, which can reduce the impact of subjective factors on the horizontal-weight set.
3.The hybrid-weighted method is used for mining association rules, which reduces the generation of redundant rules, avoids the loss of some useful association rules, and improves accuracy of fault location.
ACKNOWLEDGEMENTS
This research is the Public Research Funds of Liaoning(Project No.2014002006) and Key Disciplines and Key Laboratory Open Fund for Shenyang Ligong University(Project No.4771004kfx14)
REFERENCES
1. Tong-Yan Li, Xing-Ming Li. 2011. “Preprocessing Expert System for Mining Association Rules in Telecommunication Networks”, Expert Systems with Applications. 38: pp. 1709–1715.
2. Jin-feng Li, Huai-bin Wang. 2012. “Network Fault Alarm Correlation Analysis Based on Association Rule”. Computer Engineering, 35(5): pp. 44-46.
3. Yuan-cha Liu. “Research on Alarm Correlation Application in Communication Networks”. Tianjin University of Technology. 2012.
4. Yong-xiu He, Ai-ying Dai, Jiang Zhu, Hai-ying He, Fu-rong Li. 2011. “Risk Assessment of Urban Network Planning in China Based on the Matter-element Model and Extension Analysis”.
Electrical Power and Energy Systems, 33(1): pp. 775–782.
5. Xiao-jie Leng. 2013. “Multi-domain Distributed Communication Network Fault Diagnosis Based on Alarm Fuzzy Association Rules Parallel Mining”. University of Electronic Science and Technology.