The practical evaluation is done for both motifs of similar and variable lengths. In order to analyse the obtained results several quality measures are used during the experiments.
The usefulness of the proposed method is demonstrated with data sets, from various domains and with different properties such as amount of noise, length of the time-series, length of the motif.
Moreover, four common used motif discovery algorithms namely Mr. Motif [8], Enumeration motif [5], Mueen-Keogh (MK) [24], and the grammar- based method [6] are applied as benchmarks to evaluate the recent approach.
Quality Measures. A motif is perfect when it matches all the subse-
quences in the target class/group and no other subsequences out of that class. Various quality measures are applied to examine the result of the pro- posed motif discovery algorithm. Four possible cases to qualify a motif π
matching a subsequent ππare: True Positive rate (TP), False Negative rate
(FN), True Negative rate (TN), and False Positive rate (FP), which can be obtained for each class label. Moreover, other quality measures which can
obtain from the four mentioned measures are Sensitivity (ππ = π π
π π +πΉ π)
which measures the capacity of subsequences of the target class correctly
matched by the motif, and Specificity (ππ = π π +πΉ ππ π ) that calculates the
proportion of subsequences outside the target class that are not matched by
the motif. Precision (π π = π π +πΉ ππ π ) provides the fraction of subsequences
of the target class that are matched by the motif and the subsequence that
are not correctly matched by the motif. F-Measure (πΉ β π = 2 Β· (π π+πππ πΒ·ππ))
considers both precision and sensitivity. Additionally, we obtain the correct
motif discovery rate πΆπ = ππ+ where π+is number of correctly detected
motifs and π is number of all motifs.
Test Cases. The Gun data set [25] is gathered from the video surveillance
domain. Two types of motifs are included in the data set: Gun draw and gun point. For each motif, 100 examples are considered.
Swedish leaf data set [25] comes from a project at LinkΓΆping University. The data set contains of leaves from 15 different Swedish trees. In our experiments, only 3 types of leaves are considered as motifs.
AutoSense data is gathered from the running research project called βAdap- tive energy self-sufficient sensor network for monitoring safety-critical self-service-systemsβ [26]. The focus of this project is monitoring security critical systems, e. g., the identification of criminal attacks on automated teller machines (ATMs). After identification of several relevant attacks on ATMs, these attacks are tested and the results of them are gathered from sensors connected to the system. This data consists of 24 signals with different lengths, gathered from 8 sensors in 3 different experiments done on an ATM machine.
Results. Based on the mentioned quality measures, the obtained result
of the proposed method and the other benchmark methods for motifs of equal length are given in Tab. 1-3.
The MK algorithm [24] is able to detect only one pair of motif of similar length. The MK algorithm is useful in the case of one motif/pattern data. However, tested data sets here contain motifs/patterns in more than one
Table 1: Evaluation results of equal length motifs in Gun Data [25], Sn: Sensitivity, Sp: Specificity, Pr: Precision, F-M: F-Measure, CR: Correct motif discovery rate
Method Sn(%) Sp (%) Pr (%) F-M (%) CR (%)
SIMD 90.0 90.0 90.0 90.0 90.0
Mr. Motif 70.0 70.0 70.0 70.0 75.0
Enum. Motif 42.5 65.0 54.8 47.9 50.0
Table 2: Evaluation results of equal length motifs in Swedish Leaf [25], Sn: Sensitivity, Sp: Specificity, Pr: Precision, F-M: F-Measure, CR: Correct motif discovery rate
Method Sn(%) Sp (%) Pr (%) F-M (%) CR (%)
SIMD 94.9 96.8 93.3 94.1 95.0
Mr. Motif 54.2 95.8 86.4 66.6 73.3
Enum. Motif 55.9 85.9 66.0 55.9 55.0
Table 3: Evaluation results of equal length motifs in AutoSense [26], Sn: Sensitivity, Sp: Specificity, Pr: Precision, F-M: F-Measure, CR: Correct motif discovery rate
Method Sn(%) Sp (%) Pr (%) F-M (%) CR (%)
SIMD 80.8 97.4 81.9 81.4 80.0
Mr. Motif 25.0 95.2 42.8 31.5 54.1
Enum. Motif 62.5 95.7 65.2 63.8 62.5
classes. The grammar-based method is also able to identify most of the motifs of equal lengths, although one need to match the best parameters combination. For this reason, only the results of Mr. Motif and Enum. Motif are comparable with our results.
As shown, the SIMD method provides better result compared with other algorithms. In all tables the specificity shows high percentage due to the large amount of TN. Also, the percentage of precision in the SIMD is higher than other methods, which depicts the larger amount of correctly found motifs. The F-measure is considered as the overall accuracy measurement which is large for our proposed method in comparison with others.
Table 4: Evaluation results of the SIMD method for variable length motifs, Sn: Sensitivity, Sp: Specificity, Pr: Precision, F-M: F-Measure, CR: Correct motif discovery rate
Data Sn (%) Sp (%) Pr (%) F-M (%) CR (%)
Gun 72.5 72.5 72.5 72.5 75.0
Swedish Leaf 94.9 95.8 91.8 93.3 93.3
AutoSense 83.3 97.6 83.3 83.3 73.0
In most of the cases, Mr. Motif algorithm also provides reasonable results. Although, in the case of the AutoSense data the result are not acceptable. This is due to the representation method applied in Mr. Motif which is unable to handle noise in the data.
Additionally, in the second experiments the aforementioned methods are tested to find motifs of variable lengths. However, only the SIMD is able to discover such motifs. Correspondingly, we are not able to compare our method with other algorithms in the case of motifs with multi-variable lengths. Nevertheless, the result of the proposed method for motifs of vari- able lengths are given in Tab. 4. As depicted, only the proposed method detects motifs of variable length, which proves the success of our algorithm. It should be mentioned that these methods were examined against more data sets in [25], but due to lack of space the results of three data sets are given in this paper.