Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 99
CHAPTER-7
EXPERIMENTS AND TEST
RESULTS FOR PROPOSED
PREDICTION MODEL
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 100
This chapter will deal with all experiments are conducted through out the current
research. Prediction Model of web caching and perfecting consists of main three phases:
Preprocessing, Sessionization, Pattern discovery and analysis; this chapter will discuss all
experiments and associated results in all phases. Different tools and methods are used in
proposed research for different phases.
7.1 Preprocessing Experiments and Results
Preprocessing phase is experimented for current research and past research did by
many authors and then comparison is done based on both approaches. Number of tests is
conducted in this phase and they are narrated as under:
(1) Preprocessing Test-1:-
Test Description: - Parse row log file into appropriate fields of W3C Extended form.
Row log file is available at following path of personal computer
E:\Dharmendra\logexample\iis.log.
Result: - Sample of result of above test is available in Table 7.1.
Result Analysis: - Result got from above test is according to requirement of
proposed research. This result can be used for further processing. Total 5000 raw
are affected by this test.
Query used for: - In Microsoft Log Parser, appropriate environment has to set up to
execute query based on type of log data.
Select * from e:\dharmendra\logexample\iis.log;
Snapshot of Microsoft Visual Log Parser tool for test-1 is described in figure 7.1.
(2) Preprocessing Test-2 :-
Test Description: - Remove unnecessary web objects access by users.
Result: - Sample of result of above test is available in figure 7.2.
Result Analysis: - Result generated is perfect. This result can be used for further
processing. Total 2990 raw affected by above query from raw log file having 15 days
transactions.
Query used for: - Following query is executed to get result.
select LogFilename,date,time,c-ip,s-ip,cs-uri-stem,sc-status,time-taken from
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 101 '%.avi' and time-taken >= 600000) and( sc-status=200 or sc-status=304 or sc-status=306)) or ( (cs-uri-stem like '%.dat' and time-taken >= 600000)and( sc-status=200 or sc-status=304 or sc-status=306)) Figure 7.3 describes snapshot of tool with query and result of test-2.
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 102
Table 7.1 Log Data in W3C Field Format
Log File Name Row Date Time C-ip s-site s-computer s-ip
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 103
Table 7.1 Log Data in W3C Field Format(Continue)
Log File Name Row Date Time C-ip s-site s-computer s-ip
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 104
(Figure 7.2 Filtered Log Entries)
[3]
Preprocessing Test-3
:-Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 105
Result Analysis: - Result generated is perfect. This result can be used for further
processing. Total 490 unique web objects found from total 2990 web objects.
Query used for:- Following query is executed to get result.
select distinct cs-uri-stem, count(cs-uri-stem) from e:\dharmendra\logexample\iis.log
where (cs-uri-stem like '%.htm' and ( sc-status=200 or sc-status=304 or sc-status=306) )
or( cs-uri-stem like '%.asp' and ( sc-status=200 or sc-status=304 or sc-status=306))
Figure 7.3 Snapshot of Microsoft Visual Log Parser Tool for Test-2
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 106
or ( (cs-uri-stem like '%.jpg' and time-taken >= 360000)and( sc-status=200 or sc-status=304 or
sc-status=306) )
or (( cs-uri-stem like '%.gif' and time-taken >= 360000) and( sc-status=200 or sc-status=304 or
sc-status=306) )
or (( cs-uri-stem like '%.avi' and time-taken >= 600000) and( sc-status=200 or sc-status=304 or
sc-status=306))
or ( (cs-uri-stem like '%.dat' and time-taken >= 600000)and( sc-status=200 or sc-status=304 or
sc-status=306)) group by cs-uri-stem
[4]
Preprocessing Test-4
:-
Test Description: - To remove web objects which does not fulfill the condition of
threshold value.
Result: - Sample of result of above test is available in figure-7.5.
Result Analysis: - Total 120 raw is retrieved from above test, which fulfills condition
of threshold.
Query used for: - For this test 4, Microsoft excel tool is used. Following steps are
used to accomplish this test.
(i) First Max function is applied for data which is generated by test 3 to calculate
highest value of hit rate. Highest hit rate generated from data is 62.
= MAX (A1: A 491)
(ii) Threshold value is derived by following formula.
= (62 * 0.10)
(iii) Advanced filtered feature is used to filter only those records which Fulfill
condition of threshold value.
(iv) Lastly, records are arranged in descending order of hit ratio by sorting feature
of Microsoft excel.
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 107
is analyzed that how binary objects are important in process of preprocessing. During
preprocessing stage one test is carried out to decide threshold value of binary objects like audio
and video.
[5]
Preprocessing Test-5
:-
Test Description: - To decide threshold value of image and video file.
Tool used: - One online tool is used to determine load time of image and video.
Reference is
http://www.numion.com/calculators/time.html
.
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 110
Preprocessing Accuracy
97.6 99.4 96 97 98 99 100 Models (% ) A c c ura c y Proposed Model Other Model Proposed Model 97.6 Other Model 99.4 Clean Accuracy (%)(Figure 7.6 Preprocessing Accuracy)
Proportion of Objects
0 500 1000 1500 2000 Tests N um be r of O bj e c ts Text Objects Binary Objects Text Objects 1114 132 27 Binary Objects 1876 358 93After Test 2 After Test 3 After Test 4
(Figure 7.7 Proportion of Text Objects and Binary Objects)
7.2 Sessionization Experiments and Results
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 111
Table 7.2 Sessionization Result
Total Users
109
Total Unique IP
57
Total session
167
7.3 Pattern Discovery Experiments and Results
In proposed research, pattern discovery is done based on Markov Model and proposed
model. Markov Model accepts inputs as a web sessions and generates outputs in terms of
numbers of web objects based on appropriate ordering of model. There are number of tests are
carried out to generate appropriate output based on Markov Model.
7.3.1 Pattern Discovery Experiments based on Markov Model
[1] Markov Test-1:-
Test Description: - To generate occurrence matrix that determines occurrences of
particular web object from current state.
Result:- Occurrence Matrix is generated ( Refer Table 5.3 )
Tools Used: - Microsoft Excel Tool is used for this experiment. One Macro is
generating to determine number of occurrences.
Macro Code:- Following code is generated for that.
Sub Occurence1 ()
Dim c As Long
Dim r As Long
Dim max_col As Long
Dim max_row As Long
max_row = Sheet1.UsedRange.Rows.Count
max_col = Sheet1.UsedRange.Columns.Count
Dim values(50, 50) As Integer
For r = 1 To max_row
For c = 2 To max_col - 1
If (Sheet1.Cells(r, c) <> Sheet1.Cells(r, c + 1)) Then
values(Sheet1.Cells(r,
c).Value,
Sheet1.Cells(r,
c
+
1).Value)
=
values(Sheet1.Cells(r, c).Value, Sheet1.Cells(r, c + 1).Value) + 1
End If
Next c
Next
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 112
For c = 1 To max_col
Sheet1.Cells(i, colval + 1).Value = values(i, c)
colval = colval + 1
Next
Next
End Sub
[2] Markov Test-2
Test Description: - To generate transition probability matrix based on current state.
In order to generate transition probability matrix number of tests is carried out.
(a) Test 1:- Determine summation of number of occurrences from current state
to all other states.
Tools Used:- Microsoft Excel
Query: - SUM(X: Y) Where X and Y are cell numbers.
Result: - It generates summation figure from current state to all
other states.
(b) Test 2:- Generate transition probability from current state to all other
states.
Tools Used:- Microsoft Excel
Query: - SUM(X: Y)/ N Where N is addition that is generated from
test-1.
Result: - It generates transition probability value of every cell from
one cell to another.
(c) Test 3:- To determine maximum value of transition probability in order to
predict next web object.
Tools Used:- Microsoft Excel
Query: - MAX(X: Y)
Result: - Prediction of Next Web Object.
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 113 Prediction Accuracy 0 10 20 30 40 50 60 70
First Second Third Fourth Fifth Sixth Seventh Eight Ninth Tenth Markov Chain Order
% A c c ura c y Series1
(Figure 7.8 Prediction Accuracy of Markov Orders)
Table 7.3 Markov Hit Ratio
Markov Chain
Hit Ratio
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 114
Markovin Model Hit Ratio
-8 -6 -4 -2 0 2 4 6 8 10 Firs t Sec ond Third Four th Fifth Sixth Sev enth Eight Ninth Tent h Markov Chains H it R a ti o Hit Ratio
(Figure 7.9 Markov Model Hit Ratio)
7.3.2 Pattern Discovery Experiments based on Proposed Model
In proposed model pattern discovery is done based on appropriate formation of web sessions.
To perform web sessions new approach is discovered in proposed research. According to new
approach web sessions are formed based on distance measurement techniques. Proposed
research identified several distance measurement techniques relevant to web caching and
prefetchning. Numbers of experiments are conducted for every distance measurement
techniques.
7.3.2.1 Experiments on Lavensthein Distance Measurement technique
[1] Lavensthein Test -1
Test Description: - To determine distance measure between web sessions according
to Lavensthein distance measurement technique.
Tool used: - One online tool is used to determine distance measure between web
sessions. Reference is
http://asecuritysite.com/forensics/simstring
.
Results: - One metric with distance value is generated as a result of this test.
[2] Lavensthein Test -2
Test Description: - To determine proximity of different web sessions according to
Lavensthein measurement technique.
Tool used: - Microsoft Excel tool is used to determine proximity based on
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 115
Results: - As results of this test number of sessions involved in each cluster is
determined based on particular threshold value.
[3] Lavensthein Test -3
Test Description: - To determine accuracy of pattern.
Tool used: - Microsoft Excel tool is used to determine accuracy of pattern. Accuracy
of pattern is determine by taking average of each permutation combination web
session pair.
Results: - Accuracy value is generating for each pattern.
[4] Lavensthein Test-4
Test Description: - To determine mean and standard deviation in order to take
appropriate action.
Tool used: - Microsoft Excel tool is used to determine mean and standard deviation
of patterns generated at specific threshold value.
Results: - Mean and standard deviation of patterns are generated as a result of test.
Table 7.4 describes the conclusion of all above tests. Table describes threshold value, number
of web sessions in particular cluster, mean and standard deviation of all patterns.
Table 7.4 Patten Discovery based on Lavensthein Distance
Threshold Number of Clusters
Sessions Involved in each cluster
Web Objects Referred in that Accuracy of pattern
50 22 1,10 2,5,7,8,9,10,12,13,14,15 55
2,14,15 6,8,9,12,15,5,1,7,10,2,4,14,3 38
3,9,18,23 6,4,5,7,9,10,11,12,15,14,13,8,2,3,13,1 55.66
4,7,11,12,15,17 2,3,4,6,9,11,12,14,15,8,10,5,7,10
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 116
Table 7.4 Pattern Discovery based on Lavensthein Distance(Continue) Threshold Number of
Clusters
Sessions Involved in each cluster
Web Objects Referred in that Accuracy of pattern 5,13,19 3,6,9,11,12,13,14,15,2,10,5,7,8 68.33 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 79 7,4,15 2,4,6,8,9,10,12,14,15,3,1 60.33 8,20 7,6,5,2,1,9,10,12,14,11,13 88 9,3,18 3,4,5,6,7,9,10,11,12,15,14,13,8,2 49 10,1,15 2,5,7,8,9,10,4,6,12,14,15,3,1 46 11,4,12,15,17 2,4,6,8,9,10,12,14,15,3, 5,11,15,1,7 72.5 12,4,11,15,17 2,4,6,8,9,10,12,14,15,3,1,7 72.5 13,5,19 5,7,9,11,12,13,14,15,2,3,8,6 68.33 14,2 6,8,9,12,15,2,5 50 15,2,4,7,10,11,12,17 6,8,9,12,15,2,5,4,10,14,3,11, 7,13,
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 117
Table 7.4 Pattern Discovery based on Lavensthein Distance(Continue) Threshold Number of
Clusters
Sessions Involved in each cluster
Web Objects Referred in that Accuracy of pattern 16,6 3,8,7,9,4,6,10,11,12,13,15 79 17,4,11,12,15 2,4,6,8,9,10,12,14,15,3,5,11,1 72.5 18,3,9,25 3,4,5,6,7,9,10,11,12,15,14,13,8,2,1 56.83 19,5,13, 5,7,9,11,12,13,14,15,2,3, 6,10 68.33 20,8 7,6,5,2,1,9,10,12,14,11 88 23,3 3,4,5,6,7,9,10,11,12,15,14,13 50 25,18 8,9,10,2,3,4,5,6,7,11,12,15,14,13 56 Standard Deviation 13.63 Mean 63.13 55 18 1,10 2,5,7,8,9,10,12,13,14,15,10 55 3,9,18 6,4,5, 7,9,10,11,12,15,14,13, 8,2,3,6 70 4,11,12,15,17 2,4,6,8,9,10,12,14,15,3,5,11,1,7 72.5 5,13,19 3,6,9,11,12,13,14,15,2,10,5,7,8 68.33 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 118
Table 7.4 Pattern Discovery based on Lavensthein Distance(Continue) Threshold Number of
Clusters
Sessions Involved in each cluster
Web Objects Referred in that Accuracy of pattern 8,20 7,6,5,2,1,9,10,12,14,11,13 88 9,3,18 3,4,5,6,7,9,10,11,12,15,14,13,8,2 70 10,1 2,5,7,8,9,10 55 11,4,12,15,17 2,4,6,8,9,10,12,14,15,3,5,11,2,1,7 72.5 12,4,11,17 2,4,6,8,9,10,12,14,15,3,7 78.16 13,5,19 5,7,9,11,12,13,14,15,2,3,2,3,8,6 68.33 15,4,11,17 2,4,6,8,9,10,12,14,15,3 77 16,6 3,8,7,9,4,6,10,11,12,13,15 79 17,4,11,12,15 2,4,6,8,9,10,12,14,15,3,5,11,1 72.5 18,3,9,25 3,4,5,6,7,9,10,11,12,15,14,13,8,2,1 56.83 19,5,13 5,7,9,11,12,13,14,15,2,3, 6,10 68.33 20,8 7,6,5,2,1,9,10,12,14,11 88 25,18 8,9,10,2,3,4,5,6,7,11,12,15,14,13 56 Standard Deviation 10.17 Mean
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 119
Table 7.4 Pattern Discovery based on Lavensthein Distance(Continue) Threshold Number of
Clusters
Sessions Involved in each cluster
Web Objects Referred in that Accuracy of pattern 60 15 3,9,18 6,4,5, 7,9,10,11,12,15,14,13, 8,2,3,6 70 4,11,12,15,17 2,4,6,8,9,10,12,14,15,3,5,11,1,7 72.5 5,13,19 3,6,9,11,12,13,14,15,2,10,5,7,8 68.33 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 79 8,20 7,6,5,2,1,9,10,12,14,11,13 88 9,3 3,4,5,6,7,9,10,11,12,15,14,13 79 11,4,12,15,17 2,4,6,8,9,10,12,14,15,3,5,11,2,1,7 72.5 12,4,11,17 2,4,6,8,9,10,12,14,15,3,7 78.16 13,5 5,7,9,11,12,13,14,15,2,3 69 15,4,11,17 2,4,6,8,9,10,12,14,15,3 77 16,6 3,8,7,9,4,6,10,11,12,13,15 79 17,4,11,12,15 2,4,6,8,9,10,12,14,15,3,5,11,1 72.5 18,3 3,4,5,6,7,9,10,11,12,15,14,13 75 19,5 5,7,9,11,12,13,14,15,2,3 79 20,8 7,6,5,2,1,9,10,12,14,11 88 Standard Deviation 6.02 Mean
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 120
Table 7.4 Pattern Discovery based on Lavensthein Distance(Continue) Threshold Number of
Clusters
Sessions Involved in each cluster
Web Objects Referred in that Accuracy of pattern 65 15 3,9,18 6,4,5, 7,9,10,11,12,15,14,13, 8,2,3,6 70 4,11,12,15,17 2,4,6,8,9,10,12,14,15,3,5,11,1,7 72.5 5,13,19 3,6,9,11,12,13,14,15,2,10,5,7,8 68.33 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 79 8,20 7,6,5,2,1,9,10,12,14,11,13 88 9,3 3,4,5,6,7,9,10,11,12,15,14,13 79 11,4,12,17 2,4,6,8,9,10,12,14,15,3, ,5, 11,15,2,,7 78.16 12,4,11,17 2,4,6,8,9,10,12,14,15,3,7 78.16 13,5 5,7,9,11,12,13,14,15,2,3 69 15,4,17 2,4,6,8,9,10,12,14,15,3, 7 77 16,6 3,8,7,9,4,6,10,11,12,13,15 79 17,4,11,12,15 2,4,6,8,9,10,12,14,15,3,5,11,1 72.5 18,3 3,4,5,6,7,9,10,11,12,15,14,13 75 19,5 5,7,9,11,12,13,14,15,2,3 79 20,8 7,6,5,2,1,9,10,12,14,11 88 Standard Deviation 5.93 Mean 76.84 70 14 3,9,18 6,4,5, 7,9,10,11,12,15,14,13, 8,2,3,6 70 4,11,12,15,17 2,4,6,8,9,10,12,14,15,3,5,11,1,7 72.5 5,19 5,7,9,11,12,13,14,15,2,3,8,6 79 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 121
Table 7.4 Pattern Discovery based on Lavensthein Distance(Continue) Threshold Number of
Clusters
Sessions Involved in each cluster
Web Objects Referred in that Accuracy of pattern 8,20 7,6,5,2,1,9,10,12,14,11,13 88 9,3 3,4,5,6,7,9,10,11,12,15,14,13 79 11,4,17 2,4,6,8,9,10,12,14,15,3,7 85.33 12,4,17 2,4,6,8,9,10,12,14,15,3,7 77 15,4 2,4,6,8,9,10,12,14,15,3 77 16,6 3,8,7,9,4,6,10,11,12,13,15 79 17,4,11,12 2,4,6,8,9,10,12,14,15,3,5,11,15,3,9,8,6,10 78.16 18,3 3,4,5,6,7,9,10,11,12,15,14,13 75 19,5 5,7,9,11,12,13,14,15,2,3 79 20,8 7,6,5,2,1,9,10,12,14,11 88 Standard Deviation 5.18 Mean 78.99 75 13 3,9,18 6,4,5, 7,9,10,11,12,15,14,13, 8,2,3,6 70 4,11,15,17 2,4,6,8,9,10,12,14,15,3,1,7 77 5,19 5,7,9,11,12,13,14,15,2,3,8,6 79 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 79 8,20 7,6,5,2,1,9,10,12,14,11,13
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 122
Table 7.4 Pattern Discovery based on Lavensthein Distance(Continue) Threshold Number of
Clusters
Sessions Involved in each cluster
Web Objects Referred in that Accuracy of pattern 9,3 3,4,5,6,7,9,10,11,12,15,14,13 79 11,4,17 2,4,6,8,9,10,12,14,15,3,7 85.33 15,4 2,4,6,8,9,10,12,14,15,3 77 16,6 3,8,7,9,4,6,10,11,12,13,15 79 17,4,11 2,4,6,8,9,10,12,14,15,3 85.33 18,3 3,4,5,6,7,9,10,11,12,15,14,13 75 19,5 5,7,9,11,12,13,14,15,2,3 79 20,8 7,6,5,2,1,9,10,12,14,11 88 Standard Deviation 5.26 Mean 80.05 80 5 4,11,17 2,4,6,8,9,10,12,14,15,3,7 85.33 8,20 7,6,5,2,1,9,10,12,14,11,13 88 11,4,17 2,4,6,8,9,10,12,14,15,3,7 85.33 17,4,11 2,4,6,8,9,10,12,14,15,3 85.33 20,8 7,6,5,2,1,9,10,12,14,11 88 Standard Deviation 1.46 Mean 86.398
7.3.2.2 Experiments on Needleman Wunsch Distance Measurement technique
[1] Needleman Wunsch Test -1
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 123
Tool used: - One online tool is used to determine distance measure between web
sessions. Reference is
http://asecuritysite.com/forensics/simstring
.
Results: - One metric with distance value is generated as a result of this test.
[2] Needleman Wunsch Test -2
Test Description: - To determine proximity of different web sessions according to
Needleman Wunsch measurement technique.
Tool used: - Microsoft Excel tool is used to determine proximity based on
conditional formatting option. Metric generated in previous test result is used as an
input.
Results: - As results of this test number of sessions involved in each cluster is
determined based on particular threshold value.
[3] Needleman Wunsch Test -3
Test Description: - To determine accuracy of pattern.
Tool used: - Microsoft Excel tool is used to determine accuracy of pattern. Accuracy
of pattern is determined by taking average of each permutation combination web
session pair.
Results: - Accuracy value is generating for each pattern.
[4] Needleman Wunsch Test-4
Test Description: - To determine mean and standard deviation in order to take
appropriate action.
Tool used: - Microsoft Excel tool is used to determine mean and standard deviation
of patterns generated at specific threshold value.
Results: - Mean and standard deviation of patterns are generated as a result of test.
Table 7.5 describes the conclusion of all above tests according to Needleman Wunsch distance
measurement technique. Table describes all fields that are generated as a result of all above
tests.
Table 7.5 Patten Discovery based on Needleman Wunsch Distance
Thres hold
Number of Clusters
Sessions Involved in each cluster Web Objects Referred in that Accuracy of Pattern 55 25 1,10,11,12,14,21,23,25 2,5,7,8,9,10,12,13,14,15,4,3,
,6,1,11
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 124
Table 7.5 Patten Discovery based on Needleman Wunsch Distance(Continue)
Thres hold
Number of Clusters
Sessions Involved in each cluster Web Objects Referred in that Accuracy of Pattern 2,3,5,7,10,15,18,24,25 3,4,5,6,7,9,10,11,12,15,14,13, 2,8,1 58.8 3,2,6,7,8,9,10,12,15,16,18,20,23,25 6,8,9,12,15,2,5,3,7, 4,10,11,13, 14,1 58.30 4,5,7,8,9,10,11,12,13,15,17,19,20,22, 24 5,7,9,11,12,13,14,15,2,3,14,4,6,8, 1,10 59 5,2,4,9,10,12,13,15,19,20,24 6,8,9,12,15,2,5,4,10,14,3, 7,11, 14,13,11,12 59.01 6,3,7,9,10,16,18,20,23 3,4,5,6,7,9,10,11,12,15,14,13,2,8, 13,1 60.36 7,2,3,4,6,8,10,15,16,18,19,20,21,22,2 3,24 6,8,9,12,15,2,3,4,5,7,10,11,14,13, 1, 2, 56.16 8,3,4,7,9,10,15,18,20,24 3,4,5,6,7,9,10,11,12,15,14,13,2, ,8 ,1 61.68 9,3,4,5,6,8,10,11,12,14,15,16,17,18,2 0,24 3,4,5,6,7,9,10,11,12,15,14,13,2,,8 ,1 58.83 10,1,2,3,4,5,6,7,8,9,12,15,17,18,20,24 2,5,7,8,9,10,6, 12,15,3,4,11,14,13,3,1
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 125
Table 7.5 Patten Discovery based on Needleman Wunsch Distance(Continue)
Thres hold
Number of Clusters
Sessions Involved in each cluster Web Objects Referred in that Accuracy of Pattern 11,1,4,9,12,14,15,17,19 2,5,7,8,9,10,4,6,12,14,15,3,11,13, 1 63.23 12,1,3,4,5,9,10,11,14,15,17,20 2,5,7,8,9,10,3,4,6,11,12,15,14,13, 1 61.05 13,4,5,17,19 2,4,6,8,9,10,12,14,15,3,,5,7,11,13 , 66.7 14,1,9,11,12,17,19,25 2,5,7,8,9,10,6,4,5,11,12,15,14,13, 3,1 60.07 15,2,3,4,5,7,8,9,10,11,12,16,17,18,20, 24 6,8,9,12,15,2,5,3,4,7,10,11,14,13, 2,1,12 59.02 16,3,6,7,9,15,18,20,25 3,4,5,6,7,9,10,11,12,15,14,13,8,2, 1 61.30 17,4,9,10,11,12,13,14,15,19,21,25 2,4,6,8,9,10,12,14,15,3,5,7,,11,13 ,,15,1 59.5 18,2,3,6,7,8,9,10,15,16,20,23,25 6,8,9,12,15,2,5,3,4,7,11,14,13,3,, 4,10,13,12,1
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 126
Table 7.5 Patten Discovery based on Needleman Wunsch Distance(Continue)
Thres hold
Number of Clusters
Sessions Involved in each cluster Web Objects Referred in that Accuracy of Pattern 19,4,5,7,11,13,14,17,21 2,4,6,8,9,10,12,14,15,3,5,7,11,13, ,1 59.5 20,3,4,5,6,7,8,9,10,12,15,16,18,25 3,4,5,6,7,9,10,11,12,15,14,13,2, ,8,1 59.49 21,1,7,17,19,25 2,5,7,8,9,10,3,4,6 ,11,12,14,15,13,1, 56.06 22,4,7 2,4,6,8,9,10,12,14,15,3,11 58.33 23,1,3,6,7,18,24,25 2,5,7,8,9,10,3,4,6,11,12,15,14,13, 1 57.75 24,2,4,5,7,8,9,10,15,23 6,8,9,12,15,2,5,4 ,10,14,15,3,7,11,13,1 58.13 25, 1,2,3,6,14,16,17,18,20,21,23 2,5,7,8,9,10,6,12,15,3,4,11, 14,13,1,,3 55.69 Standard Deviation 2.36368829 Mean 59.2616 60 23 1,14 6,8,9,12,15,2,5,1,7,10 71 2,7,15,24 2,3,4,6,9,11,12,14,15,8, 10,1,13,1
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 127
Table 7.5 Patten Discovery based on Needleman Wunsch Distance(Continue)
Thres hold
Number of Clusters
Sessions Involved in each cluster Web Objects Referred in that Accuracy of Pattern 3,6,9,10,18,25 3,8,7,9,4,6,10,11,12,13,15,,5,14,2 ,1 64 4,5,7,9,10,11,12,15,17,20 5,7,9,11,12,13,14,15,2,3,,4,6, 8,10,1 62.84 5,4,13,19 2,4,6,8,9,10,12,14,15,3, 11, 13,5,7 68.5 6,3,7,16,25 3,4,5,6,7,9,10,11,12,15,14,13,2,8, 1, 63.6 7,2,4,6,8,10,15,16,18 6,8,9,12,2,5,4,10 ,14,15,3, 7,11, 13,1,14, ,2 59.69 8,7,10,15,20 2,3,4,6,9,11,12,14,15,8,5,7,10, 13,1 64.8 9,3,4,10,12,15,18 3,4,5,6,7,9,10,11,12,15,14,13,2,8, 1 65.47 10,3,4,7,8,9,15 3,4,5,6,7,9,10,11,12,15,14,13,2,8, 1 63 11,4,12,14,15,17 2,4,6,8,9,10,12,14,15,3,,5,11,,1,7
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 128
Table 7.5 Patten Discovery based on Needleman Wunsch Distance(Continue)
Thres hold
Number of Clusters
Sessions Involved in each cluster Web Objects Referred in that Accuracy of Pattern 12,4,9,11,15,17 2,4,6,8,9,10,12,14,15,3,,5,7,11,13 ,1 73.26 13,5,19 5,7,9,11,12,13,14,15,2,3,,8,6 77 14,1,11 2,5,7,8,9,10,4,6, ,12,14,15,3 66.33 15,2,4,7,8,9,10,11,12,17,20,24 6,8,9,12,15,2,5,4, ,10,14,3,11, ,7, 1,,14,,13,2,1 61.42 16,6,7,18,25 3,8,7,9,4,6,10,11,12,13,15,2,,14, ,5, 1 64.6 17,4,11,12,15,19,25 2,4,6,8,9,10,12,14,15,3,,5,11,1,7,, 13 66.52 18,3,7,9,16,25 3,4,5,6,7,9,10,11,12,15,14,13,2, 8,1 63.66 19,5,13,17 5,7,9,11,12,13,14,15,2,3,,6,10,4,8 ,7 66.66 20,4,8,15 2,4,6,8,9,10,12,14,15,3,7,5,2,1, 11 70.33 23,25 8,2,1,3,4,5,7,9,10,11,12,13 65 24,2,15 6,8,9,12,15,2,5,4,10,14,3,2,1
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 129
Table 7.5 Patten Discovery based on Needleman Wunsch Distance(Continue)
Thres hold
Number of Clusters
Sessions Involved in each cluster Web Objects Referred in that Accuracy of Pattern 25,3,6,16,17,18,23 3,4,5,6,7,9,10,11,12,15,14,13,8,,2 ,1 61.42 Standard Deviation 4.50 Mean 65.89 65 21 1,14 6,8,9,12,15,2,5,1,7,10 71 3,9,18,25 6,4,5,,7,9,10,11,12,15,14,13,8,2,3 ,1 71 4,5,11,12,15,17,20 5,7,9,11,12,13,14,15,2,3,4,6,8,10, 1 67.19 5,4,13,19 2,4,6,8,9,10,12,14,15,3, 11, 13,5,7 68.5 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 89 7,10 2,5,7,8,9,10,12,13,14,15 68 8,20 7,6,5,2,1,5,6,9,10,12,14,11,10,9,1 3,5 88 9,3,18 3,4,5,6,7,9,10,11,12,15,14,13,8, ,2 80.66 10,7,15 2,3,4,6,9,11,12,14,15,8,10,1 65.66 11,4,12,14,15,17 2,4,6,8,9,10,12,14,15,3,,5,11,,1,7
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 130
Table 7.5 Patten Discovery based on Needleman Wunsch Distance(Continue)
Thres hold
Number of Clusters
Sessions Involved in each cluster Web Objects Referred in that Accuracy of Pattern 12,4,11,15,17 2,4,6,8,9,10,12,14,15,3,1, 7 79.7 13,5,19 5,7,9,11,12,13,14,15,2,3,,8,6 77 14,1,11 2,5,7,8,9,10,4,6, ,12,14,15,3 66.33 15,4,10,11,12,17 2,4,6,8,9,10,12,14,15,3,5,7, 13, ,11 71.6 16,6 3,8,7,9,4,6,10,11,12,13,15 89 17,4,11,12,15 2,4,6,8,9,10,12,14,15,3,5,11,15,3, 1 77.8 18,3,9,25 3,4,5,6,7,9,10,11,12,15,14,13,,8,2 ,1 71 19,5,13 5,7,9,11,12,13,14,15,2,3,6,10 77 20,4,8 2,4,6,8,9,10,12,14,15,3, 7,5,1,11 71.33 23,25 8,2,1,3,4,5,7,9,10,11,12,13 65 25,3,18,23 3,4,5,6,7,9,10,11,12,15,14,13,8, 2,1 66.83 Standard Deviation 7.67 Mean 74.33 70 17 1,14 6,8,9,12,15,2,5,1,7,10
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 132
Table 7.5 Patten Discovery based on Needleman Wunsch Distance(Continue)
Thres hold
Number of Clusters
Sessions Involved in each cluster Web Objects Referred in that Accuracy of Pattern 75 15 3,9,18 6,4,5,7,9,10,11,12,15,14,13,8,2,3, 13 80.66 4,11,12,15,17 2,4,6,8,9,10,12,14,15,3, ,5, 11, 1,7 77.8 5,13,19 3,6,9,11,12,13,14,15,2,14,10,5,7, 15,8 77 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 89 8,20 7,6,5,2,1,5,6,9,10,12,14,11,10,9,1 3,5 88 9,3 3,4,5,6,7,9,10,11,12,15,14,13 82 11,4,12,17 2,4,6,8,9,10,12,14,15,3,5, 11,7 83.66 12,4,11,17 2,4,6,8,9,10,12,14,15,3,,7 83.66 13,5,19 5,7,9,11,12,13,14,15,2,3,,8,6 77 15,4 2,4,6,8,9,10,12,14,15,3 85 16,6 3,8,7,9,4,6,10,11,12,13,15 89 17,4,11,12 2,4,6,8,9,10,12,14,15,3, 5, 11 83.66 18,3 3,4,5,6,7,9,10,11,12,15,14,13 88 19,5,13 5,7,9,11,12,13,14,15,2,3,6,10 77 20,8 7,6,5,2,1,9,10,12,14,11
( Table Continue to next page
)Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 133
Table 7.5 Patten Discovery based on Needleman Wunsch Distance(Continue)
Thres hold
Number of Clusters
Sessions Involved in each cluster Web Objects Referred in that Accuracy of Pattern Standard Deviation 4.57 Mean 83.29 80 12 3,9,18 6,4,5,7,9,10,11,12,15,14,13,8,2,3, 13 80.66 4,11,12,15,17 2,4,6,8,9,10,12,14,15,3, ,5, 11, 1,7 77.8 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 89 8,20 7,6,5,2,1,5,6,9,10,12,14,11,10,9,1 3,5 88 9,3 3,4,5,6,7,9,10,11,12,15,14,13 82 11,4,17 2,4,6,8,9,10,12,14,15,3, 7 86.33 12,4,17 2,4,6,8,9,10,12,14,15,3, 7 84 15,4 2,4,6,8,9,10,12,14,15,3 85 16,6 3,8,7,9,4,6,10,11,12,13,15 89 17,4,11,12 2,4,6,8,9,10,12,14,15,3, 5, 11 83.66 18,3 3,4,5,6,7,9,10,11,12,15,14,13 88 20,8 7,6,5,2,1,9,10,12,14,11 88 Standard Deviation 3.60 Mean 85.12 85 10 3,18 8,9,10,2,3,4,5,6,7,,11,12,15,14,13 88 4,15,17 2,4,6,8,9,10,12,14,15,3,1,7
( Table Continue to next page)
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 134
Table 7.5 Patten Discovery based on Needleman Wunsch Distance(Continue)
Thres hold
Number of Clusters
Sessions Involved in each cluster Web Objects Referred in that Accuracy of Pattern 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 89 8,20 7,6,5,2,1,5,6,9,10,12,14,11,10,9,1 3,5 88 11,17 2,4,6,8,9,10,12,14,15,3,7 91 15,4 2,4,6,8,9,10,12,14,15,3 85 16,6 3,8,7,9,4,6,10,11,12,13,15 89 17,4,11 2,4,6,8,9,10,12,14,15,3 86.33 18,3 3,4,5,6,7,9,10,11,12,15,14,13 88 20,8 7,6,5,2,1,9,10,12,14,11 88 Standard Deviation 2.56 Mean 87.39 90 2 11,17 2,4,6,8,9,10,12,14,15,3,7 91 17,11 2,4,6,8,9,10,12,14,15,3,7 91 Standard Deviation 0 Mean 91
7.3.2.3 Experiments on Smith Waterman Distance Measurement technique
[1] Smith Waterman Test -1
Test Description: - To determine distance measure between web sessions according
to Lavensthein distance measurement technique.
Tool used: - One online tool is used to determine distance measure between web
sessions. Reference is
http://asecuritysite.com/forensics/simstring
.
Results: - One metric with distance value is generated as a result of this test.
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 135
Test Description: - To determine proximity of different web sessions according to
Smith Waterman measurement technique.
Tool used: - Microsoft Excel tool is used to determine proximity based on
conditional formatting option. Metric generated in previous test result is used as an
input.
Results: - As results of this test number of sessions involved in each cluster is
determined based on particular threshold value.
[3] Smith Waterman Test -3
Test Description: - To determine accuracy of pattern.
Tool used: - Microsoft Excel tool is used to determine accuracy of pattern. Accuracy
of pattern is determined by taking average of each permutation combination web
session pair.
Results: - Accuracy value is generating for each pattern.
[4] Smith Waterman Test-4
Test Description: - To determine mean and standard deviation in order to take
appropriate action.
Tool used: - Microsoft Excel tool is used to determine mean and standard deviation
of patterns generated at specific threshold value.
Results: - Mean and standard deviation of patterns are generated as a result of test.
Table 7.6 describes the conclusion of all above tests according to Smith Waterman distance
measurement technique. Table describes all fields that are generated as a result of all above
tests. Figure 7.10 describes pattern accuracy based on all distance measurement techniques
used in proposed work. Result shows that Smith Waterman distance measurement techniques
reach to 100 percent accuracy level. Figure 7.11 describes hit ratio based on Lavensthein
distance measurement technique. Figure 7.12 shows the hit ratio results based on Needleman
Wunsch distance measurement technique. Figure 7.13 describes results of hit ratio based on
Smith Waterman distance measurement technique. From the results of hit ratio it is derived
that Smith Waterman distance measurement technique gives an ideal value of hit ratio that is
nearer to 1.
Table 7.6 Patten Discovery based on Smith Waterman Distance
Thres
hold
Number
of
Clusters
Sessions Involved in
each cluster
Web Objects Referred in that
Accuracy of pattern
50 24 1,3,4,9,10,11,12,14,15,17,18,25 3,4,5,6,7,9,10,11,12,15,14,13,2,8
( Table Continue to next page)
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 136
Table 7.6 Patten Discovery based on Smith Waterman Distance(Continue)
Thres
hold
Number
of
Clusters
Sessions Involved in
each cluster
Web Objects Referred in that
Accuracy of pattern
2,4,11,14,15,17 2,4,6,8,9,10,12,14,15,3,,5,1,7 65.64 3,1,7,9,18,23,25 2,5,7,8,9,10,3,4,6,11,12,14,15,13,3, 1 52.38 4,1,2,7,10,11,12,15,17 2,5,7,8,9,10,6,12,15,3,4,11,14,13,3,1 55.61 5,10,13,19 2,5,7,8,9,10,12,13,14,15,3,6,11 61.88 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 100 7,3,4,11,13,15,17,18 3,4,5,6,7,9,10,11,12,15,14,13,2,,8 ,1 49.85 8,20,21 7,6,5,2,1,9,10,12,14,11,13,5,3 71.33 9,1,3,18 2,5,7,8,9,10,3,4,6,11,12,15,14,13 70.16 10,1,4,5,11,14,15,17,19,25 2,5,7,8,9,10,4,6,,12,14,15,3,,11,13,1 49.55 11,1,2,4,7,10,12,15,17 2,5,7,8,9,10,6,12,15,,4,,14,3,1113, 1 55.61 12,1,4,11,17 2,5,7,8,9,10,4,6,,12,14,15,3 67.5 13,5,7,19 5,7,9,11,12,13,14,15,2,3,14,,4,6,,8
( Table Continue to next page)
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 137
Table 7.6 Patten Discovery based on Smith Waterman Distance(Continue)
Thres
hold
Number
of
Clusters
Sessions Involved in
each cluster
Web Objects Referred in that
Accuracy of pattern
14,1,2,10 2,5,7,8,9,10,6,,12,15,,13,14 70.66 15,1,2,4,7,10,11,17 2,5,7,8,9,10,6,12,15,,4,14,3,,11,13 57.78 16,6 3,8,7,9,4,6,10,11,12,13,15 100 17,1,2,4,7,10,11,12,15 2,5,7,8,9,10,6,12,15,4,14,3,11,13,1 55.61 18,1,3,7,9,23,25 2,5,7,8,9,10,3,4,6,,11,12,15,14,13,1 52.38 19,5,10,13 5,7,9,11,12,13,14,15,2,3,8,10,,6,9,14 61.88 20,8,21 7,6,5,2,1,,9,10,12,14,11,3 71.33 21,8,20 7,6,5,2,1,,9,10,12,14,11,3 71.33 22,23 3,4,5,6,8,1,11,12 57 23,3,18,22 3,4,5,6,7,9,10,11,12,15,14,13,8,2 57.16 25,1,3,10,18 2,5,7,8,9,10,3,4,6,,11,12,15,14,13 62.2
Standard Deviation
13.40
Mean
( Table Continue to next page)
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 138
Table 7.6 Patten Discovery based on Smith Waterman Distance(Continue)
Thres
hold
Number
of
Clusters
Sessions Involved in
each cluster
Web Objects Referred in that
Accuracy of pattern
55 24 1,10,12,14,25 2,5,7,8,9,10,12,13,14,15,3,4,11,6,1 65.12 2,4,11,14,15,17 2,4,6,8,9,10,12,14,15,3,,5,1,7 65.64 3,9,18,25 6,4,5,7,9,10,11,12,15,14,13,8,2,3,1 75 4,2,11,12,15,17 6,8,9,12,15,2,5,4,10,14,3,11,1,7 69.06 5,13,19 3,6,9,11,12,13,14,15,2,10,5,7,8 78.33 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 100 7,13,18 3,6,9,11,12,13,14,15,2,10,8,4,5,7 47.66 8,20,21 7,6,5,2,1,9,10,12,14,11,13,5,3 71.33 9,3,18 3,4,5,6,7,9,10,11,12,15,14,13,8,2 90.33 10,1,14,25 2,5,7,8,9,10,6,12,15,1,3,4,11,13 73.83 11,2,4,12,15,17 6,8,9,12,15,2,5,4,10,14,3,11,2,14,1,7 69.06 12,1,4,11,17 2,5,7,8,9,10,4,6,,12,14,15,3 67.5 13,5,7,19 5,7,9,11,12,13,14,15,2,3,14,,4,6,,8 64.16 14,1,2,10 2,5,7,8,9,10,6,,12,15,,13,14 70.66 15,2,4,11,17 6,8,9,12,15,2,5,4,10,14,3,7
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 139
Table 7.6 Patten Discovery based on Smith Waterman Distance(Continue)
Thres
hold
Number
of
Clusters
Sessions Involved in
each cluster
Web Objects Referred in that
Accuracy of pattern
16,6 3,8,7,9,4,6,10,11,12,13,15 100 17,2,4,11,12,15 6,8,9,12,15,2,5,4,10,,14,3,11,4,1 69.06 18,3,7,9,25 3,4,5,6,7,9,10,11,12,15,14,13,2,8,1 61.3 19,5,13 5,7,9,11,12,13,14,15,2,3,6,10 78.33 20,8,21 7,6,5,2,1,,9,10,12,14,11,3 71.33 21,8,20 7,6,5,2,1,,9,10,12,14,11,3 71.33 22,23 3,4,5,6,8,1,11,12 57 23,22 3,4,5,6,8,1,11,12 57 25,1,3,10,18 2,5,7,8,9,10,3,4,6,,11,12,15,14,13 62.2
Standard Deviation
12.27
Mean
71.34
60 21 1,10,14,25 2,5,7,8,9,10,12,13,14,15,6,1,3,4,11 72.5 2,14,15 6,8,9,12,15,2,5,1,7,10,4,14,3 67.33 3,9,18 6,4,5,7,9,10,11,12,15,14,13,8,2,3 90.33 4,11,12,15,17 2,4,6,8,9,10,12,14,15,3,5,11,3,1,7 75.8 5,13,19 3,6,9,11,12,13,14,15,2,10,5,7,8 78.33 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 100 7,13,18 3,6,9,11,12,13,14,15,2,10,8,4,5,7 47.66 8,20 7,6,5,2,1,9,10,12,14,11,13Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 140
Table 7.6 Patten Discovery based on Smith Waterman Distance(Continue)
Thres
hold
Number
of
Clusters
Sessions Involved in
each cluster
Web Objects Referred in that
Accuracy of pattern
9,3,18 3,4,5,6,7,9,10,11,12,15,14,13,8,2 90.33 10,1 2,5,7,8,9,10 100 11,4,15,17 2,4,6,8,9,10,12,14,15,3,1,7 84.71 12,4,17 2,4,6,8,9,10,12,14,15,3,7 74.33 13,5,7,19 5,7,9,11,12,13,14,15,2,3,14,,4,6,8 64.16 14,1,2 2,5,7,8,9,10,6,12,15 77.66 15,2,4,11,17 6,8,9,12,15,2,5,4,10,14,3,7 73.8 16,6 3,8,7,9,4,6,10,11,12,13,15 100 17,4,11,12,15 2,4,6,8,9,10,12,14,15,3,5,11,3,1,7 75.8 18,3,7,9 3,4,5,6,7,9,10,11,12,15,14,13,2,8 70.16 19,5,13 5,7,9,11,12,13,14,15,2,3,6,10 78.33 20,8 7,6,5,2,1,9,10,12,14,11 100 25,1 2,5,7,8,9,10 83
Standard Deviation
14.04
Mean
81.15
65 19 1,10,14,25 2,5,7,8,9,10,12,13,14,15,6,1,3,4,11 72.5 2,14 6,8,9,12,15,2,5,7,10 100 3,9,18 6,4,5,7,9,10,11,12,15,14,13,8,2,3 90.33 4,11,15,17 2,4,6,8,9,10,12,14,15,3,1,7( Table Continue to next page)
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 141
Table 7.6 Patten Discovery based on Smith Waterman Distance(Continue)
Thres
hold
Number
of
Clusters
Sessions Involved in
each cluster
Web Objects Referred in that
Accuracy of pattern
5,13,19 3,6,9,11,12,13,14,15,2,10,5,7,8 78.33 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 100 8,20 7,6,5,2,1,9,10,12,14,11,13 100 9,3,18 3,4,5,6,7,9,10,11,12,15,14,13,8,2 90.33 10,1 2,5,7,8,9,10 100 11,4,15,17 2,4,6,8,9,10,12,14,15,3,1,7 84.71 13,5 5,7,9,11,12,13,14,15,2,3 73 14,1,2 2,5,7,8,9,10,6,12,15 77.66 15,4,11,17 2,4,6,8,9,10,12,14,15,3,7 84.71 16,6 3,8,7,9,4,6,10,11,12,13,15 100 17,4,11,15 2,4,6,8,9,10,12,14,15,3,1,7 84.71 18,3,9 3,4,5,6,7,9,10,11,12,15,14,13 90.33 19,5 5,7,9,11,12,13,14,15,2,3 100 20,8 7,6,5,2,1,9,10,12,14,11 100 25,1 2,5,7,8,9,10 83
Standard Deviation
9.80
Mean
89.17
70 19 1,10,14,25 2,5,7,8,9,10,12,13,14,15,6,1,3,4,11 72.5 2,14 6,8,9,12,15,2,5,7,10 100 3,9,18 6,4,5,7,9,10,11,12,15,14,13,8,2,3( Table Continue to next page)
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 142
Table 7.6 Patten Discovery based on Smith Waterman Distance(Continue)
Thres
hold
Number
of
Clusters
Sessions Involved in
each cluster
Web Objects Referred in that
Accuracy of pattern
4,11,15,17 2,4,6,8,9,10,12,14,15,3,1,7 84.71 5,13,19 3,6,9,11,12,13,14,15,2,10,5,7,8 78.33 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 100 8,20 7,6,5,2,1,9,10,12,14,11,13 100 9,3,18 3,4,5,6,7,9,10,11,12,15,14,13,8,2 90.33 10,1 2,5,7,8,9,10 100 11,4,15,17 2,4,6,8,9,10,12,14,15,3,1,7 84.71 13,5 5,7,9,11,12,13,14,15,2,3 73 14,1,2 2,5,7,8,9,10,6,12,15 77.66 15,4,11,17 2,4,6,8,9,10,12,14,15,3,7 84.71 16,6 3,8,7,9,4,6,10,11,12,13,15 100 17,4,11,15 2,4,6,8,9,10,12,14,15,3,1,7 84.71 18,3,9 3,4,5,6,7,9,10,11,12,15,14,13 90.33 19,5 5,7,9,11,12,13,14,15,2,3 100 20,8 7,6,5,2,1,9,10,12,14,11 100 25,1 2,5,7,8,9,10 83
Standard Deviation
9.80
Mean
89.17
75 18 1,10,14,25 2,5,7,8,9,10,12,13,14,15,6,1,3,4,11 72.5 2,14 6,8,9,12,15,2,5,7,10 100 3,9,18 6,4,5,7,9,10,11,12,15,14,13,8,2,3( Table Continue to next page)
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 143
Table 7.6 Patten Discovery based on Smith Waterman Distance(Continue)
Thres
hold
Number
of
Clusters
Sessions Involved in
each cluster
Web Objects Referred in that
Accuracy of pattern
4,11,15,17 2,4,6,8,9,10,12,14,15,3,1,7 84.71 5,13,19 3,6,9,11,12,13,14,15,2,10,5,7,8 78.33 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 100 8,20 7,6,5,2,1,9,10,12,14,11,13 100 9,3,18 3,4,5,6,7,9,10,11,12,15,14,13,8,2 90.33 10,1 2,5,7,8,9,10 100 11,4,15,17 2,4,6,8,9,10,12,14,15,3,1,7 84.71 14,1,2 2,5,7,8,9,10,6,12,15 77.66 15,4,11,17 2,4,6,8,9,10,12,14,15,3,7 84.71 16,6 3,8,7,9,4,6,10,11,12,13,15 100 17,4,11,15 2,4,6,8,9,10,12,14,15,3,1,7 84.71 18,3,9 3,4,5,6,7,9,10,11,12,15,14,13 90.33 19,5 5,7,9,11,12,13,14,15,2,3 100 20,8 7,6,5,2,1,9,10,12,14,11 100 25,1 2,5,7,8,9,10 83
Standard Deviation
9.25
Mean
90.07
80 18 1,10,14,25 2,5,7,8,9,10,12,13,14,15,6,1,3,4,11 72.5 2,14 6,8,9,12,15,2,5,7,10 100 3,9,18 6,4,5,7,9,10,11,12,15,14,13,8,2,3 90.33 4,11,15,17 2,4,6,8,9,10,12,14,15,3,1,7( Table Continue to next page)
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 144
Table 7.6 Patten Discovery based on Smith Waterman Distance(Continue)
Thres
hold
Number
of
Clusters
Sessions Involved in
each cluster
Web Objects Referred in that
Accuracy of pattern
5,19 5,7,9,11,12,13,14,15,2,3,8,6 100 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 100 8,20 7,6,5,2,1,9,10,12,14,11,13 100 9,3 3,4,5,6,7,9,10,11,12,15,14,13 92 10,1 2,5,7,8,9,10 100 11,4,15,17 2,4,6,8,9,10,12,14,15,3,1,7 84.71 14,1,2 2,5,7,8,9,10,6,12,15 77.66 15,4,11,17 2,4,6,8,9,10,12,14,15,3,7 84.71 16,6 3,8,7,9,4,6,10,11,12,13,15 100 17,4,11,15 2,4,6,8,9,10,12,14,15,3,1,7 84.71 18,3 3,4,5,6,7,9,10,11,12,15,14,13 100 19,5 5,7,9,11,12,13,14,15,2,3 100 20,8 7,6,5,2,1,9,10,12,14,11 100 25,1 2,5,7,8,9,10 83
Standard Deviation
9.26
Mean
91.90
85 16 1,10,14 2,5,7,8,9,10,12,13,14,15,6,1 85 2,14 6,8,9,12,15,2,5,7,10 100 3,9,18 6,4,5,7,9,10,11,12,15,14,13,8,2,3 90.33 4,11,17 2,4,6,8,9,10,12,14,15,3,7( Table Continue to next page)
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 145
Table 7.6 Patten Discovery based on Smith Waterman Distance(Continue)
Thres
hold
Number
of
Clusters
Sessions Involved in
each cluster
Web Objects Referred in that
Accuracy of pattern
5,19 5,7,9,11,12,13,14,15,2,3,8,6 100 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15 100 8,20 7,6,5,2,1,9,10,12,14,11,13 100 9,3 3,4,5,6,7,9,10,11,12,15,14,13 92 10,1 2,5,7,8,9,10 100 11,4,17 2,4,6,8,9,10,12,14,15,3,7 95.66 14,1,2 2,5,7,8,9,10,6,12,15 77.66 16,6 3,8,7,9,4,6,10,11,12,13,15 100 17,4,11 2,4,6,8,9,10,12,14,15,3 95.66 18,3 3,4,5,6,7,9,10,11,12,15,14,13 100 19,5 5,7,9,11,12,13,14,15,2,3 100 20,8 7,6,5,2,1,9,10,12,14,11 100
Standard Deviation
6.57
Mean
95.74
90 16 1,10,14 2,5,7,8,9,10,12,13,14,15,6,1 85 2,14 6,8,9,12,15,2,5,7,10 100 3,9,18 6,4,5,7,9,10,11,12,15,14,13,8,2,3 90.33 4,11,17 2,4,6,8,9,10,12,14,15,3,7 95.66 5,19 5,7,9,11,12,13,14,15,2,3,8,6 100 6,16 5,6,2,3,8,7,9,4,10,11,12,13,15Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 146
Table 7.6 Patten Discovery based on Smith Waterman Distance(Continue)
Thres
hold
Number
of
Clusters
Sessions Involved in
each cluster
Web Objects Referred in that
Accuracy of pattern
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 147
Table 7.6 Patten Discovery based on Smith Waterman Distance(Continue)
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 148
Pattern Accuracy based on Distance Metric
0 20 40 60 80 100 120 Thre shol d Va lue 50 55 60 65 70 75 80 85 90 95 100 Threshold Value A c c ura c y of P a tt e rn Levensthtein Needleman Wunsch Smith Waterman
(Figure 7.10 Pattern Accuracy based on Distance Metric)
Lavensthein Hit Ratio
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 50 60 70 80 90 100 Threshold Value H it R a ti o Hit Ratio
Prediction Model for Web Caching and Prefetching with Web Usage Mining to optimize web objects 149
Needleman Wunsch Hit Ratio
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 50 55 60 65 70 75 80 85 90 95 Threshold Value H it R a ti o Hit Ratio
(Figure 7.12 Hit Ratio based on Needleman Wunsch Distance Measurement Technique)
Smith Waterman Hit Ratio
1 1.2 1.4 1.6 1.8 2 2.2 2.4 50 60 70 80 90 100 Threshold Value H it R a ti o Hit Ratio