• No results found

Implementation of an Efficient Web Defacement Detection Technique And Spotting Exact Defacement L ocation Using Diff Algorithm

N/A
N/A
Protected

Academic year: 2020

Share "Implementation of an Efficient Web Defacement Detection Technique And Spotting Exact Defacement L ocation Using Diff Algorithm"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 3, March 2012)

252

Implementation of an Efficient Web Defacement Detection

Technique And Spotting Exact Defacement L

ocation

Using

Diff Algorithm

1

Tushar kanti,

2

Vineet Richariya,

3

Vivek Richariya

Department of Computer Science & Engineering, L.N.C.T, Bhopal Head of Department, Computer Science & Engineering, L.N.C.T, Bhopal

Department Of Computer Science & Engineering, L.N.C.T, Bhopal

1[email protected] 2[email protected] 3[email protected]

Abstract— When a hacker changes the page of a website to something other than what was originally there, then it is called a web defacement.Website defacement is an attack on a website that changes the visual appearance of the site or a webpage [1]. These are typically the work of system crackers, who break into a web server and replace the hosted website with one of their own. The most common method of defacement is using SQL Injections to log on to administrator accounts. Defacements usually consist of an entire page. This page usually includes the defacer's pseudonym or "Hacking Codename." Sometimes, the Website Defacer makes fun of the system administrator for failing to maintain server security [5]. Most times, the defacement is harmless, however, it can sometimes be used as a distraction to cover up more sinister actions such as uploading malware or deleting essential files from the server. Web defacement results in extreme embarrassment to the web site owner, regardless of the commercial interest in the web site [7]. However, persons and companies who are targets of web defacement, often have substantial interest in maintaining the professional image and integrity of the web site. This paper proposes a hash code based web defacement detection mechanism [1]. We also propose an on spot defacement detection methodology using diff algorithm which can be used to recover the original web page.

Keywords-Web defacement, Hash Code, Checksum & Hash Table.

I. INTRODUCTION

Web defacement occurs when an intruder maliciously alters a Web page by inserting or substituting provocative and frequently offending data. The defacement of an organization's Web site exposes visitors to misleading information until the unauthorized change is discovered and corrected. Web defacement is a significant and major threat to businesses developing an online presence.

Defacement of a Web site can detrimentally affect the credibility and reputation of the organization as a whole. Unlike other attack cases where the hacker hides his activities, in defacement incidents, the major goal of the hacker is to gain publicity by demonstrating the weakness of the existing security measures [1]. The damage from a Web defacement incident can be disproportionate. Damage can range from loss of customer trust to loss of revenue. An e-retailer can lose considerable patronage if its customers feel its e-business is insecure. Financial institutions, which emphasize security and credibility, may experience significant loss of business and integrity, due to security breaches in their Web site. Along the spectrum, consumer confidence and loyalty in these organizations can have serious negative implications.

There's an overwhelming need for a solution that eliminates compromises to the Web server, especially Web page defacement [6]. Ideally, it would prevent the hacker from making any modifications, thereby precluding any possibility of attracting attention. The Web server would never present a defaced page to a user. Equally important, a proactive solution would eliminate any after-the-fact need

for recovery and fixes, and be transparent to standard operations.

II. BACKGROUND

(2)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 3, March 2012)

253

People are now able to hold databases online, conduct blogs, forums, chat, and use many other forms of communication. As technology advances in favor of more potent and efficient means of transferring data and as the internet becomes more elaborate, so do the hackers. Common day non-technical people now have to deal with constantly upgrading, patching, and employing anti-virus software in order to protect themselves from attacks and vulnerabilities. An important and often overlooked aspect of web design is web security, securing your website is an extremely important step in maintaining data integrity and availability of resources. Website defacement is an extremely important topic that should warrant as much focus on security as any other area of information Technology [12]. If a hacker is able to deface a website, this essentially means that a serious breach has occurred. Many defacers do it as a form of internet graffiti, but once inside your website a lot of information can get stolen, such

as credit card numbers and other personal information. Many works have been carried out to counter web

defacement making use of genetic algorithm or Read only Memory concept but the results are complex to attain. Our work detects the defacement and traces the exact location of web defacement on a particular website.

III. METHOD

In this paper we have proposed an algorithm for defacement detection [2] as well as spotting the exact location of web defacement. We have also implemented a Web browser with inbuilt defacement detection techniques. We calculate the hash code for Defacement detection [2]. Priority of different pages in a website has been fixed like home page with highest priority and so on. The frequency of defacement detection for the Home Page is the highest as it is more prone towards defacement.

Web page links Checksum

p1 c1 p2 c2 p3 c3 p4 c4 p5 c5 p6 c6 p7 c7

Table .1”

Here, ci represents the hash code of the web page pi [2]. First the web page will be given as an input. An option to ―track the web page for defacement‖ is given to the user. If he starts tracking then the hash code for the web page is calculated. Now after a decided interval of time the saved web page is revisited. An option of ―check for web defacement‖ is given to the user. If the user checks, the present hash code for the page is calculated and compared with the saved hash code for the same page. If the hash code is found to be same then it is not defaced otherwise it is will be marked as defaced. Now an option to ―spot the defacement‖ is provided to the user so that he can spot the exact location of defaced page. For this purpose we have used diff algorithm to show the difference between the original page and the defaced page. It can be seen as a comparison between the two states of the same web page.

IV. IMPLEMENTATION

(3)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 3, March 2012)

254

1) On the basis of web page relevance and its defacement checking a web page (pi) is selected for the defacement checking.

2) Calculate the Fresh Hash Code (ci) for the chosen web page pi in step 1, using Secure Hash Algorithm. 3) Compare the Fresh calculated hash code (nci) of the

web page pi with its stored hash code ci in the database.

a) if result of comparison (nci == ci) is True, then the web page is not defaced and process will stop. b) if result of comparison (nci == ci) is False, then

the web page is defaced or the contents of the page is changed, so go to step 4 .

4) Use diff algorithm to compare the two sates of the same web page. The diff algorithm works in the below mentioned fashion:

 Comparing the characters of 2 huge text files is not easy to implement and tends to be slow. Comparing numbers is much easier so the first step is to compute unique numbers for all text lines. If text lines are identical then identical numbers are computed.

 There are some options before computing these numbers that normally are useful for some kind of text: stripping off space characters and comparing case insensitive.

 The core algorithm itself will compare 2 arrays of numbers and the preparation is done in the private Diff Codes method and by using a Hash table.

 The methods Diff Text and Diff Ints.

 The core of the algorithm is built using 2 methods:

LCS: This is the divide-and-conquer

implementation of the long common-subsequence algorithm.

SMS: This method finds the Shortest Middle Snake.

Some methods used in diff algorithm is described below:

A. DiffText(string TextA, string TextB)

Find the difference in 2 texts, comparing by textlines without any conversion. An array of Items containing the differences is returned.

B. DiffText(string TextA, string TextB, bool trimSpace, bool ignoreSpace, bool ignoreCase)

Find the difference in 2 texts, comparing by textlines with some optional conversions. A array of Items containing the differences is returned.

C. Diff(int[] ArrayA, int[] ArrayB)

Find the difference in 2 arrays of integers. A array of Items containing the differences is returned.

The screenshot for the prototype web browser is as follows:

“Fig.1”

V. RESULT

Let us now try to evaluate our approach in contrast with the existing approach. We will analyze our application on four parameters:

1. Space complexity

2. No of defacement detections 3. Time complexity

4. Cost (Person-months)

(4)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 3, March 2012)

255

approach we manually have to run the application and get the metrics out [4].

Percentage of defaced pages

Existing algorithm

Our approach

10 O(logn) O(logn)

20 O(logn) O(logn)

30 O(logn) O(logn)

40 O(logn) O(logn)

50 O(n) O(logn)

60 O(n) O(logn)

70 O(n) O(logn)

80 O(n2) O(logn)

90 O(n2) O(logn)

100 O(n2) O(logn)

“Table.2”

No of defacement detections: No of defacement detected when running our website in contrast with the existing genetic algorithm based solution will give us following results.

Note: Total pages: 100

 Defaced pages: 80

 Initially started with 20 defaced pages and then kept on defacing 5 pages per 5 minutes These results can be seen pictorially as:

“Fig.2”

Time complexity: The result of time required for the existing system using genetic algorithm or the read only memory concepts can be taken as is from the result set they have supplied. As for our approach we manually have to run the application and get the metrics out.

Percentage of defaced pages

Existing algorithm

Our approach

10 O(n) O(nlogn)

20 O(n) O(nlogn)

30 O(n) O(nlogn)

40 O(n) O(nlogn)

50 O(n) O(nlogn)

60 O(n) O(nlogn)

70 O(n) O(nlogn)

80 O(n) O(nlogn)

90 O(n) O(nlogn)

100 O(n) O(nlogn)

“Table.3’

Cost: The cost in our case was around 6 Person-Months – 8 Person-Months. Whereas internet based research shows that any existing approach for defacement detection will not take anything less that 24 Person-Months due to their inherent nature that require more debugging and trouble shooting in later stages [3].

Theproposed algorithm web compared with the existing integrity based web defacement detection methods, Our proposed method found to be detecting approximately 45% more defacements. The details of the tests can be illustrated in a graphical form as:

(5)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 3, March 2012)

256

VI. CONCLUSION

In this paper, a technique is presented for website defacement detection and recovery using a combination of hash code and diff algorithm. Web defacement checking will reduce the processing overhead of calculating the checksum of whole website frequently [2]. Since the proposed framework will provide more frequent defacement checking of the selected pages so it reduces the risk from defacement. Since integrity based web defacement methods are not feasible for dynamic web pages and the proposed framework is also based on integrity, so the framework is applicable to static web pages only. The proposed algorithm used in conjunction with the proposed prototype of web browser will help the server admin to get notified of possible defacements and will help them to recover such pages.

REFERENCES

[1] Prevent Web Site Defacement-Dr. Yona Hollander , Internet

Security Advisor November/December 200

[2] Implementing a Web Browser with Web Defacement Detection

Techniques World of Computer Science and Information Technology Journal (WCSIT)ISSN: 2221-0741,Vol. 1, No. 7, 307-310, 2011.

[3] Medvet, E.; Fillon, C.; Bartoli, A.;

Univ. of Trieste, Trieste -Detection of Web Defacements by means

of Genetic Programming, Information Assurance and Security,

2007. IAS 2007

[4] Andrew Cooks Martin S Olivier-CURTAILING WEB

DEFACEMENT USING A READ-ONLY STRATEGY

[5] Bartoli, A.; Davanzo, G.; Medvet, E.;

Univ. of Trieste, Trieste, Italy -The Reaction Time to Web Site Defacements, Internet Computing, IEEE July-Aug. 2009

[6] M Bishop. Computer Security, Art and Science. Addison Wesley,

Boston, MA, USA, 2003.

[7] C Liu, J Marchewka, J Lu, and C-S Yu. Beyond concern—a

privacy-trust-behavioral intention model of electronic commerce.

Information and Management, January 2004.

doi:10.1016/j.im.2004.01.003.

[8] H Nam, J Kim, S J Honga, and S Lee. Secure checkpointing. Journal

of Systems Architecture, 48:237–254, March 2003.

doi:10.1016/S1383-7621 (02) 00137-6.

[9] David Buttler, Daniel Rocco and Ling Liu, ―Efficient Web Change

Monitoring with Page Digest‖, May 17–22, 2004, New York, USA. ACM 1581139128/ 04/0005.

[10] Guohun Zhu and YuQing Miao, ―Co-operative Monitor Web Page

Based on MD5‖, LNCS 3033, pp. 179–182, Springer- Verlag Berlin Heidelberg 2004.

[11] Project Gamma. Defaced web site archive.

http://defaced.projectgamma.com/.

[12] Anonymous. Approximately 35 South African Web sites cracked

simultaneously. SA Computer Magazine, 12(5):12, May/June 2004.

[13] B B Madan, K Goeva-Popstojanova, K Vaidyanathan, and K

Trivedi. A method for modeling and quantifying the security attributes of intrusion tolerant systems. Performance Evaluation, 56:167–186, March 2004. doi:10.1016/j.peva.2003.07.008.

[14] J Jacob. The basic integrity theorem. In Computer Security

References

Related documents

Labor market frictions, the resulting rents, and wage compression together imply that firms have an incentive to invest more in the training of low skilled workers, and in

Additional Career Majors on the Drumright Campus Only: • Advanced Acute Care Nursing Assistant (Pre-Nursing) • Electrocardiograph Technician1.

44 information sheets 4 garment supply chain education pack women working worldwide ■ Easy to shift work between factories.. The global garments industry runs

From data about the Brazilian states from 1995 to 2009, the impact of variations of economic growth and income inequality on poverty in Brazil are here analyzed in an attempt

agencies, when you look at a Monitoring & Evaluation framework, they are always saying “don't look at the immediate and underlying causes, if you want to change something you

For this reason, we select a small amount of evenly distributed random pixels across the image domain, and for those corresponding appearance profiles we compute a standard

He had once had a great fight with Arthur, but after that they had become friends, and King Pellenore had been made a Knight of the Round Table.. He was not often at court, for he

Results: The final taxonomy consists of 21 key features distributed over eight integration domains which are organised into three main categories: scope (person-focused