• No results found

Comparison of file sanitization techniques in usb based on average file entropy values

N/A
N/A
Protected

Academic year: 2021

Share "Comparison of file sanitization techniques in usb based on average file entropy values"

Copied!
28
0
0

Loading.... (view fulltext now)

Full text

(1)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

COMPARISON OF FILE SANITIZATION TECHNIQUES IN USB BASED ON AVERAGE FILE ENTROPY VALUES

NUR AMANINA ONN

A dissertation submitted in

fulfillment of the requirement for the award of the Degree of Master of Computer Science (Information Security)

Faculty of Computer Science and Information Technology Universiti Tun Hussein Onn Malaysia

(2)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

DEDICATION

My humble effort, I dedicate this work to my beloved mother and father whose encouragement and prays that make me able to achieve such success and honor.

(3)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

iv

ACKNOWLEDGEMENT

I would like to express my gratitude to the Almighty, Allah Taala for giving me the opportunity to finish this dissertation and to accomplish my Master. I am also grateful to have an honorable supervisor, Dr Kamaruddin Malik Bin Mohamad for his sincere guidance and cooperation.

My deep gratitude goes to my parents, Onn Bin Hj Yusof and Aminah Binti Ahmad who have been supporting me since day one. All of the contributions given by them will never be forgotten. Special thanks to my friends, Fazilah and Dalila for all the love and support given. Finally, I am thankful for all the people that have contributed toward the success of this research.

(4)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

ABSTRACT

Nowadays, the technology has become so advanced that many electronic gadgets are in every household today. The fast growth of technology today gives the ability for digital devices like smartphones and laptops to have a huge size of storage which is letting people to keep many of their information like contact lists, photos, videos and even personal information. When these information are not useful anymore, users will delete them. However, the growth of technology also letting people to recover back data that has been deleted. In this case, users do not realise that their deleted data can be recovered and then used by unauthorized user. The data deleted is invisible but not gone. This is where file sanitization plays it role. File sanitization is the process of deleting the memory of the content and over write it with a different characters. In this research, the methods chosen to sanitize file are Write Zero, Write Zero Randomly and Write Zero Alternately. All of the techniques will overwrite data with zero. The best technique is chosen based on the comparison of average entropy value of the files after they have been overwritten. Write Zero is the only technique that is provided by many software like WipeFile and BitKiller. There is no software that provide Write Zero Randomly technique except for sanitizing disk using dd. As for that, Write Zero Randomly and proposed technique, Write Zero Alternately are developed using C programming language in Dev-C++. In this research, sanitization with Write Zero has the lowest average entropy value for text document (TXT), Microsoft Word (DOCX) and image (JPG) with 100% of data in the files undergone this technique have been zero-filled compared to Write Zero Randomly and Write Zero Alternately. Next, Write Zero Alternately is more efficient in terms of average entropy by 4.64 bpB to its closest competitor which is Write Zero Randomly with 5.02 bpB. This shows that Write Zero is the best sanitization method. These file sanitization techniques are important to keep the confidentiality against unauthorized user.

(5)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

vi

ABSTRAK

Pada masa kini, teknologi sangat maju sehingga gajet elektronik terdapat di kebanyakkan rumah. Kemajuan teknologi yang sangat cepat memberi kebolehan kepada alat digital seperti telefon pintar dan komputer riba untuk memiliki saiz simpanan yang besar untuk menyimpan banyak informasi. Apabila informasi ini tidak digunakan lagi, pengguna akan memadamkannya. Walaubagaimanapun, kemajuan teknologi turut memberi kebolehan untuk mengambil semula data yang telah dipadam. Dalam kes ini, pengguna tidak sedar bahawa data yang telah dipadam boleh dikembalikan semula dan kemudian digunakan oleh pengguna luar. Disinilah file sanitization memainkan peranannya. File sanitization ialah proses memadam memori kandungan dan diganti dengan karakter lain. Teknik yang telah dipilih ialah Write Zero, Write Zero Randomly dan Write Zero Alternately. Kesemua teknik ini menggantikan data dengan karakter kosong. Teknik terbaik dipilih berdasarkan perbandingan nilai purata entropi setelah fail-fail digantikan dengan karakter yang lain. Write Zero ialah satu-satunya teknik yang mempunyai perisian seperti WipeFile dan BitKiller. Tiada perisian yang menyediakan teknik Write Zero Randomly kecuali untuk membersihkan cakera dengan menggunakan dd. Oleh itu, Write Zero Randomly dan Write Zero Alternately dibangunkan menggunakan Bahasa pengaturcaraan C dalam Dev-C++. Dalam kajian ini, proses pembersihan menggunakan Write Zero mempunyai nilai purata entropi yang paling rendah untuk dokumen teks (TXT), Microsoft Word (DOCX) dan imej (JPG) iaitu 100% daripada data di dalam fail dipenuhkan dengan kosong. Seterusnya, Write Zero Alternately lebih berkesan dalam istilah purata entropi iaitu 4.64 bpB lebih baik berbanding pesaing terdekatnya iaitu Write Zero Randomly dengan nilai 5.02 bpB. Ini menunjukkan bahawa teknik Write Zero ialah teknik file sanitization yang paling bagus. Kesemua teknik adalah penting bagi mengekalkan kerahsiaan maklumat terhadap pengguna luar yang tidak mempunyai kebenaran.

(6)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

CONTENTS CHAPTER 1 TITLE DECLARATION DEDICATION ACKNOWLEDGEMENT ABSTRACT ABSTRAK CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF APPENDICES INTRODUCTION i ii iii iv v vi vii x xii xv 1 1.1 Background Study 1 1.2 Research Motivation 2 1.3 Objectives 3 1.4 Research Limitation 4 1.5 Significance of Research 5 1.6 Report Organization 6

CHAPTER 2 LITERATURE REVIEW 7

2.1 Introduction 2.2 Anti-Forensic 2.3 Entropy 7 7 9

(7)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

viii

CHAPTER 3 CHAPTER 4

CHAPTER 5

2.4 File Sanitization

2.4.1 Overview of Techniques for File Sanitization 2.4.2 Write Zero Method

2.5 Comparative Analysis 2.6 Chapter Summary

RESEARCH METHODOLOGY

3.1 Introduction 3.2 Research Steps

3.2.1 Write Zero Alternately (The Proposed Method) 3.3 Research Framework

3.4Chapter Summary

IMPLEMENTATION

4.1 Introduction

4.2 Hardware and Software Requirement 4.3 Write Zero Technique

4.3.1 Implementation of Write Zero 4.4 Write Zero Randomly Technique

4.4.1 Implementation of Write Zero Randomly 4.5 Write Zero Alternately (The Proposed Technique) 4.5.1 Implementation of Write Zero Alternately 4.6 Chapter Summary

RESULT AND DISCUSSIONS

5.1 Introduction

5.2 Entropy of Data Set Before Sanitized 5.3 Evaluation of Write Zero Alternately

5.4 Evaluation of Write Zero and Write Zero Randomly (Existing Technique)

5.4.1 Evaluation of Write Zero

5.4.2 Evaluation of Write Zero Randomly 5.5 Comparison of Results for Average File Entropy 5.6 Chapter Summary 11 12 12 14 16 17 17 17 20 20 24 25 25 25 27 27 41 41 47 47 53 54 54 54 59 63 63 68 73 76

(8)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

CHAPTER 6 CONCLUSION AND FUTURE WORK

6.1 Introduction

6.2 Research Contributions 6.3 Suggestions for Future Work 6.4 Summary REFERENCES 77 77 78 79 79 80

(9)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

x LIST OF TABLES Table 2.1 Table 4.1 Table 5.1 Table 5.2 Table 5.3 Table 5.4 Table 5.5 Table 5.6 Table 5.7 Table 5.8 Table 5.9 Table 5.10

Comparative Analysis of Different File Sanitization Methods

Hardware and Software Specification

Average Entropy of 30 TXT Files Before Sanitized Average Entropy of 30 DOCX Files Before Sanitized Average Entropy of 30 JPG Files Before Sanitized Average Entropy of 30 TXT Files after Write Zero Alternately Technique

Average Entropy of 30 DOCX Files after Write Zero Alternately Technique

Average Entropy of 30 JPG Files after Write Zero Alternately Technique

Average Entropy of 30 TXT Files after Write Zero Technique

Average Entropy of 30 DOCX Files after Write Zero Technique

Average Entropy of 30 JPG Files after Write Zero Technique

Average Entropy of 30 TXT Files after Write Zero Randomly Technique 14 26 55 56 57 59 60 61 64 65 66 69

(10)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

Table 5.11

Table 5.12

Table 5.13

Average Entropy of 30 DOCX Files after Write Zero Randomly Technique

Average Entropy of 30 JPG Files after Write Zero Randomly Technique

Average File Entropy after Sanitized -Write Zero, Write Zero Randomly and Write Zero Alternately

70

71

(11)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

xii LIST OF FIGURES Figure 2.1 Figure 2.2 Figure 3.1 Figure 3.2 Figure 4.1 Figure 4.2 Figure 4.3 Figure 4.4 Figure 4.5 Figure 4.6 Figure 4.7 Figure 4.8 Figure 4.9 Figure 4.10 Figure 4.11 Figure 4.12 Figure 4.13 Figure 4.14 Figure 4.15 Figure 4.16

File with Value of Zero (Lance, 2013)

File That Has Been Compressed (Lance, 2013) Research Steps of Comparison For File Sanitization Techniques That Overwrite with Zero

Research Framework of File Sanitization TXT and DOCX Files in USB Flash Drive JPG Files in USB Flash Drive

Calculate Entropy of File File Entropy Value

Sanitize File using WipeFile After Sanitized File 000001.docx Step 1 of Imaging USB Flash Drive Step 2 of Imaging USB Flash Drive Step 3 of Imaging USB Flash Drive Step 4 of Imaging USB Flash Drive Step 5 of Imaging USB Flash Drive Step 6 of Imaging USB Flash Drive Step 7 of Imaging USB Flash Drive Step 8 of Imaging USB Flash Drive Step 9 of Imaging USB Flash Drive Step 10 of Imaging USB Flash Drive

10 10 18 21 27 28 28 29 29 30 31 31 32 32 33 33 34 34 35 35

(12)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

Figure 4.17 Figure 4.18 Figure 4.19 Figure 4.20 Figure 4.21 Figure 4.22 Figure 4.23 Figure 4.24 Figure 4.25 Figure 4.26 Figure 4.27 Figure 4.28 Figure 4.29 Figure 4.30 Figure 4.31 Figure 4.32 Figure 4.33 Figure 4.34 Figure 4.35 Figure 4.36 Figure 4.37 Figure 4.38

Step 1 of Retrieving Sanitized File (DOCX) Step 2 of Retrieving Sanitized File (DOCX) Step 3 of Retrieving Sanitized File (DOCX) Step 4 of Retrieving Sanitized File (DOCX) Sanitized TXT and JPG File

Sanitized DOCX File (Write Zero) Cannot Be Opened Sanitized JPG File (Write Zero) Cannot Be Opened Sanitized TXT File (Write Zero) Can Be Opened Algorithm Steps for Write Zero Randomly Code Fragment for Write Zero Randomly

Program of Write Zero Randomly (To Find File, Open File and Find Size of File)

Program of Write Zero Randomly (To Open File and Do Sanitization Process)

Program of Write Zero Randomly (To Close and Remove File)

Sanitized DOCX File (Write Zero Randomly) Cannot Be Opened

Sanitized JPG File (Write Zero Randomly) Cannot Be Opened

TXT File (Write Zero Randomly) Before Sanitized

Sanitized TXT File (Write Zero Randomly) Can Be Opened Algorithm Steps for Write Zero Alternately

Code Fragment for Write Zero Alternately

Program of Write Zero Alternately (To Find File, Open File and Find Size of File)

Program of Write Zero Alternately (To Open File and Do Sanitization Process)

Program of Write Zero Alternately (To Close and Remove File) 36 36 37 37 38 39 39 40 42 43 44 44 44 45 45 46 46 48 49 50 50 50

(13)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

xiv Figure 4.39 Figure 4.40 Figure 4.41 Figure 4.42 Figure 5.1 Figure 5.2 Figure 5.3 Figure 5.4 Figure 5.5

Sanitized DOCX File (Write Zero Alternately) Cannot Be Opened

Sanitized JPG File (Write Zero Alternately) Cannot Be Opened

TXT File (Write Zero Alternately) Before Sanitized TXT File (Write Zero Alternately) After Sanitized

Average Entropy of Three Different Files (TXT, DOCX and JPG) Before Sanitized

Average File Entropy Before and After Using Write Zero Alternately

Average File Entropy Before and After Using Write Zero Average File Entropy Before and After Using Write Zero Randomly

Summary Result of Average File Entropy after Sanitized Using Write Zero, Write Zero Randomly and Write Zero Alternately 51 51 52 52 58 62 67 72 75

(14)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

LIST OF APPENDICES APPENDIX A TITLE Coding PAGE 84

(15)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

(16)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

CHAPTER 1 INTRODUCTION 1.1 Background Study

Nowadays, information security and privacy has become one of the most important issues. The fast growth of technology has produce a variety of digital devices like smartphones. Information like e-mail, databases and spreadsheet are the core of many companies and they need a large amount of disk space (Alexander, 2018). With the high features of digital devices in today’s world, they are offering people with high storage capacity that enable people to keep all sort of data in their digital devices. Some of the data that people usually keep in their digital devices are photos, music files, videos, contact lists and even personal information. When people do not want to use the data anymore or for the sake of privacy, they delete the data in hoping that no one will ever look at their private information.

However, not many people aware that to completely erase the data, it does not work just by deleting the data. When the data is deleted, it makes the data invisible but not gone. With the technology today, unauthorized user can easily recover back the data. In addition, data broker can also gain private information of financial organizations or business companies with just a smartphone number (Chan, 2017).An extra step is needed to wipe the data completely, preventing from the recovery of the data by unauthorized user. As for that, file sanitization is a best practice in keeping the security and privacy of a data. File sanitization is the process of permanently deleting

(17)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

2 the data of the content and making sure that it is unrecoverable by overwriting it with a different characters by using software (Srinivasan, 2012)

.

In this research, all of the sanitization techniques discussed are techniques that overwrite using the character zero (0). The techniques are Write Zero, Write Zero Randomly and Write Zero Alternately. File entropy is used to measure the randomness of the files. As for final result of determining the best technique of file sanitization, it is based on file entropy that has been produced after sanitizated. The lower the file entropy, the better the technique of sanitization that overwrites using the character zero.

1.2 Research Motivation

Data security has become one of the highest concern in today’s world. The efficiency feature of digital devices that offers big storage capacity allowing user to store many important data in the device. Protecting user’s data is crucial in preventing from unauthorized use of the data. For example, a user sell his gadget to a shopkeeper. Before that, the user delete all the important information in the gadget in hoping that no one will find the information. However, with the fast spread of technology, it gives the opportunity for the shopkeeper to recover the information in the gadget with the help of some data recovery tools and can misuse the information.

Sensitive information can end up in the wrong hands if secure deletion of data is not done. In an investigation by (Garfinkel & Abhi, 2017), they have collected 158 hard drives from many sources like computer swap meets, used computer equipment stores and services of online auction. They have found so many sensitive data like credit-card numbers, records of financial, medical data and many other personal information. One of the problems that they have figured out why it occurs is employees are not well trained in using data sanitization techniques. That is why they able to find sensitive data from California Electronics Manufacturer, Chicago bank’s ATM machine and a supermarket credit card processing terminal.

Verizon Communications Inc. is a telecommunication company from United States of America. The company has released a sequence of investigation reports about data breach since 2008 (Xu et al., 2014). According to Verizon Communication Inc.,

(18)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

data breach means that an occasion that has developed exposure and disclosure to an authorized person (Verizon, 2015). Based on its 2013 investigation report, 62% of data breach affairs take months and sometimes years to be discovered and exposed. On top of that, 70% of the person who have been exposed to the data are not the data possessors (Xu et al., 2014).

To prevent such thing to occur, data sanitization is a great method to overwrite the data, knowing that the data cannot be eliminated just by deleting it. Supposedly a device that has been sanitized will not contain the usable residual data. Advanced forensic tools are also not able to recover the sanitized data. As for that, it is crucial to find techniques that could prevent the recovery of data in making sure that one’s privacy is safe. Several techniques of file sanitization can be learned to find out the best file sanitization technique based on the file entropy and performance of the file sanitization technique to wipe data.

1.3 Objectives

There are three objectives of this research:

i. To study about different file sanitization techniques which are Write Zero and Write Zero Randomly.

ii. To propose and develop an enhancement of file sanitization technique which is Write Zero Alternately.

iii. To test the proposed technique which is Write Zero Alternately and compare it with the two existing techniques which are Write Zero and Write Zero Randomly in terms of average file entropy values.

(19)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

4

1.4 Research Limitation

This research is focusing on three techniques of file sanitization that overwrite using zero that are used to wipe files on USB flash drive. The three techniques of file sanitization are Write Zero, Write Zero Randomly and Write Zero Alternately. All of the techniques have different ways of implementation of overwriting. Write Zero overwrites all data with zero, Write Zero Randomly overwrites random data within the file with zero while Write Zero Alternately overwrites alternate data within the file with zero.

In this study, the proposed technique application are developed using C programming language. However, Write Zero technique is applied using existing software which is WipeFile. As reviewed in Lifewire.com, a website that contains information of file sanitization software, at the time this research is undertaken, there is no available software for Write Zero Randomly. Thus, a prototype for the technique is developed using C language (Fisher, 2018).

The file types that have been sanitized are text document (TXT), word (DOCX) and image (JPG). After that FTK Imager is used to recover the files. Then, the file entropy is measured using CrypTool. The results of the techniques will be compared to see the randomness produced by each technique for text, document and image. The best technique will be chosen based on the file entropy. The file with the lowest value of file entropy uses the best file sanitization method that overwrite data with zero.

(20)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

1.5 Significance of Research

File sanitization techniques are most useful when a user want to delete the data permanently. The main advantage of this research is the techniques give secure way of removing the data. After sanitizing the data, it is impossible for other people to recover it even though other people try to recover the data by using data recovery tool. The new technique of file sanitization which is Write Zero Alternately is an enhancement from the existing techniques of Write Zero and Write Zero Randomly.

File sanitization can be categorized as one of anti-forensic techniques. Write Zero Alternately is an improve way of sanitizing file compared to Write Zero Randomly as more data in the file are replaced with zero. Hence, the technique also improve anti-forensic technique as more data has been replaced and much harder to be recovered. When the data is hard to be recovered, it prevents unauthorized user from misusing the data. As for that, this sort of cybercrime can be controlled when data is sanitized permanently by using file sanitization technique.

Besides that, file sanitization technique is very important for organisations in order to keep their confidentiality of sensitive information. For organisations, protection of information is supreme. One of the critical aspects of making sure sensitive information is safely protected by an organization against unauthorized people is an effective sanitization techniques (Kissel et al., 2014). Therefore, better file sanitization technique provide better confidentiality for organisations. In this research, Write Zero Alternately is more secure than Write Zero Randomly in terms of data privacy as more data can be replaced with zero.

(21)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

6

1.6 Report Organization

This chapter explains on the overview of file sanitization and the motivation of doing this research. Besides that, the most important thing is the objectives which are also listed in knowing the purpose of conducting this research. Limitation of research is listed in narrowing down the scope of comparing the techniques to wipe the file. The significance of the research is also explained for contribution of doing this research. In Chapter 2, there will be literature review on file sanitization methods. Moreover, there will also be comparative analysis between three methods of file sanitization chosen. In Chapter 3, there will be explanation on the research process throughout the entire process of conducting this research. Lastly, the framework of the research is explained to give a clearer clarification on way to sanitize file.

(22)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

CHAPTER 2 LITERATURE REVIEW 2.1 Introduction

In literature review, there will be research and analysis that are needed to compare the different methods or techniques of sanitizing file. Before that, there will be explanations on anti-forensic and file sanitization. Then, different methods of existing file sanitization will be explained thoroughly. From there on, the different techniques will be compared to see the similarities or differences of each technique. This chapter is made to have a better understanding on file sanitization and the existing techniques that can be used to sanitize file.

2.2 Anti-Forensic

The rapid growth of technology can cause computers and other gadgets to become a weapon when they are being used with false intention like stealing top secret information. Anti-forensic is a selection of tricks and techniques that are used with the intention of preventing forensic investigation (Jain & Chhabra, 2014). Besides that, anti-forensic is also defined as any effort or action that is to compromise the availability of evidence to the process of forensic (Stüttgen & Cohen, 2013).

(23)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

8 Anti-forensic technique is close to how data sanitization works. Some techniques are used to on storage devices so that data cannot be recovered (Poonia, 2014). Anti-forensic is steps taken to frustrate forensic investigation and avoid it. The main purpose of anti-forensic is preventing crime evidence from getting discovered. Besides that, the availability of open source tools and software have been designed to apply anti-forensic techniques (Jain & Chhabra, 2014). There are four ultimate goals of anti-forensic. The first is to avoid identification of some event that has taken place. Second, to disrupt the selection of information. Third, to add the amount of time an examiner need to use on a case and lastly is to cause suspicion on testimony and forensic report (Garfinkel, 2017).

Michael Perklin has mentioned in DEFCON, which is world's longest running and massive underground hacking conference that file sanitization is one of three techniques that popular amongst people. Another two are encryption and physical destruction. These techniques are all categorized as classic anti-forensic techniques (Perklin, 2012). Apart from that, file sanitization is also categorized as artifact sanitization which is the original taxonomy of anti-forensic. Other than file sanitization are disk degaussing, disk sanitization, generic data sanitization, log sanitization, metadata sanitization and registry sanitization which belong to the same category that is artifact sanitization (Conlan et al., 2016).

Data sanitization ruin the data stored in memory. Tools have been developed to sanitize the files like Eraser and BC Wipe. These software destroy files by overwriting data in files repeatedly. Artifact sanitization requires less time and efficient. However, there are limitations for sanitization. Some sanitize tasks are quite difficult like deleting file that is contained in the master file table. As for that, some tools are not able to completely overwrite the file, leaving residues data and traces of the file (Jain & Chhabra, 2014). Both researchers and practitioners are needed to truly understand the effect of anti-forensic.

(24)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

2.3 Entropy

Entropy can be described as the measurement of randomness. The concept of entropy is originally from Claude E. Shannon who has applied it to digital communications. Claude E. Shannon is interested in finding the theoretical maximum quantity or amount that a compressed digital file can produce (Hartman, Calculate File Entropy, 2013). Entropy calculates the uncertainty set of probabilities and the compressibility of information.

Similar to the term entropy, file entropy is representing the datasets in the selected file. It calculates the quantity of data that are presented in a selected file. For an example, a file has been selected and wished to calculate the value of entropy for the file, it will be as simple as accessing the technique of file entropy and the calculation process. The representation for Shannon’s entropy is shown in equation (1) below:

𝐻 = − ∑𝑛𝑖=1𝑃𝑖 log2𝑃𝑖 (1) From the formula (1) above, i is the symbol out of a possible i to n different symbols while P is the probability for the occurrence of ith symbol. When given a set of symbols, the number of different symbols and the occurrences are important. The less number of various symbol, the higher the occurrences of symbols, the entropy becomes lower fort the set of symbols (Mesdger & Srinivasan, 2012). It is said that when a file is compressed, various patterns of bits are replaced with a shorter patterns. As for that, when the entropy of data in a file is high, the less the file can be compressed (Hartman, Calculate File Entropy, 2013).

This Shannon’s formula will produce a result of something that is between zero (0) to eight (8). When the value of entropy is closer to zero (0), the data is more orderly and non-random but when the value of entropy is closer to eight (8), the more random or non-uniform it is (Lance, 2013). For this experiment, the technique with the lowest entropy, which is value that is closer to zero (0) is the best technique for file sanitization using zero. In the website ForensicKB by Lance, there are also examples and explanation on file entropy. These are the easiest example by ForensicKB, imagine a file that has been filled with value of zero:

(25)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

80

REFERENCES

Alexander, P. (2018). Choosing the Best Data Storage Solution. Retrieved from https://www.entrepreneur.com/article/172226

Bahl, V., Leong, D., Jiayan, G., Siang, J., & Mei, T. (2012). Secure Data Shredder.

Proceedings of the Global Engineering, Science and Technology Conference.

Bangladesh. pp. 28-29.

Chan, C. (2017). Putting All Your Data in One Smartphone Basket. Retrieved from https://www.wired.com/story/putting-all-your-data-in-one-smartphone-basket/ Conlan, K., Baggili, I., & Breitinger, F. (2016). Anti-forensics: Furthering digital

forensic science through a new extended, granular taxonomy. Digital Investigation, 18, pp. 66–75.

Cretu, G. F., Stavrou, A., Stolfo, S. J., & Keromytis, A. D. (2007). Data Sanitization : Improving the Forensic Utility of Anomaly Detection Systems. Workshop on Hot Topics in System Dependability (HotDep).Edinburgh, UK.pp. 1-6. Cui, W., Liu, S., Wu, Z., & Wei, H. (2014). How Hierarchical Topics Evolve in

Large Text Corpora. IEEE Transactions On Visualization And Computer Graphics, 20(12), pp. 2281–2290.

Diesburg, S., Feldhaus, C. A., Fardan, M. Al, Schlicht, J., & Ploof, N. (2015). Is Your Data Gone ? Comparing Perceived Effectiveness of Thumb Drive Deletion Methods to Actual Effectiveness. Cryptography and Security.arXiv preprint arXiv:1512.08986.

Dunn, J. E. (2015). Best disk wiping tools – how to securely clean hard drives, smartphones and SSDs. Retrieved from

https://www.techworld.com/security/best-disk-wiping-tools-securely-cleaning-hard-drives-smartphones-ssds-3627310/

Fisher, T. (2016). What is the Write Zero Method? Retrieved from https://www.lifewire.com/what-is-the-write-zero-method-2626052

(26)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

Fisher, T. (2017). Data Sanitization Methods. Retrieved from https://www.lifewire.com/data-sanitization-methods-2626133

Fisher, T. (2018). Retrieved from https://www.lifewire.com/dban-dariks-boot-and-nuke-review-2619130

Fisher, T. (2018). 35 Free File Shredder Software Programs. Retrieved from https://www.lifewire.com/free-file-shredder-software-programs-2619149 Garfinkel, S. L. (2014). The prevalence of encoded digital trace evidence in the

nonfile space of computer media. Journal of Forensic Sciences, 59(5), pp. 1386–1393.

Garfinkel, S. L., & A. S. (2017). Discarded Hard Drives Can Be Dangerous. Retrieved from http://www.computerweekly.com/feature/Discarded-hard-drives-can-be-dangerous

Greg. (2013). Using dd To Repeatedly Erase A Specific Range Of Sectors On The Hard Disk. Retrieved from https://zedt.eu/tech/linux/using-dd-to-repeatedly-erase-a-specific-range-of-sectors-on-the-hard-disk/

Hartman, K. G. (2013). Calculate File Entropy. Retrieved from https://www.kennethghartman.com/calculate-file-entropy/

Hughes, G., & Coughlin, T. (2014). Tutorial on Disk Drive Data Sanitization. Nist Special Publication, pp. 1–15.

Hoffman, C. (2016). You Only Need to Wipe a Disk Once to Securely Erase It. Retrieved from https://www.howtogeek.com/115573/htg-explains-why-you-only-have-to-wipe-a-disk-once-to-erase-it/

Jain, A., & Chhabra, G. S. (2014). Anti-forensics techniques: An analytical review.

2014 7th International Conference on Contemporary Computing (IC3). Noida. pp. 412–418.

Kissel, R., Regenscheid, A., Scholl, M., & Stine, K. (2014). Guidelines for Media Sanitization. US Department of Commerce : National Institute of Standards and Technology.

Lance. (2013). File Entropy Explained. Retrieved from

http://www.forensickb.com/2013/03/file-entropy-explained.html

Medsger, J., Srinivasan, A., & Wu, J. (2015). Information Theoretic and Statistical Drive Sanitization Models. Journal of Information Privacy and Security, 11(2), pp. 97–117.

(27)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

82

National Science Foundation. (2017). Producing The Digital Body (Home). Retrieved from https://digitalcorpora.org/

Perklin, M. (2012). Anti-Forensics and Anti-Anti-Forensics. Talk at DEF CON, 20. Las Vegas.

Poonia, A. S. (2014). Data Wiping and Anti Forensic Techniques, Compusoft3(12), pp. 1374–1376.

Postgraduate, N. (2007). Anti-Forensics : Techniques , Detection and

Countermeasur. In 2nd International Conference on i-Warfare and Security.

Monterey, California. pp. 77-84.

Savoldi, A., Piccinelli, M., & Gubian, P. (2012). A statistical method for detecting on-disk wiped areas. Digital Investigation, 8(3–4), pp. 194–214.

Singh, B., Saharan, R., Somani, G., & Gupta, G. (2016). Secure File Deletion For Solid State Drives. in Peterson, G., Shenoi, S. Advances in Digital Forensics XII. USA:Springer. pp. 345–346.

Singh, V., Kesharwani, L., Saran, V., Gupta, A. K., & Lal, E. P. (2015). “ Efficacy of open source tools for recovery of unconventionally deleted data for forensic consideration ”. International Journal of Social Relevance & Concern (IJSRC), 3(9), pp. 53-59.

Srinivasan, A. (2012). ERASE- EntRopy-based SAnitization of SEnsitive Data for Privacy Preservation,The 7th International Conference for Internet Technology and Secured Transactions. George Mason University. pp. 427–432.

Stüttgen, J., & Cohen, M. (2013). Anti-forensic resilient memory acquisition. Digital Investigation, 10(13), pp. 105–115.

VandenBrink, R. (2016). Using File Entropy to Identify "Ransomwared" Files. Retrieved from

https://isc.sans.edu/forums/diary/Using+File+Entropy+to+Identify+Ransomwar ed+Files/21351/

Verizon. (2015). 2015 Data Breach Investigations Report.Verizon RISK Team51(2), pp. 39–54.

Xu, L., Jiang, C., Wang, J., Yuan, J., & Ren, Y. (2014). Information security in big data: Privacy and data mining. IEEE Access, 2, pp. 1151–1178.

Zhang, P., Niu, S., Huang, Z., & Qin, X. (2017). Adaptive Data Wiping Scheme with Adjustable Parameters for Ext4 File System. Chinese Journal of Electronics,

(28)

PTTA

PERPUS

TAKAAN

TUNKU

TUN

AMI

NAH

Zoubek, C., & Sack, K. (2017). Selective deletion of non-relevant data. Digital Investigation, 20, pp. 92–98.

https://www.entrepreneur.com/article/172226

References

Related documents