PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
COMPARISON OF FILE SANITIZATION TECHNIQUES IN USB BASED ON AVERAGE FILE ENTROPY VALUES
NUR AMANINA ONN
A dissertation submitted in
fulfillment of the requirement for the award of the Degree of Master of Computer Science (Information Security)
Faculty of Computer Science and Information Technology Universiti Tun Hussein Onn Malaysia
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
DEDICATION
My humble effort, I dedicate this work to my beloved mother and father whose encouragement and prays that make me able to achieve such success and honor.
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
iv
ACKNOWLEDGEMENT
I would like to express my gratitude to the Almighty, Allah Taala for giving me the opportunity to finish this dissertation and to accomplish my Master. I am also grateful to have an honorable supervisor, Dr Kamaruddin Malik Bin Mohamad for his sincere guidance and cooperation.
My deep gratitude goes to my parents, Onn Bin Hj Yusof and Aminah Binti Ahmad who have been supporting me since day one. All of the contributions given by them will never be forgotten. Special thanks to my friends, Fazilah and Dalila for all the love and support given. Finally, I am thankful for all the people that have contributed toward the success of this research.
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
ABSTRACT
Nowadays, the technology has become so advanced that many electronic gadgets are in every household today. The fast growth of technology today gives the ability for digital devices like smartphones and laptops to have a huge size of storage which is letting people to keep many of their information like contact lists, photos, videos and even personal information. When these information are not useful anymore, users will delete them. However, the growth of technology also letting people to recover back data that has been deleted. In this case, users do not realise that their deleted data can be recovered and then used by unauthorized user. The data deleted is invisible but not gone. This is where file sanitization plays it role. File sanitization is the process of deleting the memory of the content and over write it with a different characters. In this research, the methods chosen to sanitize file are Write Zero, Write Zero Randomly and Write Zero Alternately. All of the techniques will overwrite data with zero. The best technique is chosen based on the comparison of average entropy value of the files after they have been overwritten. Write Zero is the only technique that is provided by many software like WipeFile and BitKiller. There is no software that provide Write Zero Randomly technique except for sanitizing disk using dd. As for that, Write Zero Randomly and proposed technique, Write Zero Alternately are developed using C programming language in Dev-C++. In this research, sanitization with Write Zero has the lowest average entropy value for text document (TXT), Microsoft Word (DOCX) and image (JPG) with 100% of data in the files undergone this technique have been zero-filled compared to Write Zero Randomly and Write Zero Alternately. Next, Write Zero Alternately is more efficient in terms of average entropy by 4.64 bpB to its closest competitor which is Write Zero Randomly with 5.02 bpB. This shows that Write Zero is the best sanitization method. These file sanitization techniques are important to keep the confidentiality against unauthorized user.
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
vi
ABSTRAK
Pada masa kini, teknologi sangat maju sehingga gajet elektronik terdapat di kebanyakkan rumah. Kemajuan teknologi yang sangat cepat memberi kebolehan kepada alat digital seperti telefon pintar dan komputer riba untuk memiliki saiz simpanan yang besar untuk menyimpan banyak informasi. Apabila informasi ini tidak digunakan lagi, pengguna akan memadamkannya. Walaubagaimanapun, kemajuan teknologi turut memberi kebolehan untuk mengambil semula data yang telah dipadam. Dalam kes ini, pengguna tidak sedar bahawa data yang telah dipadam boleh dikembalikan semula dan kemudian digunakan oleh pengguna luar. Disinilah file sanitization memainkan peranannya. File sanitization ialah proses memadam memori kandungan dan diganti dengan karakter lain. Teknik yang telah dipilih ialah Write Zero, Write Zero Randomly dan Write Zero Alternately. Kesemua teknik ini menggantikan data dengan karakter kosong. Teknik terbaik dipilih berdasarkan perbandingan nilai purata entropi setelah fail-fail digantikan dengan karakter yang lain. Write Zero ialah satu-satunya teknik yang mempunyai perisian seperti WipeFile dan BitKiller. Tiada perisian yang menyediakan teknik Write Zero Randomly kecuali untuk membersihkan cakera dengan menggunakan dd. Oleh itu, Write Zero Randomly dan Write Zero Alternately dibangunkan menggunakan Bahasa pengaturcaraan C dalam Dev-C++. Dalam kajian ini, proses pembersihan menggunakan Write Zero mempunyai nilai purata entropi yang paling rendah untuk dokumen teks (TXT), Microsoft Word (DOCX) dan imej (JPG) iaitu 100% daripada data di dalam fail dipenuhkan dengan kosong. Seterusnya, Write Zero Alternately lebih berkesan dalam istilah purata entropi iaitu 4.64 bpB lebih baik berbanding pesaing terdekatnya iaitu Write Zero Randomly dengan nilai 5.02 bpB. Ini menunjukkan bahawa teknik Write Zero ialah teknik file sanitization yang paling bagus. Kesemua teknik adalah penting bagi mengekalkan kerahsiaan maklumat terhadap pengguna luar yang tidak mempunyai kebenaran.
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
CONTENTS CHAPTER 1 TITLE DECLARATION DEDICATION ACKNOWLEDGEMENT ABSTRACT ABSTRAK CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF APPENDICES INTRODUCTION i ii iii iv v vi vii x xii xv 1 1.1 Background Study 1 1.2 Research Motivation 2 1.3 Objectives 3 1.4 Research Limitation 4 1.5 Significance of Research 5 1.6 Report Organization 6
CHAPTER 2 LITERATURE REVIEW 7
2.1 Introduction 2.2 Anti-Forensic 2.3 Entropy 7 7 9
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
viii
CHAPTER 3 CHAPTER 4
CHAPTER 5
2.4 File Sanitization
2.4.1 Overview of Techniques for File Sanitization 2.4.2 Write Zero Method
2.5 Comparative Analysis 2.6 Chapter Summary
RESEARCH METHODOLOGY
3.1 Introduction 3.2 Research Steps
3.2.1 Write Zero Alternately (The Proposed Method) 3.3 Research Framework
3.4Chapter Summary
IMPLEMENTATION
4.1 Introduction
4.2 Hardware and Software Requirement 4.3 Write Zero Technique
4.3.1 Implementation of Write Zero 4.4 Write Zero Randomly Technique
4.4.1 Implementation of Write Zero Randomly 4.5 Write Zero Alternately (The Proposed Technique) 4.5.1 Implementation of Write Zero Alternately 4.6 Chapter Summary
RESULT AND DISCUSSIONS
5.1 Introduction
5.2 Entropy of Data Set Before Sanitized 5.3 Evaluation of Write Zero Alternately
5.4 Evaluation of Write Zero and Write Zero Randomly (Existing Technique)
5.4.1 Evaluation of Write Zero
5.4.2 Evaluation of Write Zero Randomly 5.5 Comparison of Results for Average File Entropy 5.6 Chapter Summary 11 12 12 14 16 17 17 17 20 20 24 25 25 25 27 27 41 41 47 47 53 54 54 54 59 63 63 68 73 76
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
CHAPTER 6 CONCLUSION AND FUTURE WORK
6.1 Introduction
6.2 Research Contributions 6.3 Suggestions for Future Work 6.4 Summary REFERENCES 77 77 78 79 79 80
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
x LIST OF TABLES Table 2.1 Table 4.1 Table 5.1 Table 5.2 Table 5.3 Table 5.4 Table 5.5 Table 5.6 Table 5.7 Table 5.8 Table 5.9 Table 5.10
Comparative Analysis of Different File Sanitization Methods
Hardware and Software Specification
Average Entropy of 30 TXT Files Before Sanitized Average Entropy of 30 DOCX Files Before Sanitized Average Entropy of 30 JPG Files Before Sanitized Average Entropy of 30 TXT Files after Write Zero Alternately Technique
Average Entropy of 30 DOCX Files after Write Zero Alternately Technique
Average Entropy of 30 JPG Files after Write Zero Alternately Technique
Average Entropy of 30 TXT Files after Write Zero Technique
Average Entropy of 30 DOCX Files after Write Zero Technique
Average Entropy of 30 JPG Files after Write Zero Technique
Average Entropy of 30 TXT Files after Write Zero Randomly Technique 14 26 55 56 57 59 60 61 64 65 66 69
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
Table 5.11
Table 5.12
Table 5.13
Average Entropy of 30 DOCX Files after Write Zero Randomly Technique
Average Entropy of 30 JPG Files after Write Zero Randomly Technique
Average File Entropy after Sanitized -Write Zero, Write Zero Randomly and Write Zero Alternately
70
71
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
xii LIST OF FIGURES Figure 2.1 Figure 2.2 Figure 3.1 Figure 3.2 Figure 4.1 Figure 4.2 Figure 4.3 Figure 4.4 Figure 4.5 Figure 4.6 Figure 4.7 Figure 4.8 Figure 4.9 Figure 4.10 Figure 4.11 Figure 4.12 Figure 4.13 Figure 4.14 Figure 4.15 Figure 4.16
File with Value of Zero (Lance, 2013)
File That Has Been Compressed (Lance, 2013) Research Steps of Comparison For File Sanitization Techniques That Overwrite with Zero
Research Framework of File Sanitization TXT and DOCX Files in USB Flash Drive JPG Files in USB Flash Drive
Calculate Entropy of File File Entropy Value
Sanitize File using WipeFile After Sanitized File 000001.docx Step 1 of Imaging USB Flash Drive Step 2 of Imaging USB Flash Drive Step 3 of Imaging USB Flash Drive Step 4 of Imaging USB Flash Drive Step 5 of Imaging USB Flash Drive Step 6 of Imaging USB Flash Drive Step 7 of Imaging USB Flash Drive Step 8 of Imaging USB Flash Drive Step 9 of Imaging USB Flash Drive Step 10 of Imaging USB Flash Drive
10 10 18 21 27 28 28 29 29 30 31 31 32 32 33 33 34 34 35 35
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
Figure 4.17 Figure 4.18 Figure 4.19 Figure 4.20 Figure 4.21 Figure 4.22 Figure 4.23 Figure 4.24 Figure 4.25 Figure 4.26 Figure 4.27 Figure 4.28 Figure 4.29 Figure 4.30 Figure 4.31 Figure 4.32 Figure 4.33 Figure 4.34 Figure 4.35 Figure 4.36 Figure 4.37 Figure 4.38
Step 1 of Retrieving Sanitized File (DOCX) Step 2 of Retrieving Sanitized File (DOCX) Step 3 of Retrieving Sanitized File (DOCX) Step 4 of Retrieving Sanitized File (DOCX) Sanitized TXT and JPG File
Sanitized DOCX File (Write Zero) Cannot Be Opened Sanitized JPG File (Write Zero) Cannot Be Opened Sanitized TXT File (Write Zero) Can Be Opened Algorithm Steps for Write Zero Randomly Code Fragment for Write Zero Randomly
Program of Write Zero Randomly (To Find File, Open File and Find Size of File)
Program of Write Zero Randomly (To Open File and Do Sanitization Process)
Program of Write Zero Randomly (To Close and Remove File)
Sanitized DOCX File (Write Zero Randomly) Cannot Be Opened
Sanitized JPG File (Write Zero Randomly) Cannot Be Opened
TXT File (Write Zero Randomly) Before Sanitized
Sanitized TXT File (Write Zero Randomly) Can Be Opened Algorithm Steps for Write Zero Alternately
Code Fragment for Write Zero Alternately
Program of Write Zero Alternately (To Find File, Open File and Find Size of File)
Program of Write Zero Alternately (To Open File and Do Sanitization Process)
Program of Write Zero Alternately (To Close and Remove File) 36 36 37 37 38 39 39 40 42 43 44 44 44 45 45 46 46 48 49 50 50 50
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
xiv Figure 4.39 Figure 4.40 Figure 4.41 Figure 4.42 Figure 5.1 Figure 5.2 Figure 5.3 Figure 5.4 Figure 5.5
Sanitized DOCX File (Write Zero Alternately) Cannot Be Opened
Sanitized JPG File (Write Zero Alternately) Cannot Be Opened
TXT File (Write Zero Alternately) Before Sanitized TXT File (Write Zero Alternately) After Sanitized
Average Entropy of Three Different Files (TXT, DOCX and JPG) Before Sanitized
Average File Entropy Before and After Using Write Zero Alternately
Average File Entropy Before and After Using Write Zero Average File Entropy Before and After Using Write Zero Randomly
Summary Result of Average File Entropy after Sanitized Using Write Zero, Write Zero Randomly and Write Zero Alternately 51 51 52 52 58 62 67 72 75
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
LIST OF APPENDICES APPENDIX A TITLE Coding PAGE 84
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
CHAPTER 1 INTRODUCTION 1.1 Background Study
Nowadays, information security and privacy has become one of the most important issues. The fast growth of technology has produce a variety of digital devices like smartphones. Information like e-mail, databases and spreadsheet are the core of many companies and they need a large amount of disk space (Alexander, 2018). With the high features of digital devices in today’s world, they are offering people with high storage capacity that enable people to keep all sort of data in their digital devices. Some of the data that people usually keep in their digital devices are photos, music files, videos, contact lists and even personal information. When people do not want to use the data anymore or for the sake of privacy, they delete the data in hoping that no one will ever look at their private information.
However, not many people aware that to completely erase the data, it does not work just by deleting the data. When the data is deleted, it makes the data invisible but not gone. With the technology today, unauthorized user can easily recover back the data. In addition, data broker can also gain private information of financial organizations or business companies with just a smartphone number (Chan, 2017).An extra step is needed to wipe the data completely, preventing from the recovery of the data by unauthorized user. As for that, file sanitization is a best practice in keeping the security and privacy of a data. File sanitization is the process of permanently deleting
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
2 the data of the content and making sure that it is unrecoverable by overwriting it with a different characters by using software (Srinivasan, 2012)
.
In this research, all of the sanitization techniques discussed are techniques that overwrite using the character zero (0). The techniques are Write Zero, Write Zero Randomly and Write Zero Alternately. File entropy is used to measure the randomness of the files. As for final result of determining the best technique of file sanitization, it is based on file entropy that has been produced after sanitizated. The lower the file entropy, the better the technique of sanitization that overwrites using the character zero.1.2 Research Motivation
Data security has become one of the highest concern in today’s world. The efficiency feature of digital devices that offers big storage capacity allowing user to store many important data in the device. Protecting user’s data is crucial in preventing from unauthorized use of the data. For example, a user sell his gadget to a shopkeeper. Before that, the user delete all the important information in the gadget in hoping that no one will find the information. However, with the fast spread of technology, it gives the opportunity for the shopkeeper to recover the information in the gadget with the help of some data recovery tools and can misuse the information.
Sensitive information can end up in the wrong hands if secure deletion of data is not done. In an investigation by (Garfinkel & Abhi, 2017), they have collected 158 hard drives from many sources like computer swap meets, used computer equipment stores and services of online auction. They have found so many sensitive data like credit-card numbers, records of financial, medical data and many other personal information. One of the problems that they have figured out why it occurs is employees are not well trained in using data sanitization techniques. That is why they able to find sensitive data from California Electronics Manufacturer, Chicago bank’s ATM machine and a supermarket credit card processing terminal.
Verizon Communications Inc. is a telecommunication company from United States of America. The company has released a sequence of investigation reports about data breach since 2008 (Xu et al., 2014). According to Verizon Communication Inc.,
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
data breach means that an occasion that has developed exposure and disclosure to an authorized person (Verizon, 2015). Based on its 2013 investigation report, 62% of data breach affairs take months and sometimes years to be discovered and exposed. On top of that, 70% of the person who have been exposed to the data are not the data possessors (Xu et al., 2014).
To prevent such thing to occur, data sanitization is a great method to overwrite the data, knowing that the data cannot be eliminated just by deleting it. Supposedly a device that has been sanitized will not contain the usable residual data. Advanced forensic tools are also not able to recover the sanitized data. As for that, it is crucial to find techniques that could prevent the recovery of data in making sure that one’s privacy is safe. Several techniques of file sanitization can be learned to find out the best file sanitization technique based on the file entropy and performance of the file sanitization technique to wipe data.
1.3 Objectives
There are three objectives of this research:
i. To study about different file sanitization techniques which are Write Zero and Write Zero Randomly.
ii. To propose and develop an enhancement of file sanitization technique which is Write Zero Alternately.
iii. To test the proposed technique which is Write Zero Alternately and compare it with the two existing techniques which are Write Zero and Write Zero Randomly in terms of average file entropy values.
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
4
1.4 Research Limitation
This research is focusing on three techniques of file sanitization that overwrite using zero that are used to wipe files on USB flash drive. The three techniques of file sanitization are Write Zero, Write Zero Randomly and Write Zero Alternately. All of the techniques have different ways of implementation of overwriting. Write Zero overwrites all data with zero, Write Zero Randomly overwrites random data within the file with zero while Write Zero Alternately overwrites alternate data within the file with zero.
In this study, the proposed technique application are developed using C programming language. However, Write Zero technique is applied using existing software which is WipeFile. As reviewed in Lifewire.com, a website that contains information of file sanitization software, at the time this research is undertaken, there is no available software for Write Zero Randomly. Thus, a prototype for the technique is developed using C language (Fisher, 2018).
The file types that have been sanitized are text document (TXT), word (DOCX) and image (JPG). After that FTK Imager is used to recover the files. Then, the file entropy is measured using CrypTool. The results of the techniques will be compared to see the randomness produced by each technique for text, document and image. The best technique will be chosen based on the file entropy. The file with the lowest value of file entropy uses the best file sanitization method that overwrite data with zero.
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
1.5 Significance of Research
File sanitization techniques are most useful when a user want to delete the data permanently. The main advantage of this research is the techniques give secure way of removing the data. After sanitizing the data, it is impossible for other people to recover it even though other people try to recover the data by using data recovery tool. The new technique of file sanitization which is Write Zero Alternately is an enhancement from the existing techniques of Write Zero and Write Zero Randomly.
File sanitization can be categorized as one of anti-forensic techniques. Write Zero Alternately is an improve way of sanitizing file compared to Write Zero Randomly as more data in the file are replaced with zero. Hence, the technique also improve anti-forensic technique as more data has been replaced and much harder to be recovered. When the data is hard to be recovered, it prevents unauthorized user from misusing the data. As for that, this sort of cybercrime can be controlled when data is sanitized permanently by using file sanitization technique.
Besides that, file sanitization technique is very important for organisations in order to keep their confidentiality of sensitive information. For organisations, protection of information is supreme. One of the critical aspects of making sure sensitive information is safely protected by an organization against unauthorized people is an effective sanitization techniques (Kissel et al., 2014). Therefore, better file sanitization technique provide better confidentiality for organisations. In this research, Write Zero Alternately is more secure than Write Zero Randomly in terms of data privacy as more data can be replaced with zero.
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
6
1.6 Report Organization
This chapter explains on the overview of file sanitization and the motivation of doing this research. Besides that, the most important thing is the objectives which are also listed in knowing the purpose of conducting this research. Limitation of research is listed in narrowing down the scope of comparing the techniques to wipe the file. The significance of the research is also explained for contribution of doing this research. In Chapter 2, there will be literature review on file sanitization methods. Moreover, there will also be comparative analysis between three methods of file sanitization chosen. In Chapter 3, there will be explanation on the research process throughout the entire process of conducting this research. Lastly, the framework of the research is explained to give a clearer clarification on way to sanitize file.
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
CHAPTER 2 LITERATURE REVIEW 2.1 Introduction
In literature review, there will be research and analysis that are needed to compare the different methods or techniques of sanitizing file. Before that, there will be explanations on anti-forensic and file sanitization. Then, different methods of existing file sanitization will be explained thoroughly. From there on, the different techniques will be compared to see the similarities or differences of each technique. This chapter is made to have a better understanding on file sanitization and the existing techniques that can be used to sanitize file.
2.2 Anti-Forensic
The rapid growth of technology can cause computers and other gadgets to become a weapon when they are being used with false intention like stealing top secret information. Anti-forensic is a selection of tricks and techniques that are used with the intention of preventing forensic investigation (Jain & Chhabra, 2014). Besides that, anti-forensic is also defined as any effort or action that is to compromise the availability of evidence to the process of forensic (Stüttgen & Cohen, 2013).
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
8 Anti-forensic technique is close to how data sanitization works. Some techniques are used to on storage devices so that data cannot be recovered (Poonia, 2014). Anti-forensic is steps taken to frustrate forensic investigation and avoid it. The main purpose of anti-forensic is preventing crime evidence from getting discovered. Besides that, the availability of open source tools and software have been designed to apply anti-forensic techniques (Jain & Chhabra, 2014). There are four ultimate goals of anti-forensic. The first is to avoid identification of some event that has taken place. Second, to disrupt the selection of information. Third, to add the amount of time an examiner need to use on a case and lastly is to cause suspicion on testimony and forensic report (Garfinkel, 2017).
Michael Perklin has mentioned in DEFCON, which is world's longest running and massive underground hacking conference that file sanitization is one of three techniques that popular amongst people. Another two are encryption and physical destruction. These techniques are all categorized as classic anti-forensic techniques (Perklin, 2012). Apart from that, file sanitization is also categorized as artifact sanitization which is the original taxonomy of anti-forensic. Other than file sanitization are disk degaussing, disk sanitization, generic data sanitization, log sanitization, metadata sanitization and registry sanitization which belong to the same category that is artifact sanitization (Conlan et al., 2016).
Data sanitization ruin the data stored in memory. Tools have been developed to sanitize the files like Eraser and BC Wipe. These software destroy files by overwriting data in files repeatedly. Artifact sanitization requires less time and efficient. However, there are limitations for sanitization. Some sanitize tasks are quite difficult like deleting file that is contained in the master file table. As for that, some tools are not able to completely overwrite the file, leaving residues data and traces of the file (Jain & Chhabra, 2014). Both researchers and practitioners are needed to truly understand the effect of anti-forensic.
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
2.3 Entropy
Entropy can be described as the measurement of randomness. The concept of entropy is originally from Claude E. Shannon who has applied it to digital communications. Claude E. Shannon is interested in finding the theoretical maximum quantity or amount that a compressed digital file can produce (Hartman, Calculate File Entropy, 2013). Entropy calculates the uncertainty set of probabilities and the compressibility of information.
Similar to the term entropy, file entropy is representing the datasets in the selected file. It calculates the quantity of data that are presented in a selected file. For an example, a file has been selected and wished to calculate the value of entropy for the file, it will be as simple as accessing the technique of file entropy and the calculation process. The representation for Shannon’s entropy is shown in equation (1) below:
𝐻 = − ∑𝑛𝑖=1𝑃𝑖 log2𝑃𝑖 (1) From the formula (1) above, i is the symbol out of a possible i to n different symbols while P is the probability for the occurrence of ith symbol. When given a set of symbols, the number of different symbols and the occurrences are important. The less number of various symbol, the higher the occurrences of symbols, the entropy becomes lower fort the set of symbols (Mesdger & Srinivasan, 2012). It is said that when a file is compressed, various patterns of bits are replaced with a shorter patterns. As for that, when the entropy of data in a file is high, the less the file can be compressed (Hartman, Calculate File Entropy, 2013).
This Shannon’s formula will produce a result of something that is between zero (0) to eight (8). When the value of entropy is closer to zero (0), the data is more orderly and non-random but when the value of entropy is closer to eight (8), the more random or non-uniform it is (Lance, 2013). For this experiment, the technique with the lowest entropy, which is value that is closer to zero (0) is the best technique for file sanitization using zero. In the website ForensicKB by Lance, there are also examples and explanation on file entropy. These are the easiest example by ForensicKB, imagine a file that has been filled with value of zero:
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
80
REFERENCES
Alexander, P. (2018). Choosing the Best Data Storage Solution. Retrieved from https://www.entrepreneur.com/article/172226
Bahl, V., Leong, D., Jiayan, G., Siang, J., & Mei, T. (2012). Secure Data Shredder.
Proceedings of the Global Engineering, Science and Technology Conference.
Bangladesh. pp. 28-29.
Chan, C. (2017). Putting All Your Data in One Smartphone Basket. Retrieved from https://www.wired.com/story/putting-all-your-data-in-one-smartphone-basket/ Conlan, K., Baggili, I., & Breitinger, F. (2016). Anti-forensics: Furthering digital
forensic science through a new extended, granular taxonomy. Digital Investigation, 18, pp. 66–75.
Cretu, G. F., Stavrou, A., Stolfo, S. J., & Keromytis, A. D. (2007). Data Sanitization : Improving the Forensic Utility of Anomaly Detection Systems. Workshop on Hot Topics in System Dependability (HotDep).Edinburgh, UK.pp. 1-6. Cui, W., Liu, S., Wu, Z., & Wei, H. (2014). How Hierarchical Topics Evolve in
Large Text Corpora. IEEE Transactions On Visualization And Computer Graphics, 20(12), pp. 2281–2290.
Diesburg, S., Feldhaus, C. A., Fardan, M. Al, Schlicht, J., & Ploof, N. (2015). Is Your Data Gone ? Comparing Perceived Effectiveness of Thumb Drive Deletion Methods to Actual Effectiveness. Cryptography and Security.arXiv preprint arXiv:1512.08986.
Dunn, J. E. (2015). Best disk wiping tools – how to securely clean hard drives, smartphones and SSDs. Retrieved from
https://www.techworld.com/security/best-disk-wiping-tools-securely-cleaning-hard-drives-smartphones-ssds-3627310/
Fisher, T. (2016). What is the Write Zero Method? Retrieved from https://www.lifewire.com/what-is-the-write-zero-method-2626052
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
Fisher, T. (2017). Data Sanitization Methods. Retrieved from https://www.lifewire.com/data-sanitization-methods-2626133
Fisher, T. (2018). Retrieved from https://www.lifewire.com/dban-dariks-boot-and-nuke-review-2619130
Fisher, T. (2018). 35 Free File Shredder Software Programs. Retrieved from https://www.lifewire.com/free-file-shredder-software-programs-2619149 Garfinkel, S. L. (2014). The prevalence of encoded digital trace evidence in the
nonfile space of computer media. Journal of Forensic Sciences, 59(5), pp. 1386–1393.
Garfinkel, S. L., & A. S. (2017). Discarded Hard Drives Can Be Dangerous. Retrieved from http://www.computerweekly.com/feature/Discarded-hard-drives-can-be-dangerous
Greg. (2013). Using dd To Repeatedly Erase A Specific Range Of Sectors On The Hard Disk. Retrieved from https://zedt.eu/tech/linux/using-dd-to-repeatedly-erase-a-specific-range-of-sectors-on-the-hard-disk/
Hartman, K. G. (2013). Calculate File Entropy. Retrieved from https://www.kennethghartman.com/calculate-file-entropy/
Hughes, G., & Coughlin, T. (2014). Tutorial on Disk Drive Data Sanitization. Nist Special Publication, pp. 1–15.
Hoffman, C. (2016). You Only Need to Wipe a Disk Once to Securely Erase It. Retrieved from https://www.howtogeek.com/115573/htg-explains-why-you-only-have-to-wipe-a-disk-once-to-erase-it/
Jain, A., & Chhabra, G. S. (2014). Anti-forensics techniques: An analytical review.
2014 7th International Conference on Contemporary Computing (IC3). Noida. pp. 412–418.
Kissel, R., Regenscheid, A., Scholl, M., & Stine, K. (2014). Guidelines for Media Sanitization. US Department of Commerce : National Institute of Standards and Technology.
Lance. (2013). File Entropy Explained. Retrieved from
http://www.forensickb.com/2013/03/file-entropy-explained.html
Medsger, J., Srinivasan, A., & Wu, J. (2015). Information Theoretic and Statistical Drive Sanitization Models. Journal of Information Privacy and Security, 11(2), pp. 97–117.
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
82
National Science Foundation. (2017). Producing The Digital Body (Home). Retrieved from https://digitalcorpora.org/
Perklin, M. (2012). Anti-Forensics and Anti-Anti-Forensics. Talk at DEF CON, 20. Las Vegas.
Poonia, A. S. (2014). Data Wiping and Anti Forensic Techniques, Compusoft3(12), pp. 1374–1376.
Postgraduate, N. (2007). Anti-Forensics : Techniques , Detection and
Countermeasur. In 2nd International Conference on i-Warfare and Security.
Monterey, California. pp. 77-84.
Savoldi, A., Piccinelli, M., & Gubian, P. (2012). A statistical method for detecting on-disk wiped areas. Digital Investigation, 8(3–4), pp. 194–214.
Singh, B., Saharan, R., Somani, G., & Gupta, G. (2016). Secure File Deletion For Solid State Drives. in Peterson, G., Shenoi, S. Advances in Digital Forensics XII. USA:Springer. pp. 345–346.
Singh, V., Kesharwani, L., Saran, V., Gupta, A. K., & Lal, E. P. (2015). “ Efficacy of open source tools for recovery of unconventionally deleted data for forensic consideration ”. International Journal of Social Relevance & Concern (IJSRC), 3(9), pp. 53-59.
Srinivasan, A. (2012). ERASE- EntRopy-based SAnitization of SEnsitive Data for Privacy Preservation,The 7th International Conference for Internet Technology and Secured Transactions. George Mason University. pp. 427–432.
Stüttgen, J., & Cohen, M. (2013). Anti-forensic resilient memory acquisition. Digital Investigation, 10(13), pp. 105–115.
VandenBrink, R. (2016). Using File Entropy to Identify "Ransomwared" Files. Retrieved from
https://isc.sans.edu/forums/diary/Using+File+Entropy+to+Identify+Ransomwar ed+Files/21351/
Verizon. (2015). 2015 Data Breach Investigations Report.Verizon RISK Team51(2), pp. 39–54.
Xu, L., Jiang, C., Wang, J., Yuan, J., & Ren, Y. (2014). Information security in big data: Privacy and data mining. IEEE Access, 2, pp. 1151–1178.
Zhang, P., Niu, S., Huang, Z., & Qin, X. (2017). Adaptive Data Wiping Scheme with Adjustable Parameters for Ext4 File System. Chinese Journal of Electronics,
PTTA
PERPUS
TAKAAN
TUNKU
TUN
AMI
NAH
Zoubek, C., & Sack, K. (2017). Selective deletion of non-relevant data. Digital Investigation, 20, pp. 92–98.