Lab III: Unix File Recovery Data Unit Level

(1)

New Mexico Tech Digital Forensics Fall 2006

Lab III: Unix File Recovery – Data Unit Level

Objectives

- Review of unallocated space and extracting with dls - Interpret the file system information from the superblock - Locate files by block number

- Recover files from unallocated blocks

- Understand contiguous and noncontiguous files - Using the Autopsy Forensic Browser

Procedures

Extracting Unallocated Space

Step 1

These first steps will be a review of the process you performed in Lab II. Launch your “Linux – Forensics” virtual machine.

An image of a disk partition with deleted files to be recovered is provided and located on /dev/hdb1. The image contains a linux ext2 file system. The file, image.sha1.txt contains the hash value of the image. There is another file, fileinfo.txt , that contains the file names, sizes in bytes and hashes of the deleted files you will recover.

Mount /dev/hdb1 to /mnt/recover.

The image file is located in /mnt/recover/lab3

Question 1: Extract unallocated space to the file image.unalloc.dls and explain your process and the commands you used. Do not forget your job as a forensics investigator is to ensure integrity of the data.

Begin by verifying the hash of image.dd matches the hash found in image.sha1.txt. Extract unallocated space using dls.

(2)

/mnt/recover/lab3/image.unalloc.dls

Hash your image of unallocated space and add it to the hash file. # sha1sum /mnt/recover/lab3/image.unalloc.dls >> /mnt/recover/lab3/image.sha1.txt

Extracting Plaintext from Unallocated Space

Step 2

The first four files you are going to recover are plaintext files containing repeating strings that match the file's name. For example, file01 contains multiple strings of “file01”. You will need to find occurrences of the strings “file01”, “file02”, “file03” and “file04”.

Question 2: Explain the process and commands you use to extract

plaintext from the unallocated image files to the file image.unalloc.str. How will you use grep to locate the files you are looking for?

Recall the strings tool you used from Lab 2.

# strings -a -t d /mnt/recover/lab3/image.unalloc.dls > /mnt/recover/lab3/image.unalloc.str

Create a searchlist.txt file and enter the strings “file01”, “file02”, “file03” and “file04”.

# grep -f /mnt/recover/lab3/searchlist.txt /mnt/recover/lab3/image.unalloc.str >

/mnt/recover/lab3/results.grep

Locating Files by Block Numbers

Step 3

Recall from Lab 2 that the -t d option for the tool strings included the byte offset in the output. The byte offset is printed before the string.

Analyze your results.grep file after running grep and use it to find the byte offset of the first occurrence of the string “file01”.

It is important to realize that this byte offset is associated with image.unalloc.dls (you ran strings against this file) and not the original image.dd file.

(3)

You are now going to learn new tools for converting the offset in unallocated space to the offset on the actual disk or image. The sleuthkit tool dcalc is used to convert between units from a .dls file to the units in the complete image. A unit is equivalent to the block or cluster size of the file system. First, you have to

convert the byte offset found in the strings output to a block offset. Block

sizes will be determined by the file system and will vary depending on partition sizes.

To get information on the file system use the sleuthkit tool fsstat.

# fsstat -f linux-ext2 /mnt/recover/lab3/image.dd | less

Look for the number by 'Block Size:'. This number is in terms of bytes. Question 3: What is the block size for this image? What is the byte offset for the location of the string “file01” in the .dls file. How do you convert this byte offset to a block offset in the .dls file and what is that value?

The block size is 1024 bytes. The first occurrence of the string “file01” is found at byte offset 12288 in the .dls file. Divide this number by the block size 1024 to convert this to a block offset in the .dls file. You should get block number 12.

Once you have the block offset for the .dls file you can use dcalc to convert this to an offset in the original whole file. The -u option specifies that you are

converting from a .dls block number to the actual block number.

# dcalc -f linux-ext2 -u offset /mnt/recover/lab3/image.dd

NOTE: offset is the block offset value you found in Question 3

The sleuthkit tool dstat can be useful to add a little assurance that you calculated the right block number.

# dstat -f linux-ext2 /mnt/recover/lab3/image.dd offset

NOTE: offset is the block offset from the original image file you found using dcalc.

The output should confirm that the block is unallocated. If it says it is allocated you know for sure you miscalculated.

(4)

Extracting Data Blocks

Step 4

The sleuthkit tool dcat is used to extract data from a specified block number. You are going to use this tool to recover the actual file.

To make life easier, the results of doing a ls -l on the drive before the files

were deleted, is made available to you.

View the fileinfo.txt file and find the filesize of file01.

Notice it is the exact size of a single block on this file system (not very likely to happen normally). dcat extracts data one block at a time so with dcat you should easily be able to recover file01 in its exact form.

# dcat -f linux-ext2 /mnt/recover/lab3/image.dd block > /mnt/recover/lab3/file01

NOTE: block is the new value you found from running dcalc Verify the file size for file01 is correct.

# ls -l /mnt/recover/lab3/file*

Take the hash of the file you just recovered and compare it with the hash in fileinfo.txt. They should match!

Partial Block File Sizes

Step 5

Generally, files will not be the exact size of a block or multiples of blocks. The end of the file, most likely, will fall somewhere in the middle of a block.

Question 4: Using the methods you learned above, determine the location of file02 in the image.dd file. How did you do this and what block number did you find?

Analyzing the results.grep file you find that the first occurrence of the string “file02” is at byte offset 13312. Divide this by the block size, 1024, to get a block offset of 13 in the .dls file. Use dcalc to convert this to the block found in image.dd

(5)

# dcalc -f linux-ext2 -u 13 /mnt/recover/lab3/image.dd

File02 is found at block number 271 in the image.dd file.

Question 5: Use dcat to extract the block you found in Question 4. You may have recovered the data in file02, but is this an exact copy of the original file02? What has happened?

# dcat -f linux-ext2 /mnt/recover/lab3/image.dd 271 > /mnt/recover/lab3/file02

Dcat extracts data unit sizes determined by the block size. Extracting block 271 give you a file 1024 bytes long, but in the fileinfo.txt file you should see that the original file02 file was only 950 bytes in size. Taking the hash of the recovered file will not match the original hash.

Use less to view your file02 file.

You will notice garbage characters at the end that don't actually belong in the original file. You will next use a plaintext editor or a hex editor to open file02 and delete the data that does not belong in the file.

Khexedit is a great tool to view the file in hex and its ascii representation. A shortcut to khexedit is located on your toolbar. It makes deleting the extra characters easy because the left most column lists the byte offset.

Make sure in the 'View' menu that 'Offset in Decimal' is selected. Place

your cursor at byte offset 950 in the hex window and delete the remaining characters. The file size of the recovered file and the original should match. Once they do, be sure to compare the hashes.

Question 6: You may not want to delete this extra data without giving it some attention first. What may be useful about this data?

You should recognize the extra data as being the slack of the file you are recovering. Recall from Lab 2 that slack space proves to sometimes contain useful evidence.

Recovering Files in Contiguous Blocks

Step 6

You have recovered two files that were contained in a single block. Larger files will obviously be spread across multiple blocks.

(6)

you do this and what block number did you find?

Analyzing the results.grep file you find that the first occurrence of the string “file03” is at byte offset 14336. Divide this by the block size, 1024, to get a block offset of 14 in the .dls file. Use dcalc to convert this to the block found in image.dd

# dcalc -f linux-ext2 -u 14 /mnt/recover/lab3/image.dd

File03 is found at block number 272 in the image.dd file.

Question 8: Notice the filesize of file03 in fileinfo.txt. What can you determine about this file that will be useful for recovering this file?

File03 is 2000 bytes long which is larger then the block size. This file will be associated with two blocks and both those blocks will need to be extracted.

dcat allows you to specify the number of data units (blocks) to extract. Extracting multiple blocks will be done in consecutive order. Therefore, to recover an entire file in this manner, that file must exist on disk in contiguous blocks.

Use dcat to extract file03.

# dcat -f linux-ext2 /mnt/recover/lab3/image.dd unit_addr

num

NOTE: unit_addr is the block number you found and num is the number of blocks

you determined the file uses.

Don't forget to modify the file so that the filesize is correct. Compare the hashes.

Recovering Files in Noncontiguous Blocks

Step 7

Files will not always be on disk in contiguous blocks. Notice in your results.grep file that “file04” is found at byte offset 16384 and again at offset 532480. Since the offsets are so far apart, it is a good indication that the file is fragmented.

Question 9: How many blocks will file04 use? Use the results.grep file to determine how the file is fragmented. What blocks do you need to

(7)

File04 is 3030 bytes long according to the fileinfo.txt file. Therefore, this file will be located on three different blocks. By looking at the results.grep file it looks like there is one block worth of data starting at byte offset 16384 and two blocks worth of data starting at block 532480. Using dcalc you should find that file04 is located on blocks 274, 778 and 779.

You are going to use dcat to extract file04 in parts.

Question 10: How will you use dcat to extract all the blocks for file04? Name each part file04-p1, file04-p2, etc. (Hint: you should only have to use dcat two times to extract all the data blocks).

# dcat -f linux-ext2 /mnt/recover/lab3/image.dd 274 > /mnt/recover/lab3/file04-p1

# dcat -f linux-ext2 /mnt/recover/lab3/image.dd 778 2 > /mnt/recover/lab3/file04-p2

Once you have extracted all the blocks associated with file04 you are ready to piece them back together to form the entire file. The easiest method is using the linux cat command.

# cat /mnt/recover/lab3/file04-p* > /mnt/recover/lab3/file04

Modify the file to fit the original filesize and compare hashes.

Extra Credit: dd can be used for more then just imaging an entire disk. You can also use it to recover files. Show how you would use dd to recover file03. Construct your command so that you do not need to make any modifications to the file to make the filesize match.

Use dd to write out a single byte at a time. You will need to skip to the byte offset that file03 is located at. Recall file03 is at block 272, so multiply this by 1024 to get the byte offset.

#dd if=/mnt/recover/lab3/image.dd

of=/mnt/recover/lab3/file03.dd bs=1 skip=278528 count=2000

Recovery of a Non-plaintext File

(8)

The last file that exists on the image is a non-plaintext file. Parts of the file may be in plaintext, but it also contains binary data. It is very difficult, at the data unit layer, to recover these types of deleted files if they are noncontiguous on disk. Look at fileinfo.txt to view the filesize of file05. Do the math and you see that this file must exist across 22 blocks.

File05 is a document that discusses a new type of Microsoft keyboard. Use grep to search for occurrences of “keyboard.”

# grep “keyboard” /mnt/recover/lab3/image.unalloc.str

The searchword “keyboard” is found at block 883 using dcalc. The entire text of the document can be viewed by looking at 3 blocks.

# dcat -f linux-ext2 /mnt/recover/lab3/image.dd 883 3 | less

However, you know from the filesize that this is not the entire document. The large filesize compared with the small amount of plaintext is a good indication the original file contains binary data. Also, if you didn't know the filesize, the fact that the plaintext starts in the middle of a block is also a good indication of preceding binary data. This file is noncontiguous on disk so it is very difficult to discover the blocks containing binary data associated with this file. A searchlist would be of little use.

Here is a list of blocks associated with file05 : Direct Blocks: 881 882 883 884 885 886 887 888 11205 11206 11207 11208 11210 101594 101595 101596 101597 101598 101599 101600 101601 101602 Indirect Blocks: 11209

Recall that an inode can list up to 12 blocks. The thirteenth block (11209 in this case) is used to point to the remaining blocks.

Question 11: List the commands you would take to recover file05.

# dcat -f linux-ext2 /mnt/recover/lab3/image.dd 881 8 > /mnt/recover/lab3/file05-p1

# dcat -f linux-ext2 /mnt/recover/lab3/image.dd 11205 4 > /mnt/recover/lab3/file05-p2

(9)

# dcat -f linux-ext2 /mnt/recover/lab3/image.dd 11210 > /mnt/recover/lab3/file05-p3

# dcat -f linux-ext2 /mnt/recover/lab3/image.dd 101594 9 > /mnt/recover/lab3/file05-p4

# cat /mnt/recover/lab3/file05-p* > /mnt/recover/lab3/file05

Use khexedit to delete all characters at and after byte offset 22110.

# cat /mnt/recover/lab3/fileinfo.txt # sha1sum /mnt/recover/lab3/file05

The sleuthkit tool file can be used determine the type of file.

# file /mnt/recover/lab3/file05

Question 12: What type of file is file05? What type of useful information can be found in the other 19 blocks?

File05 is a Microsoft Word Document. By looking at the other 19 blocks of this file you can determine what application and version number was used to create the file (MS Word v. 10 in this case), the owner's username (pmwatso) and so on.

Using the Autopsy Forensic Browser

Step 9

Autopsy is a web front end, written in perl, for the sleuthkit set of tools. Autopsy should be started from the command line using the -d option which specifies the working directory for autopsy. All log and output files from autopsy will be stored in this specified directory.

Make an autopsy working directory

# mkdir /mnt/recover/lab3/autopsy

Start autopsy

(10)

The web address autopsy prints to the screen has been bookmarked in your browser.

From your toolbar, launch the mozilla web browser. From the links bar start autopsy.

The first part of setting up a case in autopsy may seem tedious, but can be important if working on a real case.

Click 'New Case'. Name the case Lab3. Use your name for an investigator and click 'New Case' and then 'ok'. When you see the new case listed in the “Case Gallery” click the “OK” button.

Next click 'Add Host'. The hostname is to keep record of what computer is used for doing the investigation. Enter 'vmware-forensics' in the host name field. Enter MST for the Timezone. Click 'Add Host'.

Click 'Ok' and then 'Add Image'. The location is used to specify the image file. Enter /mnt/recover/lab3/image.dd. Keep the Import Method at symlink.

Change the file system type to linux-ext2. Mount point should be set to / . Select 'Calculate the hash value for this image' and click 'Add Image'. Notice, autopsy calculates hashes using a md5sum so it will not match the image.sha1.txt.

Click 'Ok' twice and you receive a toolbar in the top frame of the browser. Click the 'Image Details' button.

Question 13: What command line tool can be used to generate the same output shown?

Fsstat

Next Click the 'Keyword Search' button.

Here you can extract unallocated space as if you used dls at the command line. You can also extract strings from the entire image or just the dls file. You should be able to associate each of the buttons on this page with the tools you have already used on the command line.

Extract strings from the image.dd file. Extract unallocated space to image.dls and then extract strings on that file.

(11)

/mnt/recover/lab3/autopsy/lab3/vmware-forensics/output. You can now do a keyword search on either the original image or unallocated image.

On the original image do a search for the string “file03”.

Notice there is a match at blocks 272 and 273. This should be the same results you got on the command line earlier.

Question 14: There was another match for “file03” in allocated space. When viewing the contents of that block, what does it look like? What can you conclude from this about deleted files?

The contents of this block look like a directory listing. Because you know that these files have been deleted you can conclude that on a ext2 file system, if a file is deleted, evidence of that file will remain associated with that directory.

Next, click on the 'Data Unit' tab. For Fragment Number enter 272 and specify the number of fragments as two. The default view is in strings format.

Experiment with the different views by clicking the links for ascii and hex. If this were a real case you would want to take careful notes of your findings.

Click the 'Add Note' button. Type a note such as “Found the beginning of the deleted file, file03 here.” Close the window.

Autopsy can be used to recover files as well. You should be back at the window where you are viewing fragments 272-273.

Click 'Export Contents'. Use the save dialog box to create a new directory /mnt/recover/lab3/autopsy/recovered. Save the file as file03 in the new directory.

Question 15: What tool do you think autopsy is the frontend for that exports the contents of a data block? Is the file you exported an exact copy of the original and if not, why?

'Export Contents' is the frontend for dcat in autopsy. Because it is dcat, exported files will have sizes of full blocks. To recover the file in its original state the extra slack must be deleted.

You will be exploring more of autopsy in future labs. You will also be learning how to recover files at the meta data layer which can make it possible to easily recover noncontiguous files.

(12)

Step 10

It is very important to do these next steps so that the lab is properly set up for the person who uses the computer after you.

Unmount any drives you mounted and shutdown the VMWare system. In VMWare, revert 'Linux - Forensics' back to the snapshot by clicking the 'Revert' button.

From the c:\vmware-images\Linux - Forensics\ directory remove all files beginning with 'Linux – Forensics-Image'.