The cache simulator - Simulation results - Analysis of a Single-Processor System

3. Analysis of a Single-Processor System

3.5. Simulation results

3.5.1. The cache simulator

To answer these questions, we wrote a program to simulate the behavior of various types of caches, using the trace data to drive the simulation. We used only the data collected from the file manager level. This allowed us to simulate the effect of a cache on the I/O generated by user processes, without including the effects of I/O generated for maintenance of the file system and the naming system. For the measurements below, the four traces produced nearly indistinguish- able results; we report only the results from trace A.

The simulated system mimics the data structures found in theUNIXfile system. We simulated only the file manager operations, and included a file block cache instead of a disk block cache. This cache consisted of several fixed-sized blocks used to hold recently referenced portions of file. We used an LRU algorithm for block replacement in the cache. There is a table of currently open and recently closed files, where each entry in the table includes the file identifier, reference count, file size, statistics about how the file was accessed, and a pointer to a doubly-linked list of blocks from the file that reside in the cache.

In the UNIX system, when a file is closed, any of its blocks that may reside in the cache are not automatically flushed out. This results in a significant performance improvement, since many files are opened again shortly after they are closed, and their blocks may still be found in the

cache. We wished to preserve this aspect of the UNIX disk block cache in our simulated file block cache.

As the trace is processed, an open record causes an entry in the file table to be allocated. If the file is already in the file table, the associated reference count is incremented. A close record causes the reference count to be decremented. When the reference count reaches zero, the file table entry is placed on a free list. Any blocks in the file that still reside in the cache remain associated with the file table entry. So, in fact, when the simulator must allocate a file table entry to respond to an open record, it searches the free list first. If an entry for the file is found, it is reclaimed, and any file blocks that still remain in the cache are also reclaimed.

For each read or write record, the range of affected bytes is converted to a logical block number or numbers. The simulator checks to see if the affected blocks are in the cache. If so, the access is satisfied without any disk manager activity, and the block is brought to the head of the linked list that indicates the LRU order of the cache blocks. If not, a block from the cache free list is allocated. If the free list is empty, the block at the tail of the LRU list of the file block cache is freed and allocated to this file. If the cache is simulating a write-back cache, any changes to the block are written back at this time.

The principal metric for evaluating cache organization was the I/O ratio, which is similar to the miss ratio. The I/O ratio is a direct indicator of the percentage of I/O avoided due to the cache. It expresses the ratio of the number of block I/O operations performed to the number of block I/O operations requested. An I/O operation was charged each time a block was accessed and not in the cache, or when a modified block was written from the cache back to disk. The I/O ratio is different from the miss ratio in that it effectively counts as missed those I/O operations resulting from the write policy, even though those blocks appear in the cache.

A secondary metric was the effective access time. We assigned a time cost to each disk access and computed the total delay that user programs would see when making accesses through the cache. This allowed us to evaluate the effects of varying the access time to the disk storage on performance.

Often in the traces, programs made requests in units much smaller than the block size. We counted each of these requests as a separate access, usually satisfied from the cache. Because it more closely simulates the actual performance that programs will see, we chose not to collapse redundant requests from programs even though this results in lower miss and I/O ratios and effective access times.

The results are reported only after the simulator reaches steady state. That is, block accesses and misses that occur before the cache has filled are ignored. Modified blocks left in the cache at the end of the simulation are not forced out, because this would unrealistically increase the miss and I/O ratios.

In document WRL 87 4 pdf (Page 34-36)