Lecture 27
Week 13 – Chapter 4
Long-term Information Storage
Requirements :
1. Must store large amounts of data
2. Information stored must survive the
termination of the process using it
Files
• are logical units of information created by processes
• are independent of each other
• each is an address space in a disk
• Files are an abstraction mechanism. They provide a way to store information on the disk and read it back later
File system
• part of the operating system dealing with
files is known as the file system
• Tasks such as:
– Structured
– Named
– Accessed
– Used
– Protected
– implemented, and
File Naming
• Many operating systems support two part
file names
• First is the identifier or actual name
• Later part is the extension
• An application is associated to an extension
which runs it
File Naming
• Some file systems distinguish between
upper and lower case letters, whereas others
do not.
File Structure
File Structure
• Three kinds of files
1. unstructured sequence of bytes
2. sequence of fixed-length records
3. Tree of records
File Types
• Regular Files
– are the ones that contain user information
• Directories
– are system files for maintaining the structure of the file system
• Regular files are generally either
– ASCII files
• ASCII files consist of lines of text terminated by a
– carriage return \r
File Types
(a) An executable file of early UNIX
File Access
• Sequential access
– read all bytes/records from the beginning
– cannot jump around, could rewind or back up
– convenient when medium was mag tape
• Random access
– bytes/records read in any order
– essential for data base systems
– read can be …
• move file marker (seek), then read or …
File Attributes
File Operations
1.
Create
2.
Delete
3.
Open
4.
Close
5.
Read
6.
Write
7.
Append
8.
Seek
Directories
Single-Level Directory Systems
• A single level directory system
– contains 4 files
Two-level Directory Systems
Hierarchical Directory Systems
File System Layout
• File systems are stored on disks
• Disks are divided into one or more partitions
• Each partition can have different file system on it • Sector 0 is called the MBR (Master Boot Record)
– use to Boot the computer
– end of the MBR contains the partition table
• contains starting and
• ending addresses of each partition
File System Layout
• When computer Boots
– It executes MBR program by reading MBR
– it locates active partition
• Reads the first block called Boot Block
• Execute it
– The program in the boot block loads the operating system contained in that partition
• Superblock
– contains all the key parameters about the file system
– is read into memory when the computer is booted
• Superblock includes
– a magic number to identify the file system type,
– the number of blocks in the file system,
• Free Space Management
– In terms of
• Linked list
• Bitmap
• I-Nodes
– one per file
– an array of data structures
– telling all about the file
• Root directory
– which contains the top of the file system tree
• Remainder of disk contains all the other directories & files
File allocation methods
1. Contiguous Allocation
2. Linked List Allocation/Chained Allocation
Contiguous Allocation
• Store each file as a contiguous run of disk blocks
• Example
– on a disk with l-KB blocks, a 50-KB file would be allocated 50 consecutive blocks
• Advantages
– Simple Implementation
• Disk address of first block
• Number of blocks in a file
– Fast reading performance
• Disadvantage
– Fragmentation
• Solution?
Contiguous Allocation
• Fragmentation
Contiguous Allocation
• Useful in write-once optical media
– CDs
– DVDs
• As all sizes are known in advance and once allocated it doesn’t need to be altered
• Hence doesn’t lead to fragmentation
• Files can be stored by keeping each one as a linked list of disk blocks
• First word of each block is used as a pointer to the next block.
• The rest of the block is for data.
• Advantage
– No fragmentation
– Every block can be used
• Disadvantage
– Random access is very slow
• To get to block n, the operating system has to start at the beginning and read the N-1 blocks prior to it, one at a time
– the amount of data storage in a block is no longer a power of two because the pointer takes up a few bytes
• Pointer word from each disk block resides in a table in memory
• Both disadvantages of previous method can be avoided with this technique
Linked List Allocation Using a Table in Memory
• Such a table in main memory is called a FAT (File Allocation Table)
• Disadvantage:
– If disk storage is large then residing this table in memory takes up proportionally more space and it needs to stay in the main memory all the time
– E.g. with a 200-GB disk and a 1-KB block size
– it would take 600-800MB of ram
• A data structure associate with each file • which lists the attributes
• and disk addresses of the file's blocks.
• Advantage
– I-node need only be in memory when the corresponding file is open
I-NODES
Implementing Directories
• In a simple design, a directory consists of a
list of fixed size entries,
one per file
,
containing
– fixed-length file name
– structure of the file attributes
– one or more disk addresses telling
• File systems which uses I-Nodes • Directory entry becomes shorter:
– just a file name
Implementing Directories
(a) A simple directory
fixed size entries
21
Implementing Directories
• Two ways of handling long file names in directory
– (a) In-line
Shared Files
• More data added which means new blocks
are added to the file
• How would both directories know file has
more blocks ?
• Only user doing the changes gets its
directory entry appended with new blocks
• Two Solutions:
Shared Files (2)
Disk Space Management (2)