Rochester Institute of Technology
RIT Scholar Works
Theses
Thesis/Dissertation Collections
1986
Addition of Structured Records to the UNIX™ and
MS-DOS™ File Systems
Nancy Wisotzke
Follow this and additional works at:
http://scholarworks.rit.edu/theses
This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion
in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact
ritscholarworks@rit.edu.
Recommended Citation
Rochester Institute of Technology
School of Computer Science and Technology
Addition of Structured Records to the
UNIX(tm) and MS-DOS(tm) File Systems
by
Nancy Wisotzke
A thesis, submitted to
The Faculty of the School of Computer Science and Technology,
in partial fulfillment of the requirements for the degree of
Master of Science in Computer Science
Approved by:
Professor Jeff Lasky
Professor Warren Carithers
TItle of Thesis:
~
t
D
I T(
0
&
'
i
v',
f
\
'
I
I' , _ '1 ' ; ')1
hereby
~/deny)
permission
to
the
Wallace Memorial Library, of
RIT,
to reproduce
m~;sis
in whole or in
part. Any reproduction will not be for commerclal- use or profit.
-t
UNIX is a registered trademark of AT&T Bell Laboratories
MS-DOS is a registered trademark of Microsoft
Corporation
Digital Equipment
Corporation,
my employer, assumes noCONTENTS CHAPTER 1
1.2
1.3
1.4 INTRODUCTIONCOMPARISON OF UNIX AND
MS-DOS NATIVE FILE SYSTEMS
2
IMPLEMENTATION OF THE EXTENSIONS 2
THESIS ORGANIZATION 3
CHAPTER 2 2, 2, 2, 2. 2. 2. 2. 2. 2 2 2 2 2 2 2
THE UNIX FILE SYSTEM
I/O SYSTEM IMPLEMENTATION
13
CREAT 14
OPEN
16
WRITE
17
READ
19
I/O BUFFERS 21
STRUCTURING OF THE FILE SYSTEM 22
Addition Of
Library
Routines 23Fixed Length Records 24
Variable Length Records 26
STRUCTURING AT THE SYSTEM LEVEL 28
Data Structure Changes
28
Fixed Length Records 29
Variable Length Records 32
CONCLUSIONS 33
CHAPTER 3 THE MS-DOS FILE SYSTEM
3.2 I/O SYSTEM CALLS
42
3.2.1 CREATE 4
3
3.2.2 OPEN
43
3.2.3 WRITE 44
3.2.4 READ 4
5
3.3 STRUCTURING OF THE FILE SYSTEM 45
3.3.1 FIXED LENGTH RECORDS
47
3.3.2 VARIABLE LENGTH RECORDS
53
3.3.3 STREAM I/O
6
03.4 CONCLUSIONS
65
CHAPTER 4 4
4
4 44
LESSONS LEARNED2
FUTURE ENHANCEMENTS67
2.1 Indexed File System
67
2.1.1 Data Structures
-UNIX
69
2.1.2 Data Structures
-MS-DOS
69
Page
2
4.2.1.5
Open 714.2.1.6
Close 714.2.1.7
Indexing
Methods 724.2.1.7.1
B Tree 724.2.1.7.2
B+ Tree 754.2.1.8
Write 764.2.1.9
Read 774.2.1.10
Remove 784.2.2 Conclusions 79
4.3
NET. UNIX OPINIONS ON UNIX FILE SYSTEM .... 80APPENDIX A DATA STRUCTURES AND GLOSSARY
A.l MS-DOS DATA STRUCTURES
82
A. 2 GLOSSARY
8 5
CHAPTER 1
INTRODUCTION
A
file
system consists of operating system codesupporting
the I/O operations that open, close, create,read and write to files. Record oriented I/O is not a
feature
of either the UNIX or the MS-DOS native filesystem. The
data
is storedby
the file system as asequence of
bytes.
It is the task of the user to designan application
dependent
structure on thefile
system.In the case of
logically
relatedfixed
or variablelength
data,
a record structureis
desirable.This thesis describes a set of extensions to the
UNIX and the MS-DOS file systems that
supply
a recordstructure to these native file systems. The
descriptions of both the UNIX and the MS-DOS native
file
systems are followed
by
the extensions to eachfile
system. Program code
is
provided todemonstrate
the1.2
COMPARISON
OF UNIX AND MS-DOS NATIVE FILE SYSTEMSThe UNIX and MS-DOS file systems
have
manysimilarities. Each file system provides minimal
functionality;
thatis,
thebasic
abilities to read,write, open, close and create files.
Reading
andwriting
are sequential operations under both filesystems. Both the UNIX
file
system and the MS-DOS filesystem are hierachical in structure. Each contains a
root
directory
with branchesleading
to leaves ofalternate
directories
andfiles.
A fileis
foundby
tracing
a pathleading
from the root directory. Nomajor
differences
were found between the two filesystems.
1.3
IMPLEMENTATION OF THE EXTENSIONSThe extensions to the MS-DOS
file
system areimplemented in
KOALA,
a Digital Equipment Corporationproprietary language. KOALA is a block structured
language similar to
ALGOL,
ADA and PASCAL. KOALA offersuser-defined statements, functions and operators. Data
types consist of
integers,
bytes,
strings, arrays,characters and records. A record can consist of
any
combination of
data
types. KOALA routines consist offunctions and statements, where
functions
return avalue, and both
functions
and statements acceptparameters. When another program needs to call a
-function
or statement the term EXPORTis
placed in frontof
that
function
or statement definition. The sameholds
true for constants, variables and types. The termEXTERNAL placed in
front
of a function or statementdefinition
indicates
that the function or statement isfound
in
a macroassembly
language
module.The extensions to the UNIX file system are
described
from the user point of view. Thedifferent
methods which provide record structure are presented
along
with the advantages and disadvantages of each.1.4 THESIS ORGANIZATION
The
remaining
chapters of this thesis arestructured as
follows:
Chapter two
describes
the native UNIX file systemand its extensions. It details the internal structures
present, the allocation of space on a
disk,
and thenative
file
systemfunctions.
The extensions tothese
functions to provide record structures are then
discussed.
Chapter three
describes
the native MS-DOSfile
allocation and each system I/O function are discussed.
The system
function
descriptions
are followedby
therecord structure extensions. Application programs which
demonstrate
the use of the extended MS-DOSfile
systemare also presented. The code which extends the native
MS-DOS
file
systemis
documented
and accompaniedby
auser guide. The
demonstration
of the new file systemand
its
record structure areimplemented
on a Rainbowpersonal computer
running
MS-DOS version2.11.
Chapter four summarizes what was learned while
structuring
these two file systems. Future enhancementsto the
file
systems are also discussed. The indexedrecord structure
is
describedalong
with the B+treeindexing
method. The integration of the B+tree indexinto
the UNIXfile
system and the MS-DOS file system ispresented.
Appendix A
includes
data
structures,illustrations
of both the UNIX and the MS-DOS
file
systems, and aglossary.
The
bibliography
follows the Appendix. Eachentry
includes a short description of the
book
or article.-CHAPTER 2
THE UNIX FILE SYSTEM
A
file
under UNIX is a named sequence ofbytes.
UNIX
files
are organized in a hierarchialdirectory
structure. The root
directory
is at thetop
of thehierarchy.
Afile
is
foundby
searching through thedirectory
tree. The pathname describes this search.The pathname consists of a series of
directories,
separated
by
slashes(
'/')
, followedby
the name of thefile. If a path name begins with a '/' then the search
begins
at the rootdirectory,
if not, the searchbegins
at the current directory. The current
directory
is
thedirectory
from which you candirectly
access afile.
Tochange the current
directory,
the cd commandis
used.Every
processhas
a current directory. Thereis
a ".."and a "." entry found in each directory. The ".." entry
refers to the parent
directory
and the "." refers to thedirectory
itself. In the case of the rootdirectory
The
file
systemis
accessedusing
block
I/O. InUNIX,
adisk
is
anarray
ofblocks,
divided
into
four orfive
areas(depending
on the UNIX version). (See Figure2-1).
Thefirst
disk
block
is
not usedby
thefile
system. Its use varies from system to system; however
it
often contains the initial UNIX systembootstrap(
start-up) program.The super block plays an
important
role inmaintaining
thefile
system. There are two versions ofthe super
block,
the on-disk and the in-core superblock.
The second disk blockholds
the on-disk superblock.
This blockis
read in at system initializationtime and
holds
information on the size of the disk andthe
layout
of the other areas on the disk. It alsokeeps
track of free block andinode
usuage on thedisk.
The
in-core
super blockis
allocated when a mountcommand
is
issued
and released when an unmount commandoccurs. Mount causes a file on the original
file
systemto refer to the root
directory
of the file on theremovable volume. Once the mount is executed, the files
on the
file
system are accessed as ifthey
are part ofthe original
file
system. The UNIXkernel's
mount tableholds the in-core super
blocks
of all mountedfile
block
from
the mount table.Directories are considered files on UNIX with the
restriction that
they
cannot be written to. Each filehas
anentry
in either the rootdirectory
or asub-directory. Since these sub-directories are
considered
files,
they
have
anentry
in their parentdirectory.
Directory
entries are16
bytes in length andconsist of a 2 byte i-number and a 14 byte file name.
Each
directory
entry maps the file name to itscorresponding
inode.
The UNIX
file
system maintains an ilist(following
the super
block)
on each disk. The ilist contains aninode
for each file ordirectory
on thedisk.
Theoffset of an inode into the ilist is the i-number.
The inode is the central structure for all file
access in UNIX. There are two types of
inode
structures, the on-disk
inode
and thein-core
inode.
Afile that resides on disk and
is
not currentlybeing
accessed
has
aninode
entry in theilist
or on-diskinode table. The information found
in
the on-diskinode
is
read into an in-core inode when thefile
is
accessed.with the
file.
In-coreinodes
are allocated for eachfile
that
is
active,for
currentdirectories,
for eachmounted-on
file,
for each textfile,
and for the root.The root
directory's
inode is at a known i-number andall
files
may
be foundby
tracing
a path from the root.This
information
is
used when converting a path name toan
in-core
inode
table entry. A search for a file'sinode
starts at some known inode (either the root or theinode
of the current directory). Thisdirectory
issearched for the next portion of the pathname. The
directory
entry
found gives an i-number and afilename.
When the
filename
found matches the last component ofthe pathname, the i-number of the
directory
entry mapsto the desired in-core inode.
Any
modifications to theinode
are made at thislevel.
When the fileis
nolonger accessed the in-core inode is copied back to the
ilist
and thein-core
versionis
freed.
The in-coreinode
is
described
next with the fields that are commonto both inodes indicated.
UNIX Disk Structure
+ +
|
BOOT BLOCK|
+ +
|
SUPER BLOCK|
+ +
I
|
|
I-LISTj
I
I
+ +
I
|
|
FREE BLOCKSj
I
I
I
I
+ +
I
I
|
SWAP AREA|
I
I
+ +
Figure 2-1
The
inode
structureholds
the filedescription.
The first four fields are unique to the in-core inode.
The
flag
field
indicates
whether theinode
is
locked,
has
been modified, is mounted on, has a process waitingon the
lock,
or if the access time needs to be updated.The count field holds a reference count of the number of
files which reference the
inode.
If this referencecount
is
zero, the inode is unallocated. Theremay
beseveral file table entries that point to one inode. The
device on which the inode resides is
held
in a fieldin
the inode. The internal file name for the
inode,
calledthe i-number on the UNIX
file
system,is
held
in theThe
information
for the remaining inode fieldsis
read
into
thein-core
inode
from the on-disk inode onthe volume. The
fields
consist of a mode (whichdescribes
file
usage,type,
permissions),links
to thedirectory
entries, the owner, the groupid
of the owner,the
file
size, the addresses of the device thatconstitute the
file,
the time that the file was lastaccessed and
last
modified, and the time that the inodewas
last
modified.The
device
addresses in the inode point to freespace where disk
blocks
for the file can be obtained.(See Figure 2-2). The free block area is the fourth
section of a
disk.
The size of a block differs over thevarious versions of UNIX. A block
is
traditionally
512bytes;
however,
under 4.1BSDits
size is IKbytes,
andunder 4.2BSD a block can be up to 8K
bytes.
Underversion 6 there are 8
device
address pointers in theinode- This number becomes 13 in version 7. Under
version 7
UNIX,
the first ten addresses in this fieldpoint to the first ten blocks of
free
space availablefor a file. The eleventh address points to a block that
contains the addresses of the next 128
blocks
of thefile. The twelfth block points to 128 more
blocks,
eachwhich in turn point to 128 blocks of free space for the file. The thirteenth address
holds
"triple
indirect"-addressing. Each address points to
128
blocks,
eachwhich
in
turn point to128
blocks
that point to128
blocks
offree
space. The maximum file size, in thiscase.-
is
2,113,673
blocks.
Tripleindirect
addressing
is not used. Most
disks
are smaller than the onebillion
byte
file
sizelimit.
File allocation occurs when space is reserved for a
new
inode
in thememory
resident inode table and thedisk inode is copied into this entry.
There
is
a fifth disk area in some versions ofUNIX. This
is
theswap
area. This areaholds
the userInode File Access
Inode Device
Address
+ +
|
0
...9
|-|
Directj
|
Accessj-+ +
I
10
|-+ +
I
HI-+ +
I
12
|-+ +
File
Blocks
+ +
>|
0
1st
Indirect BlockDouble Indirect Block Triple Indirect Block
I
V + +|
128
|-JBlocks
j-+ +
>l
9+
+ +
|
10|
128 |>|
...>|Blocksj
>j
137
+ + +
|
138+ +
|
|
128
|>|
|
|
>|Blocksj
j
V
|
+ +|
+ +
|
...|
|
128|
|
+ +|
|Blocks|
>|
128|
>|
+ +
j
Blocks|
j
+ +
|16521
+ + +
|
128|
>+ +|16522
>|Blocks|
|
128|
j
+ +j
Blocksj
>j
+ +
j
+ + ...
|
|
128|
+ +|
>|Blocksj
>|
128|
j
+ +
j
Blocksj
>j
+ +
|2113673
+
Figure 2-2
-2.2
I/O SYSTEM IMPLEMENTATIONA user program
is
an application which has a needto access UNIX
files.
The user accesses a fileby
executing
a create or open call to the UNIX operatingsystem. The UNIX I/O system contains the code that
manipulates the file system. A call to the UNIX I/O
system to open or create a
file
causes the process'user
descriptor
to point to the open filearray
or table.This table
holds
a pointer to the entry in the inodetable. The inode table points to the actual file. (See
Figure
2-3
)
.UNIX File Access from User Descriptor to File
Process Data Area
Open File Array/Table
System Inode Files Table
I
|
User|-+ +
|
Descriptors|
> +- I-
+-I
|->
-+ + +I
I
I
-+
|->|
|
|-|
+ +-+
-| + +
-+
I
I
I
I
I
-+ + +
I
+-I
-+ Figure 2-3The open file array contains
file
structures whichhold information necessary for all
file
activity
undersuccessful a
file
descriptor
is
returned. The filedescriptor
is
the
file
system's method of accessing thefile.
Thefile
descriptor
is
a field in the filestructure. The
file
structure also contains a pointerto the next character to be read or written, a
flag
field
indicating
whether the file isbeing
read from orwritten
to,
and thelocation
of the I/Obuffer.
Different resources are needed for a file
depending
on
its
status. A file that existsbut
whose count fieldis zero needs an entry in the
directory
hierarchy,
adisk
inode
entry, and zero or more blocks of diskstorage- Once the
file
is
referenced an in-coreinode
entry
is
also required. A file thatis
opened forreading
or writing needs anentry
in the filearray
andan entry in the user program's process
data
area (whichpoints to a file array entry). Thus
far,
the allocationof storage, the file structure and the inode structure
have been described. I/O operations on files are
presented next;
it
is
here that the record structureis
implemented.
2.2.1 CREAT
To create a file that
does
notcurrently
exist, ortruncate an existing file to zero, the creat(path, mode)
system call
is
used. The path parameteris
a valid UNIX-pathname. The mode consists of three octal
digits
whichindicate
thefile
permissions of the owner, the groupand the world. The permissions are specified
by
4 forread, 2
for
write, and 1 for execute and are addedtogether to
specify
thedesired
permissions.Creat checks the pathname to see if a file with
that name
already
exists. In the case where thefile
does
not exist a coreinode
must be allocated for thefile.
The super block keeps track of these free inodes.Only
onehundred
inodes
arekept
in the on-disk superblock at a time and when the hundred are used, a linear
search of the ilist is performed to obtain the next
hundred
inodes.
The core inodefields
(flag,
count,device,
andi-number)
are then assigned. Adirectory
entry
is
made with the pathname and the i-number. Creatthen checks the permissions on the
file,
assigns valuesto the
file
structure, and calls a device open systemcall.
When a file of the same name already exists, access
permissions are checked and any previous contents of the
file
are removed. Thisis
done
by
a procedure whichfrees
all disk blocks that are associated with theinode. The file descriptor and the file structure are
to the
file
structure and thefile
is
opened.If an error occurs
during
the creat, the inodeis
released and -1
is
returned in place of the filedescriptor.
Errors occur when thedirectory
is
notfound,
thedirectory
cannot be writtento,
the fileexists and cannot
be
writtento,
the fileis
adirectory,
or there are toomany
openfiles.
2.2.2
OPENTo open a
file,
the application uses the open(path,mode) system call. The path parameter is a valid UNIX
pathname. The mode parameter indicates the type of file
access desired and is 0 for read, 1 for write, and 2 for
read and write access.
Open checks the pathname to see if a file with that
name already exists. If the file exists, an
entry
is
made into the inode table and an open file table entry
holding
the pointer to the inode table entry is created.Open associates a file descriptor with the file. This
file
descriptor mustbe
used for subsequent I/Ooperations on the file. The I/O pointer
is
positionedat the
beginning
of the file(byte
0)
and thefile
is
opened according to the access permissions requested
by
the user. If an error occurs
during
the open, a -1is
returned
in
place of thefile
descriptor.
Errors occurif the
file
does
not exist, if thedirectory
does
notexist,
if
thedirectory
cannotbe
read, if the filecannot
be
read (if open for read), or written (if openfor write), or if there are too
many
open files.The open and creat system calls set up entries in
three system tables. The
entry
in the open file tableindicates
whether the file is open for reading,writing
or
is
a pipe. Anentry
is made in the inode table andin
the user process table.2.2.3 WRITE
To write to a
file
inUNIX,
a callis
made to thewrite( f
iledesc,
buff,
nobytes) system call. Thefiledesc argument is the file descriptor that is
returned from the creat or open system call. The buff
argument
is
the buffer to write to the filefrom,
andthe nobytes argument is the number of
bytes
tobe
written.
Write converts the user file
descriptor
into
apointer to the file structure. It then checks that the
through
thefile
structure in the open file table.For a character special
file,
the appropriateprocedure
is
called to perform a write to the device. Aloop
is
thenbegun
to write the requested number ofbytes.
The physical block number of the file tobe
referenced
is
determined,
the character offset in thefile
is
set, and the minimum of the number of bytes leftin
the block and the number requested tobe
writtenis
set. If the
file
is
not a special blockfile,
thephysical
block
numberis
determined using the inodepointer and the block number determined previously. The
number of bytes to be written is checked. If it is a
full
block,
a buffer is assigned for the indicatedblock.
If the desired block is located andalready
associated, it
is
returned. The availablelist
holds
buffers
that are currently in usebut
may
be reassignedfor
an alternate use. The first buffer from thislist
is returned. In the case of less than a full block
being
written, the contents currently in the buffer areread. This
is
done
in order to save the contentsin
anarea of the buffer that is not in use.
The number of bytes requested for transfer are
moved from the user buffer area to the kernel
file
area.If the new value of the file offset points
beyond
end offile,
then
the
file
sizehas
increased.
Theinode
modification
flag
is
set and theloop
to write to thefile
continues,
aslong
as no error occurs and the countto
be
transferred
is
not zero. The number ofbytes
written
is
returnedfrom
thefile
system call. An errorhas
occurredif
this valueis
not the same as the numberof
bytes
requested for writing.If an error
has
occurredduring
the write, a -1is
returned
from
the system call. Errors occur when a badfile
descriptor,
buffer
address, or count are passed tothe system call, or when a physical I/O error occurs.
On an error the buffer
is
released to the availablelist.
No write aheadis
present inUNIX;
however,
if
the characters in the buffer
have
not changed the bufferis
marked and released. If the buffer is taken foranother use this mark indicates to the file system that
it
should be written out first.2.2.4
READA read is done with the read( filedesc,
buff,
nobytes) system call. The filedesc parameter
is
thefile
descriptor returned from the creat or open systemcall. The buff parameter
is
thelocation
of thebuffer
that
holds
the data thatis
tobe
read, and nobytesis
Read converts
the
file
descriptor
into a pointer tothe
file
structure. The access modein
theinode
is
checked to
determine
if
thefile
is
openfor
read. Acheck
is
then made toinsure
that the number ofbytes
tobe
readis
not zero. If this countis
zero, theprocedure returns. The
inode
is
accessed through thefile
structure in the open file table. Aloop
is thenset
up
totransfer
or read thedata
from thefile.
The
file
offset of the block to be readis
comparedto the
file
size. The number ofbytes
tobe
readis
setto the minimum of the number requested and the number of
bytes
left
in thefile.
The read-aheadblock
is
set.The
blocks
of thefile
may
be
read sequentially. Inthis case, once a block is read, I/O
begins
on theread-ahead block.
The number of bytes requested for transfer are
moved from the kernel area to the user
buffer
area. Theread
loop
continues until either an error occurs, oruntil all the bytes requested
have
been read. Thenumber of bytes read is returned from the system call.
If this value
is
zero, end of filehas
been reached.The number of bytes returned
may
be
less
than the numberrequested. This
is
not an errorbut
indicates
that
fewer than the requested number of
bytes
remainin
the-file.
If an error
has
occurredduring
the read, a -1is
returned
by
the system call. Errors occur when a badfile
descriptor,
buffer
address, or count are passed tothe system call, or when a physical I/O error occurs.
2.2.5
I/O BUFFERSAt this point, to conclude the description of the
UNIX
file
system, thebuffers
are described in moredetail. The I/O buffers are the method through which
the
kernel
keeps
track of the data from the user programand the
data
on the disk. Buffer headers are identifiedwith a device number and a block number. These
headers
are kept in two
lists,
the b-list and the av-list. Theb-list
links
together buffers that are associated withthe device type. The av-list is the available list
which
holds
buffers that canbe
released from theirpresent use and allocated
for
different tasks. Abusy
flag
is
set when a buffer is allocated from theavailable list. A done
flag
is
set when theinformation
in the buffer is correct. Another
flag
is set when thebuffer contents need to be written out before
being
reassigned. The buffers are not assigned to a
file
until needed. This allows for a small number of
buffers
buffers
untilthey
are needed and ifonly
part of thebuffer
is
needed, the remainder of the informationis
not altered. The
kernel
controls writing of the buffersand a
utility
programforces
buffers
to be written outunconditionally
twice a minute, in order to preservetheir contents.
2.3 STRUCTURING OF THE FILE SYSTEM
This section
describes
the extensions to the nativeUNIX
file
system which provide record structure. Thereare three
different
record formats provided to implementthis structuring. The record format is the way that the
record physically appears on the storage medium. The
three record formats supported are:
o Fixed length
o Variable length counted
o Variable length stream
When all of the records in a file are the same
length,
fixed length record format is used. The size ofthe record is set at compile time to the size of the
user data structure. Fixed length records are
represented
by
fixed length data types such as,integers,
characters, arrays, and structscontaining
fixed length fields.
-When
the
size of the recordsin
a file varies, thevariable
length
recordformat
is
used. Variable lengthrecords are represented as strings. In a variable
length
counted recordfile
thedata
is precededby
a twobyte
countfield,
which contains the size of the record.The variable
length
stream recordformat
delimits
eachrecord with a newline character. Both the two byte
count
field
and the newline character aid in theretrieval of the record from the
file.
There are two methods of adding these record
formats
to the native UNIX file system. One methodmodifies the inode and some of the existing system
calls. The other provides a set of
library
routines tothe user, that implement the record I/O operations.
Each of these structuring methods are now discussed.
2.3.1 Addition Of
Library
RoutinesThe addition of C
library
level functions tostructure the file system has several advantages. This
method makes the record
formatting
invisible to theuser, also this method
is
portable. Thelibrary
routines can be installed on other systems. One
disadvantage to this method is that a header record
is
used to hold information about the file on the storage
not
know
aboutthis
header.
The
header
canbe
attached to thedata
in thefile,
or
located
separately
from
thedata.
When the headeris
attached to the
file,
it
is
notnecessary
tokeep
trackof the address of the
first
data
record. Itimmediately
follows
theheader.
Thedisadvantage
to this methodis
that the
header
takes up space, and there may not beenough space available on the device to write both the
header
and thedata.
Using
the method where theheader
is
detached
from
thedata
makesit
necessary
tokeep
apointer to the
first
data
block of the file.2.3.1.1
Fixed Length RecordsThe record format of the file
is
decided
at compiletime
by
the format of the user data structure that isused throughout I/O operations on the file. The file
header
is
set up in the creat and openlibrary
calls.Space for the header is allocated in the creat, and the
record
format
fieldis
filledin.
The recordlength
is
also written to the header. This length
is
the size ofthe user data structure at compile time. The open
system call checks the file header to see if the user
data
structure's record format matches the recordformat
that was set up in the creat. If the record
formats
do
-not
match,
an error occurs.To write a record to
the
file,
thelibrary
functionrwrite(fd,
buff)
is
used. The fd parameter is thefile
descriptor
that is returned in the creat or open. Buffis
thebuffer
containing
the data tobe
written to thefile.
Thislibrary
function
makes a call to the writesystem call with the file
descriptor
and write bufferfrom
the rwrite call, and the record length from theheader
record. Write returns the number ofbytes
written, or -1 if an error has occurred. When the
number of
bytes
writtenis
returned, itis
checkedagainst the record
length.
If the two are not equal, anerror has occurred and a full record cannot be written.
Rwrite returns 0 when the write completes successfully,
and -1 when an error occurs.
To retrieve a record from a fixed length
file
therread(fd,
buff)
library
function is used. The fdparameter
is
the file descriptor thatis
returned fromthe creat or open call. Buff is the buffer that the
data
is read into. The rread function uses the systemread call to transfer the
data
from thefile.
Thesystem read
is
called with thefile
descriptor and thebuffer from the rread call, and the record
length
from
from
read, or -1is
returned when an error occurs. Thenumber of
bytes
readis
compared to the record lengthand
if
the
two
are not equal, an errorhas
occurred.Rread returns
0
when the read completes successfully,and -1 when an error occurs.
2.3.1.2
Variable Length RecordsThe
library
routines tohandle
variable lengthrecords use the counted record format. When stream
record
format
is
tobe
used, calls are made to theexisting
I/O system calls. These callscurrently
implement
stream I/O.The creat
library
function allocates and sets upthe
file
header. The record format fieldis
set tovariable. The record length field is set to
0,
indicating
that it is not used. This is because therecord length varies with each I/O operation on the
file. The open
library
function checks the recordformat
of the user data type against the recordformat
in
the file header. If the two are not the same, anerror
has
occurred and the open fails.-To write a variable
length
record to thefile
therwrite(fd,
buff)
library
function
is
used. Thefd
parameter
is
thefile
descriptor
thatis
returned froman open or creat. The buff parameter is the buffer to
write
from.
Rwrite uses the strlen function on thedata
in
buff
todetermine
the recordlength.
The lengthbytes
are concatenated onto thedata
string
inside buff.A call
is
made to the write system call to transfer thedata
to thefile.
Write takes thefile
descriptor,
thecount/write
buffer,
and the record length plustwo,
asparameters. Two is added to the record length to
accomodate the two byte count field. The number of
bytes
written is compared to the record length. Anerror occurs if the two are not equal. Rwrite returns
the number of
bytes
written, or -1 if an errorhas
occurred.
To read a variable length record from the
file,
therread(fd,
buff)
library
functionis
called. The fdparameter
is
thefile
descriptor thatis
returned fromthe creat or open. Buff is the buffer to transfer the
data
from the file into. The rread functionfirst
callsthe read system function with the fd and buff parameters
from rread, and two for the record length parameter.
Two bytes are read from the file
into
buff,
which nowexecuted again with
fd,
buff,
and the record length(that
wasjust
returned),
as parameters. Read returnsthe number of
bytes
read, or -1 on an error. The numberof
bytes
readis
compared to the record length and anerror
is
returnedif
they
are not equal. Rread returnsthe number of
bytes
read when the read completessuccessfully,
or -1 when an error occurs.2.4 STRUCTURING AT THE SYSTEM LEVEL
An alternative to the
library
routine method ofstructuring
is
the addition of the record structuringcode at the system level. This method is used when the
applications need to know about record
formats.
Theuser must learn about the record formats available and
choose the one that best suits each application. One
disadvantage
to this method is that changes are made tothe
inode
and some system calls, which adds code andcomplexity
to the native UNIX file system.2.4.1 Data Structure Changes
At this point, a discussion of the data structure
changes
is
presented, followedby
a discussion of thechanges to the system calls. The first change
is
to thein-core inode. A field
is
added to theinode
whichdescribes the record format of the
file
as eitherfixed,
variable counted, or variable stream. A new system
-call, set_rec_format
(
format
)
,is
added to thekernel
code to set
this
field
in
theinode.
The user makes acall to this
function
before
issuing
an open or creatsystem call, to
indicate
the type of variable recordformat
thatis
tobe
used. The creat and open systemfunctions
are changed to set the recordformat
field inthe
inode.
If the set_rec_format callis
notissued,
acheck
is
madeinside
the creat or open, to see if therecord
format
is
fixed
(the
userdata
structureis
fixed
length).
ifit
is
not, the variable recordformat
defaults
to stream. Theinode
record format fieldis
set accordingly. The user
interface
to the creat andopen system calls
does
not change.2.4.2 Fixed Length Records
The write(fd,
buff,
reclen) system call is used towrite records to the file. The user interface to this
call
does
not change. The fd parameter is thefile
descriptor that
is
returned from the open or creatsystem call. The buff parameter
holds
thedata
whichthe user assigned to the record that is to
be
written.The reclen parameter
is
the length of this record, andremains constant throughout other writes to the file.
The number of bytes written
is
returnedby
the writefunction and is checked against reclen. An error
is
Random writes on
fixed
length
records are alsoallowed. The record number
is
passed to the writeby
the user. The random write call uses a new parameter.
This new system
function
is
added to UNIX in a layer ofcode above the
kernel
code. The random write functionprovides a method of
writing
to a specific record in thefile.
numwrit = randwrit
(
fildesc
,buf,
reclen, recno) ;The record
length(
reclen) is multipliedby
the recordnumber
(
recno) and lseek is called to position the filepointer to the
desired
record. The write system callis
then called to write the
data
to the file. The numberof
bytes
writtenis
checked against the record lengthand an error
is
returned if the two are not equal.To read records from the
file,
the read(fd,buff,
reclen) system call is used. The user interface to this
call does not change. The fd
is
the filedescriptor
returned from the creat or open, the buff parameter
is
the buffer that the record is to be read
into.
Forfixed length records, the count to be read is always the
size of the user data record. The reclen parameter
is
set to this size and remains fixed
during
reads to thefile.
A read should always return a record of the samesize. The number of bytes read is returned
by
the read-function.
A checkfor
the number ofbytes
transferredand the number of
bytes
requestedis
made. An erroris
raised
if
they
are not equal. A comparison mustbe
madebetween
the number ofbytes
requested and the numberremaining
in
thefile.
if the number requestedis
greater than the number of
bytes
remaining
in thefile,
end of
file
is
returned.Random access reads add a new parameter to the read
function
call. The record numberis
passed to thefunction
by
the user. This new system call is added toUNIX in a layer of code on
top
of the kernel code. Therandom read
function
provides a method forreading
aspecific record of the
file.
numread = randread( f ildesc
,
buf,
reclen, recno) ;Random reads are executed
by
multiplying
the recordnumber
(
recno)
by
the record size( reclen) andcalling
lseek to position the
file
pointer. The read systemfunction
is
then called to read the record. The numberof
bytes
readis
returned. This countis
checkedagainst the record size and an error is returned
if
thetwo
do
not match. The random read also tests for end offile.
Thisis
doneby
comparing the recordlength
tothe number of bytes remaining in the
file.
If therecord
length,
end offile
is
returned.2.4.3
Variable
Length RecordsA variable
length
record canbe
written to thefile
as a counted record or a stream record. To set the
record
format,
the set_rec_format call is used. Therecord
format
field
in theinode
is
then set in the openor creat system call. When the record
format
is
set tostream, the current read and write system calls are not
changed. These
functions
already
provide stream recordI/O. A
discussion
of the changesnecessary
toimplement
counted records
is
now presented.To write a counted record to the
file,
a new systemcall, rwrite(fd,
buff),
is used. This system call takesthe
file
descriptor,
and thedata
buffer to writefrom
as parameters. Inside this function the length of the
record
is
determined
by
the strlenfunction.
Thislength
is
concatenated to thefront
of thedata
tobe
written. The write system call
is
then used to writethe
data.
The buff parameterholds
the newdata
string
(count
anddata),
and the reclen parameteris
thelength
of the data plus the two bytes for the count. The
number of bytes written is returned and checked against
reclen. If the two are not equal, an error
is
returned.The rwrite call returns the number of data
bytes
that
-have
been
written tothe
file,
or -1if
an error occurs.A new system call,
rread(fd,
buff),
is
used to readvariable
length
counted records. Inside thisfunction,
a call
is
made to the read system call to retrieve thefirst
twobytes
from
the record. Thesebytes
representthe
length
of the record. The read system callis
executed again with this record length as a parameter.
The record's
data
is
then retrieved from the file. Thenumber of
data
bytes
readis
returnedfrom
the rreadcall, or -1
is
returned if an error has occurred.2.5
CONCLUSIONSThe current Unix
file
system and two methods ofstructuring
it
have
just
been presented. The systemadministrator must weigh the advantages and
disadvantages of each in order to
decide
which method touse. The amount of control that the u-ser requires over
CHAPTER 3
THE MS-DOS FILE SYSTEM
A
file
under MS-DOS consists of a series of bytes.Access to the
file
is
through a pathname. This chapterdescribes
the pathnames which access thefile,
thestorage of
directories
and files on thedisk,
and theorganization of a
disk.
Next,
the MS-DOSfile
systemcalls are
discussed
followed
by
the extensions to thenative
file
system which implement a record structure.The
directory
structure on MS-DOS ishierarchial.
The directories are the
branches
and thefiles
are theleaves.
The rootis
at thetop
of thehierarchy
andis
the
directory
that is automatically created when thedisk
is
formatted. A fileis
foundby
searching
thedirectory
tree. The pathname describes this search. Apathname
is
a series ofdirectories,
separatedby
backslashes
(\),
followedby
the file name. If apathname begins with a "\" then the search
begins
atthe
-root
directory.
ifnot, the search begins at the
current
directory.
A path nameis
often precededby
adevice
nameindicating
thedisk
on which thefile
canbe
found.
if nodevice
is
indicated,
thedefault
device
is
used. A
device
nameidentifies
a diskby
a singleletter.
itis
separatedfrom
the pathnameby
a ": " . Adiscussion
on the storage ofdirectories
andfiles
andthe organization of a
disk
follow.
MS-DOS reserves one or more of the first tracks on
a
disk
forstoring
aloader
program. These reservedtracks are called system tracks. All of the remaining
tracks are used for
storing
files.
Datafiles
arewritten wherever there are
empty
locations
and there canbe
many
locationscontaining
portions of a file. Anentry
is
madeinto
adirectory
and the File AllocationTable
(FAT)
for each file on adisk.
This tells MS-DOSwhere the file
is
located on the disk.MS-DOS divides a disk into groups of logical
records called clusters. This provides for better
data
management. The data storage area of each disk
is
addressed in terms of these clusters.
They
are numberedfrom zero on the outside of the disk
sequentially
inward
with
increasing
cluster numbers. The number of clustersuse
by
the
file
systemare stored
in
thedisk's
directory
and FileAllocation
Table. Thefirst
few
clusters on each
disk
are reserved to storeits
directory,
the
numberindicating
thecapacity
of thedisk
and the cluster size.Directories,
other than the rootdirectory,
areconsidered
files
by
MS-DOS. Eachfile
has an entry ineither the root
directory
or in one of these lowerdirectories.
Since thesedirectories
are consideredfiles,
they
have an entry in their parent directory.All
directory
entries are 32bytes
in length and areformatted
asfollows:
0-7
Filename. Eight characters, left aligned andpadded, if necessary, with blanks. The
first
byte of this field indicates the file status
as follows:
00H
Thedirectory
entryhas
never beenused. This is used to limit the
length of
directory
searches, forperformance reasons.
2EH
The entry is for a directory.E5H The file was used,
but
it
has been erased.Any
other characteris
the first character of a-filename
.8-OA
Filename extension.OB
Fileattribute;
the attribute values are asfollows:
OlH
Fileis
marked read-only.02H Hidden
file;
the file is excluded from normaldirectory
searches.04H System
file;
the file is excluded from normaldirectory
searches.08H
Theentry
contains the volume label in thefirst
11bytes.
No other usable informationis present (except the date and time of
creation) and the entry exists only in the
root directory.
10H The entry defines a sub-directory and
is
excluded from normal
directory
searches.20H Archive
bit;
thebit
is set to "on" wheneverthe file has been written to and closed.
OC-15
Reserved.16-17
The time the file was created orlast
updated,The
hour,
minutes, and seconds are mappedinto
18-19
Datethe
file
was created orlast
updated. Theyear, month, and
day
are mappedinto
twobytes.
1A-1B
The cluster number of thefirst
cluster in thefile.
The cluster number is stored with theleast
significantbyte
first.
1C-1F
File size inbytes.
The first word of thisfield
is
thelow-order
part of the size.* See
diagram
of MS-DOS
Directory
entry in Appendix AThe MS-DOS
operating
system maintains a FileAllocation Table
(FAT)
on each disk. The FAT stores thecluster numbers allocated to
files
on the disk. Thereare usually two copies of the FAT to insure against
loss
or corruption. If this table cannot be read, the
files
stored on the disk cannot be
located.
The FATis
anarray
of12-bit
entries for each cluster on thedisk.
The first two FAT entries usually contain a disk
identifier (the size and format of the
disk).
Thesecond and third bytes of the FAT are always FFH. This
insures that the first FAT cluster entry begins on a
byte
boundary. Three bytes store two FAT entries. Thethird FAT entry, which starts at byte
four,
begins
themapping of the data area. Files in the
data
area arenot always written sequentially on the disk. The
data
area is allocated one cluster at a time and the
first
-free
clusterfound
is
the next tobe
allocated.The
first
cluster number of afile
is contained inthe
file's
directory
entry. when a clusterbecomes
filled,
MS-DOS searchesthe
FAT tablestarting
from
thetop
of thetable,
for anentry
of 000Hindicating
anempty
cluster. It places this cluster numberinto
theFAT
location
of the cluster pointed toby
thedirectory
entry. Whenever a new cluster is assigned to a file its
number
is
placedinto
the FAT location of the previouscluster. This
links
afile
from cluster to cluster.The FAT entry for the
last
cluster of a file is set to avalue in the range FF8H to FFFH.
Any
clusters whichhave
diskette
defects should have their FAT entries setto FF7H.
The File Allocation Table is found on the first
section of the disk after the reserved sectors. If
it
is
larger than one sector, the sectorsit
resides on arecontiguous. The FAT is read into one of the MS-DOS
buffers
whenever needed. This can beduring
reads,writes, opens, or other I/O operations.
MS-DOS also keeps a File Control Block
(FCB)
(see
needed
by
theoperating
systemduring
I/O operations. AFCB consists of a
37
or44
byte
areain
memory
locatedwherever
the
programfinds
convenient. The FCB containsfields
whichhold
thedrive
number,filename,
extension,current
block,
record size,file
size, date oflast
write, time of
last
write, the current record and therelative record. The current record field points to one
of the records in the current block and must be set
before
doing
a sequential read or write to the file.The relative record (from block zero) points to the
currently
selected record. It must be set beforedoing
a random read or write on the file. FCBs are considered
either opened or unopened. An unopened FCB must be set
up before an open system function is executed.
-MS-DOS File
Control
BlockByte Field
0
Drive Number1-8
Filename
9-ll
Filename
Extension12-13
Current
Block Number14-15
Logical Record Size16-19
File Size20-21
Date of Creation or Last Update22-23
Time of Creation or Last Update24-31
Reservedby
MS-DOS32 Record within Current Block
33-36
Random Access Record NumberFor Extended File Control Blocks
add the
following
prefix:-7
Flag
Byte
-6--2 Reserved
-1 File Attribute
Figure 3-1
To access a disk using the MS-DOS
file
system, apathname, file name and
file
handle
need tobe
specified. The file handle
is
returned when afile
iscreated or opened. It must
be
passed to all file accesshandle
first
setup
or retrieve the FCBinformation.
MS-DOS
looks
at afile
as an address space. Thefile
handle
is
the
file
system's method ofaccessing
thisspace .
3.2
I/O SYSTEM CALLSMS-DOS system calls are executed from macro level
code. The next
layer
is
a KOALAlayer
which allocatesand sets up I/O
buffers
and makes the calls to the macroroutines. The
top
layer
of code is the level throughwhich the application program accesses the file system.
(
See Figure3-2)
.MS-DOS Module
Hierarchy
User Level KOALA module which the application uses to access the
file
system. + + Interface Level KOALA module-->
|
which sets upthe
file
systemstructures and
buffers from the
user application
Figure 3-2
+ +
System Level
Macro module
>
|
which callsthe MS-DOS
file system
functions
+ +
The file information is held in a record
data
structure. This file record holds the name of the
file,
the file
handle,
several buffer areas, the organizationof the
file,
the access mode of thefile,
a countfield
and the file attribute. (See Appendix
A)
-3.2.1
CREATETo create a
file
thatdoes
notcurrently
exist, theCREATE set NAMED
string
DEFAULT_NAMEstring
commandis
used. The set
in
the callis
a userdefined
structure.The
string
values are valid MS-DOS file names. Thisroutine makes a call to setup_file which allocates space
for
thefile
on the storage media and checks that thename passed
in
by
the useris
valid. It then calls thesys$create
function
with the filedata
structure recordas a parameter. This
function
calls the macro createfunction
to create the file with the file name, fileattribute and the
file
handle
as parameters. The MS-DOSsystem create
is
now executed. The create system callcreates a new
file
or if a filealready
exists with thesame name, create truncates it to zero. The new
file
is
given the attribute and name that were passed into
it.
The
file
handle is returned, and provides a reference tothe new
file
and the file is opened for read/writeaccess.
3.2.2 OPEN
To open a
file,
the application uses the OPEN setNAMED string DEFAULT_NAME string statement. The set
is
a user defined
data
structure that mustbe
previously
defined
in
a VARIABLE declaration. Thestring
valuessetup_file which allocates space
for
thefile
on thestorage media. it checks
that
the name passedin
by
the useris
a valid MS-DOSfile
nameby
a call tocall_parse. A call
is
then made to sys$open with thefile
data
structure record as the parameter. Thisfunction
then calls the macro open with the file name,file
access mode andfile
handle
as parameters. TheMS-DOS system open
is
executed from the macrofunction.
Open associates a
file
handle
with the file. This filehandle
mustbe
used for subsequent I/O to the file. Theread/write pointer is set at the first byte of the file and the record size of the
file
is set to one byte.3.2.3
WRITEThe WRITE set statement
is
used to write to afile.
This statement checks that the file is open. A callis
made to the sys$put function with the file
data
structure as a parameter. This function then calls the macro write system call with the file
handle,
the numberof
bytes
tobe
written, the buffer to write from and thenumber of bytes actually written, as parameters. The MS-DOS system write
is
executed from this macrofunction. Write transfers the number of bytes requested from the buffer into the file
indicated
by
the filehandle. The number of bytes transferred
is
passedback
to the sys$put function and an error occurs
if
thiscount
is
not equal tothe
number ofbytes
requested.3.2.4
READA read
is
done
with the READ set statement. Acheck
is
madethat
the file is open. A bufferis
allocated to
hold
thedata
thatis
to be read. Sys$getis
called with the filedata
structure record as aparameter. This
function
calls the macro read functionwith the
file
handle,
the number ofbytes
to read, thebuffer
to read from and a variable to hold the number ofbytes
actually
read as parameters. The MS-DOS systemread
is
executed from the macro read function. Readtransfers the number of bytes requested from the file
referenced
by
thefile
handle,
into the buffer locationindicated. The number of bytes actually transferred
is
passed back to the sys$get routine and if
it
is notequal to the number of bytes requested an error
has
occurred.
3.3 STRUCTURING OF THE FILE SYSTEM
The MS-DOS system calls are the basis for the
structuring of the file system. The actual code to
structure the file system is done in a high level
language. These programs are linked to the macro code
The macro module
interfaces
directly
with theMS-DOS
system calls. It setsup
thenecessary
registersand calls
MS-DOS
to
performfile
systemfunctions.
Thenext module
level
is
a KOALA module which allocates andsets
up
I/Obuffers
and calls the macro routines.Many
of
the
file
system changes arefound
in
this KOALAmodule. The
top
level
moduleis
also written in KOALA.This module
is
the userinterface
to the file system.Changes are also made
here
in order to implement therecord
structuring
extensions.This section
describes
the extensions to the MS-DOSfile
system which provide record structure. Theseextensions
describe
the actual program code changes.The
first
change is to the file data structure record.(See
Figure3-3).
Severalfields
whichdescribe
thefile
are added to this record. A fielddescribes
therecord structure of the file as either
fixed,
variable,or stream. A stream buffer field
is
used to read andwrite to stream files and a stream index
is
used tohold
the position in the stream file.
!*** MS-DOS File record
EXPORT TYPE file_access IS ACCESS
RECORD
EXPORT name : STRING;
-EXPORT
ifi
:INTEGER;
EXPORT
buffer
:char_access;
EXPORT user_buff_len :
INTEGER;
EXPORT user_buff_area :
char_access;
EXPORT
buff_size
:INTEGER;
EXPORT organization :
(
org_sequential,org_indexed)
;EXPORT access_mode :
dos_access_type
;EXPORT attribute :
dos_attribute
;EXPORT
current_key
:INTEGER;
EXPORT count :
INTEGER;