• No results found

Distributed File Systems Part I. Issues in Centralized File Systems

N/A
N/A
Protected

Academic year: 2021

Share "Distributed File Systems Part I. Issues in Centralized File Systems"

Copied!
21
0
0

Loading.... (view fulltext now)

Full text

(1)

Distributed File Systems

Part I

Daniel A. Menascé

Issues in Centralized File

Systems

• File Naming c:\courses\cs571\procs.ps (MS-DOS) /usr/menasce/courses/cs571/processes.ps (UNIX) • File Structure bitstream or bytestream

record oriented (record = key + data) indexed (e.g., B*-trees (IBM VSAM) )

(2)

B*-Tree Files

... ... ... ... ... ... ... ... ... ... ≤ > > index nodes leaf nodes a b

Issues in Centralized File

Systems

• File Types

text (e.g., ASCII)

binary (e.g., executables, images, etc.) • Directory Structures

flat

hierarchical (tree) graph

(3)

Directories

menasce courses CS571 papers INFS601 intro.ps procs.ps ... intro.ps procs.ps ... grcs571.xls grinfs601.xls menasce courses CS571 papers INFS601 intro.ps procs.ps ... grcs571.xls grinfs601.xls hierarchical graph

Directories

menasce courses CS571 papers INFS601 intro.ps procs.ps ... intro.ps procs.ps ... hierarchical ~menasce/courses/CS571/intro.ps ~menasce/courses/INFS601/intro.ps

(4)

Directories

menasce courses CS571 papers INFS601 intro.ps procs.ps ... grcs571.xls grinfs601.xls graph ~menasce/courses/CS571/intro.ps ~menasce/courses/INFS601/intro.ps

Issues in Centralized File

Systems

• Allocation of File to Disk Blocks contiguous

linked indexed

(5)

Contiguous Allocation of File to Disk

Blocks

... ... 101102103 150 0 1 2 49 start address = 101 no. of used blocks = 3 last reserved block = 150

• simple mapping • bad use of disk space • hard to expand if maximum

allocation is exceeded

Linked Allocation of File to Disk Blocks

0 1 2

154 35 237

first block address = 154 last block address = 237 number of blocks = 3

• good use of disk space

• bad performance for direct access (e.g. reading the k-th block requires

reading k blocks) directory info

(6)

Indexed Allocation of File to Disk

Blocks

0 1 2 3 511 154 35 237 -1 -1

.

.

.

154 35 237 (index in main memory) disk

• efficient direct access • good use of disk space • inadequate for very large

files (very large index).

UNIX I-node

item type (e.g., file, directory) item size in bytes

time the file’s inode was last modified time the file’s contents was last modified

time the file was last accessed reference count: number of file names

file’s owner (a UID) file’s group (a GID) file’s mode bits (r,w,x)

(7)

UNIX Directories

. . . foo bar notes doc

notes and doc are the same file

I-node Allocation of File to Disk Blocks

0 1 2 3 511 I-node file attributes 510 509 ... ... ... ... ... ... ... ... ... ... ... ... SIP DIP TIP

SIP= single indirect pointer DIP= double indirect pointer TIP= triple indirect pointer

(8)

I-node Allocation of File to Disk Blocks

• Efficient access to data blocks of small (from i-node), medium (from single indirect blocks), large (from double indirect blocks), and huge (from triple indirect block) files.

• Maximum file size (assuming 512 byte blocks and 4 bytes per pointer):

(120+128+128**2+128***3) * 512 ≈ 1 GByte

Security in Centralized Systems

• What is security?

• Storing protection data. • UNIX File Protection. • Authentication methods.

(9)

What Is Security?

• Confidentiality: protecting information from being read or copied by unauthorized users. • Data Integrity: protecting information from

being deleted or altered without permission. • Availability: avoiding denial of service.

• Access Control: controlling who has access to the system.

• Accountability: keeping track of unauthorized accesses on an audit trail.

Storing Protection Data

• SeCurity

Protection Matrix

Access Control Lists Capabilities

usr1

usr n

file 1 file 2 file m

...

rw r rwx

(10)

-Access Control Lists and

Capabilities

usr1

usr n

file 1 file 2 file m

...

rw r rwx

- rw

-capabilities: list of objects

and access rights per user.

access control list: list of users

and access rights per object.

UNIX Protection Model

usr1

usr n

file 1 file 2 file m

...

rw r rwx

- rw

-access control list: list of users

• UNIX implements a coarse grain version of ACLs.

• Users are divided into three groups:

- owner - group - world

(11)

Protection Bits for Files

drwx--S--- 2 menasce 512 Nov 4 13:49 grades/

-rw-rw-r-- 1 menasce 684 Nov 4 13:48 project_ideas

-rw--- 1 menasce 509 Nov 4 13:48 student_mail

-rw-r--r-- 1 menasce 3063 Nov 4 13:49 syllabus

entry type (- file; d directory) owner rights

group rights other’s rights

Authentication Methods

• Something that you know: password. • Something that you have: a card key. • Something that you are: fingerprint • Combination:

– card key and password – card key and weight

(12)

Passwords

• Passwords are stored in password files

(/etc/passwd in UNIX) in an encrypted form (one-way encryption).

• Users should select hard to crack passwords:

– Use combinations of lower and upper case

characters, punctuation signs (!$#?;:), and numbers. – Good password: A$1c;:mE

– Bad password: sunshine

– Easy to remember: base password on a phrase. – Change passwords regularly

Users, User IDs and the Superuser

• Every user in UNIX has a username and a user identifier (UID) which is a number.

• Common “users” in UNIX systems:

– root: superuser performs accounting and low-level functions.

– daemon: handles network aspects – agent: handles e-mail

– guest: for visitors – ftp: for anonymous ftp.

(13)

Groups and Group Identifiers

• Every UNIX user belongs to one or more groups.

• Groups have a group name and a group ID (GID).

• Each user belongs to the primary group stored in the /etc/passwd file

• All groups are listed in the /etc/group file in UNIX

Groups and Group Identifiers

users group (gid 104)

ftp group (gid 10) admin group (gid 0)

student group (gid 40)

root john mary peter susan jill ftp

(14)

The Superuser

• Every UNIX system has a special user with UI = 0 and usually called root.

• root is used by the OS to accomplish its basic functions

• root has access to all system resources!

• More than one user can be the superuser (they just need to have UID = 0).

• The superuser is the main security weakness in UNIX.

Distributed File Systems

• File Service Interface: - upload/download model

client server

get file put file

- entire files are retrieved from the server, and accessed at the client.

- once the client is done, the file is stored back at the server.

(15)

Distributed File Systems

• File Service Interface: - remote access model

client server

read block write block

- only the needed blocks of files are retrieved from the server.

- once the client is done with a block, it is written back to the server.

- example: NFS

Distributed File Systems: directory

service interface

file server 1: file server 2: A B C D E F A B C D E F root A B C D E F root at client 1 at client 2

(16)

Distributed File Systems: directory

service interface

file server 1: file server 2: A B C D E F A B C D E F root A B C D E F root at client 1 at client 2

Distributed File Systems: naming

• Location transparency: the path name does not

reveal the file location.

e.g.: /serverA/dir1/dir2/x does not say where the server is located.

• Location independence: files can be moved and all references to them continue to be valid.

(17)

Distributed File Systems: two-level

naming

• Symbolic Names: human readable. e.g.: /courses/slides/files.ps

• Binary names: machine readable names. Easier to manipulate.

e.g.: UNIX i-node, or

server IP address:i-node number

• Symbolic to binary name mapping may be one to many

in a distributed system (file replication).

Semantics of File Sharing

• UNIX semantics: used in centralized systems.

- a read that follows a write sees the value written by the write.

time x write x’ to block a ⇒ x’ x’ read block a t1 t2 get x’

(18)

Semantics of File Sharing

• UNIX semantics:

- a read that follows two writes in quick succession sees the result of the last write.

x write x’ to block a ⇒ x’ read block a t1 t2 t3 x’’ x’ ⇒ x’’ write x’’ to block a get x’’

Semantics of File Sharing

Issues in Distributed File Systems

• Single File Server - No client caching

- easy to implement UNIX semantics • Client File Caching

- improves performance by decreasing demand at the server

- updates to the cached file are not seen by other clients.

(19)

Semantics of File Sharing

• Session Semantics: (relaxed semantics)

- changes to an open file are only visible to the process that modified the file.

- when the file is closed, changes are visible to other processes closed file is sent

back to the server. ⇒

Semantics of File Sharing

• Session Semantics:

- what if two or more clients are caching and modifying a file?

• final result depends on who closes last • use an arbitrary rule to decide who wins. - file pointer sharing not possible when a

process and its children run on different machines

(20)

Semantics of File Sharing

• No File Updates Semantics:

- files are never updated.

- allowed file operations: CREATE and READ. - files are atomically replaced in the directory. - Problem: what if two clients want to replace

a file at the same time?

• take the last one or use any non-deterministic rule.

Semantics of File Sharing

• Transaction Semantics:

- all file changes are delimited by a Begin and End transaction.

- all file requests within the transaction are carried out in order.

- the complete transaction is either carried out completely or not at all (atomicity).

(21)

Semantics of File Sharing

UNIX Semantics every operation is instantly visible to others

Session Semantics no changes visible until file is closed. No Updates Semantics no file updates are

allowed.

References

Related documents