• No results found

Topics in Computer System Performance and Reliability: Storage Systems!

N/A
N/A
Protected

Academic year: 2021

Share "Topics in Computer System Performance and Reliability: Storage Systems!"

Copied!
46
0
0

Loading.... (view fulltext now)

Full text

(1)

1

CSC 2233:

Topics in Computer System Performance and Reliability: Storage Systems!

Note: some of the slides in today’s lecture are borrowed from a course taught by Greg Ganger and Garth Gibson at Carnegie Mellon University

(2)

Who am I?

(3)

3

What makes storage systems so cool?

1. Combines so many topic areas:

  hardware meets OS meets networking meets distributed systems meets security meets AI meets HCI…

(4)

What makes storage systems so cool?

1. Combines so many topic areas 2. This is where great jobs are!

  Designers and implementers still needed

  not just testing J

  Continuing growth area for the future

  The Internet is a network, but the web is a storage system

  Strong existing companies: EMC, NetApp, …

  Core competency for Internet services: Google, Microsoft, Amazon, …

  and still support for start-ups

(5)

5

What makes storage systems so cool?

1. Combines so many topic areas 2. Great careers

3. Still so much room to contribute:

  performance actually matters here

  in fact, it dominates other parts of system performance in many cases

  … and reliability too

  storage management wide open

  and, storage starting to “take over” computation

  Big data …

  Lots is and will be happening

  Solid state drives and other technologies?

(6)

Amdahl ’ s Law

  Speedup limited to fraction improved

  obvious, but fundamental, observation

50 50

90% reduction in BLUE yields only

45% reduction in total

50

5

(7)

7

Technology Trends

2000" 2002" 2004" 2006" 2008" 2010"

Year"

Normalized value relative to 2000"

1"

10"

100"

CPU Performance"

Memory Bandwidth"

Disk Bandwidth"

Network Bandwidth"

Network Latency"

Disk Latency"

(8)

Consequence: storage performance dominates

0 10 20 30 40 50 60 70 80 90

100 CPU Time

I/O Time

0 10 20 30 40 50 60 70 80 90

100 CPU Time

I/O Time

(9)

9

“I/O certainly has been lagging in the last decade”

  Seymour Cray, 1976

“Also, I/O needs a lot of work”

  David Kuck, 1988

“In 3 to 5 years, we will start seeing servers as peripherals to storage

  SUN Chief Technology Officer, 1998

“Scalable I/O is perhaps the most overlooked area of high-performance computing R&D

  Suggested R&D topic report for 2005-2009

Storage systems: fun quotes

(10)

Logistics & Administratives

  Class time: Thu 10am – 12pm

  Office hours:

  By appointment

  Class web page

  www.cs.toronto.edu/~bianca/csc2233.html

(11)

11

Grading

  30% class participation

  Participation in class discussions

  (Read all papers prior to class)

  Class presentation of research paper

  70% class project

  No exams, no homework, no paper summaries

(12)

Class project

  Can be done in team of two or alone

  Start looking for a partner now!

  On a research project you pick

  I will suggest possible projects (see course web page)

  You can propose your own

  Start thinking about it soon, proposal due in ~3 weeks

  Output: workshop quality research paper (10-12 pages)

  Even better: conference quality paper

  Use latex template on course web page

  All reports will be published as tech-report

(13)

13

Class project

  Output: workshop quality research paper (10-12 pages)

  I will help you get there --- multiple milestones:

  Project proposal

  Related work

  Status reports

  Final report

  And meetings with instructor

(14)

Topic of class project

  Project topic must be related to the topic of the class

  Is it OK to have overlap with my research / my course project in another course?

  You cannot get academic credit for the same piece of work twice

(15)

15

Paper presentation

  Each of you will present one or two papers in class

  Format of the presentation:

  30 min presentation of paper

  5-15 min paper review

  Good points

  Bad points

  10 min class discussion that you lead!

  Prepare questions!

(16)

Paper presentation

  What I do not want:

  A long laundry list of all things the paper did

  What I do want:

  A lecture style presentation of the paper

  Including background material your fellow class mates might need to understand the paper

  A critical discussion of the paper

  Strength & Weaknesses

  Prepare questions!

(17)

17

Purpose of presentation

  Wrong answers:

  To give a verbal version of the paper, cramming all its content into 30 min”

  To impress people with your technical depth and thoroughness”

  In fact, no one cares about these things

  The goal is to filter out the main points of the paper and present them well

  By the end, everybody in the audience should remember 2-3 take- home messages

(18)

What ’ s on each slide?

  Each slide should have one basic point

  There should NOT be tons of text

  Use sentence fragments

  Use pictures everywhere you possibly can!

  A picture says more than 1000 words

  Saves text and thus slides

  Much easier to process

(19)

19

Rest of today: Some review …

(20)

What are storage systems all about?

  Memory/storage hierarchy

(21)

21

Memory/storage hierarchies

  Balancing performance with cost

  Small memories are fast but expensive

  Large memories are slow but cheap

  Exploit locality to get the best of both worlds

  locality = re-use/nearness of accesses

  allows most accesses to use small, fast memory

Capacity Performance

(22)

Example memory hierarchy values

Notice the huge access time gap

between DRAM and disk

Where will SSDs go?

(23)

23

What are storage systems all about?

  Memory/storage hierarchy

  Combining many technologies to balance costs/benefits

  No longer the focal point of storage system design

  Still important though

  Maybe more so with new technologies arriving on the market

(24)

What are storage systems all about?

  Memory/storage hierarchy

  Combining many technologies to balance costs/benefits

  No longer the focal point of storage system design

  Still important though

  Maybe more so with new technologies arriving on the market

  Persistence

  Storing data for lengthy periods of time

  To be useful, it must also be possible to find it again later

  this brings in data organization, consistency, and management issues

  This is where the serious action is

(25)

25

Why persistence is important

  Some statistics:

  Among companies who lose data in a disaster, 50% never re-open and 90% are out of business within two years

  Even smaller incidents can be costly

  Reproducing some tens of megabytes of accounting data can take several weeks and cost tens of thousands of dollars

  Bad PR!

(26)

Storage System Application

Bob1 Bob2 Bob3 Bob4

Bob1 Bob2 Bob3 Bob4 Bob3 Bob4 Bob4

Application gives data objects & their

IDs to storage

What is a storage system: Big Picture

The storage system keeps the data objects

and returns one upon request (by ID)

Bob2

Bob1

(27)

27

Storage Systems & Interfaces

  What is a “Storage System”?

  Hardware (devices, controllers, interconnect) and Software (file system, device drivers, firmware) dedicated to providing

management of and access to persistent storage.

  One view: defined by collection of interfaces

(28)

Program Physical Media

File system

Device driver

I/O

controller

High level of abstraction No abstraction

Storage Software Interfaces

Understands files and

(29)

29

OS sees storage as linear array of blocks

OS’s view of storage device

  Common disk block size: 512 bytes

  Number of blocks: device capacity / block size

  Common OS-to-storage requests defined by few fields

  R/W, block #, # of blocks, memory source/dest

6

5 7 12 23

(30)

OS sees storage as linear array of blocks

OS’s view of storage device

  How does the OS implement the abstraction of files and directories on top of this logical array of disk blocks?

6

5 7 12 23

(31)

31

File System Implementation

  File systems define a block size (e.g., 4KB)

  Disk space is allocated in granularity of blocks

Bitmap Space to store files and directories

Default usage of LBN space

Superblock

Notice the terminology clash here: “block” is used for different

things by the file system and the disk interface… and this kind of

thing is common in storage systems!!

(32)

File System Implementation

  File systems define a block size (e.g., 4KB)

  Disk space is allocated in granularity of blocks

  A “Master Block” determines location of root directory (aka superblock)

  Always at a well-known disk location

  Often replicated across disk for reliability

  A free map determines which blocks are free, allocated

  Usually a bitmap, one bit per block on the disk

  Also stored on disk, cached in memory for performance

  Remaining disk blocks used to store files (and dirs)

  There are many ways to do this

(33)

33

Disk Layout Strategies

  Files span multiple blocks

  How do you allocate the blocks for a file?

1. Contiguous allocation

(34)

Contiguous Allocation

0 1 2 3 4

5 6 7 8 9

10 11 12 13 14

15 16 17 18 19

20 21 22 23 24

25 26 27 28 29

File Name Start Blk Length

File A 2 3

File B 9 5

File C 18 8

File D 27 2

directory Disk

(35)

35

Disk Layout Strategies

  Files span multiple disk blocks

  How do you find all of the blocks for a file?

1. Contiguous allocation

  Like memory

  Fast, simplifies directory access

  Inflexible, causes fragmentation, needs compaction 2. Linked, or chained, structure

(36)

Linked Allocation

0 1 2 3 4

5 6 7 8 9

10 11 12 13 14

15 16 17 18 19

20 21 22 23 24

25 26 27 28 29

File Name

Start Blk Last Blk

File B 1 22

directory

(37)

37

Disk Layout Strategies

  Files span multiple disk blocks

  How do you find all of the blocks for a file?

1. Contiguous allocation

  Like memory

  Fast, simplifies directory access

  Inflexible, causes fragmentation, needs compaction 2. Linked, or chained, structure

  Each block points to the next, directory points to the first

  Good for sequential access, bad for all others 3. Indexed structure (indirection, hierarchy)

  An “index block” contains pointers to many other blocks

  Handles random better, still good for sequential

  May need multiple index blocks (linked together)

(38)

Indexed Allocation: Unix Inodes

  Unix inodes implement an indexed structure for files

  Each file is represented by an inode

  Each inode contains 15 block pointers

  First 12 are direct block pointers (e.g., 4 KB data blocks)

  Then single, double, and triple indirect

0

12

1

(39)

39

Unix Inodes and Path Search

  Unix Inodes are not directories

  They describe where on the disk the blocks for a file are placed

  Directories are files, so inodes also describe where the blocks for directories are placed on the disk

  Directory entries map file names to inodes

  To open “/one”, use Master Block to find inode for “/” on disk and read inode into memory

  inode allows us to find data block for directory “/”

  Read “/”, look for entry for “one”

  This entry locates the inode for “one”

  Read the inode for “one” into memory

  The inode says where first data block is on disk

  Read that block into memory to access the data in the file

(40)

Data and Inode Placement

Original Unix FS had two placement problems:

1. Data blocks allocated randomly in aging file systems

  Blocks for the same file allocated sequentially when FS is new

  As FS “ages” and fills, need to allocate into blocks freed up when other files are deleted

  Problem: Deleted files essentially randomly placed

  So, blocks for new files become scattered across the disk

2. Inodes allocated far from blocks

  All inodes at beginning of disk, far from data

  Traversing file name paths, manipulating files, directories requires going back and forth from inodes to data blocks

Both of these problems generate many long seeks

(41)

41

Cylinder Groups

  BSD Fast File System (FFS) addressed placement problems using the notion of a cylinder group (aka allocation groups in lots of modern FS’s)

  Disk partitioned into groups of cylinders

  Data blocks in same file allocated in same cylinder group

  Files in same directory allocated in same cylinder group

  Inodes for files allocated in same cylinder group as file data blocks

Superblock

Cylinder group organization

Cylinder Group

(42)

More FFS solutions

  Small blocks (1K) in orig. Unix FS caused 2 problems:

  Low bandwidth utilization

  Small max file size (function of block size)

  => fix using a larger block (4K)

  Problem: Media failures

  Replicate master block (superblock)

  Problem: Device oblivious

  Parameterize according to device characteristics

(43)

43

File Buffer Cache

  Applications exhibit significant locality for reading and writing files

  Idea: Cache file blocks in memory to capture locality

  This is called the file buffer cache

  Cache is system wide, used and shared by all processes

  Reading from the cache makes a disk perform like memory

  Even a 4 MB cache can be very effective

  Issues

  The file buffer cache competes with VM (tradeoff here)

  Like VM, it has limited size

  Need replacement algorithms

(44)

Read Ahead

  Many file systems implement “read ahead”

  FS predicts that the process will request next block

  FS goes ahead and requests it from the disk

  This can happen while the process is computing on previous block

  Overlap I/O with execution

  When the process requests block, it will be in cache

  Compliments the on-disk cache, which also is doing read ahead

For sequentially accessed files, can be a big win

(45)

45

Caching Writes

  On a write, some applications assume that data

makes it through the buffer cache and onto the disk

  As a result, writes are often slow even with caching

  Several ways to compensate for this

  write-behind”

  Maintain a queue of uncommitted blocks

  Periodically flush the queue to disk

  Unreliable

  Battery backed-up RAM (NVRAM)

  As with write-behind, but maintain queue in NVRAM

  Expensive

  Log-structured file system

  Always write contiguously at end of previous write

(46)

Remainder of the course

  Other optimizations:

  Other file system designs: log-structured, journaling

  Devices: Hard disks & Solid state drives

  Reliability & fault tolerance

  Performance modeling

  Distributed file systems: Google & Netapp

  Parallel file systems: GPFS & PanFS

  Storage for data-intensive computing

References

Related documents