• No results found

Coda.pdf

N/A
N/A
Protected

Academic year: 2020

Share "Coda.pdf"

Copied!
31
0
0

Loading.... (view fulltext now)

Full text

(1)

DISCONNECTED OPERATION IN THE CODA

FILE SYSTEM

(2)

Background

 We are back to 1990s.

 Network is slow and not stable  Terminal  “powerful” client

– 33MHz CPU, 16MB RAM, 100MB hard drive

 Mobile Users appeared

(3)
(4)

Disconnected Operation

 Disconnected operation is a mode of operation that enables a client to continue accessing critical data during temporary failures of a shared data repository.

 Key idea: caching data.

– Performance – Availability

(5)

Design Overview

 Coda is designed for an environment consisting of a large collection of untrusted Unix clients and a much smaller number of trusted Unix file servers.

 Each Coda client has a local disk and can communicate with the servers over a high bandwidth network.

(6)

Design Overview

 The Coda namespace is mapped to individual file servers at the granularity of subtrees called volumes.

(7)

Mechanisms for high availability:

(1) Server replication

VSG : volume storage group- a set of replicas for a volume

AVSG : client’s accessible VSG

(2) Disconnected operation

takes effect when the AVSG becomes empty.

An example depicts a typical scenario involving transitions between server replication and disconnected operation.

(8)

An Example

(9)

An Example

(10)
(11)

An Example

(12)
(13)
(14)

Design Rationale

Scalability

– Callback cache coherence (inherit from AFS) – Whole file caching

– Fat clients. (security, integrity) – Avoid system-wide rapid change

Portable workstations

(Powerful, lightweight and compact laptop computers)
(15)

Design Rationale -Replication

First vs Second Class Replication

Server replication (why?)

higher quality:

+ Persistent, Secure physically - Expensive

Client replication(i.e., cache copies)

(16)

Design Rationale –Replica Control

 By definition, a network partition exists between a disconnected second class replica and all its first class associates.

Pessimistic

– Disable all partitioned writes

– disallowing all partitioned writes or by restricting reads and writes to a single partition.

Optimistic

- sophisticated: conflict detection

(17)
(18)
(19)

Hoarding

 Hoard useful data for disconnection

 Balance the needs of connected and disconnected operation.

– Cache size is restricted

– Unpredictable disconnections

(20)

Prioritized algorithm

 User defined hoard priority p: how interest it is?  Recent Usage q

 Object priority = f(p,q)

 Kick out the one with lowest priority

+ Fully tunable

Everything can be customized

- Not tunable (?)

(21)

Hoard Walking

 We say that a cache is in equilibrium, signifying that it meets user expectations about availability, when no uncached object has a higher priority than a cached object.

 Equilibrium – uncached obj < cached obj

– Why it may be broken? Cache size is limited.

 Walking: restore equilibrium

– Reloading HDB (changed by others) – Reevaluate priorities in HDB and cache – Enhanced callback

(22)

Emulation

 Act like a server

 Record modified objects

 Replay update activity Preparation

– Log based per volume

 Persistence

– Meta-data  RVM – Exhaustion

(23)

Reintegration

 Replay algorithm

– Execute in parallel to all AVSG – Transaction based

– Succeed?

 Yes. Free logs, reset priority

(24)

Conflict Handling

 Only care write/write confliction  File vs Directory

– File: Halt entire reintegration process – Dir: investigate more

(25)

Coda Evaluation

Hardware

– 386 laptop, IBM Decstation 3100s – 350MB disk

How …?

– How long does reintegration take?

(26)

Answers

Duration of Reintegration

– A few hours( 4 to 5) disconnection ->1 min

Cache size

– 100MB(disk) at client is enough for a “typical” workday

Conflicts

– No Conflict at all! Why?

– Over 99% modification by the same person

(27)

Conclusion

 Disconnected operation is a simple idea  Hard to implement in each stage

– Why?

 An extended version of write-back cache?

– A critical data pre-fetched write-back cache

(28)

Remember this slide?

 We are back to 1990s.

 Network is slow and not stable  Terminal  “powerful” client

– 33MHz CPU, 16MB RAM, 100MB hard drive

 Mobile Users appear

(29)

What’s now?

 We are in 2000s now.

 Network is fast and reliable in LAN

 “powerful” client  very powerful client

– 2.4GHz CPU, 1GB RAM, 120GB hard drive

 Mobile Users everywhere

– IBM Thinkpad 10 yrs anniversary

 Do we still need disconnection?

(30)

Do we still need disconnection?

 WAN and wireless is not very reliable, and is slow  PDA is not very powerful

– 200MHz strongARM, 128M CF Card – Electric power constrained

 LBFS (MIT) on WAN, Coda and Odyssey (CMU) for mobile users

(31)

What is the future?

 We are in 2011s now

 High bandwidth, reliable wireless everywhere  Even PDA is powerful

– 2GHz, 1G RAM/Flash

– Unlimited kinetic or solar energy (?)

 What will be the research topic in FS?

References

Related documents

The Community Seismic Network ( CSN ) is currently a 500- element strong-motion network located in the Los Angeles area of California (see Fig.. The sensors in the network are

This is because space itself is to function as the ‘form’ of the content of an outer intuition (a form of our sensi- bility), as something that ‘orders’ the ‘matter’

The objectives of the current study are (1) to describe the course of limitations in activities in patients with moderate functional limi- tations due to OA of the hip and knee over

Pin and glue the right and left fuselage sides over the plans and to the cross-pieces, diagonals, the rear landing gear block and to each other at the rear tail-post.. Use a

The Network Director in conjunction with Executive Leadership will conduct a review of the facts that lead to the misuse of insulin pens to determine if administrative actions

The general pattern of a negative correlation between ENSO and SAM and no significant influence of external forcing on this relationship is consistent among real-world

The results show that the investigated plants have high contamination levels by these metals at three sites, while the fourth site has lower concentration level than the

Warning labels indicating exposure to laser light and the device classification are applied onto the body of the scanner (Figure A, 6). Warning and Device Class Label..