Resilient Dynamic Programming

(1)

Resilient Dynamic Programming

Irene Finocchi, Saverio Caminiti, and Emanuele Fusco

Dipartimento di Informatica, “Sapienza” Università di Roma via Salaria, 113-00198 Rome, Italy.

{finocchi, caminiti, fusco}@di.uniroma1.it

Kickoff AlgoDEEP — Bertinoro, Italia. April 16-17 2010

(task C.1.1)

(2)

Outline

1 Introduction

2 A resilient framework for dynamic programming

3 Testing and experimental validation

(3)

Memories and faults

Why should we care about memory faults in algorithm design?

Memory faults happen: a large cluster of computers with a few

gigabytes per node can experience one bit error every few minutes [Sah06].

Memory faults are harmful: undetected memory faults cause

data corruption to spread; (potentially safety critical, e.g., avionics).

Hardware solutionsmay be inadequate: fault-tolerant memory

chips does not guarantee complete fault coverage; (expensive – system halt upon detection of uncorrectable errors – interruptions of service) [JNW08].

(4)

From liars to data corruption

Algorithmic research related to memory errors has focused mainly on sorting and searching problems:

late 70’s: Rényi [Rén94] and Ulam [Ula77]: “twenty questions game” against a liar, handling noise in binary search.

Yao and Yao [YY85], and then [AU91, LM99, LMP97]: destructive faults in fault-tolerant sorting networks, comparison gates can destroy one of the input values. . . .

[FI04] sorting in the faulty RAM model.

(5)

Faulty memories: an adversarial model

Memory in a faulty-RAM of word-sizew is divided in three classes:

a large unreliable memory: anadaptiveadversary of unlimited

computational powercan modify up toδ memory words;

O(1) safememory words: the adversary can readbutnot

modify this memory;

O(1) privatememory words: the adversarycannot even read this memory.

(6)

Local dependency dynamic programming – edit distance

Letei,j be the edit distance between the prefix up to the i-th symbol of the input stringX and the prefix up to thej-th symbol of the input stringY.

ei,j :=

(

ei−1,j−1 ifi,j >0 andxj =yi

1+min{ei−1,j,ei,j−1,ei−1,j−1} ifi,j >0 andxj 6=yi

(e0,j =j,ei,0 =i.)

(7)

Correctness requirements

Correctness of sorting and searching required only on uncorrupted values.

In our setting, such a relaxed definition of correctness does not seem to be natural.

We seek algorithms that correctly compute the edit distance between the two input strings, in spite of memory faults.

(8)

Correctness requirements

Correctness of sorting and searching required only on uncorrupted values.

In our setting, such a relaxed definition of correctness does not seem to be natural.

We seek algorithms that correctly compute the edit distance between the two input strings, in spite of memory faults.

(9)

Tools

Majority.

Table decomposition. Fingerprinting.

(10)

Majority

A variable can be made resilient by making 2δ+1 copies.

As at mostδ of them can be altered by the adversary, the

majority value is the correct value.

The majority value can be read in time O(δ)and space

O(1) [BM91].

(11)

Table decomposition

The DP table is split in blocks of size δ×δ. The boundaries of each block are written reliably in the faulty memory. δ2 values result in roughly

5δ2 memory words.

(12)

Fingerprinting

A fingerprint for a column is computed as: ϕk =v1◦v2◦. . .◦vδ modp

wherep is a prime number uniformly chosen at randomly in interval

[nc−1_,_nc_]_(where_c _{is an appropriate constant).}

Using logical shifts and Horner’s rule, each fingerprint can be incrementally computed while generating the valuesvh:

for h=1 toδ do

ϕ= ((ϕ×2w) +vh) modp

end for

(13)

Fingerprinting

A fingerprint for a column is computed as: ϕk =v1◦v2◦. . .◦vδ modp

wherep is a prime number uniformly chosen at randomly in interval

[nc−1_,_nc_]_(where_c _{is an appropriate constant).}

Using logical shifts and Horner’s rule, each fingerprint can be incrementally computed while generating the valuesvh:

for h=1 toδ do

ϕ= ((ϕ×2w) +vh) modp

end for

(14)

Block computation

Bi,j Bi,j−1

Bi−1,j−1 Bi−1,j

ϕ1

The first column of a block is computed reading reliably all values it depends from.

(15)

Block computation

Bi,j Bi,j−1

ϕ1

While computing the first column, fingerprint ϕ1 is

also computed.

(16)

Block computation

Bi,j Bi,j−1

ϕ1

While computing the first column, fingerprint ϕ1 is

also computed.

(17)

Block computation

ϕk Bi−1,j−1 Bi−1,j Bi,j−1 ϕk + 1 Bi, j ϕk

While computing column

k+1, we produce two

fingerprints,ϕk+1 andϕk.

(18)

Block computation

ϕk Bi−1,j−1 Bi−1,j Bi,j−1 ϕk + 1 Bi, j ϕk Fingerprintϕk is then compared withϕ_k (i.e., the fingerprint produced while

computing column k).

(19)

Block computation

ϕk Bi−1,j−1 Bi−1,j Bi,j−1 ϕk + 1 Bi, j ϕk Ifϕk 6=ϕk, the block is recomputed from scratch.

(20)

As a result...

... we have: Theorem

The edit distance between two strings of length n and m, with n≥m, can be correctly computed, with high probability, in:

O(nm+αδ2) time; O(nm) space,

whenδ is polynomial in n.

(21)

Generalizing

Theorem

A d -dimensional local dependency dynamic programming table M of size nd can be correctly computed, with high probability, in:

O(nd+αδd) time; O(nd₊_n_δ₎_space,

when the actual numberα≤δ of memory faults occurring during the computation is polynomial in n.

(Edit distance, longest common subsequence, sequence alignment,. . .)

(22)

faultyLib

We are developing a library to test program behavior in presence of memory faults.

Plugging in the library should be very easy:

existing C/C++ code should require minimal changes to be tested with our library.

Implementation of different (and meaningful) adversaries should be easy.

. . .

(23)

faultyLib: usage

FaultyUInt M[n+1u][m+1u]; // An n+1 X m+1 matrix of

// faulty unsigned int ...

for (unsigned int i = 1; i <= n; i++) { for (unsigned int j = 1; j <= m; j++) {

M[i][j] = min(1 + min(M[i-1][j], M[i][j-1]), M[i-1][j-1] + ((x[i-1]==y[j-1]) ? 0 : 1)); }

} ...

(24)

faultyLib: faulty types implementation

template <typename T>

class Faulty : public FaultyBase { ... private: T _val; T read() const { FaultyMM::getInstance()->faultBeforeRead(&_val, sizeof(T), context); return _val; } void write(T v) { _val = v; FaultyMM::getInstance()->faultAfterWrite(&_val, sizeof(T), context); } } ...

typedef Faulty<unsigned int> FaultyUInt;

(25)

faultyLib: overriding operators

...

//Assignment operator template <typename Targ>

Faulty & operator=(const Targ & v) { write((T)v); return *this; }

... //OR

template <typename Targ>

bool operator||(const Targ & v) const { return (read() || (T)v);

} }

...

(26)

faultyLib: adversaries implementation

class REDAdversary : public Adversary { ...

virtual void faultAfterWrite(void * location, size_t s, Context * cnt) {

if ((cnt != NULL) && (cnt->tag == EDMATRIX_TAG)) { MatrixContext * m = (MatrixContext *)cnt; unsigned int * i = (unsigned int *)location; if (m->getIndex(0) == 3) if (m->getIndex(1) == 7) *i = *i +3; } } ...

(27)

Thanks!

Thank you for your attention!

(28)

References

[AU91] S. Assaf and E. Upfal.

Fault tolerant sorting networks.

SIAM J. Discrete Math., 4(4):472–480, 1991.

[BM91] R. S. Boyer and J. S. Moore.

Mjrty: A fast majority vote algorithm. InAutomated Reasoning: Essays in Honor of Woody Bledsoe, pages 105–118, 1991.

[FI04] Irene Finocchi and Giuseppe F. Italiano.

Sorting and searching in the presence of memory faults (without redundancy). In László Babai, editor,STOC, pages 101–110. ACM, 2004.

[JNW08] B. L. Jacob, S. W. Ng, and D. T. Wang.

Memory Systems: Cache, DRAM, Disk.

Morgan Kaufmann, 2008.

[LM99] F. T. Leighton and Y. Ma.

Tight bounds on the size of fault-tolerant merging and sorting networks with destructive faults.

SIAM J. Comput., 29(1):258–273, 1999.

[LMP97] F. T. Leighton, Y. Ma, and C. G. Plaxton.

Breaking theθ(nlog2n)barrier for

sorting with faults.

J. Comput. Syst. Sci., 54(2):265–304, 1997.

[Rén94] A. Rény.

A diary on information theory. J. Wiley and Sons, 1994. Original publication:Napló az információelméletröl, Gondolat, Budapest, 1976.

[Sah06] G. K. Saha.

Software based fault tolerance: a survey.

Ubiquity, 7(25), 2006.

[Ula77] S. M. Ulam.

Adventures of a mathematician. Charles Scribner’s Sons, New York, 1977.

[YY85] A. C. Yao and F. F. Yao.

On fault-tolerant networks for sorting.

SIAM J. Comput., 14(1):120–128, 1985.