Unit 4: Dynamic Programming

(1)

Y.-W. Chang

Unit 4 1

Unit 4: Dynamic Programming

․

Course contents:

 Assembly-line scheduling  Matrix-chain multiplication  Longest common subsequence  Optimal binary search trees

 Applications: Rod cutting, optimal polygon triangulation,

flip-chip routing, technology mapping for logic synthesis

․

Reading:

 Chapter 15

Dynamic Programming (DP) vs. Divide-and-Conquer

․

Both solve problems by combining the solutions to subproblems.

․

Divide-and-conquer algorithms

 Partition a problem into independent subproblems, solve the

subproblems recursively, and then combine their solutions to solve the original problem.

 Inefficient if they solve the same subproblem more than once.

․

Dynamic programming (DP)

 Applicable when the subproblems are not independent.

(2)

Y.-W. Chang

Unit 4 3

Assembly-line Scheduling

․

An auto chassis enters each assembly line, has parts added at

stations, and a finished auto exits at the end of the line.  S_i,j: the jth station on line i

 a_i,j: the assembly time required at station S_i,j

 t_i,j: transfer time from station S_i,jto the j+1 station of the other line.  e_i(x_i): time to enter (exit) line i

station S2,1S2,2 a1,1 a1,2 a1,3 a1,n-1 a1,n x₁ x₂ t1,1 t_2,1 a2,1 a2,2 a2,3 a2,n-1 a2,n e2 t_1,2 t2,2 t1,n-1 t2,n-1 e1 auto exits chassis enters line 1 line 2 station S1,1S1,2 S1,3 S1,n-1 S1,n S_2,3 S_2,n-1 S2,n

Optimal Substructure

․

Objective: Determine the stations to choose to minimize the total manufacturing time for one auto.

 Brute force: (2n), why?

 The problem is linearly ordered and cannot be rearranged =>

Dynamic programming?

․

Optimal substructure:If the fastest way through station Si,jis

through S1,j-1, then the chassis must have taken a fastest way from

the starting point through S .

S1,1 S1,2 S1,3 S1,n-1 S1,n S_2,1 S_2,2 a_1,1 a_1,2 a_1,3 a1,n-1 a1,n x1 x2 t1,1 t2,1 a2,1 a2,2 a2,3 a2,n-1 a2,n e₂ t1,2 t_2,2 t1,n-1 t_2,n-1 e₁ auto exits chassis enters line 1 line 2 S2,3 S2,n-1 S2,n

(3)

Y.-W. Chang

Unit 4 5

Overlapping Subproblem: Recurrence

․

Overlapping subproblem:The fastest way through station S1,jis

either through S1,j-1 and thenS1,j, or through S2,j-1and then transfer to

line 1 and through S1,j.

․

fi[j]: fastest time from the starting point through Si,j

․

The fastest time all the way through the factory

f* = min(f1[n] + x1, f2[n] + x2) S1,1 S1,2 S1,3 S1,n-1 S1,n S_2,1 S_2,2 a1,1 a1,2 a1,3 a1,n-1 a1,n x1 x₂ t_1,1 t2,1 a2,1 a2,2 a2,3 a2,n-1 a2,n e₂ t1,2 t_2,2 t1,n-1 t_2,n-1 e₁ auto exits chassis enters line 1 line 2 S2,3 S2,n-1 S2,n 1 1, 1 1 1 1, 2 2, 1 1, 1 [ ] min( [ 1] j, [ 1] j j) 2 e a if j f j f j a f j t  a if j      _{ } _{ } _ _ 

Computing the Fastest Time

․

li[j]: The line number whose station j-1is used in a fastest way

through Si,j Fastest-Way(a, t, e, x, n) 1. f1[1]=e1+ a1,1 2. f2[1]=e2+ a2,1 3.forj=2 ton 4. iff1[j-1] + a1,j f2[j-1] + t2,j -1 + a1,j 5. f1[j]=f1[j-1] + a1,j 6. l1[j] =1 7. elsef1[j]=f2[j-1] + t2,j -1 + a1,j 8. l1[j] =2 9. iff2[j-1] + a2,j f1[j-1] + t1,j -1 + a2,j 10. f2[j]=f2[j-1] + a2,j 11. l2[j] =2 12. elsef2[j]=f1[j-1] + t1,j -1 + a2,j 13. l2[j] =1 14. iff1[n] + x1f2[n] + x2 15. f* =f1[n] + x1 16. l* = 1 17. elsef* =f2[n] + x2 18. l* = 2 Running time? Linear time!!

(4)

Y.-W. Chang

Unit 4 7

Constructing the Fastest Way

Print-Station(l, n)

1. i =l*

2. Print “line”i“, station “n

3. forj=n downto2 4. i=l_i[j]

5. Print “line“ i “, station“ j-1

S_1,1 S_1,2 S1,3 S1,5 S1,6 S2,1 S2,2 7 9 3 8 4 3 2 2 2 8 5 6 5 7 4 3 1 4 1 2 auto exits chassis enters line 1 line 2 S2,3 S2,5 S2,6 S1,4 4 4 3 2 S2,4 1 2 line 1, station 6 line 2, station 5 line 2, station 4 line 1, station 3 line 2, station 2 line 1, station 1

An Example

S1,1 S1,2 S1,3 S1,5 S1,6 S_2,1 S_2,2 7 9 3 8 4 3 2 2 2 8 5 6 5 7 4 3 1 4 1 2 auto exits chassis enters line 1 line 2 S2,3 S2,5 S2,6 S1,4 4 4 3 2 S2,4 1 2 9 18 20 24 32 35 12 16 22 25 30 37 1 2 1 1 2 1 2 1 2 2 1 2 3 4 5 6 j f* = 38 l* = 1 f1[j] f₂[j] l1[j] l [j]

(5)

Y.-W. Chang

Unit 4 9

․

Typically apply to optimization problem.

․

Generic approach

 Calculate the solutions to all subproblems.

 Proceed computation from the small subproblems to

larger subproblems.

 Compute a subproblem based on previously computed

results for smaller subproblems.

 Store the solution to a subproblem in a table and never

recompute.

․

Development of a DP

1. Characterize the structure of an optimal solution. 2. Recursively define the value of an optimal solution. 3. Compute the value of an optimal solution bottom-up. 4. Construct an optimal solution from computed information

(omitted if only the optimal value is required).

Dynamic Programming (DP)

When to Use Dynamic Programming (DP)

․

DP computes recurrence efficiently by storing partial results 

efficient only when the number of partial results is small.

․

Hopeless configurations:n! permutations of an n-element set, 2n_{subsets of an}_n_{-element set, etc.}

․

Promising configurations: contiguous substrings of an n-character string, n(n+1)/2 possible subtrees of a binary search tree, etc.

․

DP works best on objects that are linearly ordered

and cannot be rearranged!!

 Linear assembly lines, matrices in a chain, characters in a string, points around the boundary of a polygon, points on a line/circle, the left-to-right order of leaves in a search tree, etc.

 Objects are ordered left-to-right Smell DP?

1 ( 1) / 2

n

i in n

(6)

Y.-W. Chang

Unit 4 11

DP Example: Matrix-Chain Multiplication

․

If

A

is a

p

x

q

matrix and

B

a

q

x

r

matrix, then

C

=

AB

is

a

p

x

r

matrix

time complexity:

O

(

pqr

).

1

[ , ]

[ , ] [ , ]

q k

C i j

A i k B k j





Matrix-Multiply(A, B) 1. ifA.columns≠B.rows

2. error“incompatible dimensions”

3. elselet C be a new A.rows* B.columnsmatrix

4. fori=1 toA.rows 5. forj=1 toB.columns 6. c_ij= 0 7. fork = 1toA.columns 8. c_ij= c_ij+ a_ikb_kj 9. returnC

DP Example: Matrix-Chain Multiplication

․

The matrix-chain multiplication problem



Input: Given a chain <

A

₁

,

A

₂

, …,

A

_n

> of

n

matrices,

matrix

A

_i

has dimension

p

_i_-1

x

p

_i



Objective: Parenthesize the product

A

₁

A

₂

…

A

_n

to

minimize the number of scalar multiplications

․

Exp:

dimensions:

A

₁

: 4

x

2;

A

₂

: 2

x

5;

A

₃

: 5

x

1 (

A

₁

A

₂

)

A

₃

: total multiplications = 4

x

2 x

5 + 4

x

5 x

1 = 60

A

₁

(

A

₂

A

₃

): total multiplications = 2

x

5 x

1 + 4

x

2 x

1 = 18

(7)

Y.-W. Chang

Unit 4 13

Matrix-Chain Multiplication: Brute Force

․

A

=

A

₁

A

₂

…

A

_n

: How to evaluate

A

using the minimum

number of multiplications?

․

Brute force: check all possible orders?



P

(

n

): number of ways to multiply

n

matrices.



,

exponential

in

n

.

․

Any efficient solution?



The matrix chain

is linearly ordered and

cannot be rearranged!!



Smell Dynamic programming?

Matrix-Chain Multiplication

․

m

[

i

,

j

]: minimum number of multiplications to compute

matrix

A

_i..j

=

A

_i

A

_i₊₁

…

A

_j

, 1



i



j



n

.



m

[1,

n

]: the cheapest cost to compute

A

_1.._n

.

․

Applicability of dynamic programming



Optimal substructure:

an optimal solution contains

within its optimal solutions to subproblems.



Overlapping subproblem:

a recursive algorithm

revisits the same problem over and over again; only



(

n

2

_{) subproblems.}

(8)

Y.-W. Chang

Unit 4 15

Bottom-Up DP Matrix-Chain Order

Matrix-Chain-Order(p) 1. n =p.length– 1

2. Let m[1..n, 1..n] and s[1..n-1, 2..n] be new tables 3. fori=1 ton

4. m[i, i] =0

5. for l=2 ton // l is the chain length 6. fori=1 ton –l +1 7. j=i + l -1 8. m[i, j] = 9. fork=i toj -1 10. q=m[i, k] + m[k+1, j] + pi-1pkpj 11. if q < m[i, j] 12. m[i, j] =q 13. s[i, j] =k 14. returnmand s s A_idimension p_i_-1xp_i

Constructing an Optimal Solution

․

s[i, j]: value of ksuch that the optimal parenthesization of

AiAi+1… Ajsplits between Akand Ak+1

․

Optimal matrix A1..nmultiplication: A1..s[1, n]As[1, n] + 1..n

․

Exp:call Print-Optimal-Parens(s, 1, 6): ((A1(A2A3))((A4A5) A6))

Print-Optimal-Parens(s, i, j) 1. ifi == j 2. print “A”i 3. else print “(“ 4. Print-Optimal-Parens(s, i, s[i, j]) 5. Print-Optimal-Parens(s, s[i, j] + 1, j) 6. print “)“ s 0

(9)

Y.-W. Chang

Unit 4 17

Top-Down, Recursive Matrix-Chain Order

․

Time complexity:



(2

n

_{) (}

_T

₍

_n

_{) > (}

1

_T

₍

_k

₎₊

_T

₍

_n

_-

_k

_)+1)).

1 n k  



Recursive-Matrix-Chain(p, i, j) 1. ifi == j 2. return0 3. m[i, j] =



4. fork=itoj -1 5 q=Recursive-Matrix-Chain(p,i, k) + Recursive-Matrix-Chain(p, k+1, j) + pi-1pkpj 6. if q < m[i, j] 7. m[i, j] =q 8. returnm[i, j]

Top-Down DP Matrix-Chain Order (Memoization)

․

Complexity: O(n2_{) space for}_m_{[] matrix and}_O₍_n3_{) time to fill in}

O(n2_{) entries (each takes}_O₍_n_{) time)}

Memoized-Matrix-Chain(p)

1. n =p.length - 1

2. let m[1..n, 1..n] be a new table 2. fori=1 to n 3. forj =iton 4. m[i, j] =



5. returnLookup-Chain(m, p,1,n) Lookup-Chain(m, p, i, j) 1. if m[i, j] <  2. return m[i, j] 3. if i == j 4. m[ i, j] =0 5. else fork=itoj -1 6. q=Lookup-Chain(m, p, i, k) + Lookup-Chain(m, p, k+1, j) + p_i_-1p_kp_j 7. if q < m[i, j] 8. m[i, j] =q 9. return m[i, j]

(10)

Y.-W. Chang

Unit 4 19

Two Approaches to DP

1. Bottom-up iterative approach

 Start with recursive divide-and-conquer algorithm.

 Find the dependencies between the subproblems (whose

solutions are needed for computing a subproblem).

 Solve the subproblems in the correct order.

2. Top-down recursive approach (memoization)

 Start with recursive divide-and-conquer algorithm.  Keep top-down approach of original algorithms.

 Save solutions to subproblems in a table (possibly a lot of

storage).

 Recurse only on a subproblem if the solution is not already available in the table.

․

If all subproblems must be solved at least once, bottom-up DP is

better due to less overhead for recursion and for maintaining tables.

․

If many subproblems need not be solved, top-down DP is better

since it computes only those required.

Longest Common Subsequence

․

Problem:

Given

X

= <

x

₁

, x

₂

, …, x

_m

> and

Y

= <y

₁

, y

₂

, …, y

_n

>, find the

longest common

subsequence (LCS)

of

X

and Y.

․

Exp:

X

= <

a,

b

,

c

,

b

, d,

a

, b

> and

Y

= <

b

,

d

,

c

,

a

,

b

,

a

>

LCS = <

b

,

c

,

b

,

a

> (also, LCS = <

b, d, a, b

>).

․

Exp: DNA sequencing:



S1 = A

C

CG

GTCG

A

GATGCA

G;

S2 = GT

CGT

T

CGGA

A

TGCA

T;

LCS S3 = CGTCGGATGCA

․

Brute-force method:



Enumerate all subsequences of

X

and check if they

appear in

Y

.



Each subsequence of

X

corresponds to a subset of

the indices {1, 2, …,

m

} of the elements of

X

.

(11)

Y.-W. Chang

Unit 4 21

Optimal Substructure for LCS

․

Let

X

= <

x

₁

,

x

₂

, …,

x

_m

> and

Y

= <

y

₁

, y

₂

, …,

y

_n

> be

sequences, and Z = <

z

₁

,

z

₂

, …,

z

_k

> be LCS of X and Y.

1. If xm= yn, then zk= xm= ynand Zk-1is an LCS of Xm-1and Yn-1.

2. If xm≠ yn, then zk≠xmimplies Zis an LCS of Xm-1and Y.

3. If xm≠ yn, then zk≠ynimplies Zis an LCS of Xand Yn-1.

․

c

[

i

,

j

]: length of the LCS of

X

_i

and

Y

_j

․

c

[

m

,

n

]: length of LCS of

X

and

Y

․

Basis:

c

[0,

j

] = 0 and

c

[

i

, 0] = 0

Top-Down DP for LCS

․

c[i, j]: length of the LCS of Xiand Yj, where Xi= <x1, x2, …, xi> and Yj=<y1, y2, …, yj>.

․

c[m, n]: LCS of Xand Y.

․

Basis: c[0, j] = 0 and c[i, 0] = 0.

․

The top-down dynamic programming: initialize c[i, 0] = c[0, j] = 0,

c[i, j] = NIL TD-LCS(i, j) 1.if c[i,j] == NIL 2. ifxi== yj 3. c[i, j] = TD-LCS(i-1, j-1) + 1 4. elsec[i, j] = max(TD-LCS(i, j-1), TD-LCS(i-1, j)) 5. returnc[i, j]

(12)

Y.-W. Chang

Unit 4 23

Bottom-Up DP for LCS

․

Find the right order to

solve the subproblems

․

To compute

c

[

i

,

j

], we

need

c

[

i

-1,

j

-1],

c

[

i

-1,

j

],

and

c

[

i

,

j

-1]

․

b

[

i

,

j

]: points to the table

entry w.r.t. the optimal

subproblem solution

chosen when computing

c

[

i

,

j

]

LCS-Length(X,Y) 1. m=X.length 2. n=Y.length 3. let b[1..m, 1..n] and c[0..m, 0..n] be new tables 4. fori=1 tom 5. c[i, 0] =0 6. forj=0 ton 7. c[0, j] =0 8. fori=1 tom 9. forj=1 ton 10. ifxi== yj 11. c[i, j] =c[i-1, j-1]+1 12. b[i, j] =“↖” 13. elseifc[i-1,j] c[i, j-1] 14. c[i,j] =c[i-1, j] 15. b[i, j] =`` '' 16. elsec[i, j] =c[i, j-1] 17. b[i, j] =“ “ 18. returncandb

Example of LCS

․

LCS time and space complexity: O(mn).

․

X= <A, B, C, B, D, A, B> and Y= <B, D, C, A, B, A> LCS = <B,

(13)

Y.-W. Chang

Unit 4 25

Constructing an LCS

․

Trace back from

b

[

m

,

n

] to

b

[1, 1], following the arrows:

O

(

m

+

n

) time.

Print-LCS(b, X, i, j) 1. ifi== 0 orj == 0 2. return 3. ifb[i, j] == “↖“ 4. Print-LCS(b, X, i-1, j-1) 5. print xi 6. elseifb[i, j] == “ “ 7. Print-LCS(b, X, i-1, j) 8. elsePrint-LCS(b, X, i, j-1)

Optimal Binary Search Tree

․

Given a sequence K = <k₁, k₂, …, k_n> of ndistinct keys in sorted order (k₁< k₂< … < k_n) and a set of probabilities P= <p₁, p₂, …, p_n> for searching the keys in Kand Q = <q₀, q₁,

q₂, …, q_n> for unsuccessful searches (corresponding to D= <d₀, d₁, d₂, …, d_n> of n+1distinct dummy keys with d_i

representing all values between k_iand k_i+1), construct a binary search tree whose expected search cost is smallest.

k2 k1 k3 k4 k5 d₀ d₁ d2 d3 d4 d5 k₂ k₁ k4 k5 k3 d0 d1 d2 d3 d4 d5 p₂ q₂ p1 _p₄ p3 p5 q0 q1 q3 q4 q5

(14)

Y.-W. Chang Unit 4 27

An Example

i

0 1 2 3 4 5 pi 0.15 0.10 0.05 0.10 0.20 q_i 0.05 0.10 0.05 0.05 0.05 0.10 1 0

1

n n i i i i

p

q

 







1 0

[search cost in ] =

n

(

T

( ) 1)

i i n

(

T

( ) 1)

i i i i

E

T

depth k

p

depth d

q

 

  

 



1 0

1

n T

( )

i i n T

( )

i i i i

depth k

p

depth d

q

 

 



 





Cost = 2.75 Optimal!! k2 k₁ k4 k5 k₃ d0 d1 d2 d3 d₄ d₅ Cost = 2.80 k2 k₁ k3 k₄ k5 d0 d1 d2 d3 d4 d5 0.10×1 (depth 0) 0.15×2 0.10×2 0.10×4 0.20×3

Optimal Substructure

․

If an optimal binary search tree

T

has a subtree

T

’

containing keys

k

_i

, …,

k

_j

, then this subtree

T

’ must be

optimal as well for the subproblem with keys

k

_i

, …,

k

_j

and dummy keys

d

_i-1

, …,

d

_j

.

 Given keys k_i, …, k_jwith kr(irj) as the root, the left subtree contains the keys ki, …, kr-1(and dummy keys di-1, …,dr-1) and the right subtree contains the keys kr+1, …, kj(and dummy keys

dr, …,dj).

 For the subtree with keys k_i, …, k_jwith root k_i, the left subtree

contains keys k_i, .., k_i-1(no key)with the dummy key d_i-1. k₂ k₁ k₄ k5 k₃ d₀ d1 d4 d₅

(15)

Y.-W. Chang

Unit 4 29

Overlapping Subproblem: Recurrence

․

e

[

i

,

j

] : expected cost of searching an optimal binary

search tree containing the keys

k

i

, …,

k

j

.

 Want to find e[1, n].

 e[i, i -1] = q_i-1(only the dummy key d_i-1).

․

If

k

_r

(

i



r



j

) is the root of an optimal subtree containing

keys

k

_i

, …,

k

_j

and let

then

e

[

i, j

] =

p

r

+ (

e

[

i

,

r

-1] +

w

(

i

,

r

-1)

) + (

e

[

r

+1,

j

] +

w

(

r

+1,

j

)

=

e

[

i

,

r

-1] +

e

[

r

+1,

j

] +

w

(

i

,

j

)

․

Recurrence:

1 ( , ) j j l l l i l i w i j p q    







1 1 [ , ] _{min{ [ ,}i _1] _[ _{1, ]} _{( , )}} i r j q if j i e i j e i r e r j w i j if i j         _{ } _ _ _  k3 k4 k5 d₂ d₃ d4 d5 0.10×1 0.10×3 0.20×2

Node depths increase by 1 after merging two subtrees, and so do the costs

Computing the Optimal Cost

․Need a table e[1..n+1, 0..n] for e[i, j] (why e[1, 0] and e[n+1, n]?)

․Apply the recurrence to compute w(i, j) (why?)

Optimal-BST(p, q, n)

1. let e[1..n+1, 0..n], w[1..n+1, 0..n], and root[1..n, 1..n] be new tables 2. fori=1 ton + 1 3. e[i, i-1] =qi-1 4. w[i, i-1] =qi-1 5. forl=1 ton 6. fori=1 ton –l + 1 7. j=i + l -1 8. e[i, j] = 9. w[i, j] =w[i, j-1] + pj+ qj 10. forr=i toj 11. t=e[i, r-1] + e[r+1, j] + w[i, j] 12. if t < e[i, j] 13. e[i, j] =t 14. root[i, j] =r

15. returneandroot

1

1 [ , ]

[ ,

1]

i j j

q

if j

i

w i j

p

q

if i

j



 



 

_{  }

_



․root[i, j] : index rfor which

k_ris the root of an optimal search tree containing keys ki, …, kj. 2.75 2.00 0.50 1.30 0.05 0.90 0.10 1.75 1.20 0.05 0.60 0.30 0.90 0.40 1.25 0.70 0.25 0.05 0.45 0.10 0.05 0 3 4 5 1 2 3 4 5 6 2 1 e j i k3 k4 k5 d2 d3 d4 d5

(16)

Y.-W. Chang Unit 4 31

Example

2.75 2.00 0.50 1.30 0.05 0.90 0.10 1.75 1.20 0.05 0.60 0.30 0.90 0.40 1.25 0.70 0.25 0.05 0.45 0.10 0.05 0 3 4 5 1 2 3 4 5 6 2 1 e j i 1.00 0.80 0.35 0.60 0.05 0.50 0.10 0.70 0.50 0.05 0.30 0.20 0.45 0.25 0.55 0.35 0.15 0.05 0.30 0.10 0.05 0 3 4 5 1 2 3 4 5 6 2 1 w j i 2 4 5 5 5 2 2 4 4 1 2 2 2 3 1 3 4 5 ₁ 2 3 4 5 2 1 root j i i 0 1 2 3 4 5 pi 0.15 0.10 0.05 0.10 0.20 qi 0.05 0.10 0.05 0.05 0.05 0.10 k2 k1 k4 k5 k3 d₀ d1 d2 d3 d4 d₅ e[1, 1] = e[1, 0] + e[2, 1] + w(1,1)

= 0.05 + 0.10 + 0.3 = 0.45

e[1, 5] = e[1, 1] + e[3, 5] + w(1,5) = 0.45 + 1.30 + 1.00 = 2.75

(r = 2; r=1, 3?)

Appendix A:

Rod Cutting

․

Cut steel rods into pieces to maximize the revenue

 Assumptions: Each cut is free; rod lengths are integral numbers.

․

Input: A length nand table of prices p_i, for i= 1, 2, …, n.

․

Output: The maximum revenue obtainable for rods whose lengths sum to n, computed as the sum of the prices for the individual rods. length i 1 2 3 4 5 6 7 8 9 10 price pi 1 5 8 9 10 17 17 20 24 30 length i= 4 9 1 8 5 5 8 1 1 1 1 1 1 1 1 5 1 5 1 1 5 max revenue r4= 5+5 =10 5 5

(17)

Y.-W. Chang

Unit 4 33

Optimal Rod Cutting

․

If p_nis large enough, an optimal solution might require no cuts, i.e.,

just leave the rod as nunit long.

․

Solution for the maximum revenue riof length i

r₁= 1 from solution 1 = 1 (no cuts)

r₂= 5 from solution 2 = 2 (no cuts)

r3= 8 from solution 3 = 3 (no cuts)

r₄= 10 from solution 4 = 2 + 2

r5= 13 from solution 5 = 2 + 3

r6= 17 from solution 6 = 6 (no cuts)

r₇= 18 from solution 7 = 1 + 6 or 7 = 2 + 2 + 3 r8= 22 from solution 8 = 2 + 6 length i 1 2 3 4 5 6 7 8 9 10 price pi 1 5 8 9 10 17 17 20 24 30 max revenue ri 1 5 8 10 13 17 18 22 25 30

)

(

max

)

,...,

,

max(

1 1 1 2 2 1 1 i n i n i n n n n n

p

r

p

r

     









Optimal Substructure

․

Optimal substructure: To solve the original problem, solve

subproblems on smaller sizes. The optimal solution to the original problem incorporates optimal solutions to the subproblems.

 We may solve the subproblems independently.

․

After making a cut, we have two subproblems.

 Max revenue r₇: r₇= 18 = r₄+ r₃= (r₂+ r₂) + r₃or r₁+ r₆

․

Decomposition with only one subproblem: Some cut gives a first

piece of length ion the left and a remaining piece of length n - ion

the right. length i 1 2 3 4 5 6 7 8 9 10 price pi 1 5 8 9 10 17 17 20 24 30 max revenue ri 1 5 8 10 13 17 18 22 25 30

)

,...,

,

max(

p

r

1

r

1

r

2

r

2

r

1

r

1

r

n



n



n



n n



)

(

max

1 i n i n i n

p

r

  





(18)

Y.-W. Chang

Unit 4 35

Recursive Top-Down “Solution”

․

Inefficient solution: Cut-Rod calls itself repeatedly, even on subproblems it has already solved!!

)

(

max

1 i n i n i n

p

r

  





Cut-Rod(p, n) 1. ifn == 0 2. return0 3. q= -∞ 4. fori=1 ton 5. q= max(q, p[i] + Cut-Rod(p, n - i)) 6. returnq        

_

  1 0 . 0 if ), ( 1 0 if , 1 ) ( n j n j T n n T Cut-Rod(p, 4) Cut-Rod(p, 3) Cut-Rod(p, 1)

T

(

n

) = 2

n Cut-Rod(p, 0) Overlapping subproblems?

Top-Down DP Cut-Rod with Memoization

․

Complexity: O(n2_{) time}

 Solve each subproblem just once, and solves subproblems for

sizes 0,1, …, n. To solve a subproblem of size n, the forloop iterates ntimes.

Memoized-Cut-Rod(p, n) 1. let r[0..n] be a new array 2.fori=0 to n

3. r[i] =-

4.returnMemoized-Cut-Rod -Aux(p, n, r) Memoized-Cut-Rod-Aux(p, n, r) 1. ifr[n]≥ 0 2. returnr[n] 3. ifn == 0 4. q= 0 5. elseq= -∞ 6. fori=1 ton

7. q= max(q, p[i] + Memoized-Cut-Rod -Aux(p, n-i, r)) 8. r[n]= q

// Each overlapping subproblem // is solved just once!!

(19)

Y.-W. Chang

Unit 4 37

Bottom-Up DP Cut-Rod

․

Complexity: O(n2_{) time}

 Sort the subproblems by size and solve smaller ones first.

 When solving a subproblem, have already solved the

smaller subproblems we need.

Bottom-Up-Cut-Rod(p, n) 1. let r[0..n] be a new array 2. r[0]= 0 3. forj=1 ton 4. q= -∞ 5. fori=1 toj 6. q= max(q, p[i] + r[j–i]) 7. r[j]= q 8. returnr[n]

Bottom-Up DP with Solution Construction

․

Extend the bottom-up approach to record not just optimal values, but optimal choices.

․

Saves the first cut made in

an optimal solution for a problem of size iin s[i].

Extended-Bottom-Up-Cut-Rod(p, n) 1. let r[0..n] and s[0..n] be new arrays 2. r[0]= 0 3. forj=1 ton 4. q= -∞ 5. fori=1 toj 6. ifq< p[i] + r[j–i] 7. q= p[i] + r[j–i] 8. s[j]= i 9. r[j]= q 10. returnr ands Print-Cut-Rod-Solution(p, n) 1. (r, s) = Extended-Bottom-Up-Cut-Rod(p, n) 2. whilen > 0 3. print s[n] 4. n= n–s[n] i 0 1 2 3 4 5 6 7 8 9 10 r[i] 0 1 5 8 10 13 17 18 22 25 30 s[i] 0 1 2 3 2 2 6 1 2 3 10 5 5 5 5 1 17

(20)

Y.-W. Chang

Unit 4 39

Appendix B:

Optimal Polygon Triangulation

․

Terminology: polygon, interior, exterior, boundary, convex polygon,

triangulation?

․

The Optimal Polygon Triangulation Problem: Given a convex polygon P= <v0, v1, …, vn-1> and a weight function wdefined on

triangles, find a triangulation that minimizes _{ }w().

․

One possible weight function on triangle:

w(vivjvk)=|vivj| + |vjvk| + |vkvi|,

where |vivj| is the Euclidean distance from vito vj.

Optimal Polygon Triangulation (cont'd)

․

Correspondence between full parenthesization, full binary tree

(parse tree), and triangulation  full parenthesization full binary tree

 full binary tree (n-1 leaves) triangulation (nsides)

(21)

Y.-W. Chang

Unit 4 41

Pseudocode: Optimal Polygon Triangulation

․

Matrix-Chain-Order is a special case of the optimal polygonal

triangulation problem.

․

Only need to modify Line 9 of Matrix-Chain-Order.

․

Complexity: Runs in (n3_{) time and uses}_₍_n2_{) space.}

Optimal-Polygon-Triangulation(P) 1. n =P.length 2. fori=1 ton 3. t[i,i] =0 4. forl=2 ton 5. fori=1 ton –l + 1 6. j=i+ l - 1 7. t[i, j] =



8. fork=itoj-1 9. q =t[i, k] + t[k+1, j ] + w(



vi-1vkvj) 10. if q < t[i, j] 11. t[i, j] = q 12. s[i,j] =k 13. returntands

Appendix C:

LCS for Flip-Chip Routing

driver pad bump pad ring net assignment

(1) (2)

․

Lee, Lin, and Chang, ICCAD-09

․

Given: A set of driver pads on driver pad rings, a set of

bump pads on bump pad rings, a set of nets/connections

․

Objective: Connect driver pads and bump pads

according to a predefined netlist such that the total

wirelength is minimized

(22)

Y.-W. Chang Unit 4 43 Sd 1 2 1 3 4 3 S_b 3 2 1 4

Minimize # of Detoured Nets by LCS

d₁ d2 d₃ d4 b1 b2 b3 b4 b1 b2 b3 b4 b1 b2 b3 b4 d₁ d2 d₃ d4 d1 d1 d3 d3 d₁ d2 d₃ d4 d1 d1 d3 d3 n1 n₂ n3 n₄ n2 n3 n4 n4 n3 n3 n2 n1 n1 n₁ n2 n3 n₄ n1 Sd=[1,2,1,3,4,3] S_b=[3,2,1,4] (1) (2) (3) (4) 0 0 0 0 0 0 0 0 1 1 2 2 3 3 0 1 1 2 2 2 2 0 0 1 1 1 1 1 0 0 0 0 1 1 1

net sequence

Need detour for only n3

․

Cut the rings into lines/segments (form linear orders)

#Detours Minimization by Dynamic Programming

Longest common subsequence (LCS) computation

d1 d2 d3

b₃ b₁ b₂ d1 d2 d3

b₃ b₁ b₂

seq.1=<1,2,3>

seq.2=<3,1,2> Common subseq = <3>

d1 d2 d3 b₃ b₁ b₂ (c) LCS= <1,2> (a) _(b) d1 d2 d3 b3 b1 b2 d1 d2 d3 b3 b1 b2 d1 d2 d3 b3 b1 b2 b4 b5 b6 b4 b5 b6 b4 b5 b6

(23)

Y.-W. Chang

Unit 7 45

Supowit's Algorithm for Finding MPSC

․

Supowit, “Finding a maximum planar subset of a set of nets

in a channel,” IEEE TCAD, 1987.

․

Problem: Given a set of chords, find a maximum planar subset of chords.

 Label the vertices on the circle 0 to 2n-1.

 Compute MIS(i, j): size of maximum independent set between

vertices iand j, i< j.

 Answer = MIS(0, 2n-1).

Vertices on the circle

Dynamic Programming in Supowit's Algorithm

(24)

Y.-W. Chang 47

Ring-by-Ring Routing

․

Decompose chip into

rings of pads

․

Initialize I/O-pad

sequences

․

Route from inner rings to

router rings

․

Exchange net order on

current ring to be

consistent with I/O pads’

 Keep applying LCSand

MPSC algorithms between

two adjacent rings to

minimize #detours _{current ring}

preceding ring Over 100X speedups over ILP

Global Routing Results

․

The global routing result of circuit fc2624

․

A routing path for each net is guided by a set of segments

(25)

Y.-W. Chang

Unit 4 49

Detailed Routing: Phase 1

․

Segment-by-segment routing in counter-clockwise order

․

As compacted as possible

Detailed Routing: Phase 2

․

Net-by-net re-routing in clockwise order

(26)

Y.-W. Chang

Unit 4 51

Appendix D:

Standard-Cell Based VLSI Design Style

(27)

Y.-W. Chang

Unit 4 53

Technology Mapping

․

Technology Mapping:

The optimization problem of

finding a minimum cost covering of the subject graph by

choosing from the collection of pattern graphs for all

gates in the library.

․

A

cover

is a collection of pattern graphs such that

every node of the subject graph is contained in one (or

more) of the pattern graphs.

․

The cover is further constrained so that each input

required by a pattern graph is actually an output of

some other pattern graph.

Trivial Covering

․

Mapped into 2-input NANDs and 1-input inverters.

․

8 2-input NAND-gates and 7 inverters for an area cost

of 23.

․

Best covering?

(28)

Y.-W. Chang

Unit 4 55

Optimal Tree Covering by Dynamic Programming

․

If the subject directed acyclic graph (DAG) is a tree,

then a polynomial-time algorithm to find the minimum

cover exists.

 Based on dynamic programming: optimal substructure?

overlapping subproblems?

․

Given: subject trees (networks to be mapped), library

cells

․

Consider a node

n

of the subject tree

 Recursive assumption: For all children of n, a best match

which implements the node is known.  Cost of a leaf is 0.

 Consider each pattern tree which matches at n, compute

cost as the cost of implementing each node which the pattern requires as an input plus the cost of the pattern.

 Choose the lowest-cost matching pattern to implement n.

Best Covering

․

A best covering with an area of 15.