• No results found

Analysis of Computer Algorithms. Algorithm. Algorithm, Data Structure, Program

N/A
N/A
Protected

Academic year: 2022

Share "Analysis of Computer Algorithms. Algorithm. Algorithm, Data Structure, Program"

Copied!
25
0
0

Loading.... (view fulltext now)

Full text

(1)

12/13/02 <Algorithm Theory> 1

Analysis of Computer Algorithms

Hiroaki Kobayashi

Algorithm

Input Output

Algorithm, Data Structure, Program

Algorithm

Well-defined, a finitestep-by-step computational procedure that takes some values, or set of values, as inputand produces some value, or set of values, as output.

• Procedure

a sequence of computational steps that transform the input into the output.

Finitenessof the steps of a procedure is not guaranteed.

• Program

algorithm represented in a program language

• Data structure

data structure suited for input, output, and temporal data during computing.

Passive objects

Program = Algorithm + Data Structure

(2)

12/13/02 <Algorithm Theory> 3

What is a Computable Problem

Computable (Solvable) problem…

There is an algorithm with finite

program stepsto solve the problem on a computer

Many different algorithms for a computable problem are available!

Analysis of algorithms is essential to design good algorithms Problems

Computable incomputable

How can we evaluate algorithms?

Measure running time of a program

• Most algorithms transform input objects into output objects.

• The running time of an algorithm typically grows with the input size.

• Average case time is often difficult to determine.

• We focus on the worst case running time.

– Easier to analyze

– Crucial to applications such as games, finance and robotics

0 20 40 60 80 100 120

Running Time

1000 2000 3000 4000 Input Size

best case average case worst case

(3)

12/13/02 <Algorithm Theory> 5

Experimental Studies

• Write a program implementing the algorithm

• Run the program with inputs of varying size and composition

• Use a method like System.currentTimeMillis()to get an accurate measure of the actual running time

• Plot the results

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

0 50 100

Input Size

Time (ms)

Limitations of Experiments

• It is necessary to implement the algorithm, which may be difficult

• Results may not be indicative of the running time on other inputs not included in the experiment.

• In order to compare two algorithms, the same

hardware and software environments must be used

(4)

12/13/02 <Algorithm Theory> 7

Theoretical Analysis

• Uses a high-level description of the algorithm instead of an implementation

• Characterizes running time as a function of the input size, n .

• Takes into account all possible inputs

• Allows us to evaluate the speed of an algorithm independent of the hardware/software

environment

Theoretical Analysis (Cont’d)

Algorithm analysis needs an abstract machineas a measurefor evaluation of algorithms

Abstract Machines:

⇒ Very simple computer models for measurement of program execution steps and memory requirement

Turing Machine

Random Access Machine (RAM) Program correctness

Computation complexity evaluation in terms of speed and memory

(5)

12/13/02 <Algorithm Theory> 9

Abstract Machine Example : Turing Machine

Proposed by Alan M. Turing in 1936 (Pre-Electric-Computer Era)

• Processor

Finite State Machine

• Tape

Memory with unlimited cells

Blank or symbol

• Head

Head for tape reads and writes

Processor (Finite State

Machine) Head

Unlimited Tape (memory)

Cell

Turing Machine can perform any computation that any general–purpose computer can, but has a quite different architecture from conventional computers.

Pseudo-code

• High-level description of an algorithm

• More structured than English prose

• Less detailed than a program

• Preferred notation for describing algorithms

• Hides program design issues

Algorithm arrayMax(A, n) Input array A of n integers Output maximum element of A currentMaxA[0]

for i ← 1 to n − 1 do if A[i] > currentMax then

currentMaxA[i]

return currentMax

Example: Find max

element of an array

(6)

12/13/02 <Algorithm Theory> 11

Pseudocode Details

• Control flow

– ifthen[else…]

– whiledo– repeatuntil– fordo

– Indentation replaces braces

• Method declaration

Algorithm method (arg [, arg…]) Input

Output

• Method call

var.method (arg [, arg…])

• Return value returnexpression

• Expressions

← Assignment (like = in Java)

= Equality testing (like == in Java) n2Superscripts and other

mathematical formatting allowed

Von-Neumann Computer Model:

Base Model for Modern Computer Systems

Control Control

Datapath

(ALU, Reg)

Datapath

(ALU, Reg)

Memory

Input Input

Output Output Processor

Computer System

(7)

12/13/02 <Algorithm Theory> 13

The Random Access Machine (RAM) Model

• A CPU

• An potentially unbounded bank of memorycells, each of which can hold an arbitrary number or character

0 1 2

• Memory cells are numbered and accessing any cell in memory takes unit time.

Primitive Operations

• Basic computations performed by an algorithm

• Identifiable in pseudocode

• Largely independent from the programming language

• Exact definition not important

• Assumed to take a constant amount of time in the RAM model

• Examples:

– Evaluating an expression

– Assigning a value to a variable

– Indexing into an array

– Calling a method – Returning from a

method

(8)

12/13/02 <Algorithm Theory> 15

Analysis of Algorithm Complexity

• Execution Steps

– Number of primitive steps/operations to solve a given problem.

• Memory Requirement

– Total amount of memory required to hold inputs, outputs and temporal results

Example: Search a desired datum among N data – Case 1

• A data set of N elements is randomly stored.

– Case 2

• A data set of N elements is stored in an ascending/descending order.

Search Algorithms

• Case 1 Random stored data

10 8 2 5 6 1 9 3

1 2 3 5 6 8 9 10

• Case 2 Sorted data

Sequential search ・・・

Binary search

2 5 8

1 3 6 9

10 If smaller If larger

< log 8

(9)

12/13/02 <Algorithm Theory> 17

Summary of Search Algorithms

N logN log N

searches, N/2moves log N

log N Binary

search

N searches for N/2

search, N/2moves 1

Best 1, Worst N,

Average searchesN/2 Sequential

search

Initial data arrangement

for N times Delete data

Insert new Search data

N vs. Log N :As a Function of Input Data

N log

N

2 1

4 2

8 3

16 4

64 6

256 8

4096 12

65536 16

1048576 20

Very slow growth

Algorithms with log N steps are very good ones!

(10)

12/13/02 <Algorithm Theory> 19

Metrics for Complexity Analysis

• Many programs are extremely sensitive to their input data Algorithms are evaluated as a function of the number of input data.

• Two Metrics for Algorithm Analysis – Space Complexity

• Total amount of memory to store inputs, outputs, and temporal data – Time Complexity

• Number of (program) steps to solve the problem

So, What’s a Bad Algorithm…, yes, it is computable, but it needs the enormous amount of memory

the enormous number of execution steps

Complexity in Worst Case and Average Case

• Worst Case Complexity

– The worst-case execution complexity of an algorithm gives an upper bound on the execution steps for any input.

Evaluate the algorithm using the worst possible input configuration.

• Guarantee that the algorithm will never take any longer.

• For some algorithms, the worst case rarely occurs. In such a case, the worst case complexity does not reflect the actual program behavior.

• Average Complexity (Expected Complexity)

– Estimate the complexity of an algorithm with considering input

occurrence probability. Evaluate the algorithm using typical input data.

• Reflects the actual behavior of an algorithm, but hard to constitute average inputs for a particular problem.

Assume that all inputs of a given size are equally likely

(11)

12/13/02 <Algorithm Theory> 21

Estimating Running Time

• Assume that an algorithm executes 7n-1 primitive operations in the worst case. Define:

– a = Time taken by the fastest primitive operation – b = Time taken by the slowest primitive operation

• Let T(n) be worst-case time of the algorithm.

Then

– a(7n-1) ≤ T(n) ≤ b(7n-1)

• Hence, the running time T(n) is bounded by two linear functions.

Growth Rate of Running Time

• Changing the hardware/ software environment – Affects T(n) by a constant factor, but

– Does not alter the growth rate of T(n)

The linear growth rate of the running

time T(n) is an intrinsic property of

the algorithm

(12)

12/13/02 <Algorithm Theory> 23

Growth Rates

• Growth rates of functions:

– Linear ≈ n – Quadratic ≈ n2 – Cubic ≈ n3

• In a log-log chart, the slope of the line corresponds to the growth rate of the function

1E+0 1E+2 1E+4 1E+6 1E+8 1E+10 1E+12 1E+14 1E+16 1E+18 1E+20 1E+22 1E+24 1E+26 1E+28 1E+30

1E+0 1E+2 1E+4 1E+6 1E+8 1E+10 n

T(n)

Cubic Quadratic Linear

Constant Factors

• The growth rate is not affected by

– constant factors or – lower-order terms

1E+0 1E+2 1E+4 1E+6 1E+8 1E+10 1E+12 1E+14 1E+16 1E+18 1E+20 1E+22 1E+24 1E+26

1E+0 1E+2 1E+4 1E+6 1E+8 1E+10 n

T(n)

Quadratic Quadratic Linear Linear

• Examples

– 10

2

n + 10

5

is a linear function

– 10

5

n

2

+ 10

8

n is a

quadratic function

(13)

12/13/02 <Algorithm Theory> 25

• Big O-notationgives an upper bound for a function to within a constant factor.

– If there exist positive constant n0and c such that to the right of n0, the value of f(n) always lies on or below cg(n), i.e.,

f(n)≤cg(n) n: number of inputs

The complexity of a given algorithm : O(g(n))

Big O-notation: Quantitative and Approximate Analysis of Algorithm Complexity

Analysis of Algorithm Complexity using Big O-Notation

• Just consider the leading/dominant termof mathematical expression of algorithm complexity

n2, 2.5n2, 100n2, 1000+5n+2n2

O-notation gives approximate evaluation of algorithms…ignore details!!

Example: An algorithm with 2n2execution steps is roughly evaluated with a complexity of O(n2) steps

Free the analysis from considering the details of particular machine characteristics: independent of both inputs and implementation details

O(n

2

) !!

(14)

12/13/02 <Algorithm Theory> 27

Big-O Example

Example: 2n + 10 is O(n)

– 2n + 10 ≤ cn – (c − 2) n ≥ 10 – n ≥ 10/(c − 2)

1 10 100 1,000 10,000

1 10 100 1,000

n

3n 2n+10 n

–Pick c = 3 and n0 = 10

Big-O Example

Example:

function n

2

is not O(n) – n

2

≤ cn

– n ≤ c

1 10 100 1,000 10,000 100,000 1,000,000

1 10 100 1,000

n

n^2 100n 10n n

The above inequality cannot be satisfied since c must be a constant

(15)

12/13/02 <Algorithm Theory> 29

More Big-O Examples

7n-2 is O(n)

need c > 0 and n01 such that 7n-2 ≤ c•n for n ≥ n0 this is true for c = 7 and n0= 1

3n3+ 20n2+ 5 is O(n3)

need c > 0 and n01 such that 3n3+ 20n2+ 5 ≤ c•n3for n ≥ n0 this is true for c = 4 and n0= 21

3 log n + log log n is O(log n)

need c > 0 and n01 such that 3 log n + log log n ≤ c•log n for n ≥ n0 this is true for c = 4 and n0= 2

Big-Oh and Growth Rate

• The big-Oh notation gives an upper bound on the growth rate of a function

The statement “f(n) is O(g(n))” means that the growth rate of f(n) is no more than the growth rate of g(n)

• We can use the big-Oh notation to rank functions according to their growth rate

Yes No

f(n) grows more

No Yes

g(n) grows more

g(n) is O(f(n)) f(n) is O(g(n))

(16)

12/13/02 <Algorithm Theory> 31

Big-Oh Rules

If is f(n) a polynomial of degree d, then f(n) is O(n

d

), i.e.,

1. Drop lower-order terms 2. Drop constant factors

• Use the smallest possible class of functions – Say “2n is O(n)” instead of “2n is O(n

2

)”

• Use the simplest expression of the class

– Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)”

Asymptotic Algorithm Analysis

• The asymptotic analysis of an algorithm determines the running time in big-Oh notation

• To perform the asymptotic analysis

– We find the worst-case number of primitive operations executed as a function of the input size

– We express this function with big-Oh notation

• Example:

– We determine that the algorithm executes at most 7n− 1 primitive operations

– We say that the algorithm “runs in O(n) time”

• Since constant factors and lower-order terms are eventually dropped anyhow, we can disregard them when counting primitive operations

(17)

12/13/02 <Algorithm Theory> 33

Good Algorithm or Bad Algorithm?:

Quantitative Definition

☺ Good Algorithms…

Time/Space Complexity is O(n), O(log n), O(nlog n), O(n2)…

Polynomial Growth of Computation Steps and/or memory requirementas the number of inputs increases

Bad Algorithms…

Time/Space Complexity is O(kn) where k is a constant.

Exponential Growth of Computation Steps and/or Memory Requirement as the number of inputs increases

Polynomial-time solvable problems are called easy problemsor tractable problems.

If algorithms with exponential growth of complexity are only solutions to a problem, the problem is called a hard problem or intractable problem.

Growth of Functions

n 10 20 50 100 200 500 1000 1000

O(1) 1 1 1 1 1 1 1 1

O(logn) 0.5 0.65 0.85 1 1.15 1.35 1.5 2

O(n) 0.1 0.2 0.5 1 2 5 10 100

O(nlogn) 0.05 0.13 0.42 1 2.3 6.75 15 200

O(n2) 0.01 0.04 0.25 1 4 25 100 2.78hours

O(n3) 0.001 0.008 0.125 1 8 125 1000 11.6days

O(2n) 0.001 1.02 34.8years - - - - -

n: number of inputs

Unit: seconds if not specified Using more powerful computers is a good

solution to hard problems?

(18)

12/13/02 <Algorithm Theory> 35

Computer Power Goes Up Steadily, and…

Pentium 4 2.4GHz

Source: Intel 750KHz

2014年

•64B Tr

•4.3GHz 2014年

•64B Tr

•4.3GHz

1+ GHz The Number of Transistors Per Chip

Doubles Every 18 Months!

The Number of Transistors Per Chip Doubles Every 18 Months!

Yes, We Have Powerful Computers at Low Cost, but…

Performance(ratio) 0.15MIPS(1) Memory(ratio 64KB(1)

Price 360M yen

Price/performance 1 1965

IBM System 360/50

2001

Pentium 4 PC

3792MIPS(25,280) 256MB(4,096) 250K yen

More than 36M times!

1977

DEC VAX11/780

1MIPS(6.7) 1MB(16) 72M yen 33 times

(19)

12/13/02 <Algorithm Theory> 37

It’s a mere drop in the bucket for hard problems…

Example

• For 4 algorithms A1,A2,A3,A4whose complexities are O(n), O(n2), O(n100), O(2n), respectively, evaluate increases in input size when an 100 times faster computer is available.

Assumptions

• A new computer M’ is 100 times faster than an old computer M.

• mxis a data size/sec on M with algorithmx

• mxis a data size/sec on M’ with algorithmx

Algorithms Time increase in Input size

Complexity M M'

A1 O(n) m1 100m1

A2 O(n2) m2 10m2

A3 O(n100) m3 1.047m3

A4 O(2n) m4 m4+6.644

Algorithm Analysis Example: MSP Problem

Minimum Spanning Tree Problem

– For given graph G(V, E), find an acyclic subset T⊆E that connects all of the vertices and whose total weight is minimized, where V is the set of vertices and E is the set of edges.

7 9

12 17

16 15 8

7 9

15 8

Minimum Spanning Tree

(20)

12/13/02 <Algorithm Theory> 39

MST Algorithm (Kruskal’s Algorithm)

Step 1

• Sort edges in an ascending order

Step 2

• Prepare v sets, each of which has each vertex as an entry, where v is the number of vertices.

Step 3

• Pick up an edge with the smallest weight.

• If two vertices of the edge belong to the different sets、the edge is the edge of the spanning tree, and merge the two sets to which these two vertices belong.

• Otherwise, discard the edge, and check the next smallest edge.

• This step is repeated until all the edges have gone.

Algorithm Behavior

1stiteration 2nditeration

3rditeration

sets merge

4thiteration 5thiteration Initial Sets

Not changed

Final MST merge

merge

weighted graph

(21)

12/13/02 <Algorithm Theory> 41

Complexity Analysis: MST Problem m=|E|, n=|V| where n < m ≤ n(n-1)/2

• Computational complexity for sorting

– Sort m elements m log m

• Computational complexity for set operations

– One set operation for n elements log n (Find vertices connected by a given edge and merge two sets)

m iterations of the set operation m log n

Total Computation Complexity O(mlog m) + O (mlog n) = O(mlog m) Space Complexity

Memory for graph O(m+n) O(m),

Working memory for set operations O(n) Total O(m) + O(n) O(m)

Complexity Analysis: Optimization Problem

• For n+1 natural number (a1, a2, a3, …, an, b), if there is a 0-1 vector x=(x1, x2, …, xn) that satisfies Σaixi=b, output yes, otherwise no.

– Example: x=(3, 7, 5, 8, 2), b=11

• Algorithm

– Represent x in n+1-bit number, and check throughout all the 0-1 combinations, e.g., from x=(000…0) to (111…1) . If there is a combination that satisfies the condition, output yes, otherwise no.

(22)

12/13/02 <Algorithm Theory> 43

Total computation complexity O(2n)× O(n) O(n 2n)

Space complexity O(n) for storing each of vectors a and x Total Space Complexity O(n)

Complexity Analysis: Optimization Problem

• Number of iterations to check throughout 000…0 to 111…1

• 2n+1 O(2n)

• Operation within one iteration: calculations for weighted sum and condition evaluation

– O(n)

Hard Problem!

Possible Solutions to Hard Problems

• Translate the hard problems to easy problems for which algorithms generating quasi-optimal solutions/output with polynomial complexityare available.

• Develop heuristic approaches that generate quasi-optimal solutions/output

– Neural Network – Genetic algorithm

• Develop parallel algorithms on parallel computers (brute- force!!)

(23)

12/13/02 <Algorithm Theory> 45

Neural Network

Cell Input

Output

Biological Neuron

Inputs

Output weight

Artificial Neuron

Classification Pattern Recognition Associative Memory

Trainingan artificial neural network with pairs of inputs and their correct outputs, instead of programming

Problem class where there is no good deterministic algorithm!!

Parallel Processing

Big Problem

Partitioning

Allocation

Interconnection Network P

Sub- problems

P P

Network Topologies

P: Processor

(24)

12/13/02 <Algorithm Theory> 47

Trend in Supercomputer Performance

2x/year

Single-CPU Performance, or Massive Parallelism ? (As of Nov. 02)

1 10 100 1000 10000 100000

0.0001 0.001 0.01 0.1 1 10

100MFLOPS 1GFLOPS

100GFLOPS 1TFLOPS

10TFLOPS 100TFLOPS 10GFLOPS

MasPar

NCube CM-2

IntelDelta

InteliPSC CM-5

Cray/SGI T3D

CP-PACS

SX-2 CRAY-3

Cray/SGI Y-MP

SX-4

SX-3 S-3800

VPP5000 SX-5 (Osaka) SX-7(Tohoku) ES(SX-6)

SP P2SC ASCI Red ASCI

white ASCI-Q

Linux Cluster IBMpSeries 690 SR8000

(Tokyo) ORIGIN 2000 ASCI Blue

Alpha SP3 SR2201 T3E

ASCI Purple in 05?

(25)

12/13/02 <Algorithm Theory> 49

Hard Problems are also Useful…

Example: Secure Networks with Cryptography

Sender Receiver

Encrypted Data Transmission Plain text

Encrypted text

key

Plain text

Encrypted text

key

Decryption without an encryption key leads to exponential computation complexity.

Secure data exchange

E-Commerce Intrusion by someone else is very hard!

Summary

• Complexity analysis of algorithms gives a clue to judge whether the algorithms are efficient or not.

– Time Complexity – Space Complexity

• Analysis of an algorithm is a cycle process of – Analyzing

– Estimating

– Refining the analysis until an answer to the desired level of detail is reached

• Design efficient algorithms with polynomial-time/space complexity of O(n), O(log n), O(n log n), or O(n2) !

References

Related documents

ROW ROW ROW ROW ROW ROW PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDC PRDCPRDC PRDC PRDC PRDCPRDC PRDC

Norway has not reported the following elements of the supplementary information required under Article 7, paragraph 2, of the Kyoto Protocol: information on what efforts Norway

considered this study especially useful for the desired purpose because Martin and col- leagues used latent indices scaled under the assumption of cross-national measurement

fresh properties of cement mortar using a large proportion of a nitrite-based accelerator and two types of high-range.. water-reducing agents, in a low-temperature

Preekscitacijski sindrom - WPW syndrom se na površinskom EKG-u karakteriše skra- ćenim PR intervalom, proširenim inicijal- nim delom QRS kompleksom (delta talas)

The reduced feed intake on the 0.25 % unripe noni powder and subsequent improvement on the 0.25 % ripe powder was not clear but palatability problems due to possible compos-

Individual effects of variation in seed tuber weight, sprout length and within-row spacing on COV of tuber size, COV of yield per plant and COV of number of tubers per plant at

In conclusion, the results presented in the thesis provide some evidence for a functional dissociation between the IL and the PL cortices. Lesions of the