• No results found

CODE OPTIMIZATION TECHNIQUES

N/A
N/A
Protected

Academic year: 2020

Share "CODE OPTIMIZATION TECHNIQUES"

Copied!
51
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

Factors affecting optimization

The architecture of the target CPU:

Number of CPU registers.

RISC vs CISC.

Pipelines.

Number of functional units.

The architecture of the machine:

Cache size .

Cache/Memory transfer rates.

The machine itself:

General purpose use.

Special-purpose use(Embedded systems).

General Purpose Operating System.

(3)

Code Optimization Techniques

We would discuss various code optimization techniques

which includes the following:

1. Dead Code Elimination

2. Constant Folding

3. Copy Propagation

4. Strength Reduction

5. Common Sub-Expression Elimination

6. Code Motion

(4)

Dead Code Elimination

Eliminates code that cannot be reached or where the results are not subsequently

used.

For example, consider the following code fragment:

int count

void foo() {

int i;

i = 1;

// dead code since it is not subsequently used

count = 1;

//dead code since it was overwritten

count = 2;

return;

count = 3;

//dead code(unreachable) since the function has returned

}

After applying dead code elimination we have the code below:

int count

void foo() {

(5)

Constant Folding

This refers to the technique of evaluating ate compile time, expressions whose

operands are known to be constant.

It involves the determining that all of the operands in an expression are constant

values, performing the evaluation of the expression at compile time and then

replacing the expression by its value. For example, the expression

12 + 4 * 3

can be replaced by its result of 24 at compile time and omit the code as if the

(6)

Constant Propagation

In constant propagation, if a variable is assigned a constant value, then

subsequent use of that variable can be replaced by a constant as long as no

intervening assignment has changed the value of the variable.

For Example, consider the code:

int x = 12;

int y = 7 – x /2;

return y * (24 / x + 2)

Applying constant propagation to x, we have:

int x = 12;

int y = 7 – 12 / 2;

return y * (24 / 12 + 2);

Applying constant folding , we have:

(7)

Strength Reduction

This is also called operator strength reduction is the replacement of

expressions that are expensive with cheaper and simple ones.

Fore example an add instruction can be used to replace a multiply instruction.

The code:

T2 = T2 * 2

Can be replaced with:

(8)

Common Sub-Expression Elimination

This is a code optimization technique that scans the code to find identical expressions

and replaces redundant expression each time it is encountered.

For example, consider the following fragment:

a = b * c + g;

d = b * c * e;

Here we see that the sub-expression b * c repeats, so we can transform the code to:

t = b * c;

a = t + g;

(9)

Code Motion

Also called loop-invariant code motion has to do with moving a block of code

outside a loop if it wont have any difference if it is executed outside or inside the

loop.

Consider the example:

for (int i = 0; i < n; i++) {

x = y + z;

a[i] = 6 * i;

}

In the code fragment, the expression x = y + z has no effect inside the loop and can safely be

moved outside of the loop.

The resulting code would be:

x = y + z;

(10)

Inlining

This is also referred to as function inlining or inline expansion, is a technique of

replacing a function call with the actual body of the function.

This technique eliminates the overhead associated with expanding the body of

the function inline.

Consider the fragment:

int add ( int x, int y)

{

z = x + y;

return z;

}

int sub (int x, int y) {

return add(x, -y)

}

We can expand the second function without calling he add function, so we have:

(11)

Is it faster to count down than it is to

count up?

for (i = N; i >= 0; i--)

putchar('*');

is better than:

(12)

Is it faster to count down than it is to

count up?

Counting from N down to 0 is slightly faster that Counting from 0 to N

in the sense of how hardware will handle comparison..

Note the

comparison

in each loop

i>=

0

i<N

Most processors have comparison with zero instruction..

so the first one will be translated to machine code as:

1.Load i

2.Compare and jump if Less than or Equal zero

But the second one needs to load N form Memory each time

1.load i

2.load N

3.Sub i and N

4.Compare and jump if Less than or Equal zero

So it is not because of counting down or up..

(13)

Examples

t1 = t1 + 1

//remove this(dead code elimination)

L0:t2 = 0

t3 = t1 * 8 + 1

t4 = t3 + t2

//remove this (copy propagation

t5 = t4 * 4

//(t3+t2)*4, then remove preceding line

t6 = t5

t7 = FP + t3

//remove this (dead code elimiation)

*t7 = t2

//replace with t7=0(constant propagation)

t8 = t1

//remove this (dead code elimination)

if(t8>0) goto L1

//t8 is surely < 8 (constant folding)

L1: goto L0

L2: t1 = 1

t10 = 16

t11 = t1 * 2

//Change to t1 + t1 (Strength reduction)

(14)

Code after Optimization

L0:t2 = 0

t3 = t1 * 8 + 1

t5 = (t3+t2)*4

t6 = t5

(15)

Java codes

The Optimized code:

for(i=0;i<10;i++) {

System.out.println(i*10); }

char x; int y; y = x;

i = 1; count = 1; count = 2;

float f[100]=new float[100]; float sum=0;

for(i=0;i<100;i++) { sum += f[i];

}

int i, sum = 0;

for (i = 1; i <= N; ++i) { sum += i;

}

(16)
(17)
(18)

Rewrite the following loop using Do…

While structure:

for(i=0;i<100;i++)

{

(19)

Explain what type of code optimization that can be done on the following code:

public void selectionmethod()

{

int a = getValue();

if(a > 0)

{

int b = c + 10;

int d = b * 2;

System.out.println(d);

}

else if( a < 30 )

{

int b = c + 10;

int d = b * 2;

System.out.println(d);

}

(20)

Rewrite the following code in an optimization form:

public void emptymethod()

{

final int MIN = 0;

int i = 100;

if (i < MIN)

{

}

(21)

Rewrite the following code to remove unnecessary structure:

public void ifmethod()

{

if (true)

{

//Some Code ...

}

if (!true)

{

//Some Code ...

}

(22)

Rewrite the following code to remove unnessary If

staement(s):

public boolean test(String value)

{

if(value.equals("Hello"))

{

return true;

}

else

{

return false;

}

(23)

Rewrite the following code to avoid method call in

the loop:

public void method()

{

String str = "Good morning";

for (int i = 0; i < str.length(); i++)

{

i++;

}

(24)
(25)

Data Access Optimizations: Data access

optimizations are code transformations, which

change the order in which iterations in a loop

nest are executed. The goal of these

transformations is mainly to improve

temporal locality. Moreover, they can also

expose parallelism and make loop iterations

vectorizable. Note that the data access

optimizations we present in this section

maintain all data dependencies and do not

change the results of the numerical

(26)

Data Layout Optimizations: Data layout

optimizations modify how data structures and

variables are arranged in memory. These

transformations aim at avoiding effects like cache

conflict misses and false sharing. They are further

intended to improve the spatial locality of a code.

Data layout optimizations include changing base

addresses of variables, modifying array sizes,

(27)

Java codes

The Optimization Techniques:

The Optimized code:

double sum =0;

double a[]=new double[1024];

double b[]=new double[1024];

for(int i=1; i<=1023;i++)

sum+=a[i]*b[i];

for(int i=1; i<=n;i++)

for(int j=1; j<=n;j++)

a[i][j]= b[i][j];

for(int i=1; i<=n;i++)

b[i]= a[i] +1.0;

(28)

Java codes

The Optimization Techniques:

The Optimized code:

double sum =0;

double a[][]=new double[n][n];

for(int i=1; i<=n;i++)

for(int j=1; j<=n;j++)

sum+=a[i][j];

double a[]=new double[1024];

double b[]=new double[1024];

(29)

Example: Accessing A Set-Associative Cache

Mem Location / Cashe

A[0][0] ...

A[0][1] …..

A[0][2] …..

A[0][3] …..

A[1][0] …..

A[1][1] ……

A[1][2] ……

A[1][3] ……

A[2][0] …..

A[2][1] …..

A[2][2] …..

A[2][3] ……

A[3][0] …..

A[3][1] ……

A[3][2] ……

A[3][3] …..

double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

(30)

Example: Accessing A Set-Associative Cache

double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

for(int i=1; i<=n;i++) sum+=a[i][j];

Mem Location / Cashe

A[0][0] ...

A[0][1] …..

A[0][2] …..

A[0][3] …..

A[1][0] …..

A[1][1] ……

A[1][2] ……

A[1][3] ……

A[2][0] …..

A[2][1] …..

A[2][2] …..

A[2][3] ……

A[3][0] …..

A[3][1] ……

A[3][2] ……

A[3][3] …..

Accessing Sequence:

(31)

Example: Accessing A Set-Associative Cache

Mem Location / Cashe

A[0][0] ...

A[0][1] …..

A[0][2] …..

A[0][3] …..

A[1][0] …..

A[1][1] ……

A[1][2] ……

A[1][3] ……

A[2][0] …..

A[2][1] …..

A[2][2] …..

A[2][3] ……

A[3][0] …..

A[3][1] ……

A[3][2] ……

A[3][3] …..

Current accessing: Page A[0][0]

Miss!

double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

(32)

Example: Accessing A Set-Associative Cache

A[0][0] ...

A[0][1] …..

A[0][2] …..

A[0][3] …..

A[1][0] …..

A[1][1] ……

A[1][2] ……

A[1][3] ……

A[2][0] …..

A[2][1] …..

A[2][2] …..

A[2][3] ……

A[3][0] …..

A[3][1] ……

A[3][2] ……

A[3][3] …..

Current accessing: Page A[0][0]

Miss!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

(33)

Example: Accessing A Set-Associative Cache

A[0][0] ...

A[0][1] …..

A[0][2] …..

A[0][3] …..

A[1][0] …..

A[1][1] ……

A[1][2] ……

A[1][3] ……

A[2][0] …..

A[2][1] …..

A[2][2] …..

A[2][3] ……

A[3][0] …..

A[3][1] ……

A[3][2] ……

A[3][3] …..

Current accessing: Page A[1][0]

Miss!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

(34)

Example: Accessing A Set-Associative Cache

A[0][0] ... A[0][1] ….. A[0][2] ….. A[0][3] ….. A[1][0] ….. A[1][1] …… A[1][2] …… A[1][3] …… A[2][0] ….. A[2][1] ….. A[2][2] ….. A[2][3] …… A[3][0] ….. A[3][1] …… A[3][2] …… A[3][3] ….. Current accessing: Page A[1][0] Miss!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

A[1][0],A[1][1], A[1][2], A[1][3] double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

(35)

Example: Accessing A Set-Associative Cache

A[0][0] ... A[0][1] ….. A[0][2] ….. A[0][3] ….. A[1][0] ….. A[1][1] …… A[1][2] …… A[1][3] …… A[2][0] ….. A[2][1] ….. A[2][2] ….. A[2][3] …… A[3][0] ….. A[3][1] …… A[3][2] …… A[3][3] ….. Current accessing: Page A[2][0] Miss!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

A[1][0],A[1][1], A[1][2], A[1][3] double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

(36)

Example: Accessing A Set-Associative Cache

A[0][0] ... A[0][1] ….. A[0][2] ….. A[0][3] ….. A[1][0] ….. A[1][1] …… A[1][2] …… A[1][3] …… A[2][0] ….. A[2][1] ….. A[2][2] ….. A[2][3] …… A[3][0] ….. A[3][1] …… A[3][2] …… A[3][3] ….. Current accessing: Page A[2][0] Miss!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

A[1][0],A[1][1], A[1][2], A[1][3]

A[2][0],A[2][1], A[2][2], A[2][3]

double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

(37)

Example: Accessing A Set-Associative Cache

A[0][0] ... A[0][1] ….. A[0][2] ….. A[0][3] ….. A[1][0] ….. A[1][1] …… A[1][2] …… A[1][3] …… A[2][0] ….. A[2][1] ….. A[2][2] ….. A[2][3] …… A[3][0] ….. A[3][1] …… A[3][2] …… A[3][3] ….. Current accessing: Page A[3][0] Miss! Cashe Full

Delete the first Block

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

A[1][0],A[1][1], A[1][2], A[1][3]

A[1][0],A[1][1], A[1][2], A[1][3]

double sum =0;

double a[][]=new double[n][n]; for(int i=1; i<=n;i++)

(38)

Example: Accessing A Set-Associative Cache

A[0][0] ... A[0][1] ….. A[0][2] ….. A[0][3] ….. A[1][0] ….. A[1][1] …… A[1][2] …… A[1][3] …… A[2][0] ….. A[2][1] ….. A[2][2] ….. A[2][3] …… A[3][0] ….. A[3][1] …… A[3][2] …… A[3][3] ….. Current accessing: Page A[3][0] Miss! Cashe Full

Delete the first Block Now get the data

Mem Location / Cashe

A[1][0],A[1][1], A[1][2], A[1][3]

A[1][0],A[1][1], A[1][2], A[1][3]

double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

(39)

Example: Accessing A Set-Associative Cache

A[0][0] ... A[0][1] ….. A[0][2] ….. A[0][3] ….. A[1][0] ….. A[1][1] …… A[1][2] …… A[1][3] …… A[2][0] ….. A[2][1] ….. A[2][2] ….. A[2][3] …… A[3][0] ….. A[3][1] …… A[3][2] …… A[3][3] ….. Current accessing: Page A[0][1] Miss! Cashe Full

Delete the second Block

Mem Location / Cashe

A[3][0],A[3][1], A[3][2], A[3][3]

A[1][0],A[1][1], A[1][2], A[1][3]

A[1][0],A[1][1], A[1][2], A[1][3]

double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

(40)

Example: Accessing A Set-Associative Cache

A[0][0] ... A[0][1] ….. A[0][2] ….. A[0][3] ….. A[1][0] ….. A[1][1] …… A[1][2] …… A[1][3] …… A[2][0] ….. A[2][1] ….. A[2][2] ….. A[2][3] …… A[3][0] ….. A[3][1] …… A[3][2] …… A[3][3] ….. Current accessing: Page A[0][1] Miss! Cashe Full

Delete the second Block

Now get the data

Mem Location / Cashe

A[3][0],A[3][1], A[3][2], A[3][3]

A[0][0],A[0][1], A[0][2], A[0][3]

A[1][0],A[1][1], A[1][2], A[1][3]

OPPS

Always Cashe Miss

double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

(41)

Example: Accessing A Set-Associative Cache

Mem Location / Cashe

A[0][0] ...

A[0][1] …..

A[0][2] …..

A[0][3] …..

A[1][0] …..

A[1][1] ……

A[1][2] ……

A[1][3] ……

A[2][0] …..

A[2][1] …..

A[2][2] …..

A[2][3] ……

A[3][0] …..

A[3][1] ……

A[3][2] ……

A[3][3] …..

double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

(42)

Example: Accessing A Set-Associative Cache

Mem Location / Cashe

A[0][0] ...

A[0][1] …..

A[0][2] …..

A[0][3] …..

A[1][0] …..

A[1][1] ……

A[1][2] ……

A[1][3] ……

A[2][0] …..

A[2][1] …..

A[2][2] …..

A[2][3] ……

A[3][0] …..

A[3][1] ……

A[3][2] ……

A[3][3] …..

Accessing Sequence: …, A[0][2], A[0][1], A[0][0]

double sum =0;

double a[][]=new double[n][n]; for(int j=1; j<=n;j++)

(43)

Example: Accessing A Set-Associative Cache

Mem Location / Cashe

A[0][0] ...

A[0][1] …..

A[0][2] …..

A[0][3] …..

A[1][0] …..

A[1][1] ……

A[1][2] ……

A[1][3] ……

A[2][0] …..

A[2][1] …..

A[2][2] …..

A[2][3] ……

A[3][0] …..

A[3][1] ……

A[3][2] ……

A[3][3] …..

Current accessing: Page A[0][0]

Miss!

double sum =0;

double a[][]=new double[n][n]; for(int i=1; i<=n;i++)

(44)

Example: Accessing A Set-Associative Cache

A[0][0] ...

A[0][1] …..

A[0][2] …..

A[0][3] …..

A[1][0] …..

A[1][1] ……

A[1][2] ……

A[1][3] ……

A[2][0] …..

A[2][1] …..

A[2][2] …..

A[2][3] ……

A[3][0] …..

A[3][1] ……

A[3][2] ……

A[3][3] …..

Current accessing: Page A[0][0]

Miss!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

double sum =0;

double a[][]=new double[n][n]; for(int i=1; i<=n;i++)

(45)

Example: Accessing A Set-Associative Cache

A[0][0] ...

A[0][1] …..

A[0][2] …..

A[0][3] …..

A[1][0] …..

A[1][1] ……

A[1][2] ……

A[1][3] ……

A[2][0] …..

A[2][1] …..

A[2][2] …..

A[2][3] ……

A[3][0] …..

A[3][1] ……

A[3][2] ……

A[3][3] …..

Current accessing: Page A[0][1]

Hit!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

double sum =0;

double a[][]=new double[n][n]; for(int i=1; i<=n;i++)

(46)

Example: Accessing A Set-Associative Cache

A[0][0] ...

A[0][1] …..

A[0][2] …..

A[0][3] …..

A[1][0] …..

A[1][1] ……

A[1][2] ……

A[1][3] ……

A[2][0] …..

A[2][1] …..

A[2][2] …..

A[2][3] ……

A[3][0] …..

A[3][1] ……

A[3][2] ……

A[3][3] …..

Current accessing: Page A[0][2]

Hit!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

double sum =0;

double a[][]=new double[n][n]; for(int i=1; i<=n;i++)

(47)

Example: Accessing A Set-Associative Cache

A[0][0] ...

A[0][1] …..

A[0][2] …..

A[0][3] …..

A[1][0] …..

A[1][1] ……

A[1][2] ……

A[1][3] ……

A[2][0] …..

A[2][1] …..

A[2][2] …..

A[2][3] ……

A[3][0] …..

A[3][1] ……

A[3][2] ……

A[3][3] …..

Current accessing: Page A[0][3]

Hit!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

double sum =0;

double a[][]=new double[n][n]; for(int i=1; i<=n;i++)

(48)

Example: Accessing A Set-Associative Cache

A[0][0] ... A[0][1] ….. A[0][2] ….. A[0][3] ….. A[1][0] ….. A[1][1] …… A[1][2] …… A[1][3] …… A[2][0] ….. A[2][1] ….. A[2][2] ….. A[2][3] …… A[3][0] ….. A[3][1] …… A[3][2] …… A[3][3] ….. Current accessing: Page A[1][0] Miss!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

A[1][0],A[1][1], A[1][2], A[1][3] double sum =0;

double a[][]=new double[n][n]; for(int i=1; i<=n;i++)

(49)

Example: Accessing A Set-Associative Cache

A[0][0] ... A[0][1] ….. A[0][2] ….. A[0][3] ….. A[1][0] ….. A[1][1] …… A[1][2] …… A[1][3] …… A[2][0] ….. A[2][1] ….. A[2][2] ….. A[2][3] …… A[3][0] ….. A[3][1] …… A[3][2] …… A[3][3] ….. Current accessing: Page A[1][1] Hit!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

A[1][0],A[1][1], A[1][2], A[1][3] double sum =0;

double a[][]=new double[n][n]; for(int i=1; i<=n;i++)

(50)

Example: Accessing A Set-Associative Cache

A[0][0] ... A[0][1] ….. A[0][2] ….. A[0][3] ….. A[1][0] ….. A[1][1] …… A[1][2] …… A[1][3] …… A[2][0] ….. A[2][1] ….. A[2][2] ….. A[2][3] …… A[3][0] ….. A[3][1] …… A[3][2] …… A[3][3] ….. Current accessing: Page A[1][2] Hit!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

A[1][0],A[1][1], A[1][2], A[1][3] double sum =0;

double a[][]=new double[n][n]; for(int i=1; i<=n;i++)

(51)

Example: Accessing A Set-Associative Cache

A[0][0] ... A[0][1] ….. A[0][2] ….. A[0][3] ….. A[1][0] ….. A[1][1] …… A[1][2] …… A[1][3] …… A[2][0] ….. A[2][1] ….. A[2][2] ….. A[2][3] …… A[3][0] ….. A[3][1] …… A[3][2] …… A[3][3] ….. Current accessing: Page A[1][3] Hit!

Mem Location / Cashe

A[0][0],A[0][1], A[0][2], A[0][3]

A[1][0],A[1][1], A[1][2], A[1][3]

WOW

Always Cashe Hit

double sum =0;

double a[][]=new double[n][n]; for(int i=1; i<=n;i++)

References

Related documents

A consequence of this perspective of woman directors being appointed as part of the symbolic management of the independence of the board is that, when these directors lose

If the fibrous outer ring tears, disc tissue can protrude into the spinal canal (slipped disc or prolapse).. This leads to compression of the passing nerve endings, which

All stationary perfect equilibria of the intertemporal game approach (as slight stochastic perturbations as in Nash (1953) tend to zero) the same division of surplus as the static

EXAMPLE: Match the graph of each function in (a)-(d) with the graph of its derivative in I-IV.. Give reasons for

Key words: Ahtna Athabascans, Community Subsistence Harvest, subsistence hunting, GMU 13 moose, Alaska Board o f Game, Copper River Basin, natural resource management,

By first analysing the image data in terms of the local image structures, such as lines or edges, and then controlling the filtering based on local information from the analysis

The Nortel Switched Firewall is a key component in Nortel's layered defense strategy, and is certified under the Check Point Open Platform for Security (OPSEC) criteria and enhances

11 In addition to the Provisional Regulations on Domain Name Registration, China has draft Regulations on Domain Name Registration Dispute Resolution (“Draft Regulations”).