• No results found

GATE (CSE)_ADA & DSA.pdf

N/A
N/A
Protected

Academic year: 2021

Share "GATE (CSE)_ADA & DSA.pdf"

Copied!
292
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

• The logical and mathematical model of a particular organization of data is called a Data Structure.

• Data Structure generally specify the following things:

a).Organization of Data b).Accessing Methods c).Degree of Associative

d).Processing Alternatives for Information.

(3)

Major Classification is

:

•Linear Data Structure

•Non-Linear Data Structure

Linear :

•the values are arranged in linear fashion. •that means in sequence

•Eg.-Arrays, linked lists, stacks, queues, etc.

Non-Linear :

•this type is opposite to linear.

•The data values are not arranged in order. •Eg- Trees, Graphs, Table, Sets, etc.

(4)

• Homogenous

• Non-homogenous

• Dynamic

• Static

(5)

The Data Structure is a collection of different data types.

Major Classification

of data types is:

• Primitive Data type

• Non- Primitive Data type

(6)

Primitive:

These are the basic data types defined by a computer language itself. It generally represents a single valued data.

w.r.f. of C/ C++, there are 4 primitive data types: • Integer

• Floating Points • Characters

• Void

Non-Primitive:

These are basically derived and User defined data types.

• Ex- Arrays, Structure, Union, Class, Files, Stack, Queue, Graphs, Trees,etc.

(7)

An Array is a collection of elements having following properties:

1. It is collection of similar data type.

2. It is consecutive (sequential) set of memory locations. 3. Referred by single variable name.

4. Elements are accessed by its index value. Types of array:

• Single Dimensional, and • Multidimensional.

(8)

Advantages:

• Retrieval of stored elements is efficient using index value.

• Searching

technique

is very simply.

Disadvantages:

• Insertion & deletion at random location are complicated.

• For storing data, large continuous free block of memory is required.

• Memory fragmentation occurs if remove the elements randomly.

(9)

• A structure is a group of items in which item is identified by its own identifier, each of which is known as a member of the structure.

• Thus structure is a collection of different items of various data types under a unique name.

• Syntax of structure in C is as under:

struct name { type1 data1; type2 data2; . . typen datan; };

• The memory requirement of a structure is the summation of size of all data members.

(10)

•Unions are very similar to structures except the way of member data is stored.

•In union, the members are sharing the common memory location. Thus, it is used to save memory.

• Syntax of union in C is as under: union name { type1 data1; type2 data2; . . typen datan; };

•The memory requirement of a union is the size of largest data member.

(11)

DOS.h

union REGS

{

struct WORDREGS x;

struct BYTEREGS h;

};

Union…..

(12)

• A function is a set instruction to carryout a particular task.

• After its execution it returns a single value.

• We can also pass some parameters to a function. • Classification:

•Standard Functions (library / built-in) •User-defined Functions

(13)
(14)

• An algorithm is a computational method for solving a problem.

• It is a sequence of steps that take us from the input to the output.

An algorithm must be

• Correct: It should provide a correct solution according to the specifications.

• Finite: It should terminate.

• General: It should work for every instance of a problem

• Efficient: It should use few resources (such as time or memory).

(15)

• It is the study of their

efficiency

.

• It determine the amount of resources necessary

to execute it.

• Most algorithms are designed to work with

inputs of arbitrary length.

• Quantifying the resources required.

(16)

• Measures of resource utilization (efficiency):

– Execution time time complexity

– Memory space space complexity

• Observation :

– The larger the input data the more the resource requirement:

– Complexities are functions of the amount of

input data (input size).

(17)

Space Complexity

• Space complexity is defined as the

amount of

memory

a program needs to run to completion.

• Space Complexity

is-=

Instruction space

+

Data space +

(18)

Time Complexity

• Time complexity is the

amount of computer time

a

program needs to run.

• How do we measure?

1. Count a particular operation (operation counts)

2. Count the number of steps(step counts)

(19)

Asymptotic Notation

• Describes the behavior of the time or space

complexity for large instance characteristic

• Major Notations are

– Big Oh (O) notation provides an upper bound for the function

– Omega (Ω) notation provides a lower-bound

– Theta ( ) notation is used when an algorithm can be

bounded both from above and below by the same

(20)

Upper Bounds-

Big Oh (O)

• Time complexity T(n) is a function of the problem size n.

• The value of T(n) is the running time of the algorithm in the worst case, i.e., the number of steps it requires at most with an arbitrary input.

• The order is denoted by a complexity class using the Big Oh (O) notation.

• Definition: f(n) = O(g(n)) (read as “f(n) is Big Oh of g(n)”),

iff positive constants c and n0 exist such that f(n) ≤ cg(n) for all n, n ≥ n0.

• That is, O(g) comprises all functions f, for which there exists a constant c and a number n0, such that f(n) is smaller or equal

(21)

Big Oh (O)……….

(22)

Big Oh

Examples

• Bubble Sort :T(n) =O(n

2

).

• Linear Search: T(n) =O(n).

• 2n

2

=O(n

2

).

• 7n

2

+5n+1000 =O(n

2

).

(23)

Lower Bounds-

Omega (Ω)

• for the problem the lower bound is,

a certain number of steps that every algorithm has to execute at least in order to solve the problem.

• Definition: f(n) = Ω(g(n)) (read as “f(n) is omega of g(n)”)

iff positive constants c and

n0 exist such that f(n) ≥ cg(n) for all n, n ≥ n0.

• That is, Ω(g) comprises all functions f, for which there exists a constant c and a number n0, such that f(n) is greater or equal to c·g(n) for all n ≥ n0.

(24)

Omega (Ω)……..

(25)

Omega (Ω) Examples

Linear Search : T(n) = Ω(1).

Bubble Sort :T(n) = Ω(n).

(26)

Tightly Bound-

Theta ( )

• Used when the function f can be bounded both from

above and below by the same function g.

• Definition: f(n) = (g(n)) (read as “f(n) is theta of g(n)”) iff positive constants c1, c2 and n0 exist such that c1g(n) ≤ f(n) ≤ c2g(n) for all n, n ≥ n0.

• That is, f lies between c1 times the function g and c2 times the function g, except possibly when n is smaller than n0.

• Theta ( ) Examples:

– Find Max / Min: T(n) = (n)

(27)

Theta ( )………

(28)

Relations b/w Ω, ,O

Theorem :

For any two functions g(n) and f(n),

f(n) = (g(n)) iff

(29)

Relations b/w

Ω, ,O………..

i.e., (g(n)) = O(g(n)) ∩ Ω (g (n))

 In practice,

asymptotically tight bounds are obtained from asymptotic upper and lower bounds.

(30)

Common Growth Rate Functions

• 1 (constant): growth is independent of the problem size n.

• log2N (logarithmic): growth increases slowly compared to the

problem size (binary search)

• N (linear): directly proportional to the size of the problem.

• N * log2N (n log n): typical of some divide and conquer approaches

(merge sort)

• N2 (quadratic): typical in nested loops

• N3 (cubic): more nested loops

(31)

Practical Complexities

logn

n

nlogn

n

2

n

3

2

n

0

1

2

3

4

5

7

1

2

4

8

16

32

100

0

2

8

24

64

160

700

1

4

16

64

256

1024

10000

1

8

64

512

4096

32768

1000000

2

4

16

256

65536

4294967296

1267650600228

2294014967032

05376

(32)
(33)
(34)

• A method of programming whereby a function directly or indirectly calls itself.

• Recursion is often presented as an alternative to iteration. • A function performed a task by calling itself repeatedly. • We can use the recursive function only where we want to

perform some task with the help of same set of statements repeatedly.

• The data structure used by recursion is stack.

(35)

Recursion for finding factorial of given number. int fact(int x) { if(x==0) return(1); else return(x * fact(x-1)); }

Example of Recursion

(36)

Recursion –

how it works?

• To see how the recursion works, let‟s break down

the factorial function to solve factorial(3)

(37)

Recursion Tree

• A tree representation of recursion calls.

• A method to analyze the

complexity of an

algorithm

by

diagramming

the

recursive

function calls.

(38)

How to Build a Recursion Tree ?

• root = the initial call

• Each node = a particular call

Each new call becomes a child of the node

that called it

• A tree branch (solid line) = a call-return path

between any 2 call instances

(39)

Example - Fibonacci Numbers

Fibonacci (n)

IF (n <= 1)

RETURN n

ELSE

RETURN Fibonacci (n-1) + Fibonacci (n-2)

(40)
(41)

Types of Recursive Functions

• Direct Recursion.

• Indirect Recursion.

• Linear Recursion

• Tail Recursion

• Binary Recursion

• Exponential Recursion

• Nested Recursion

• Mutual Recursion

(42)

int

factorial (int x)

{

if (x==0)

return(1);

else

return (x *

factorial(x-1)

);

}

Direct Recursion

(43)

void fun1() { static i=0; if (i<5) fun2(); } void fun2() {

printf ("Recursion from fun2 to fun1 which is indirect recursion\n"); fun1(); } main() { fun1();

Indirect Recursion

(44)

• A linear recursive function is a function that

only makes a

single call

to itself each time the

function runs.

– (as opposed to one that would call itself multiple times during its execution).

• thus if we were to draw out the recursive calls,

we would see a straight, or linear, path

.

• The factorial function is a good example of

linear recursion.

(45)

//C++

int factorial (int n)

{

if ( n == 0 )

return 1;

return

n * factorial(n-1);

// or factorial(n-1) * n

}

(46)
(47)

• A recursive procedure where the recursive call is the last action to be taken by the function.

• Tail recursive functions are generally easy to transform into iterative functions.

• Tail recursion is a form of linear recursion.

• Often, the value of the recursive call is returned.

• A tail recursive function is one where every recursive call is the last thing done by the function before returning and thus produces the function’s value.

(48)

• As such, tail recursive functions can often be easily implemented in an iterative manner;

• by taking out the recursive call and replacing it with a loop, the same effect can generally be achieved.

• A good compiler can recognize tail recursion and

convert it to iteration in order to optimize the performance of the code.

(49)

• to compute the GCD, or Greatest Common Denominator, of two numbers:

//C++

int gcd(int m, int n) { int r; if (m < n) return gcd(n,m); r = m%n; if (r == 0) return(n); else return(gcd(n,r)); }

Tail Recursion….

(50)

• Is the factorial method a tail recursive method? int fact(int x) { if (x==0) return 1; else return x*fact(x-1); }

• When returning back from a recursive call, there is still one pending operation, multiplication.

• Therefore, factorial is a non-tail recursive method.

(51)

• Is this method tail recursive? void fun1(int i) { if (i>0) { printf(“%d “, i); fun1(i-1); } It is tail recursive.

Tail Recursion….

(52)

• Is the following program tail recursive? void prog(int i) { if (i>0) { prog(i-1); printf(“%d “, i); prog(i-1); } }

• No, because there is an earlier recursive call, other than the last one,

• In tail recursion, the recursive call should be the last statement, and

there should be no earlier recursive calls whether direct or indirect.

(53)

Advantage of Tail Recursive Method

• Tail Recursive methods are easy to convert to iterative.

• Smart compilers can detect tail recursion and convert it to iterative to optimize code

• Used to implement loops in languages that do not support loop structures explicitly (e.g. prolog)

void tail(int i){ if (i>0) { printf(“%d “, i); tail(i-1) } }

void iterative (int i) {

for (; i>0 ; i--)

printf(“%d “, i); }

(54)

• A recursive function which calls itself twice during the course of its execution.

• The mathematical combinations operation is a good

example of a function that can quickly be

implemented as a binary recursive function. The number of combinations, often represented as nCk where we are choosing n elements out of a set of k elements.

(55)

//C++

int choose (int n, int k)

{

if (k == 0 || n == k)

return(1);

else

return(choose(n-1,k) + choose(n-1,k-1));

}

(56)

• Recursion where more than one call is made to the function from within itself. This leads to exponential growth in the number of recursive calls.

• An exponential recursive function is one that, if you were to draw out a representation of all the function calls, would have an exponential number of calls in relation to the size of the data set

– (exponential meaning if there were n elements, there would be O(an) function calls where a is a positive number).

• A good example an exponentially recursive function is a function to compute all the permutations of a data set.

(57)

void print_array(int arr[], int n) {

int i;

for(i=0; i<n; i++)

printf("%d ", arr[i]); printf("\n");

}

Exponential Recursion… -

Example

void print_permutations(int arr[], int n, int i) { int j, swap; print_array(arr, n); for(j=i+1; j<n; j++) { swap = arr[i]; arr[i] = arr[j]; arr[j] = swap; print_permutations(arr, n, i+1); swap = arr[i]; arr[i] = arr[j]; arr[j] = swap; } }

(58)

• In nested recursion, one of the arguments to the recursive function is the recursive function itself.

• These functions tend to grow extremely fast.

• A good example is the classic mathematical function, Ackermann's function.

• It grows very quickly (even for small values of x and y, Ackermann(x,y) is extremely large) and it cannot be computed with only definite iteration (a completely defined for() loop for example); it requires indefinite iteration (recursion, for example).

(59)

//C++

int ackerman(int m, int n) { if (m == 0) return(n+1); else if (n == 0) return(ackerman(m-1,1)); else return(ackerman(m-1,ackerman(m,n-1))); }

(60)

• A recursive function doesn't necessarily need to call itself.

• Some recursive functions work in pairs or even larger groups.

• For example, function A calls function B which calls function C which in turn calls function A.

• A simple example of mutual recursion is a set of function to determine whether an integer is even or odd. How do we know if a number is even? Well, we know 0 is even. And we also know that if a number n is even, then n - 1 must be odd. How do we know if a number is odd? It's not even!

(61)

//C++

int is_even(unsigned int n) { if (n==0) return 1; else return(is_odd(n-1)); }

int is_odd(unsigned int n) {

return (!is_even(n)); }

(62)

Recursions…….

• Recursion is a powerful problem-solving technique that often produces very clean solutions to even the most complex problems.

• Recursive solutions can be easier to understand and to describe than iterative solutions.

• Recursion works the best when the algorithm and/or data structure that is used naturally supports recursion.

• One such data structure is the tree, One such algorithm is the binary search algorithm.

(63)

• Recursive solutions may involve extensive overhead because they use calls.

• When a call is made, it takes time to build a stackframe and push it onto the system stack.

• Conversely, when a return is executed, the stackframe must be popped from the stack and the local variables reset to their previous values – this also takes time.

• In general, recursive algorithms run slower than their iterative counterparts.

• Also, every time we make a call, we must use some of the memory resources to make room for the stackframe.

(64)

• Therefore, if the recursion is deep, say, factorial(1000), we may run out of memory.

• Because of this, it is usually best to develop iterative algorithms when we are working with large numbers.

(65)

PROS

– Clearer logic

– Often more compact code – Often easier to modify

– Allows for complete analysis of runtime performance

CONS

– Overhead costs – Time Consuming

– Additional Memory Requirement.

(66)

Sample Question-1

Consider following recursive function in C/ C++

int func (int n)

{

static int i=1;

if (n>=5) return n;

n=n + 1;

i++;

int x = func (n);

Stmt : return n;

}

(67)

1. The value returned by func(1)

is-A. 5 B. 6 C. 7 D. 2

2. How many times does the Stmt will execute in the above code?

A. 1 B. 5 C. 4 D. 0

3. What will be the final value of i.

A. 1 B. 5 C. 4 D. 0

(68)

4. What would be the final value of i if we remove the keyword static from the code.

A. 1 B. 2 C. 3 D. 4 E. 5

5. The value returned by func(5)

is-A. 5 B. 6 C. 7 D. 2

(69)

1. 2 (D) 2. 4 (C) 3. 5 (B) 4. 2 (B) 5. 5 (A)

(70)

Consider following recursive function in C/ C++

int func(int a, int b) {

if(b==0)

return(a); else

return(1+ func (a, b-1)); }

(71)

1. The value returned by func(2,3) is-A. 5 B. 1 C. 3 D. 2 E. None

2. The value returned by func(1,2)

is-A. 5 B. 1 C. 3 D. 2 E. none

Questions…..

1. 5 (A) 2. 3(C)

(72)

Find the output of following code of C/ C++

void f(void) { int s = 0; s++; if (s == 10) return; printf("%d ", s); f( ); } int main(void) { f( ); return 0; }

Sample Question-3

1 1 … Infinite

(73)

Find the output of following code of C/ C++

void f(void) { static int s = 0; s++; if (s == 10) return; printf("%d ", s); f( ); } int main(void) { f( ); return 0; }

Sample Question-4

1 2 3 4 5 6 7 8 9

(74)

Find the output of following code of C/ C++

void f(int i) { if( i < 10) { f( i + 1 ); printf("%d ", i); } } int main(void) { f( 0 ); return 0; }

Sample Question-5

9 8 7 6 5 4 3 2 1 0

(75)

Find the output of following code of C++

void func6()

{

char ch;

cout << "Enter a character ('.' to end program): "; cin >> ch; if (ch != '.') { func6(); cout << ch; } } void main() { func6(); cout << "\n";

Sample Question-6

For Input Chars : a b c d e f . Output Chars: fedcba

(76)

In

Tower of Hanoi

problem, how many disk

moves are required to shift 5 disks.

A. 27 B. 32 C. 31 D. 28

Sample Question-7

31 (C)

(77)

Legend has it that there were three diamond

needles set into the floor of the temple of Brahma

in Hanoi.

Stacked upon the leftmost needle were 64 golden

disks, each a different size, stacked in concentric

order:

(78)

The priests were to transfer the disks from the

first needle to the second needle, using the third

as necessary.

• But they could only move one disk at a time, and could

never put a larger disk on top of a smaller one.

• When they completed this task, the world would end!

(79)

Basis: What is an instance of the problem that is

trivial?

for

n == 1

Since this base case could occur when the disk is

on any needle, we simply output the instruction

to move the top disk from src to dest.

(80)

Basis: What is an instance of the problem that is

trivial?

For

n == 1

Since this base case could occur when the disk

is on any needle, we simply output the

instruction to move the top disk from src to

(81)

Induction Step: n > 1

® How can recursion help us out?

a. Recursively move n-1 disks from src to aux.

(82)

Induction Step: n > 1

® How can recursion help us out?

b. Move the one remaining disk from src to

dest.

(83)

Induction Step: n > 1

® How can recursion help us out?

c. Recursively move n-1 disks from aux to

dest...

(84)

Induction Step: n > 1

® How can recursion help us out?

d. We‟re done!

(85)

Tower of Hanoi –

Algorithm

We can combine these steps into the following

algorithm:

0. Receive n, src, dest, aux. 1. If n > 1:

a. Move(n-1, src, aux, dest); b. Move(1, src, dest, aux); c. Move(n-1, aux, dest, src); Else

Display “Move the top disk from “, src, “ to “, dest. End if.

(86)

Let‟s see how many moves” it takes to solve this problem, as a function of n, the number of disks to be moved.

n Number of disk-moves required

1 1 2 3 3 7 4 15 5 31 ... i 2i-1 64 264-1 (a big number)

(87)

How big?

Suppose that our computer and “super-printer” can generate and print 1,048,576 (220) instructions/second.

There are 264 instructions to print.

– Then it will take 264/220 = 244 seconds to print them.

– Then it will take @ 244 / 26 = 238 minutes to print them.

– Then it will take @ 238 / 26 = 232 hours to print them.

– Then it will take @ 232 / 25 = 227 days to print them.

– Then it will take @ 227 / 29 = 218 years to print them.

•1 century == 100 years.

– Then it will take @ 218 / 27 = 211 centuries to print them.

•1 millenium == 10 centuries.

– Then it will take @ 211 / 24 = 27 = 128 millenia

Tower of Hanoi –

Analysis…..

(88)
(89)

Linear Search

• Searching is the process of determining whether or not a given value exists in a data structure or a storage media. • We discuss two searching methods on one-dimensional

arrays: linear search and binary search.

• The linear (or sequential) search algorithm on an array is:

– Sequentially scan the array, comparing each array item with the searched value. – If a match is found; return the index of the matched element; otherwise return –1.

• Note: linear search can be applied to both sorted and unsorted arrays.

(90)

Linear Search…..

//Function

public static int linearSearch(Object[] array, Object key) { for(int k = 0; k < array.length; k++) if(array[k].equals(key)) return k; return -1; }

(91)

Linear Search……

/* Linear search */ for ( i=0; i < N ; i++) {

if( keynum == array[i] ) { found = 1; break; } } if ( found == 1) printf("SUCCESSFUL SEARCH\n"); else

(92)

Linear Search……

• Data structure -Array

• Worst case performance O(n) • Best case performance O(1)

(93)

Bubble Sort

• Sorting takes an unordered collection and makes it an ordered one.

• Bubble sort algorithm:

– Compare adjacent elements. If the first is greater than the second, swap them.

– Do this for each pair of adjacent elements, starting with the first two and ending with the last two. At this point the last element should be the

greatest.

– Repeat the steps for all elements except the last one.

– Keep repeating for one fewer element each time, until you have no more pairs to compare

(94)

Bubble Sort…..

for i = 1:n, swapped = false for j = n:i+1, if a[j] < a[j-1], swap a[j,j-1] swapped = true

break if not swapped end

(95)

Bubble Sort……

• Data structure -Array

• Worst case performance O(n2)

• Best case performance O(n)

• Average case performance O(n2)

(96)

Bubble Sort……

#define MAXSIZE 10 void main() { int array[MAXSIZE]; int i, j, N, temp; clrscr();

printf("Enter the value of N\n"); scanf("%d",&N);

printf("Enter the elements one by one\n");

for(i=0; i<N ; i++)

scanf("%d",&array[i]); printf("Input array is\n"); for(i=0; i<N ; i++)

printf("%d\n",array[i]);

/* Bubble sorting begins */

for(i=0; i< N ; i++) { for(j=0; j< (N-i-1) ; j++) { if(array[j] > array[j+1]) { temp = array[j]; array[j] = array[j+1]; array[j+1] = temp; } } }

printf("Sorted array is...\n"); for(i=0; i<N ; i++)

{

printf("%d\n",array[i]); }

(97)

Selection Sort

• Algorithm:

– Pass through elements sequentially;

– In the ith pass, we select the element with the

lowest value in A[i] through A[n], then swap the lowest value with A[i].

(98)

Selection Sort…..

for i = 1:n, k = i for j = i+1:n, if a[j] < a[k], k = j

→ invariant: a[k] smallest of a[i..n]

swap a[i,k]

→ invariant: a[1..i] in final position

(99)

Selection Sort…..

• Not stable

• O(1) extra space • Θ(n2) comparisons

• Θ(n) swaps • Not adaptive

(100)

Insertion Sort

• Algorithm:

– Start with the result being the first element of the input;

– Loop over the input array until it is empty, "removing" the first remaining (leftmost) element;

– Compare the removed element against the current result, starting from the highest (rightmost) element, and working left towards the lowest element;

– If the removed input element is lower than the current result element, copy that value into the following element to make room for the new element below, and repeat with the next lowest result element;

– Otherwise, the new element is in the correct location; save it in the cell left by copying the last examined result up, and start again from step 2

(101)

Insertion Sort

for i = 2:n,

for (k = i; k > 1 and a[k] < a[k-1]; k--) swap a[k,k-1]

→ invariant: a[1..i] is sorted

(102)

Insertion Sort….

• Stable

• O(1) extra space

• O(n2) comparisons and swaps

• Adaptive: O(n) time when nearly sorted • Very low overhead

(103)
(104)

Divide-and-Conquer

The divide-and-conquer strategy solves a problem by:

1. Breaking it into sub-problems that are themselves smaller instances of the same type of problem.

2. Recursively solving these sub-problems 3. Appropriately combining their answers

The name "divide and conquer" is sometimes applied also to algorithms that reduce each problem to only one sub problem, such as the binary search algorithm.

(105)

Divide-and-Conquer- Advantages

• Solving difficult problems

• Algorithm efficiency

• Parallelism

• Memory access

• Roundoff control

(106)

Divide-and-Conquer- Implementation Issues

• Recursion

• Explicit stack

• Stack size

• Choosing the base cases

(107)

Divide-and-Conquer- Examples

Problems Solved by Divide & Conquer :

– Binary Search

– Merge Sort

– Quick Sort

– Max-Min Problem

– Matrix Multiplication

– …..etc….

(108)

Divide-and-Conquer-

Recurrence Relation

• also called as

Difference Equation.

• Is a numerical way to represent the Computation

time & other parameters of analysis.

• Recurrence Relation for Divide & Conquer :

T(1)

n=1

T(n) =

aT(n/b) + f(n) n>1

– Where a and b are known constant,

– T(1) is known,

(109)
(110)
(111)

• Performs two recursive calls on partitions of roughly half of the total size of the list, and

• then makes two further comparisons to sort out the max/min for the entire list.

• Recurrence Relation :

T(n) = 2T(n/2) + 2

Where T(1) = 0

(112)

Primary Requirement :

“Elements should be in SORTED order.”

• much more Efficient than Linear Search.

• Recurrence Relation :

T(n) = T(n/2) + O(1)

(113)
(114)

Examples – 2 : Binary Search

Algorithm….

Step 1: get sorted data as Input of size N. Step 2: initialize low:=1; high:=N;

Step 3: if (low > high) print ("Value Not Found");

goto Step 7

Step 4: find mid:=(low+ high) div 2

Step 5: if ( v==a[mid] ) then print ("Value Found") goto Step 7.

Step 5: if ( v<a[mid] ) then high:=mid-l else low:=mid+1

Step 6: goto step 3 Step 7: End

(115)

Examples – 2 : Binary Search - Analysis

Following tree structure describe the way of division………...

(116)

Examples – 2 : Binary Search - Analysis

Therefore, (in worst case),

Complexity = Number of Comparison

= Number of level of Binary tree + 1 = k + 1 We know that, n = 2k log n = k . log 2 k = log n / log 2 i.e. k = log2n

Number of Comparison =( log2n + 1 ) ≈≈ log2n

(117)
(118)

Examples – 3 : Merge Sort

• The merge sort algorithm closely follows the divide and-conquer paradigm.

• It is based on Two-way Merge Sort. • specially for External Sorting.

• Intuitively, it operates as follows.

– Divide: Divide the n-element sequence to be sorted into two subsequences of n/2 elements each.

– Conquer: Sort the two subsequences recursively using merge sort.

– Combine: Merge the two sorted subsequences to produce the sorted answer.

(119)

Examples

– 3 : Merge Sort : Working by example

Divide

(120)

Examples – 3 : Merge Sort : Analysis

When we have n > 1 elements, we break down the

running time as follows..

– Divide: The divide step just computes the middle of the subarray, which takes constant time. Thus, D(n) = (1). – Conquer: We recursively solve two subproblems, each of

size n/2, which contributes Q(n) =2T (n/2) to the running time.

– Combine: We uses the merge procedure on an n-element subarray that takes time Θ(n), so C(n) = (n).

(121)

Examples – 3 : Merge Sort : analysis

Therefore, T(n) = D(n) + C(n) + Q(n)

= (1)+ (n) + 2T (n/2) = (n) + 2T (n/2)

gives the recurrence for the worst-case running time T (n) of merge sort:

(1)

,if n = 1

T(n)=

2T (n/2) +

(n) ,if n > 1

(122)
(123)

Examples – 4 : Quick Sort

most efficient Internal Sorting method. • it works as under:

1. select a value as a pivot element from given array A[1], . . . , A[n]

2. Divide the list into 2 half based on pivot value such that,

(elements of first half) < pivot < (elements of second half)

3. Repeat from step-1 for both halves if they still divisible into two.

we hope the pivot is near the median key value in the array, so that it produce nearly equal size of halves.

(124)

Quick Sort……..

Simple

Steps-1. if left < right:

1.1. Partition a[left...right] such that:

all a[left...p-1] are less than a[p], and

all a[p+1...right] are >= a[p]

1.2. Quicksort a[left...p-1]

1.3. Quicksort a[p+1...right]

2. Terminate

(125)

Quick Sort-

Partitioning

• A key step in the Quicksort algorithm is

partitioning the array

– We choose some (any) number p in the array to use as a pivot

– We partition the array into three parts:

p

numbers less than p

numbers greater than or equal to p

(126)

Quick Sort- Algorithm

• Choose an array value (say, the first) to use as the pivot • Starting from the left end, find the first element that is

greater than or equal to the pivot

• Searching backward from the right end, find the first element that is less than the pivot

• Interchange (swap) these two elements

(127)

Quick Sort- Example

• choose pivot:

4 3 6 9 2 4 3 1 2 1 8 9 3 5 6

• search:

4

3

6

9 2 4 3 1 2 1 8 9

3

5 6

• swap:

4

3

3

9 2 4 3 1 2 1 8 9

6

5 6

• search:

4

3

3

9

2 4 3 1 2

1

8 9 6

5 6

• swap:

4

3

3

1

2 4 3 1 2

9

8 9 6

5 6

• search:

4

3

3 1 2

4

3 1

2

9 8 9 6

5 6

• swap:

4

3

3 1 2

2

3 1

4

9 8 9 6

5 6

• search:

4

3

3 1 2 2 3

1

4

9

8 9 6

5 6

(left > right)

(128)

Quick Sort- Analysis (Best case)

• Suppose each partition operation divides the array almost exactly in half.

• Then the depth of the recursion in log2n

– Because that‟s how many times we can halve n

• However, there are many recursions! – How can we figure this out?

– We note that

• Each partition is linear over its subarray

(129)

Way of Partitioning

Quick Sort- Analysis (Best case)

n n/2 n/4 n/4 n/2 n/8 n/8

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Gate- ADA & DSA by Nitesh Dubey

(130)

• We cut the array size in half each time • So the depth of the recursion in log2n

• At each level of the recursion, all the partitions at that level do work that is linear in n

• O(log2n) * O(n) = O(n log2n)

• Hence in the average case, quick sort has time complexity O(n log2n)

(131)

• In the worst case, partitioning always divides the size n

array into these three parts:

– A length one part, containing the pivot itself – A length zero part, and

– A length n-1 part, containing everything else • We don’t recur on the zero-length part

• Recurring on the length n-1 part requires (in the worst case) recurring to depth n-1

(132)

Quick Sort- Analysis (Worst case)…..

Way of Partitioning

n/2 n-1 n-3 n-2

.

.

.

.

..

.

.

.

.

(133)

• In the worst case, recursion may be n levels deep (for an array of size n).

• But the partitioning work done at each level is still n.

• O(n) * O(n) = O(n2)

• So worst case for Quicksort is O(n2)

• When does this happen?

– When the array is sorted to begin with!

(134)

• If the array is sorted to begin with, Quicksort is terrible: O(n2)

• However, Quicksort is usually O(n log2n)

• The constants are so good that Quicksort is generally the fastest algorithm known.

• Most real-world sorting is done by Quicksort.

(135)

• Almost anything you can try to “improve” Quicksort will actually slow it down.

• One good tweak is to switch to a different sorting method when the subarrays get small (say, 10 or 12).

– Quicksort has too much overhead for small array sizes

• For large arrays, it might be a good idea to check beforehand if the array is already sorted.

• Picking a better pivot.

(136)

• For every iteration pick the Pivot from random position of array.

• We could do an optimal quicksort (guaranteed

O(n log n)) if we always picked a pivot value that exactly cuts the array in half

– Such a value is called a median: pick it as Pivot value.

• Take the median (middle value) of three randomly

selected values and consider it as pivot.

(137)

Example 5 : Matrix Multiplication

Suppose we want to multiply two matrices of size N x

N: for example A x B = C.

Therefore,

C11 = a11b11 + a12b21 C12 = a11b12 + a12b22 C21 = a21b11 + a22b21

2x2 matrix multiplication can be accomplished in 8 multiplication.

(2log

28 =23)

(138)
(139)

Basic Matrix Multiplication………..

) ( ) ( Thus 3 1 1 3 1 , 1 , , N O cN c N T b a C N i N j N k j k N k k i j i void matrix_mult () { for (i = 1; i <= N; i++) { for (j = 1; j <= N; j++) { compute Ci,j; } }} Algorithm Time analysis

(140)

Matrix Multiplication using Divide and Conquer

A

B

= R

A

0

A

1

A

2

A

3

B

0

B

1

B

2

B

3 A0 B0+A1 B2 A0 B1+A1 B3 A2 B0+A3 B2 A2 B1+A3 B3

=

•Divide matrices into sub-matrices: A0 , A1, A2 etc

•Divide the matrices till its size reduced to 2x2 •Use blocked matrix multiply equations

(141)

Basic Matrix Multiplication using Divide & Conque

• Min size of matrices is 2x2.

• Each of which requires 8 multiplications.

• n

2

n

umber of Addition operation will be required. (for n x n)

• Therefore, the recurrence relation will be :

b ,if n = 2

T(n)=

8T (n/2) + cn2 ,if n > 2

(142)

Strassens’s Matrix Multiplication

• Volker Strassen published the Strassen algorithm in 1969.

• Strassen showed that 2x2 matrix multiplication can be accomplished in 7 multiplication and 18 additions or subtractions. (2log

27 =22.807)

• This reduce can be done by Divide and Conquer Approach.

(143)

Strassens’s Matrix Multiplication………

P1 = (A11+ A22)(B11+B22) P2 = (A21 + A22) * B11 P3 = A11 * (B12 - B22) P4 = A22 * (B21 - B11) P5 = (A11 + A12) * B22 P6 = (A21 - A11) * (B11 + B12) Where, C11 = P1 + P4 - P5 + P7 C12 = P3 + P5 C21 = P2 + P4 C22 = P1 + P3 - P2 + P6 And,

(144)

C11 = P1 + P4 - P5 + P7 = (A11+ A22)(B11+B22) + A22 * (B21 - B11) - (A11 + A12) * B22+ (A12 - A22) * (B21 + B22) = A11 B11 + A11 B22 + A22 B11 + A22 B22 + A22 B21 – A22 B11 -A11 B22 -A12 B22 + A12 B21 + A12 B22 – A22 B21 – A22 B22 = A11 B11 + A12 B21

(145)

Strassens’s Matrix Multiplication- Time Analysis

• Min size of matrices is 2x2.

• Each of which requires 7 multiplications.

• n

2

n

umber of Addition operation will be required. (for n x n)

• Therefore, the recurrence relation will be :

b ,if n = 2

T(n)=

7T (n/2) + cn2 ,if n > 2

(146)
(147)
(148)

Tower of Hanoi –

Algorithm

We can combine these steps into the

following algorithm:

0. Receive n, src, dest, aux.

1. If n > 1:

a. Move(n-1, src, aux, dest);

b. Move(1, src, dest, aux);

c. Move(n-1, aux, dest, src);

Else

Display “Move the top disk from “, src, “ to

“, dest.

(149)

Tower of Hanoi –

Algorithm

• Let T[N] be the minimum number of moves needed to solve the puzzle with N disks.

• The recursive solution above involves moving twice (N-1) disks from one peg to another and making one

additional move in between. • It then follows that (for n>0)

T[N] = 2T[N-1]+1

T[1] = 1 and T[0] = 0

i.e., T[N] = O(2^N)

(150)
(151)

• Hashing is function that maps each key to a location in memory.

• A key‟s location does not depend on other elements, and does not change after insertion.

• unlike a sorted list

• A good hash function should be easy to compute.

• With such a hash function, the dictionary operations can be implemented in O(1) time.

(152)

Hash Tables

• a hash table is an array of size Tsize

– has index positions 0 .. Tsize-1

• Two types of hash tables

– open hash table

• array element type is a <key, value> pair • all items stored in the array

– chained hash table

• element type is a pointer to a linked list of nodes containing <key, value> pairs

• items are stored in the linked list nodes

• keys are used to generate an array index

(153)

Hash Function

• a hash function is used to put data in the hash table. • Hash function is used to implement the hash table.

• The integer returned by the hash function is called

Hash key.

• Each position of the hash table is called Bucket.

• Home bucket is Actual bucket of a value.

• by applying the hash function to the key we perform – insert,

– retrieve, – update,

(154)

Some Hash functions

• Division Method

– H(key)= key % Tsize

• Mid Square Method

• Multiplicative Hash Method

– H(key)= floor( p * (key * A) % 1)

– where p is integer constant and A is Constant (0< A <1). – Donald Knuth suggested A=0.61803398987

• Digit Folding

– Shift folding

– Boundary folding (reverse the ith part)

• Digit Analysis

(155)

Some hash functions….

How to apply Hash function:

• if key (data) type is integer - key % Tsize

• if key (data) type is a string

- convert it into an integer and then % Tsize

• Goals for a hash function:

– Fast to compute – Even distribution

• cannot guarantee no collisions unless all key values are known in advance

(156)

An Open Hash Table

key value

Hash (key) produces an index in the range 0 to 6. That index is the “home address”

0

1

2

3

4

5

6

Some insertions:

K1 --> 3

K2 --> 5

K3 --> 2

K1 K1info K2 K2info K3 K3info

(157)

Hash Collisions

0

1

2

3

4

5

6

K3 K3info K1 K1info K2 K2info

Some more insertions:

K4 --> 3

K5 --> 2

K6 --> 4

K4 K4info K5 K5info K6 K6info

Linear probing collision

resolution strategy

Collisions occur when different elements are mapped to the same cell.

(158)

Collisions Resolution techniques

Chaining - using linked list

Linear probing

• Linear probing with Chaining – using array

• Linear probing with Chaining (with replacement)

• Quadratic probing • Double Hashing

(159)

Linear Probing (insert 12)

12 = 1 x 11 + 1

(160)

Search with linear probing (Search 15)

15 = 1 x 11 + 4

15 mod 11 = 4

(161)

Deletion with linear probing: LAZY (Delete 9)

9 = 0 x 11 + 9 9 mod 11 = 9

(162)

Search Performance

0

1

2

3

4

5

6

K3 K3info K1 K1info K2 K2info K4 K4info K5 K5info

K6 K6info Average number of probes needed to retrieve the value with key K? K hash(K) #probes K1 3 1 K2 5 1 K3 2 1 K4 3 2 K5 2 5 K6 4 4 14/6 = 2.33 (successful)

(163)

Chaining

0 1 2 3 4

(164)

A Chained Hash Table

insert keys:

K1 --> 3

K2 --> 5

K3 --> 2

K4 --> 3

K5 --> 2

K6 --> 4

linked lists of synonyms

0

1

2

3

4

5

6

K3 K3info K1 K1info K5 K5info K4 K4info K6 K6info K2 K2info

(165)

Insertion: insert 53

14 42 29 20 1 36 56 23 16 24 31 17 7 0 1 2 3 4 5 6 7 8 9 10 53 = 4 x 11 + 9 53 mod 11 = 9 14 42 29 20 1 36 56 23 16 24 53 17 7 0 1 2 3 4 5 6 7 8 9 10 31

(166)

Search Performance

Average number of probes needed to retrieve the value with key K?

K hash(K) #probes K1 3 1 K2 5 1 K3 2 1 K4 3 2 K5 2 2 K6 4 1 8/6 = 1.33 (successful)

0

1

2

3

4

5

6

K3 K3info K1 K1info K5 K5info K4 K4info K6 K6info K2 K2info

(167)

Quadratic Probing

• Solves the clustering problem in Linear Probing

– Check H(x)

– If collision occurs check H(x) + 1 – If collision occurs check H(x) + 4 – If collision occurs check H(x) + 9 – If collision occurs check H(x) + 16 – ...

(168)

Quadratic Probing (insert 12)

12 = 1 x 11 + 1 12 mod 11 = 1 12 mod 11 = 1 (12+1) mod 11 = 2 (12+4) mod 11 = 5 (12+9) mod 11 = 10 (12+16) mod 11 = 6 (12+25) mod 11 = 4

(169)

Double Hashing

• When collision occurs use a second hash function

– Hash2 (x) = R – (x mod R)

– R: greatest prime number smaller than table-size • Inserting 12

H2(x) = 7 – (x mod 7) = 7 – (12 mod 7) = 2

– Check H(x)

– If collision occurs check H(x) + 2 – If collision occurs check H(x) + 4 – If collision occurs check H(x) + 6 – If collision occurs check H(x) + 8

(170)

Double Hashing (insert 12)

12 = 1 x 11 + 1

12 mod 11 = 1 7 –12 mod 7 = 2

(171)

Factors affecting Search Performance

• quality of hash function

– how uniform?

– depends on actual data

• collision resolution strategy used

• load factor of the HashTable

– N/Tsize

– the lower the load factor the better the search performance

(172)

Traversal

• Visit each item in the hash table • Open hash table

– O(Tsize) to visit all n items – Tsize is larger than n

• Chained hash table

– O(Tsize + n) to visit all n items

(173)

Rehashing

• If table gets too full, operations will take too long. • Build another table, twice as big (and prime).

– Eg.- Next prime number after 11 x 2 is 23 • Insert every element again to this table

• Rehash after a percentage of the table becomes full

(174)

Question-1

An Advantage of chained hash table (external hashing) over the open addressing scheme

is-A. Worst case complexity of search operations is less B. Space used is less

C. Deletion is easier D. None of the above

(175)

Question-2

A chained hash table has an array size of 512. What is the maximum number of entries that can be placed in the table? A. 256 B. 511 C. 512 D. 1024 E. There is no maximum. Ans:- (E)

(176)

Question-3

The hash function hash : = key mod size, and linear probing are used to insert the

keys-37, 38, 72, 48, 98, 11, 56

into the hash table with indices 0 ... 6. The order of the keys in the array are given

by-A. 72, 11, 37, 38, 56, 98, 48 B. 11, 48, 37, 38, 72, 98, 56 C. 98, 11, 37, 38, 72, 56, 48 D. 98, 56, 37, 38, 72, 11, 48 E. 11, 37, 48, 38, 72, 98, 56

(177)

Question-4

An internal hash table has 5 buckets, numbered 0, 1, 2, 3, 4. Keys are integers, and the hash function

h(i) = i mod 5

is used, with linear resolution of collisions. If elements with keys 13, 8, 24, 10, and 3 are inserted, in that order, into an initially blank hash table, then the content of the bucket numbered 2

is-A. 3 B. 8 C. 10 D. 13 E. 24

(178)

Question-5

Suppose there is an open (external) hash table with four buckets, numbered 0,1,2,3, and integers are hashed into these buckets using hash function h(x) = x mod 4, If the sequence of perfect squares 1,4,9,.. ,l2, ... is hashed into the table, then, as the total

number of entries in the table grows, what will happen?

A. Two of the buckets will each get approximately half the entries, and the other two will remain empty.

B. All buckets will receive approximately the same number of entries.

C. All entries will go into one particular bucket.

D. All buckets will receive entries, but the difference between the buckets with smallest and largest number of entries will grow.

E. Three of the buckets will each get approximately one-third of the entries, and the fourth bucket will remain empty.

(179)

Question-6

Insert the characters of string K R P C S N Y T J M into a hash table of size 10. Use the hash function

h(x) = (ord(x) – ord(„a‟ + 1) mod 10.

Use linear probing to resolve collisions. Get the answer of

following-A. Which insertions cause collisions? B. Display the final hash table.

Ans:-Collisions at J M

(180)

Question-7

A hash table implementation uses function of (% 7) and linear probing to resolve collision. What is the ratio of numbers in the following series with out collision and with collision if 7 buckets are used:

32, 56, 87, 23, 65, 26, 93 A. 2/5 B. 3/4 C. 4/3 D. 5/2 Ans:- (C)

(181)
(182)

Asymptotic Notation

• little Oh (o) : o-notation is used to denote an upper bound that is asymptotically tight.

f(n) = o(g(n)) ,

for any positive constants c >0 and n0 >0 such that

0 ≤ f(n) < c.g(n) for all n, n ≥ n0.

The function f(n) becomes insignificant relative to g(n) as n approaches infinity, i.e.

• For Example : 2n = o(n2)

(183)

Asymptotic Notation

• little Omega (w) : w-notation is used to denote an lower bound that is asymptotically tight.

f(n) = w (g(n)) ,

for any positive constants c >0 and n0 >0 such that

0 ≤ c.g(n) < f(n) for all n, n ≥ n0.

The function f(n) becomes tends to infinity relative to g(n) as n approaches infinity, i.e.

• For Example : 2n2 = w (n)

(184)

Asymptotic Notations Relationship

If f(n) = O(g(n)), i.e. f(n) grows no faster than g(n).

If f(n) = o(g(n)), i.e. f(n) grows slower than g(n).

If f(n) = Ω(g(n)) i.e. f(n) grows no slower than g(n).

If f(n) = w (g(n)) i.e. f(n) grows faster than g(n).

(185)
(186)

References

Related documents