Computer code = data structure + algorithm. Data types in C ++:

(1)

Computer code = data structure + algorithm

We need to know basic algorithms and data structures so that we can produce good-quality computer code.

One cannot become a computer professional without good knowledge of these algorithms and data structures.

Data structure organizes information.

Algorithm manipulates information.

Algorithm is a set of well-deﬁned rules for a solution of a problem.

It must be ﬁnite, deterministic, each step is precisely deﬁned,

the order of steps is precisely deﬁned.

1

Data types in C + +:

simple: int, ﬂoat, bool aggregate: array, struct

Aggregate data type is an example of a data structure.

It consists of:

• simple members,

• relationship among the members,

• operations on the data structure that allow the manipulation of its members.

2

In programming languages, some data struc- tures are built-in, (arrays, structures)

Many other data structures are often needed and they can be implemented using the built-in data structures.

These are called user-deﬁned data structures.

We study in this course the data structures and algorithms that are needed

• very often,

• in many application.

For each data structure we discuss several ways of implementing it.

3

Abstract Data Type or ADT

A speciﬁcation of a type that

displays the important features,

suppresses implementation speciﬁc details.

It deﬁnes a data type by input - output relationship.

Example: A dictionary D:

//A collection of words and their meaning

Operations:

search(D,x,y);

// search D for word x and if found, y is the deﬁnition of x

insert(D,x,y);

//insert a new word x with meaning y into D

delete(D,x);

//delete the word x and its deﬁnition from D

4

(2)

A data structure is a speciﬁc implementation of an ADT.

Each implementation has:

some advantages, some disadvantages,

Main goals of the course:

5

1. Learn the commonly used ADT and their implementations.

2. Learn the costs and beneﬁts of basic data structures.

3. Know how to measure the cost and bene- ﬁts.

4. Learn basic algorithms and basic methods used in design of algorithms.

5. Learn how to select algorithms and data structures.

6

Algorithm Analysis

It is the process of ﬁnding the run-time and memory space needed by a given algorithm.

Example: Linear Search

int Find_el(int K, int * array, int n)

{ // find if K in in the first n locations // of the array

int i = 0;

while (i < n) && (array[i] != K) i++;

return i;

// if i=n then K is not in the array }

We calculate the run-time T (n) and memory space S(n) as the function of the input size n.

Run-time: Approximate number of operations to be performed when searching among n ele- ments.

It is a computer dependent measure.

The precise count depends on the computer architecture:

- some parallelism can exist on a computer.

- not all operations take the same time.

- basic operations vary among computers.

- diﬀerent compilers produce diﬀerent codes.

However, for large inputs n, these diﬀerences are not as important as the diﬀerences among algorithms.

(3)

n 100log₂n 20n + 5 3n²+ 7 2ⁿ

2 100 45 19 4

5 232 105 82 32

10 332 205 307 1024

20 432 405 1207 1048576

100 664 2005 30007 1.27∗ 10³⁰ So if typical values of n are > 100 then the additive constants and multiplicative constants are not that important,

For n > 100,

100log₂n < 20n + 5 < 3n²+ 7 < 2ⁿ

So for suﬃciently large values of n c₁log₂n < c₂n < c₃n²< cⁿ₄

If we do 1000 million operations per second, 1.27∗ 10³⁰ operations > 10²⁰ seconds

> 10¹² years>> age of universe.

9

To remove unimportant constants from run- time and space, we introduce order of notion.

Deﬁnition

T (n) is O(f (n)) (in the Order of f (n)) iﬀ there are constants c and k such that

T (n)≤ c · f(n) for all values of n greater than k.

T (n) is in Ω(f (n)) iﬀ there are constants c and k such that

T (n)≥ c · f(n) for all values of n greater than k.

T (n) is in Θ(f (n)) iﬀ T (n) if it is Ω(f (n)) and also O(f (n)).

10

For the linear search among n elements we state that:

worst case is O(n),

average case is O(n),

best case is O(1) .... constant time.

We call this the asymptotic analysis of the algorithm.

We could state the same for the lower bounds and so for the linear search among n elements:

worst case is Θ(n), average case is Θ(n),

best case is Θ(1) .... constant time.

11

Rules for Asymptotic Analysis

algorithm A() { Step 1;

Step 2;

Step 3;

. // there is no loop

. . Step i;

}

Sum rule:

T_A(n) = T_{Step 1}(n) + T_{Step 2}(n) +· · · + TStep i(n)

For asymptotic analysis concentrate on loops and calls of functions.

12

(4)

Nested statements:

algorithm B()

{ while ( condition(n)) { C;

} }

Product rule:

If the while loop is repeated f (n) times then

T_B(n) = f (n)∗ T_C(n)

13

Assume the input size is n, k is a ﬁxed constant.

for int (i=1,i <= n, i+=k) { statement;

}

above loop repeated n/k times

for int (i=1,i <= n, i*=k) { statement

}

above loop repeated log_k (n) times

int j=0;

for int (i=1,i <= n, i+=j) //

statement;

j++; // value of j is being increased }

above loop repeated square root of n times

14

Example

Input is an array of size n.

Any statement st i contains basic operations.

int i = 0;

while ((i <=n) && condition(array[i])) { // outer loop

st_1;

st_2;

int j := 1;

while (j <= n) { // inner loop st_3;

st_4;

j = 2*j;

} i = i+2;

}

Loops are nested, use the product rule.

Binary search

int binary(int K, int * array, int n) { // find if K in in the array in the

// first n locations of the array

int l = -1; // 1 less than where to search int r = n; // 1 more than where to search while (l+1 != r) {

int mid = (l+r)/2;

if (K == array[mid]) return mid;

if (K < array[mid]) r = mid;

else l = mid;

} return n;

}

(5)

Assume we want to count comparisons:

loop repetition:

best case: once

worst case: log₂n times average case: log₂n− 1 times

Number of comparisons each time in the loop:

best case: 1 (when we ﬁnd the element) worst case: 2

average case: 2

Use the product rule to ﬁnd T (n).

For the binary search, in the worst case:

T (n) = 2 log₂n = O(log₂n)

17

int binary(int K, int * array, int n) { // find if K in in the array in the

// first n locations of the array

int l = -1; // 1 less than where to search int r = n; // 1 more than where to search while (l+1 != r) {

int mid = (l+r)/2;

if (K < array[mid]) r = mid;

else { if (K == array[mid]) return mid;

l = mid;

} } return n;

}

On average, we only do 1.5 comparisons in the loop.

In the previous version we always do 2 com- parisons (except when we ﬁnd K).

It is better than the previous version.

It is an example of a ﬁne-tuning of an algo- rithm. However it remains in O(log₂n).

18

Space bound for algorithms are calculated also asymptoticly like for the time. Usually, it is much simpler.

Example: Both, linear and binary search need space Θ(n).

Space/Time Tradeoﬀ

In many applications we can ﬁnd faster algorithms but we need to use more space. Or, we can save some space by having a slower algorithm.

This possibility of trading time for space may be important in some applications.

19

Simpliﬁcations of expressions in asymptotic analysis

c is a constant:

O(f (n) + c) = O(f (n)) O(cf (n)) = O(f (n))

If, asymptoticly, f₁ grows faster than f₂: O(f₁(n) + f₂(n)) = O(f₁(n))

If f₁ is O(g₁) and f₂ is O(g₂) then O(f₁(n)∗ f2(n)) = O(g₁(n)∗ g2(n))

20

(6)

If

nlim→∞

g(n) f (n) = 0 then f (n) grows faster than g(n).

f (n) is in O(g(n)) and g(n) is Ω(f (n)).

If

n→∞lim g(n) f (n) =∞ then g(n) grows faster than f (n)

If

n→∞lim g(n) f (n)= c where c is a nonzero constant

then f (n) and g(n) grow at the same rate.

f (n) is in O(g(n)) and g(n) is in O(f (n)).

21

an algorithm:

Loop structure of

c log n

c log (c n)

c log n c c n

log n c n 3 4

5 6 c n

2 2

7

8 2

9 c n₁

loop repeats c n0

T (n) = (c₀n)∗(c1n)+c₂log₂n+(c₃n)∗(c4log₂(c₉n)+

(c₅log₂n)∗ (c6√

n)∗ (c7log₂n) + c₈n

= d₁n²+c₂log n+d₂n log₂n+d₃√

n log²n+c₈n

= O(n²)

22

Fundamental Data Structures

List:

A ﬁnite, ordered sequence of items.

L = (a₀, a₁, a₂, . . . , a_n−1)

Each element has a position in the list.

a₀: ﬁrst item, at position 0 a₁: second item, at position 1 ...

a_n−1: last item, at position n− 1.

All items are of the same type (could be an aggregate type)

Empty list contains no items: L = ()

Length of a list: number of items.

if L = (a₀, a₁, a₂, . . . , a_n₋₁), length of L = n.

head: the beginning of list tail: the end of the list

ordered list: elements positioned according to some total order of elements, ascending or descending

unordered list: no apparent order.

(7)

ADT List

class list { // list ADT public:

list(const int = LIST_SIZE) // constructor

~list(); // destructor

void clear(); // remove all items void insert(const ELEM&); // insert ELEM at curr pos void append(const ELEM&); // insert ELEM at tail ELEM remove(); // remove and return curr e void setFirst(); // set curr to first pos void prev(); // set curr to prev pos void next(); // set curr to next pos int length() const; // return the actual lengt void setPos(const int); // set curr to specif. pos void setValue(const ELEM&); // set value at curr ELEM currVal() const; // return value at curr bool isEmpty() const; // true iff list is empt bool isInList() const; // true iff curr within li bool find(const ELEM&); // true iff ELEM in list

// from curr on };

25

ADT speciﬁes the set of operations that can be performed on the list, but not how it should be implemented.

We now will consider two very common implementations of a list.

The ﬁrst implementation uses an array to store the items,

an index curr indicates the current item.

26

// Array based implementation class list { // array based list private:

int msize; // maximum list size int numinlist; // actual number of items int curr; // position of the current item ELEM* listarrray; // Array holding items

public:

void clear(); // remove all items void insert(const ELEM&); // insert ELEM at curr p void append(const ELEM&); // insert ELEM at tail ELEM remove(); // remove and return cur void setFirst(); // set curr to first pos void prev(); // set curr to prev pos void next(); // set curr to next pos int length() const; // return the actual len void setPos(const int); // set curr to specif. p void setValue(const ELEM&); // set value at curr ELEM currVal() const; // return value at curr bool isEmpty() const; // true iff list is empt bool isInList() const; // true iff curr within bool find(const ELEM&); // true iff ELEM in list

// from curr on };

27

list::list(const int size) // constructor { msize = sz; numinlist = curr = 0;

listarray = new ELEM[sz];

}

~list::list() // destructor { delete [] listarray}

void list:: prev() { curr--; }

void setPos(const int pos) // set curr to pos { curr = pos; }

void list::append(const ELEM& item) // insert at the tail

{ assert(numinlist < msize);

listarray[numinlist++]=item;

}

28

(8)

Insertion an item in the list requires a shifting of the elements in the list.

void list::insert(const ELEM& item) { //insert item in the current position assert((numinlist < msize) && (curr >=0)

&& (curr <= numinlist));

for (int i=numinlist; i >curr; i--)

//Shift elements one position up listarray[i] = listarray[i-1];

listarray[curr] = item;

numinlist++;

}

Similarly the deletion of an item not at the tail of the list requires a shifting of the elements in the list.

29

The main disadvantage of the array-based implementation:

- must state the maximum length of the list when the list is created,

- insert, delete an item in the middle of a list is slow.

We can use a linked list consisting of nodes.

Each node contains an item and a pointer to the next item.

A nodes are created when an additional item is needed.

30

List node:

data field link field element next

class link { // singly linked list node public:

ELEM element; // holds an item

link * next; // pointer to the next node link(const ELEM& elemval,link* nextval = NULL)

// constructor 1

{ element = elemval; next = nextval;}

link(link* nextval = NULL) {next = nextval;}

// constructor 2

~link() {} // destructor };

// Linked-list List implementation class list { // linked-list based list private:

link* head; // pointer to list header link* tail; // pointer to the last item link* curr; // position of the current it public:

void clear(); // remove all items void insert(const ELEM&); // insert ELEM at curr p void append(const ELEM&); // insert ELEM at tail ELEM remove(); // remove and return cur void setFirst(); // set curr to first pos void prev(); // set curr to prev pos void next(); // set curr to next pos int length() const; // return the actual len void setPos(const int); // set curr to specif. p void setValue(const ELEM&); // set value at curr ELEM currVal() const; // return value at curr bool isEmpty() const; // true iff list is empt bool isInList() const; // true iff curr within bool find(const ELEM&); // true iff ELEM in list

// from curr on };

(9)

To simplify insertion at the current, we make curr to point to the item preceding the actual current item.

To make it consistent for all items, we will keep a header node in the list that precedes the items.

L = (c, m, a, b)

with a being the current element:

c m a b

header node

head

curr

tail

It also simpliﬁes some other operations.

33

list::list(const int size) // constructor { tail = head = surr = new link; }

~list::list() // destructor { while (head != NULL){

curr = head;

head= head->next;

delete curr;

}

void list:: next()

{ if (curr !=null) curr = curr ->next; }

void setFirst() // set curr to pos { curr = head; }

void list::append(const ELEM& item) // insert at the tail

{ tail ->next = new link(item, NULL);

tail = tail->next;

}

34

A Comparison of

Array and Linked list implementations

Worst-case time

operation array linked list

clear O(1) O(n)

insert O(n) O(1)

remove O(n) O(1)

append O(1) O(1)

setFirst O(1) O(1)

prev O(1) O(n)

next O(1) O(1)

length O(1) O(n)

setPos O(1) O(n)

setValue O(1) O(1)

isEmpty O(1) O(1)

isInList O(1) O(1)

ﬁnd O(n) unsorted O(n)

O(log₂ n) sorted

Thus, choice depends on which operations are done most of the time.

35

Space for operations:

None requires more that O(1) additional space.

Space for data structure:

Array:

list is static, it size cannot grow past the size speciﬁed when created. Space is wasted if the list is not near its maximal size.

Good when a list size can be predicted.

Linked List

list is dynamic, it size is always exactly what is needed.

We waste space in link ﬁelds.

Good when a list size cannot be predicted and the list size varies a lot.

36

(10)

The main disadvantage of the linked-list imple- mentation is the slow operations prev, setPos.

This can be improved by using two pointers in each node:

- One pointer points to the next item, - One pointer points to the previous item.

element

prev next

37

Doubly linked lists

class link { // A doubly linked list node public:

ELEM element; // value in the node link* next; // points to the next node link* prev; // points to previous node link* curr; // position of current elem.

link( const ELEM& elemval,

link* nextp = NULL, link* prevp = NULL) // constructor 1

{ element = elemval;

next = nextp; prev = prevp;}

link( link* nextval = NULL) // constructor 2

{ next = nextp; prev = prevp;}

const int = LIST_SIZE);

~link(){ } // destructor };

38

Restricted variants of lists.

Lists that do not need to use all list operations.

Stack:

A list in which all insertions and deletions are done at one end of the list.

top is the end where insertions/deletions take place.

top elem.

elem elem elem

The operations on the stack are usually called:

push for inseting an element at the top, pop for removing the top element.

Both, array and linked-list implementations of stacks are used in practice.

In the linked-list implementation, the head of the list is used for the top of the stack.

Array implementation is used when we know the limit on the stack size.

Linked-list is used when the limit on the stack size is not known.

Other name for a stack: LIFO

(11)

class Stack { // array based stack private:

int top;

int size;

ELEM *listarray; // array holding stack items public:

Stack(const int sz=LIST_SIZE) // constructor { size = sz; top = 0; listarray = new ELEM[sz];}

~Stack() //destructor

{ delete [] listarray;}

void clear() // remove all elements { top = 0;}

void push(const ELEM & item) // push ELEM on top { assert(top < size};

listarray[top++] = item;}

ELEM pop() // remove the top elem

{ assert( !isEmpty());

return listarray[--top];}

ELEM top() // get the top value

return listarray[top-1];}

bool isEmpty() const // true iff stack empty { return (top == 0);}

};

41

class Stack { // linked-list stack private:

link *top;

public:

Stack(const int sz=LIST_SIZE) // constructor { top = NULL; }

~Stack() //destructor

{ clear(); }

void push(const ELEM & item) // push ELEM on top { top = new link(item,top)}

ELEM pop() // remove the top elem

link *ltemp = top -> next;

ELEM temp = top-> element;

delete top; top = ltemp;

return temp;}

ELEM top() // get the top value

return top->element;}

bool isEmpty() const // true iff stack empty { return (top == NULL);}

};

42

Queue:

A list in which:

all insertions are done at one end of the list, called the back of the queue,

all deletions are done at the other end of the list, called the front of the queue.

Operations are often called

enqueue for inserting an element,

dequeue for removing an element in a queue.

Both, array and linked-list implementations of queues are used in practice.

Other name for a queue: FIFO

43

Picture of an array-based implementation:

a[0] a[1]

a[2]

a[3]

a[4]

a[7]

a[8]

a[9]

a[12]

a[6]

a[5]

a[10]

a[11]

Queue contains elements 11,25,13,40,50 front element is 11

rear element is 50

11

25

13 40 50

Front

Rear

=1

=6

Queue empty : front == rear

Queue full: (rear+1) mod size == front Above queue can contain at most 12 elements, but array contains 13 locations.

44

(12)

class Queue { // Array based implementation private:

int size; // Max size of queue

int front; // index prior to front it.

int rear; // index of the rear it.

ELEM * listarray; // array holding items public:

Queue(const int sz = LIST_SIZE) // constructor { size = sz+1; front=rear=0; // empty queue

listarray = new ELEM[size]; }

~Queue() { delete [] listarray} // destructor void clear() { front = rear = 0; // clear queue void enqueue(const EMEM&); // put one item

// at rear ELEM dequeue(); // remove item

// in front ELEM firstValue() const

{assert(!isEmpty());

return listarray[(first+1) % size]}

// return the value of front element bool isEmpty() const

{ return (front == rear); }

//true if empty queue }

45

void Queue::enqueue(const ELEM& item) { assert(((rear+1)%size)!=front);

rear = (rear+1) % size;

listarray[rear]=item;

}

ELEM Queue:: dequeue() { assert(!isEmpty());

front = (front+1) % size;

return listarray[front];

}

ELEM firstValue()

{ assert(!isEmpty());

return listarray[(front+1)% size];

}

46

frontrear Queue is empty: front == rear== NULL

class Queue { //linked queue implementation private:

link* front; //points to the front el.

link* rear; //points to the rear el.

public:

Queue(const int sz = LIST_SIZE) // constructor { front=rear=NULL;} // empty queue

~Queue() { clear();} // destructor

void clear(); // clear queue

void enqueue(const EMEM&);// put one item at rear

ELEM dequeue(); // remove item in front ELEM firstValue() const

{assert(!isEmpty());

return front -> element;}

// return the value of front element

bool isEmpty() const

{ return (front == NULL); }

//true if empty queue }

(13)

void Queue:: enqueue(const ELEM& item) {if(rear == NULL)

front = rear = new link(item,NULL);

else{ rear->next = new link(item,NULL);

rear = rear->next;

} }

ELEM Queue:: dequeue() {assert(!isEmpty());

ELEM temp = front->element;

link * ltemp = front;

front = front->next;

delete ltemp;

if (front == NULL) rear = NULL;

return temp;

}

void Queue::clear() {link * ltemp;

while (front!=NULL)

{ltemp=front; front=front->next; delete ltemp;}

rear = NULL;

}

49

Examples of applications of stacks and queues

Stacks

Analysis of texts Compilers

Implementation of recursive functions Depth-ﬁrst search

...

Queues

Queues of file names for printing of files Queues of file names for writing files to a disc Queues of incoming mail messages

Queues of outgoing mail messages Queues of processes for batch processing ...

50

The complexity of

stack and queue operations:

Assume stack or queue is of length n.

push, pop, topValue,

enqueue, deques, ﬁrstValue,

O(1) in array and linked list implementations.

clear or destructor are:

O(n) for linked implementation O(1) for array implementations.

(not done frequently)

arrays are slightly faster,

linked lists don’t have any upper bound on length.

52

Examples of applications of stacks and queues Queues

On a computer system we have:

A queue of ﬁle names for printing of ﬁles.

A queues of ﬁle names for writing ﬁles to a disc.

A queues of incoming mail messages.

A queues of outgoing mail messages.

Queues of processes for batch processing, etc.

(usually arrays are used).

Used a lot in programs simulating a behavior of complex systems, e.g.:

Simulations of operations at an airport is used to determine how many check-in counters, se- curity checks should be there for a smooth operation.

(usually linked lists are used.)

53

(14)

Stacks

Analysis of texts and expressions, Compilers,

(usually arrays are used).

Implementation of function calls that allows recursion,

Depth-ﬁrst search.

...

(usually linked lists are used).

54

For each function call a new data area is allo- cated.

The area contains space for:

1. variables declared inside the function

2. values of value parameters

3. addresses of reference parameters

4. return address

we call this area an activation record of the function. This is of ﬁxed length known at the compile time.

Activation records are placed in the run-time stack

55

A call of function f ⇔ push

the activation record of f on the run-time stack.

A return from function f ⇔ pop

activation record of f from the run-time stack.

space for Operating system

code of the program

Run-time stack for function calls

space for variables

created by new Heap

Run-time stack

int f( int x) { . . }

void g(int y) {int temp;

} f(temp);

void main() { int temp;

float a;

g(temp}

}

(15)

int count(const ELEM val,link *l)

{//count the number of occurrences of val in list l link * ltemp = l;

int Sum = 0;

while (ltemp !=NULL)

{ if (ltemp->elemnt ==val) Sum++;

ltemp = ltemp->next;

}

return Sum;

}

int rcount(const ELEM val,link *l)

{//count the number of occurences of val in list l if (l==NULL) return 0;

if (l->element == val)

return 1 + rcount(val, l->next);

return rcount(val, l->next);

}

58

Binary Trees

A binary tree is made of a set of nodes that is either empty or

the set consists of a node called the root and the remaining nodes are partitioned into left and right binary trees.

The root of a nonempty subtree is connected to the root by an edge.

59

a

b

c d

e

f g

h left subtree

root

right subtree

60

Terminology

node edge root

(left, right) child parent

path from node n₁to n₂is a sequence of nodes n₁, n₂, n₃, . . . , n_k such that

n_i is a parent of n_i+1, 1≤ i ≤ k − 1 ancestor, descendent

depth of a node x: length of the path from the root to x.

height of a tree T is one more than maximal depth of a node.

level d: all nodes of depth d leaf has no child

internal node has at least one child.

61

(16)

Basic facts:

There are at most 2ⁱ nodes at level i.

There are at most 2ⁱ− 1 nodes in a tree of height i

In a nonempty full binary tree T ,

the number of leaves is one more than the number of internal nodes.

In a nonempty binary tree T ,

the number of empty subtrees in T is one more than the number of nodes in T .

62

Linked ( or pointer based) representation of Binary trees

In many applications we only need in every node to be able to reach its children.

Structure of a node:

in the node

pointer to the left child

pointer to the right child value stored

63

class BinNode { //binary tree node // linked implementation public:

BELEM element; // the value stored in the node BinNode * left; // ptr. to left child

BinNode * right; //ptr. to right child

BinNode(){ left= right= NULL;} // constr., // no node value BinNode(BELEM el, BinNode * lptr = NULL,

BinNode * rptr = NULL,)

{ element = el; left= lptr; right= rptr;}

// constr. with initial values

~BinNode(){} // destructor

BinNode * Leftchild() const { return left;}

BinNode * Rightchild() const { return right;}

BELEM value() const { return element;}

void setValue(BELEM val) { element = val;}

bool isLeaf() const // true if node is a leaf { return(left == NULL) && (right == NULL);}

}

Since the tree itself has a recursive structure, it is often very simple to state algorithms re- cursively.

Inorder printout of nodes

void Inorder_print(BinNode * rt) { // recursive inorder traversal

// rt is a pointer to the root of a bin. tree if (rt == NULL) return;

Inorder_print( rt->leftchild());

cout << rt->value();

Inorder_print( rt-> rightchild());

}

Any other traversal is a permutation of the three statements in the function.

(17)

Any recursive algorithm must have at least one base case where no recursion is done.

One of the base cases must precede any recursive call.

Any recursive call must bring the algorithm closer to a base case.

Execution of a recursive algorithm can be rep- resented using a call tree in which we show the steps to be taken. Execution proceeds from left to right and left part must ﬁnish before any other node at the same level is done.

66

Recursive calls are handled by the run-time sys- tem using a stack containing an “activation record” for all unﬁnished function calls.

A nonrecursive implementation of a nontriv- ial recursive algorithm requires the use of a user-deﬁned stack that takes care of the un- ﬁnished possibilities.

67

Non-recursive preorder printout of nodes

void NR_preorder_print(BinNode * rt) { // non-recursive preorder traversal

// rt is a pointer to the root of a bin. tree if (rt== NULL) return;

stack S;

S.push(rt);

BinNode * ptr;

while (! S.isEmpty()){

ptr = S.pop();

while ( ptr != NULL){

cout << ptr->value()<<’ ’;

if (ptr->rightchild()!= NULL) S.push(ptr->rightchild());

ptr = ptr->leftchild();

} } }

68

In some applications information in internal nodes is diﬀerent from information in the leaf nodes.

Leaf nodes do not need pointers to children.

(There can be very many leaf nodes)

We can have internal nodes and leaf nodes as two sub-classes of the class binary node.

We will use a function isLeaf()

to distinguish between leaf and internal nodes.

It will be a virtual function, i.e. each derived subclass deﬁnes it.

69

(18)

class VarBinNode { // Node base class public:

virtual bool isLeaf() = 0;

// each derived class must define it.

}

class LeafNode : public VarBinNode { //leaf node subclass

private:

Operand var;

public:

Leafnode(const Operand opd) { var = opd;}

//constructor bool isLeaf() { return TRUE}

Operand & value() { return var;} // node value };

70

class IntNode : public VarBinNode { // internal node

private:

Operator opr;

VarBinNode * left;

VarBinNode * right;

public:

IntNode(const Operator & op, VarBinNode * lptr, VarBinNode * rptr) // constructor {opr = op; left = lptr; right = rptr;}

bool isLeaf() { return FALSE }

VarBinNode * leftchild() { return left;}

VarBinNode * rightchild() { return right;}

Operator & value () { return opr;}

}

71

void Inorder_print(VarBinNode * rt) { // recursive inorder traversal

// rt is a pointer to the root of a bin. tree if (rt == NULL) return;

if (rt->isLeaf())

cout <<(LeafNode *) rt->value();

else {

Inorder_print((IntNode *) rt->leftchild());

cout <<(IntNode *) rt->value();

Inorder_print((IntNode *) rt-> rightchild());

} }

Example of binary tree application Huﬀman Coding Trees

Frequency of characters in English:

letter freq. letter freq.

A 77 N 67

B 17 O 67

C 32 P 20

D 42 Q 5

E 120 R 59

F 24 S 67

G 17 T 85

H 50 U 37

I 76 V 12

J 4 W 22

K 7 X 4

L 42 Y 22

M 24 Z 2

(19)

Arrange in order of Frequencies:

Z,2 J,4 X,4 Q,5 K,7, V,12 B,17 G,17

W,22 Y,22 ... E,120

Build a coding tree by:

1. joining ﬁrst two elements of the list into a tree.

Weight of this tree = sum of weights of the elements,

2. put it back in the list in order of weights.

Repeat the above until only one elements remains in the list.

74

65

33

9 79

120 E

37 U

42 D

42 L

32 C

2 Z

7 K

24 F 306

186

107

0 1

0

0 0

0

0 1

1

1 1

Codes: E=0, U=100, D=101, D=110, ... F=11111 1001110101 = UCD

75

Binary Search Trees BST

A data structure that allows fast search and fast insertion/deletion of elements.

A frequently used data structure.

Deﬁnition

A binary tree T is a Binary Search Tree if for any node n in the tree the values in n is larger than any value in the left subtree of n and is smaller than any value in the right sub- tree of n.

76

class BST { private:

BinNode* root;

void clearhelp(BinNode*);

void inserthelp(BinNode*&,const BELEM);

void removehelp(BinNode*&,const BELEM);

void printhelp(const BinNode*, const int) const;

public:

BST(){root = NULL;} //constructor

~BST(){clearhelp(root);root = NULL;}//destructor void clear(){clearhelp(root); root = NULL;}

void insert(const BELEM & val) { inserthelp(root, val);}

void remove(const BELEM & val) { removehelp(root, val);}

BinNode * deletemin(BinNode*&);

BinNode * find(const BELEM& val) const bool isEmpty() const(return root == NULL;}

void print() const {

if (root == NULL) cout << "empty";

else printhelp(root, 0);}

};

77

(20)

void BST::clearhelp(BinNode * rt) { if (rt == NULL) return;

clearhelp(rt->leftchild());

clearhelp(rt->rightchild());

delete rt;

}

void BST::printhelp(BinNode * rt, const int level){

// print tree in inorder, indent. indicates level if (rt == NULL) return;

printhelp(rt->leftchild(),level+1);//print l.s.t.

for (int i=0; i <level; i++) //print r.s.t.

cout << " ";

cout << rt->value()<<"\n";

printhelp(rt->righttchild(),level+1);

}

78

Nonrecursive ﬁnd function:

BinNode * BST :: find(const BELEM& val) const // return a pointer to the node containing val // or NULL if val is not in the tree

{BinNode * ptr = root;

BinNode * res = NULL;

while (ptr != NULL) && (res == NULL) {

if (val < ptr-> value()) ptr = ptr->leftchild();

else if (val == ptr-> value()) res = ptr;

else ptr = ptr->rightchild();

}

return res;

}

The loop is repeated in the worst case height of the tree times.

If the tree has n nodes and is not very ”out of balance”, ﬁnd is O(log₂n)

79

Insertion of a new node

Always as a leaf.

void BST :: inserthelp(BinNode *& rt,const BELEM val){

if (rt == NULL) rt = new BinNode(val,NULL,NULL);

else if(val < rt->value()) inserthelp(rt->left,val);

else inserthelp(rt->right,val);

}

Delete smallest element from BST.

It does not have a left child!

BinNode BST :: deletemin(BinNode *& rt){

// delete the smallest el. and return its pointer assert(rt != NULL);

if (rt ->left != NULL) return deletemin(rt->left);

else // rt points to the minimal element { BinNode * ptr = rt;

rt =rt->right;

return ptr;

} }

Remove a node from BST

1. Find the node X to remove.

2. if X does not have any child, remove X.

3. if X does not have both children, remove X and make the child of X to be the child of the parent of X.

4. if X has both children, then

(a) Remove the minimal element M in the right subtree of X.

(b) replace X by M .

(21)

void BST::removehelp(BinNode *& rt,const BELEM val){

//remove the node containing val

if (rt == NULL) cout << val << " not in BST";

else if (val < rt-> value())

removehelp(rt -> left,val);

else if (val > rt-> value())

removehelp(rt -> right,val);

else { // we have the node to remove BinNode * ptr = rt;

if (rt-> left == NULL) rt = rt->right else if (rt->right == NULL)

rt = rt->left

else { // both subtrees nonempty ptr = deletemin(rt->right);

rt->setvalue(ptr->value());

} delete ptr;

} }

82

a b

c

d

e f

g

h i

j

k l

83

Insertion and removal of nodes in a BST can be modiﬁed so that BST remains “balanced”

and then ﬁnd, insertion, deletion are all

O(log₂n) in the worst case.

See AVL trees, or balanced BST trees in any book on algorithms.

84