Computer code = data structure + algorithm
We need to know basic algorithms and data structures so that we can produce good-quality computer code.
One cannot become a computer professional without good knowledge of these algorithms and data structures.
Data structure organizes information.
Algorithm manipulates information.
Algorithm is a set of well-defined rules for a solution of a problem.
It must be finite, deterministic, each step is precisely defined,
the order of steps is precisely defined.
1
Data types in C + +:
simple: int, float, bool aggregate: array, struct
Aggregate data type is an example of a data structure.
It consists of:
• simple members,
• relationship among the members,
• operations on the data structure that allow the manipulation of its members.
2
In programming languages, some data struc- tures are built-in, (arrays, structures)
Many other data structures are often needed and they can be implemented using the built-in data structures.
These are called user-defined data structures.
We study in this course the data structures and algorithms that are needed
• very often,
• in many application.
For each data structure we discuss several ways of implementing it.
3
Abstract Data Type or ADT
A specification of a type that
displays the important features,
suppresses implementation specific details.
It defines a data type by input - output relationship.
Example: A dictionary D:
//A collection of words and their meaning
Operations:
search(D,x,y);
// search D for word x and if found, y is the definition of x
insert(D,x,y);
//insert a new word x with meaning y into D
delete(D,x);
//delete the word x and its definition from D
4
A data structure is a specific implementation of an ADT.
Each implementation has:
some advantages, some disadvantages,
Main goals of the course:
5
1. Learn the commonly used ADT and their implementations.
2. Learn the costs and benefits of basic data structures.
3. Know how to measure the cost and bene- fits.
4. Learn basic algorithms and basic methods used in design of algorithms.
5. Learn how to select algorithms and data structures.
6
Algorithm Analysis
It is the process of finding the run-time and memory space needed by a given algorithm.
Example: Linear Search
int Find_el(int K, int * array, int n)
{ // find if K in in the first n locations // of the array
int i = 0;
while (i < n) && (array[i] != K) i++;
return i;
// if i=n then K is not in the array }
We calculate the run-time T (n) and memory space S(n) as the function of the input size n.
Run-time: Approximate number of operations to be performed when searching among n ele- ments.
It is a computer dependent measure.
The precise count depends on the computer architecture:
- some parallelism can exist on a computer.
- not all operations take the same time.
- basic operations vary among computers.
- different compilers produce different codes.
However, for large inputs n, these differences are not as important as the differences among algorithms.
n 100log2n 20n + 5 3n2+ 7 2n
2 100 45 19 4
5 232 105 82 32
10 332 205 307 1024
20 432 405 1207 1048576
100 664 2005 30007 1.27∗ 1030 So if typical values of n are > 100 then the additive constants and multiplicative constants are not that important,
For n > 100,
100log2n < 20n + 5 < 3n2+ 7 < 2n
So for sufficiently large values of n c1log2n < c2n < c3n2< cn4
If we do 1000 million operations per second, 1.27∗ 1030 operations > 1020 seconds
> 1012 years>> age of universe.
9
To remove unimportant constants from run- time and space, we introduce order of notion.
Definition
T (n) is O(f (n)) (in the Order of f (n)) iff there are constants c and k such that
T (n)≤ c · f(n) for all values of n greater than k.
T (n) is in Ω(f (n)) iff there are constants c and k such that
T (n)≥ c · f(n) for all values of n greater than k.
T (n) is in Θ(f (n)) iff T (n) if it is Ω(f (n)) and also O(f (n)).
10
For the linear search among n elements we state that:
worst case is O(n),
average case is O(n),
best case is O(1) .... constant time.
We call this the asymptotic analysis of the al- gorithm.
We could state the same for the lower bounds and so for the linear search among n elements:
worst case is Θ(n), average case is Θ(n),
best case is Θ(1) .... constant time.
11
Rules for Asymptotic Analysis
algorithm A() { Step 1;
Step 2;
Step 3;
. // there is no loop
. . Step i;
}
Sum rule:
TA(n) = TStep 1(n) + TStep 2(n) +· · · + TStep i(n)
For asymptotic analysis concentrate on loops and calls of functions.
12
Nested statements:
algorithm B()
{ while ( condition(n)) { C;
} }
Product rule:
If the while loop is repeated f (n) times then
TB(n) = f (n)∗ TC(n)
13
Assume the input size is n, k is a fixed constant.
for int (i=1,i <= n, i+=k) { statement;
}
above loop repeated n/k times
for int (i=1,i <= n, i*=k) { statement
}
above loop repeated log_k (n) times
int j=0;
for int (i=1,i <= n, i+=j) //
statement;
j++; // value of j is being increased }
above loop repeated square root of n times
14
Example
Input is an array of size n.
Any statement st i contains basic operations.
int i = 0;
while ((i <=n) && condition(array[i])) { // outer loop
st_1;
st_2;
int j := 1;
while (j <= n) { // inner loop st_3;
st_4;
j = 2*j;
} i = i+2;
}
Loops are nested, use the product rule.
Binary search
int binary(int K, int * array, int n) { // find if K in in the array in the
// first n locations of the array
int l = -1; // 1 less than where to search int r = n; // 1 more than where to search while (l+1 != r) {
int mid = (l+r)/2;
if (K == array[mid]) return mid;
if (K < array[mid]) r = mid;
else l = mid;
} return n;
}
Assume we want to count comparisons:
loop repetition:
best case: once
worst case: log2n times average case: log2n− 1 times
Number of comparisons each time in the loop:
best case: 1 (when we find the element) worst case: 2
average case: 2
Use the product rule to find T (n).
For the binary search, in the worst case:
T (n) = 2 log2n = O(log2n)
17
int binary(int K, int * array, int n) { // find if K in in the array in the
// first n locations of the array
int l = -1; // 1 less than where to search int r = n; // 1 more than where to search while (l+1 != r) {
int mid = (l+r)/2;
if (K < array[mid]) r = mid;
else { if (K == array[mid]) return mid;
l = mid;
} } return n;
}
On average, we only do 1.5 comparisons in the loop.
In the previous version we always do 2 com- parisons (except when we find K).
It is better than the previous version.
It is an example of a fine-tuning of an algo- rithm. However it remains in O(log2n).
18
Space bound for algorithms are calculated also asymptoticly like for the time. Usually, it is much simpler.
Example: Both, linear and binary search need space Θ(n).
Space/Time Tradeoff
In many applications we can find faster algo- rithms but we need to use more space. Or, we can save some space by having a slower algorithm.
This possibility of trading time for space may be important in some applications.
19
Simplifications of expressions in asymptotic analysis
c is a constant:
O(f (n) + c) = O(f (n)) O(cf (n)) = O(f (n))
If, asymptoticly, f1 grows faster than f2: O(f1(n) + f2(n)) = O(f1(n))
If f1 is O(g1) and f2 is O(g2) then O(f1(n)∗ f2(n)) = O(g1(n)∗ g2(n))
20
If
nlim→∞
g(n) f (n) = 0 then f (n) grows faster than g(n).
f (n) is in O(g(n)) and g(n) is Ω(f (n)).
If
n→∞lim g(n) f (n) =∞ then g(n) grows faster than f (n)
If
n→∞lim g(n) f (n)= c where c is a nonzero constant
then f (n) and g(n) grow at the same rate.
f (n) is in O(g(n)) and g(n) is in O(f (n)).
21
an algorithm:
Loop structure of
c log n
c log (c n)
c log n c c n
log n c n 3 4
5 6 c n
2 2
7
8 2
9 c n1
loop repeats c n0
T (n) = (c0n)∗(c1n)+c2log2n+(c3n)∗(c4log2(c9n)+
(c5log2n)∗ (c6√
n)∗ (c7log2n) + c8n
= d1n2+c2log n+d2n log2n+d3√
n log2n+c8n
= O(n2)
22
Fundamental Data Structures
List:
A finite, ordered sequence of items.
L = (a0, a1, a2, . . . , an−1)
Each element has a position in the list.
a0: first item, at position 0 a1: second item, at position 1 ...
an−1: last item, at position n− 1.
All items are of the same type (could be an aggregate type)
Empty list contains no items: L = ()
Length of a list: number of items.
if L = (a0, a1, a2, . . . , an−1), length of L = n.
head: the beginning of list tail: the end of the list
ordered list: elements positioned according to some total order of elements, ascending or descending
unordered list: no apparent order.
ADT List
class list { // list ADT public:
list(const int = LIST_SIZE) // constructor
~list(); // destructor
void clear(); // remove all items void insert(const ELEM&); // insert ELEM at curr pos void append(const ELEM&); // insert ELEM at tail ELEM remove(); // remove and return curr e void setFirst(); // set curr to first pos void prev(); // set curr to prev pos void next(); // set curr to next pos int length() const; // return the actual lengt void setPos(const int); // set curr to specif. pos void setValue(const ELEM&); // set value at curr ELEM currVal() const; // return value at curr bool isEmpty() const; // true iff list is empt bool isInList() const; // true iff curr within li bool find(const ELEM&); // true iff ELEM in list
// from curr on };
25
ADT specifies the set of operations that can be performed on the list, but not how it should be implemented.
We now will consider two very common imple- mentations of a list.
The first implementation uses an array to store the items,
an index curr indicates the current item.
26
// Array based implementation class list { // array based list private:
int msize; // maximum list size int numinlist; // actual number of items int curr; // position of the current item ELEM* listarrray; // Array holding items
public:
list(const int = LIST_SIZE) // constructor
~list(); // destructor
void clear(); // remove all items void insert(const ELEM&); // insert ELEM at curr p void append(const ELEM&); // insert ELEM at tail ELEM remove(); // remove and return cur void setFirst(); // set curr to first pos void prev(); // set curr to prev pos void next(); // set curr to next pos int length() const; // return the actual len void setPos(const int); // set curr to specif. p void setValue(const ELEM&); // set value at curr ELEM currVal() const; // return value at curr bool isEmpty() const; // true iff list is empt bool isInList() const; // true iff curr within bool find(const ELEM&); // true iff ELEM in list
// from curr on };
27
list::list(const int size) // constructor { msize = sz; numinlist = curr = 0;
listarray = new ELEM[sz];
}
~list::list() // destructor { delete [] listarray}
void list:: prev() { curr--; }
void setPos(const int pos) // set curr to pos { curr = pos; }
void list::append(const ELEM& item) // insert at the tail
{ assert(numinlist < msize);
listarray[numinlist++]=item;
}
28
Insertion an item in the list requires a shifting of the elements in the list.
void list::insert(const ELEM& item) { //insert item in the current position assert((numinlist < msize) && (curr >=0)
&& (curr <= numinlist));
for (int i=numinlist; i >curr; i--)
//Shift elements one position up listarray[i] = listarray[i-1];
listarray[curr] = item;
numinlist++;
}
Similarly the deletion of an item not at the tail of the list requires a shifting of the elements in the list.
29
The main disadvantage of the array-based im- plementation:
- must state the maximum length of the list when the list is created,
- insert, delete an item in the middle of a list is slow.
We can use a linked list consisting of nodes.
Each node contains an item and a pointer to the next item.
A nodes are created when an additional item is needed.
30
List node:
data field link field element next
class link { // singly linked list node public:
ELEM element; // holds an item
link * next; // pointer to the next node link(const ELEM& elemval,link* nextval = NULL)
// constructor 1
{ element = elemval; next = nextval;}
link(link* nextval = NULL) {next = nextval;}
// constructor 2
~link() {} // destructor };
// Linked-list List implementation class list { // linked-list based list private:
link* head; // pointer to list header link* tail; // pointer to the last item link* curr; // position of the current it public:
list(const int = LIST_SIZE) // constructor
~list(); // destructor
void clear(); // remove all items void insert(const ELEM&); // insert ELEM at curr p void append(const ELEM&); // insert ELEM at tail ELEM remove(); // remove and return cur void setFirst(); // set curr to first pos void prev(); // set curr to prev pos void next(); // set curr to next pos int length() const; // return the actual len void setPos(const int); // set curr to specif. p void setValue(const ELEM&); // set value at curr ELEM currVal() const; // return value at curr bool isEmpty() const; // true iff list is empt bool isInList() const; // true iff curr within bool find(const ELEM&); // true iff ELEM in list
// from curr on };
To simplify insertion at the current, we make curr to point to the item preceding the actual current item.
To make it consistent for all items, we will keep a header node in the list that precedes the items.
L = (c, m, a, b)
with a being the current element:
c m a b
header node
head
curr
tail
It also simplifies some other operations.
33
list::list(const int size) // constructor { tail = head = surr = new link; }
~list::list() // destructor { while (head != NULL){
curr = head;
head= head->next;
delete curr;
}
void list:: next()
{ if (curr !=null) curr = curr ->next; }
void setFirst() // set curr to pos { curr = head; }
void list::append(const ELEM& item) // insert at the tail
{ tail ->next = new link(item, NULL);
tail = tail->next;
}
34
A Comparison of
Array and Linked list implementations
Worst-case time
operation array linked list
clear O(1) O(n)
insert O(n) O(1)
remove O(n) O(1)
append O(1) O(1)
setFirst O(1) O(1)
prev O(1) O(n)
next O(1) O(1)
length O(1) O(n)
setPos O(1) O(n)
setValue O(1) O(1)
isEmpty O(1) O(1)
isInList O(1) O(1)
find O(n) unsorted O(n)
O(log2 n) sorted
Thus, choice depends on which operations are done most of the time.
35
Space for operations:
None requires more that O(1) additional space.
Space for data structure:
Array:
list is static, it size cannot grow past the size specified when created. Space is wasted if the list is not near its maximal size.
Good when a list size can be predicted.
Linked List
list is dynamic, it size is always exactly what is needed.
We waste space in link fields.
Good when a list size cannot be predicted and the list size varies a lot.
36
The main disadvantage of the linked-list imple- mentation is the slow operations prev, setPos.
This can be improved by using two pointers in each node:
- One pointer points to the next item, - One pointer points to the previous item.
element
prev next
37
Doubly linked lists
class link { // A doubly linked list node public:
ELEM element; // value in the node link* next; // points to the next node link* prev; // points to previous node link* curr; // position of current elem.
link( const ELEM& elemval,
link* nextp = NULL, link* prevp = NULL) // constructor 1
{ element = elemval;
next = nextp; prev = prevp;}
link( link* nextval = NULL) // constructor 2
{ next = nextp; prev = prevp;}
const int = LIST_SIZE);
~link(){ } // destructor };
38
Restricted variants of lists.
Lists that do not need to use all list operations.
Stack:
A list in which all insertions and deletions are done at one end of the list.
top is the end where insertions/deletions take place.
top elem.
elem elem elem
The operations on the stack are usually called:
push for inseting an element at the top, pop for removing the top element.
Both, array and linked-list implementations of stacks are used in practice.
In the linked-list implementation, the head of the list is used for the top of the stack.
Array implementation is used when we know the limit on the stack size.
Linked-list is used when the limit on the stack size is not known.
Other name for a stack: LIFO
class Stack { // array based stack private:
int top;
int size;
ELEM *listarray; // array holding stack items public:
Stack(const int sz=LIST_SIZE) // constructor { size = sz; top = 0; listarray = new ELEM[sz];}
~Stack() //destructor
{ delete [] listarray;}
void clear() // remove all elements { top = 0;}
void push(const ELEM & item) // push ELEM on top { assert(top < size};
listarray[top++] = item;}
ELEM pop() // remove the top elem
{ assert( !isEmpty());
return listarray[--top];}
ELEM top() // get the top value
{ assert( !isEmpty());
return listarray[top-1];}
bool isEmpty() const // true iff stack empty { return (top == 0);}
};
41
class Stack { // linked-list stack private:
link *top;
public:
Stack(const int sz=LIST_SIZE) // constructor { top = NULL; }
~Stack() //destructor
{ clear(); }
void push(const ELEM & item) // push ELEM on top { top = new link(item,top)}
ELEM pop() // remove the top elem
{ assert( !isEmpty());
link *ltemp = top -> next;
ELEM temp = top-> element;
delete top; top = ltemp;
return temp;}
ELEM top() // get the top value
{ assert( !isEmpty());
return top->element;}
bool isEmpty() const // true iff stack empty { return (top == NULL);}
};
42
Queue:
A list in which:
all insertions are done at one end of the list, called the back of the queue,
all deletions are done at the other end of the list, called the front of the queue.
Operations are often called
enqueue for inserting an element,
dequeue for removing an element in a queue.
Both, array and linked-list implementations of queues are used in practice.
Other name for a queue: FIFO
43
Picture of an array-based implementation:
a[0] a[1]
a[2]
a[3]
a[4]
a[7]
a[8]
a[9]
a[12]
a[6]
a[5]
a[10]
a[11]
Queue contains elements 11,25,13,40,50 front element is 11
rear element is 50
11
25
13 40 50
Front
Rear
=1
=6
Queue empty : front == rear
Queue full: (rear+1) mod size == front Above queue can contain at most 12 elements, but array contains 13 locations.
44
class Queue { // Array based implementation private:
int size; // Max size of queue
int front; // index prior to front it.
int rear; // index of the rear it.
ELEM * listarray; // array holding items public:
Queue(const int sz = LIST_SIZE) // constructor { size = sz+1; front=rear=0; // empty queue
listarray = new ELEM[size]; }
~Queue() { delete [] listarray} // destructor void clear() { front = rear = 0; // clear queue void enqueue(const EMEM&); // put one item
// at rear ELEM dequeue(); // remove item
// in front ELEM firstValue() const
{assert(!isEmpty());
return listarray[(first+1) % size]}
// return the value of front element bool isEmpty() const
{ return (front == rear); }
//true if empty queue }
45
void Queue::enqueue(const ELEM& item) { assert(((rear+1)%size)!=front);
rear = (rear+1) % size;
listarray[rear]=item;
}
ELEM Queue:: dequeue() { assert(!isEmpty());
front = (front+1) % size;
return listarray[front];
}
ELEM firstValue()
{ assert(!isEmpty());
return listarray[(front+1)% size];
}
46
frontrear Queue is empty: front == rear== NULL
class Queue { //linked queue implementation private:
link* front; //points to the front el.
link* rear; //points to the rear el.
public:
Queue(const int sz = LIST_SIZE) // constructor { front=rear=NULL;} // empty queue
~Queue() { clear();} // destructor
void clear(); // clear queue
void enqueue(const EMEM&);// put one item at rear
ELEM dequeue(); // remove item in front ELEM firstValue() const
{assert(!isEmpty());
return front -> element;}
// return the value of front element
bool isEmpty() const
{ return (front == NULL); }
//true if empty queue }
void Queue:: enqueue(const ELEM& item) {if(rear == NULL)
front = rear = new link(item,NULL);
else{ rear->next = new link(item,NULL);
rear = rear->next;
} }
ELEM Queue:: dequeue() {assert(!isEmpty());
ELEM temp = front->element;
link * ltemp = front;
front = front->next;
delete ltemp;
if (front == NULL) rear = NULL;
return temp;
}
void Queue::clear() {link * ltemp;
while (front!=NULL)
{ltemp=front; front=front->next; delete ltemp;}
rear = NULL;
}
49
Examples of applications of stacks and queues
Stacks
Analysis of texts Compilers
Implementation of recursive functions Depth-first search
...
Queues
Queues of file names for printing of files Queues of file names for writing files to a disc Queues of incoming mail messages
Queues of outgoing mail messages Queues of processes for batch processing ...
50
The complexity of
stack and queue operations:
Assume stack or queue is of length n.
push, pop, topValue,
enqueue, deques, firstValue,
O(1) in array and linked list implementations.
clear or destructor are:
O(n) for linked implementation O(1) for array implementations.
(not done frequently)
arrays are slightly faster,
linked lists don’t have any upper bound on length.
52
Examples of applications of stacks and queues Queues
On a computer system we have:
A queue of file names for printing of files.
A queues of file names for writing files to a disc.
A queues of incoming mail messages.
A queues of outgoing mail messages.
Queues of processes for batch processing, etc.
(usually arrays are used).
Used a lot in programs simulating a behavior of complex systems, e.g.:
Simulations of operations at an airport is used to determine how many check-in counters, se- curity checks should be there for a smooth op- eration.
(usually linked lists are used.)
53
Stacks
Analysis of texts and expressions, Compilers,
(usually arrays are used).
Implementation of function calls that allows recursion,
Depth-first search.
...
(usually linked lists are used).
54
For each function call a new data area is allo- cated.
The area contains space for:
1. variables declared inside the function
2. values of value parameters
3. addresses of reference parameters
4. return address
we call this area an activation record of the function. This is of fixed length known at the compile time.
Activation records are placed in the run-time stack
55
A call of function f ⇔ push
the activation record of f on the run-time stack.
A return from function f ⇔ pop
activation record of f from the run-time stack.
space for Operating system
code of the program
Run-time stack for function calls
space for variables
created by new Heap
Run-time stack
int f( int x) { . . }
void g(int y) {int temp;
} f(temp);
void main() { int temp;
float a;
g(temp}
}
int count(const ELEM val,link *l)
{//count the number of occurrences of val in list l link * ltemp = l;
int Sum = 0;
while (ltemp !=NULL)
{ if (ltemp->elemnt ==val) Sum++;
ltemp = ltemp->next;
}
return Sum;
}
int rcount(const ELEM val,link *l)
{//count the number of occurences of val in list l if (l==NULL) return 0;
if (l->element == val)
return 1 + rcount(val, l->next);
return rcount(val, l->next);
}
58
Binary Trees
A binary tree is made of a set of nodes that is either empty or
the set consists of a node called the root and the remaining nodes are partitioned into left and right binary trees.
The root of a nonempty subtree is connected to the root by an edge.
59
a
b
c d
e
f g
h left subtree
root
right subtree
60
Terminology
node edge root
(left, right) child parent
path from node n1to n2is a sequence of nodes n1, n2, n3, . . . , nk such that
ni is a parent of ni+1, 1≤ i ≤ k − 1 ancestor, descendent
depth of a node x: length of the path from the root to x.
height of a tree T is one more than maximal depth of a node.
level d: all nodes of depth d leaf has no child
internal node has at least one child.
61
Basic facts:
There are at most 2i nodes at level i.
There are at most 2i− 1 nodes in a tree of height i
In a nonempty full binary tree T ,
the number of leaves is one more than the number of internal nodes.
In a nonempty binary tree T ,
the number of empty subtrees in T is one more than the number of nodes in T .
62
Linked ( or pointer based) representation of Binary trees
In many applications we only need in every node to be able to reach its children.
Structure of a node:
in the node
pointer to the left child
pointer to the right child value stored
63
class BinNode { //binary tree node // linked implementation public:
BELEM element; // the value stored in the node BinNode * left; // ptr. to left child
BinNode * right; //ptr. to right child
BinNode(){ left= right= NULL;} // constr., // no node value BinNode(BELEM el, BinNode * lptr = NULL,
BinNode * rptr = NULL,)
{ element = el; left= lptr; right= rptr;}
// constr. with initial values
~BinNode(){} // destructor
BinNode * Leftchild() const { return left;}
BinNode * Rightchild() const { return right;}
BELEM value() const { return element;}
void setValue(BELEM val) { element = val;}
bool isLeaf() const // true if node is a leaf { return(left == NULL) && (right == NULL);}
}
Since the tree itself has a recursive structure, it is often very simple to state algorithms re- cursively.
Inorder printout of nodes
void Inorder_print(BinNode * rt) { // recursive inorder traversal
// rt is a pointer to the root of a bin. tree if (rt == NULL) return;
Inorder_print( rt->leftchild());
cout << rt->value();
Inorder_print( rt-> rightchild());
}
Any other traversal is a permutation of the three statements in the function.
Any recursive algorithm must have at least one base case where no recursion is done.
One of the base cases must precede any recur- sive call.
Any recursive call must bring the algorithm closer to a base case.
Execution of a recursive algorithm can be rep- resented using a call tree in which we show the steps to be taken. Execution proceeds from left to right and left part must finish before any other node at the same level is done.
66
Recursive calls are handled by the run-time sys- tem using a stack containing an “activation record” for all unfinished function calls.
A nonrecursive implementation of a nontriv- ial recursive algorithm requires the use of a user-defined stack that takes care of the un- finished possibilities.
67
Non-recursive preorder printout of nodes
void NR_preorder_print(BinNode * rt) { // non-recursive preorder traversal
// rt is a pointer to the root of a bin. tree if (rt== NULL) return;
stack S;
S.push(rt);
BinNode * ptr;
while (! S.isEmpty()){
ptr = S.pop();
while ( ptr != NULL){
cout << ptr->value()<<’ ’;
if (ptr->rightchild()!= NULL) S.push(ptr->rightchild());
ptr = ptr->leftchild();
} } }
68
In some applications information in internal nodes is different from information in the leaf nodes.
Leaf nodes do not need pointers to children.
(There can be very many leaf nodes)
We can have internal nodes and leaf nodes as two sub-classes of the class binary node.
We will use a function isLeaf()
to distinguish between leaf and internal nodes.
It will be a virtual function, i.e. each derived subclass defines it.
69
class VarBinNode { // Node base class public:
virtual bool isLeaf() = 0;
// each derived class must define it.
}
class LeafNode : public VarBinNode { //leaf node subclass
private:
Operand var;
public:
Leafnode(const Operand opd) { var = opd;}
//constructor bool isLeaf() { return TRUE}
Operand & value() { return var;} // node value };
70
class IntNode : public VarBinNode { // internal node
private:
Operator opr;
VarBinNode * left;
VarBinNode * right;
public:
IntNode(const Operator & op, VarBinNode * lptr, VarBinNode * rptr) // constructor {opr = op; left = lptr; right = rptr;}
bool isLeaf() { return FALSE }
VarBinNode * leftchild() { return left;}
VarBinNode * rightchild() { return right;}
Operator & value () { return opr;}
}
71
void Inorder_print(VarBinNode * rt) { // recursive inorder traversal
// rt is a pointer to the root of a bin. tree if (rt == NULL) return;
if (rt->isLeaf())
cout <<(LeafNode *) rt->value();
else {
Inorder_print((IntNode *) rt->leftchild());
cout <<(IntNode *) rt->value();
Inorder_print((IntNode *) rt-> rightchild());
} }
Example of binary tree application Huffman Coding Trees
Frequency of characters in English:
letter freq. letter freq.
A 77 N 67
B 17 O 67
C 32 P 20
D 42 Q 5
E 120 R 59
F 24 S 67
G 17 T 85
H 50 U 37
I 76 V 12
J 4 W 22
K 7 X 4
L 42 Y 22
M 24 Z 2
Arrange in order of Frequencies:
Z,2 J,4 X,4 Q,5 K,7, V,12 B,17 G,17
W,22 Y,22 ... E,120
Build a coding tree by:
1. joining first two elements of the list into a tree.
Weight of this tree = sum of weights of the elements,
2. put it back in the list in order of weights.
Repeat the above until only one elements re- mains in the list.
74
65
33
9 79
120 E
37 U
42 D
42 L
32 C
2 Z
7 K
24 F 306
186
107
0 1
0
0 0
0
0
0 1
1
1
1 1
Codes: E=0, U=100, D=101, D=110, ... F=11111 1001110101 = UCD
75
Binary Search Trees BST
A data structure that allows fast search and fast insertion/deletion of elements.
A frequently used data structure.
Definition
A binary tree T is a Binary Search Tree if for any node n in the tree the values in n is larger than any value in the left subtree of n and is smaller than any value in the right sub- tree of n.
76
class BST { private:
BinNode* root;
void clearhelp(BinNode*);
void inserthelp(BinNode*&,const BELEM);
void removehelp(BinNode*&,const BELEM);
void printhelp(const BinNode*, const int) const;
public:
BST(){root = NULL;} //constructor
~BST(){clearhelp(root);root = NULL;}//destructor void clear(){clearhelp(root); root = NULL;}
void insert(const BELEM & val) { inserthelp(root, val);}
void remove(const BELEM & val) { removehelp(root, val);}
BinNode * deletemin(BinNode*&);
BinNode * find(const BELEM& val) const bool isEmpty() const(return root == NULL;}
void print() const {
if (root == NULL) cout << "empty";
else printhelp(root, 0);}
};
77
void BST::clearhelp(BinNode * rt) { if (rt == NULL) return;
clearhelp(rt->leftchild());
clearhelp(rt->rightchild());
delete rt;
}
void BST::printhelp(BinNode * rt, const int level){
// print tree in inorder, indent. indicates level if (rt == NULL) return;
printhelp(rt->leftchild(),level+1);//print l.s.t.
for (int i=0; i <level; i++) //print r.s.t.
cout << " ";
cout << rt->value()<<"\n";
printhelp(rt->righttchild(),level+1);
}
78
Nonrecursive find function:
BinNode * BST :: find(const BELEM& val) const // return a pointer to the node containing val // or NULL if val is not in the tree
{BinNode * ptr = root;
BinNode * res = NULL;
while (ptr != NULL) && (res == NULL) {
if (val < ptr-> value()) ptr = ptr->leftchild();
else if (val == ptr-> value()) res = ptr;
else ptr = ptr->rightchild();
}
return res;
}
The loop is repeated in the worst case height of the tree times.
If the tree has n nodes and is not very ”out of balance”, find is O(log2n)
79
Insertion of a new node
Always as a leaf.
void BST :: inserthelp(BinNode *& rt,const BELEM val){
if (rt == NULL) rt = new BinNode(val,NULL,NULL);
else if(val < rt->value()) inserthelp(rt->left,val);
else inserthelp(rt->right,val);
}
Delete smallest element from BST.
It does not have a left child!
BinNode BST :: deletemin(BinNode *& rt){
// delete the smallest el. and return its pointer assert(rt != NULL);
if (rt ->left != NULL) return deletemin(rt->left);
else // rt points to the minimal element { BinNode * ptr = rt;
rt =rt->right;
return ptr;
} }
Remove a node from BST
1. Find the node X to remove.
2. if X does not have any child, remove X.
3. if X does not have both children, remove X and make the child of X to be the child of the parent of X.
4. if X has both children, then
(a) Remove the minimal element M in the right subtree of X.
(b) replace X by M .
void BST::removehelp(BinNode *& rt,const BELEM val){
//remove the node containing val
if (rt == NULL) cout << val << " not in BST";
else if (val < rt-> value())
removehelp(rt -> left,val);
else if (val > rt-> value())
removehelp(rt -> right,val);
else { // we have the node to remove BinNode * ptr = rt;
if (rt-> left == NULL) rt = rt->right else if (rt->right == NULL)
rt = rt->left
else { // both subtrees nonempty ptr = deletemin(rt->right);
rt->setvalue(ptr->value());
} delete ptr;
} }
82
a b
c
d
e f
g
h i
j
k l
83
Insertion and removal of nodes in a BST can be modified so that BST remains “balanced”
and then find, insertion, deletion are all
O(log2n) in the worst case.
See AVL trees, or balanced BST trees in any book on algorithms.
84