• No results found

Trees are data structures which organise elements in a hierarchy. A tree consists of nodes that may have links to child nodes and to a single parent node. A node without children is called aleaf. The topmost node, which has no parent, is called theroot. The children of a tree can be ordered or unordered. The number of links to follow from the root to a certain node determines thedepth of that node. The depth of the whole tree is defined as the maximum depth of all leaves. Trees in which the depth of all leaves differs by at most one are calledbalanced trees.

Various different tree structures exist with different benefits in different scenarios. A common tree structure is the binary search tree (BST) [26]. Binary trees in general are trees with at most two child nodes (left and right). Binary search trees are binary trees in which all nodes of the left subtree of any node have a value less14than the value of the node and the nodes in the right subtree have a value greater than or equal to the value of the node. Red-Black trees and AVL-trees are variants of binary search trees that keep the tree balanced to reduce the complexity of searching values in the tree.

Asymptotic time complexity of basic operations on binary search trees

We only consider binary search trees (balanced and unbalanced) in this analysis. In general, the complexity of many operations on binary search trees directly depends

on the depth of the tree. This explains why the operations on balanced trees typically have a lower worst case complexity than the same operations on unbalanced trees.

Exact query

The number of comparisons required to find a node with a specific key

has an upper bound determined by the depth of the tree. Because an unbalanced BST withn nodes may have a depth ofn (in case all nodes have only one child), the worst case complexity of finding a node in such an unbalanced BST is linear with respect to the number of nodes. In fact, such a tree is basically a linked list.

Contrarily, in a balanced tree, the complexity is never worse than

O(log n)

: at each node, only either the left or the right link is followed during traversal which causes half of all remaining nodes to be removed from the set of possible candidate nodes at each level of the tree.

Range query

Whether range queries can be executed efficiently in a BST depends on the kind of those queries. If the range query asks for nodes with a key less than a given value or greater than a given value, finding those nodes can be done efficiently, because this is exactly the criterion that defines whether nodes are inserted as the left or right child of their parent. For example, to find all nodes with a key less thank,one needs to first find the smallest node with a key greater thank and then traverse all child nodes to the left of that node. Finding such a node can be done with logarithmic time complexity.

Other range queries require the full traversal of the tree to find all nodes that match the query. This has linear complexity. As a consequence, finding a sort order of the node keys that supports the desired range query is necessary to allow efficient range queries.

Insertion

Adding a node into an unbalanced BST has a worst case complexity of

O(n)

. For example, if nodes are inserted in increasing order, each node will only have one child and each insertion requires traversal over all previously inserted nodes.

Contrarily, inserting nodes into a balanced tree can be achieved with logarithmic (

O(log n)

) worst case complexity, even with the additional cost of keeping the tree in its balanced state.

Removal

Removing a value from the tree first requires an exact query to be executed to find the respective node. Removing the actual node can take linear complexity for unbalanced BSTs. For balanced trees, this can be achieved with logarithmic worst-case complexity.

Listing 13: A binary search-tree node for a tree that stores associations between aspect instances and key tuples.

1 class BstNode<Aspect> { 2 public Tuple keyTuple;

3 public Aspect aspectInstance; 4 public BstNode<Aspect> left;

5 public BstNode<Aspect> right; 6 }

Space complexity

A binary search tree is comparable to a linked list, only that its nodes do not refer to a “next” element, but instead provide two links, one to the “left” subtree and one to the “right” subtree. The size of a node is therefore:

sizeof(BstN ode

M

) =sizeof(Object) + 3∗sizeof(Ref) +sizeof(Ref[M])

As with linked lists, no overallocation of nodes takes place when using a binary search tree. As a bare minimum, a data storage using a binary tree as a data structure must maintain a reference to the root node of the tree. The total size of the storage can therefore be estimated with:

sizeof(BstStorage

N,M

) =sizeof(Ref) +N∗sizeof(BstN ode

M

)

= (3N

+ 1)sizeof(Ref) +N∗sizeof(Object) +N∗sizeof(Ref[M]))

Note that the fact whether the tree is balanced or not does not affect the memory footprint, as the number of nodes stays the same.

Example

Listing 13 shows a possible node of a binary search tree that stores associations for the

Equality

aspect. Thekey of the node is determined by

keyTuple

.

To determine the relative order of two keys in the BST,

Tuples

need to be comparable.

Tuple

may implement the

Comparable

interface as shown in Listing 14. To make this implementation work, the

compare

method needs to be able to determine the relative

order between two arbitrary objects. This order can be artificial, for example by assigning a unique identifier to each object and comparing those identifier .

Listing 14: A sortable

Tuple

implementation for use in a BST

1 class Tuple implements Comparable<Tuple> {

2 public Object[] elements; 3

4 public int compareTo(Tuple other) {

5 assert elements.length == other.elements.length;

6

7 for(int i = 0; i < elements.length; ++i){ 8 // compare i-th element

9 int comp = compare(this.elements[i], other.elements[i]);

10 if(comp != 0){ 11 return comp;

12 }

13 }

14

15 // both tuples are equal 16 return 0; 17 } 18 19 // Returns 20 // - a negative value if a < b 21 // - a positive value if a > b 22 // - 0 if a = b

23 private static int compare(Object a, Object b){ 24 // determine relative order between elements a and b

25 } 26 }