• No results found

Fundamental Data Structures

N/A
N/A
Protected

Academic year: 2021

Share "Fundamental Data Structures"

Copied!
42
0
0

Loading.... (view fulltext now)

Full text

(1)

2

Fundamental

Data Structures

17 February 2021

Sebastian Wild

(2)

Outline

2

Fundamental Data Structures

2.1 Stacks & Queues 2.2 Resizable Arrays 2.3 Priority Queues 2.4 Binary Search Trees 2.5 Ordered Symbol Tables 2.6 Balanced BSTs

(3)
(4)

Abstract Data Types

abstract data type (ADT)

I list of supported operations

I whatshould happen

I not: how to do it

I not: how to store data

β‰ˆ Javainterface

(with Javadoc comments)

vs.

data structures

I specify exactly

how data is represented

I algorithms for operations

I has concrete costs

(space and running time)

β‰ˆ Javaclass

(non abstract)

Why separate?

I Can swap out implementations β€œdrop-in replacements”)

reusable code!

I (Often) better abstractions

(5)

Stacks

Stack ADT

I top()

Return the topmost item on the stack Does not modify the stack.

I push(π‘₯)

Addπ‘₯onto the top of the stack.

I pop()

Remove the topmost item from the stack (and return it).

I isEmpty()

Returnstrueiff stack is empty.

I create()

(6)

Linked-list implementation for Stack

Invariants:

I maintaintoppointer to topmost element

I each element points to the element below it

(ornullif bottommost)

Linked stacks:

I requireΘ(𝑛)space when𝑛elements on stack

(7)

Array-based implementation for Stack

Can we avoid extra space for pointers? array-based implementation

Invariants:

I maintain arraySof elements, from bottommost to topmost

I maintain indextopof position of topmost element inS.

What to do if stack is full uponpop?

Array stacks:

I requirefixed capacity𝐢(known at creation time)!

I requireΘ(𝐢)space for a capacity of𝐢elements

(8)
(9)

Digression – Arrays as ADT

Arrays can also be seen as an ADT! . . . but are commonly seen as specific data structure

Array operations:

I create(𝑛) Java:A = new int[𝑛];

Create a new array with𝑛cells, with positions0, 1, . . . , 𝑛 βˆ’ 1

I get(𝑖) Java:A[𝑖]

Return the content of cell𝑖

I set(𝑖,π‘₯) Java:A[𝑖] = π‘₯;

Set the content of cell𝑖toπ‘₯.

Arrays have fixed size (supplied at creation).

Usually directly implemented by compiler + operating system / virtual machine.

Difference to others ADTs: Implementation usually fixed to β€œa contiguous chunk of memory”.

(10)

Doubling trick

Can we have unbounded stacks based on arrays? Yes!

Invariants:

I maintain arraySof elements, from bottommost to topmost

I maintain indextopof position of topmost element inS

I maintain capacity𝐢 =S.lengthso that14𝐢 ≀ 𝑛 ≀ 𝐢

can always push more elements!

How to maintain the last invariant?

I beforepush

If𝑛 = 𝐢, allocate new array of size2𝑛, copy all elements.

I afterpop

If𝑛 < 14𝐢, allocate new array of size2𝑛, copy all elements.

β€œResizing Arrays

an implementation technique, not an ADT!

(11)

Amortized Analysis

I Any individual operationpush/popcan be expensive!

Θ(𝑛)time to copy all elements to new array.

I But: An one expensive operation of cost𝑇meansΞ©(𝑇)next operations are cheap!

Formally:consider β€œcredits/potential”

distance to boundary

Ξ¦ =min{𝑛 βˆ’ 14𝐢, 𝐢 βˆ’ 𝑛} ∈ [0, 0.6𝑛]

I amortized cost of an operation = actual cost (array accesses) βˆ’ 4Β·change inΞ¦

I cheappush/pop: actual cost1array access, consumes≀ 1credits amortized cost≀ 5

I copyingpush: actual cost2𝑛 + 1array accesses, creates12𝑛 + 1credits amortized cost≀ 5

I copyingpop: actual cost2𝑛 + 1array accesses, creates12𝑛 βˆ’ 1credits amortized cost5

sequenceofπ‘šoperations: total actual cost≀total amortized cost+final credits

(12)

Queues

Operations:

I enqueue(π‘₯)

Addπ‘₯at the end of the queue.

I dequeue()

Remove item at the front of the queue and return it.

(13)

Bags

What do Stack and Queue have in common?

They are special cases of aBag!

Operations:

I insert(π‘₯)

Addπ‘₯to the items in the bag.

I delAny()

Remove any one item from the bag and return it. (Not specified which; any choice is fine.)

I roughly similar to Java’sCollection

Sometimes it is useful to state that order is irrelevant Bag

(14)
(15)

Priority Queue ADT – min-oriented version

Now: elements in the bag have differentpriorities.

(Max-oriented) Priority Queue (MaxPQ):

I construct(𝐴)

Construct from from elements in array𝐴.

I insert(π‘₯,𝑝)

Insert itemπ‘₯with priority𝑝into PQ.

I max()

Return item with largest priority.(Does not modify the PQ.)

I delMax()

Remove the item with largest priority and return it.

I changeKey(π‘₯,𝑝0)

Updateπ‘₯’s priority to𝑝0.

Sometimes restricted to increasing priority.

I isEmpty()

(16)

PQ implementations

Elementary implementations

I unordered list Θ(1)insert, butΘ(𝑛)delMax

I sorted list Θ(1)delMax, butΘ(𝑛)insert

Can we get something between these extremes? Like a β€œslightly sorted” list? Yes! Binary heaps.

Array view

Heap = array𝐴with

βˆ€π‘– ∈ [𝑛] : 𝐴[b𝑖/2c] β‰₯ 𝐴[𝑖]

≑

store nodes in level order

in𝐴[1..𝑛]

Tree view

Heap = tree that is

(i)a complete binary tree

all but last level full last level flush left

(ii)heap ordered

(17)
(18)

Why heap-shaped trees?

Why complete binary tree shape?

I only one possible tree shape keep it simple!

I complete binary trees have minimal height among all binary trees

I simple formulas for moving from a node to parent or children:

For a node at indexπ‘˜in𝐴

I parent atbπ‘˜/2c

I left child at2π‘˜

I right child at2π‘˜ + 1

Why heap ordered?

I Maximum must be at root! max()is trivial!

I But: Sorted only along paths of the tree; leaves lots of leeway for fast

how? . . . stay tuned

(19)
(20)
(21)
(22)

Analysis

Height of binary heaps:

I heightof a tree: # edges on longest root-to-leaf path

I depth/levelof a node: # edges from root root has depth0

I How many nodes on firstπ‘˜fulllevels?

π‘˜ Γ•

β„“ =0

2β„“ =2π‘˜+1βˆ’ 1

Height of binary heap: β„Ž = minπ‘˜s.t.2π‘˜+1βˆ’ 1 β‰₯ 𝑛 = blg(𝑛)c

Analysis:

I insert: new element β€œswims” up ≀ β„Žsteps (β„Žcmps)

I delMax: last element β€œsinks” down ≀ β„Žsteps (2β„Žcmps)

I constructfrom𝑛elements:

cost = cost of letting each node in heap sink!

≀ 1 Β· β„Ž + 2 Β· (β„Ž βˆ’ 1) + 4 Β· (β„Ž βˆ’ 2) + Β· Β· Β· + 2β„“ Β· (β„Ž βˆ’ β„“ ) + Β· Β· Β· + 2β„Žβˆ’1Β· 1 + 2β„ŽΒ· 0 = β„Ž Γ• β„“ =0 2β„“(β„Ž βˆ’ β„“ ) = β„Ž Γ• 𝑖=0 2β„Ž 2𝑖𝑖 = 2 β„Ž β„Ž Γ• 𝑖=0 𝑖 2𝑖 ≀ 2 Β· 2 β„Ž ≀ 4𝑛

(23)

Binary heap summary

Operation Running Time

construct(𝐴[1..𝑛]) 𝑂(𝑛) max() 𝑂(1) insert(π‘₯,𝑝) 𝑂(log 𝑛) delMax() 𝑂(log 𝑛) changeKey(π‘₯,𝑝0) 𝑂(log 𝑛) isEmpty() 𝑂(1) size() 𝑂(1)

(24)
(25)

Symbol table ADT

Symbol table / Dictionary / Map

Java:java.util.Map<K,V>

/ Associative array / key-value store:

I put(π‘˜,𝑣) Pythondict:d[π‘˜] = 𝑣

Put key-value pair(π‘˜, 𝑣)into table

I get(π‘˜) Pythondict:d[π‘˜]

Return value associated with keyπ‘˜

I delete(π‘˜)

Remove keyπ‘˜(any associated value) form table

I contains(π‘˜)

Returns whether the table has a value for keyπ‘˜

I isEmpty(),size()

I create()

Most fundamental building block in computer science.

(26)

Symbol tables vs mathematical functions

I similar interface

I but: mathematical functions are static (never change their mapping)

(Different mapping is a different function)

I symbol table = dynamicmapping

(27)

Elementary implementations

Unordered (linked) list:

Fastput

Θ(𝑛)time forget

Too slow to be useful

Sortedlinked list:

Θ(𝑛)time forput

Θ(𝑛)time forget

Too slow to be useful

(28)

Binary search

It does help . . . if we have a sorted array! Example:search for69

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 11 12 17 28 35 55 57 63 69 77 79 80 82 85 88 97 β„“ π‘š π‘Ÿ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 11 12 17 28 35 55 57 63 69 77 79 80 82 85 88 97 β„“ π‘š π‘Ÿ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 11 12 17 28 35 55 57 63 69 77 79 80 82 85 88 97 β„“ π‘š π‘Ÿ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 11 12 17 28 35 55 57 63 69 77 79 80 82 85 88 97 β„“ Binary search: I halve Β±1 remaining list in each step

≀ blg 𝑛c + 1cmps in the worst case

(29)

Binary search trees

Binary search trees (BSTs) β‰ˆ dynamic sorted array

I binary tree

I Each node has left and right child

I Either can be empty (null)

I Keys satisfy search-tree property

(30)

BST example & find

11 12 17 28 35 55 57 63 69 77 79 80 82 85 97 11 12 17 28 35 55 57 63 69 77 79 80 82 85 97

(31)

BST insert

Example:Insert88 11 12 17 28 35 55 57 63 69 77 79 80 82 85 97 97 88 88 11 12 17 28 35 55 57 63 69 77 79 80 82 85

(32)

BST delete

I Easy case: remove leaf, e. g.,11 replace bynull

I Medium case: remove unary, e. g.,69 replace by unique child

I Hard case: remove binary, e. g.,85 swap with predecessor, recurse

11 12 17 28 35 55 57 63 69 77 79 80 82 85 97 88 11 12 17 28 35 55 57 63 69 77 79 80 72 85 88 97

(33)
(34)

BST summary

Operation Running Time

construct(𝐴[1..𝑛]) 𝑂(𝑛 β„Ž) put(π‘˜,𝑣) 𝑂(β„Ž) get(π‘˜) 𝑂(β„Ž) delete(π‘˜) 𝑂(β„Ž) contains(π‘˜) 𝑂(β„Ž) isEmpty() 𝑂(1) size() 𝑂(1)

(35)
(36)

Ordered symbol tables

I min(),max()

Return the smallest resp. largest key in the ST

I floor(π‘₯), bπ‘₯c =β„€.floor(π‘₯)

Return largest keyπ‘˜in ST withπ‘˜ ≀ π‘₯.

I ceiling(π‘₯)

Return smallest keyπ‘˜in ST withπ‘˜ β‰₯ π‘₯.

I rank(π‘₯)

Return the number of keysπ‘˜in STπ‘˜ < π‘₯.

I select(𝑖)

Return the𝑖th smallest key in ST (zero-based, i. e.,𝑖 ∈ [0..𝑛))

With select, we can simulate access as in a truly dynamic array!.

(37)

Augmented BSTs

11 12 17 28 35 55 57 63 69 77 79 80 82 85 97 88 1 9 1 4 2 1 7 1 2 16 1 3 1 6 1 2

(38)

Rank

11 12 17 28 35 55 57 63 69 77 79 80 82 85 97 88 1 9 1 4 2 1 7 1 2 16 1 3 1 6 1 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 11 12 17 28 35 55 57 63 69 77 79 80 82 85 88 97

(39)

Select

11 12 17 28 35 55 57 63 69 77 79 80 82 85 97 88 1 9 1 4 2 1 7 1 2 16 1 3 1 6 1 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 11 12 17 28 35 55 57 63 69 77 79 80 82 85 88 97

(40)
(41)

Balanced BSTs

Balanced binary search trees:

I imposes shape invariant that guarantees𝑂(log 𝑛)height

I adds rules to restore invariant after updates

I many examples known

I AVL trees(height-balanced trees)

I red-black trees

I weight-balanced trees(BB[𝛼]trees)

I . . .

Other options:

I amortization: splay trees, scapegoat trees

I’d love to talk more about all of these . . . (Maybe another time)

(42)

BSTs vs. Heaps

Balanced binary search tree

Operation Running Time

construct(𝐴[1..𝑛]) 𝑂(𝑛 log 𝑛) put(π‘˜,𝑣) 𝑂(log 𝑛) get(π‘˜) 𝑂(log 𝑛) delete(π‘˜) 𝑂(log 𝑛) contains(π‘˜) 𝑂(log 𝑛) isEmpty() 𝑂(1) size() 𝑂(1)

min()/max() 𝑂(log 𝑛) 𝑂(1)

floor(π‘₯) 𝑂(log 𝑛)

ceiling(π‘₯) 𝑂(log 𝑛)

rank(π‘₯) 𝑂(log 𝑛)

select(𝑖) 𝑂(log 𝑛)

Binary heaps Strict Fibonacci heaps

Operation Running Time

construct(𝐴[1..𝑛]) 𝑂(𝑛) insert(π‘₯,𝑝) 𝑂(log 𝑛) 𝑂(1) delMax() 𝑂(log 𝑛) changeKey(π‘₯,𝑝0) 𝑂(log 𝑛) 𝑂(1) max() 𝑂(1) isEmpty() 𝑂(1) size() 𝑂(1)

I apart from fasterconstruct,

BSTs always as good as binary heaps

I MaxPQ abstraction still helpful

References

Related documents

Although total labor earnings increase with the unskilled unions’ bargaining power, we can say nothing when the increase in production is due to stronger skilled unions, since

18 th Sunday in Ordinary Time Saint Rose of Lima Parish Parroquia Santa Rosa de Lima.. August

We nd that if individuals dier in initial wealth and if commodity taxes can be evaded at a uniform cost, preferences have to be weakly separable between consumption and labor

It is the (education that will empower biology graduates for the application of biology knowledge and skills acquired in solving the problem of unemployment for oneself and others

β€’ Storage node - node that runs Account, Container, and Object services β€’ ring - a set of mappings of OpenStack Object Storage data to physical devices To increase reliability, you

β€’ Our goal is to make Pittsburgh Public Schools First Choice by offering a portfolio of quality school options that promote high student achievement in the most equitable and

investment advice (for the relevant information requirement, see Article 24(3) of the MiFID II draft). Only then is it actually possible for banks to offer this service without

In a surprise move, the Central Bank of Peru (BCRP) reduced its benchmark interest rate by 25 basis points (bps) to 3.25% in mid-January following disappointing economic growth data