• No results found

Joan 374-5555

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

John

Joan

hash(“John”)=1

hash(“Joan”)=14

Figure 8.1: An open hash table.

# lets = "Jason";;

val s : string = "Jason"

#s.[2];; - : char = ’s’ #s.[3] <- ’y’;; - : unit = () #s;; - : string = "Jasyn"

The String module defines additional functions, including the String.length and

String.blit functions that parallel the corresponding Array operations. The

String.create function does not require an initializer. It creates a string with arbi- trary contents. #String.create 10;; - : string = "\000\011\000\000,\200\027x\000\000" #String.create 10;; - : string = "\196\181\027x\001\000\000\000\000\000"

8.4 Hash tables

To illustrate these types in action, let’s implement (yet another) dictionary, this time in the form of a hash table. Ahash tableprovides the usual map from keys to values, but this time the expected running time for lookup and insertion is constant. The hash table works by computing an integerhashof a key that serves as an index into an array of dictionary entries. Insertion is performed by adding a new entry to the table at the hash index of the key; lookup is performed by searching for an entry with a matching key at the key’s hash index. An example of a hash table is shown in Figure 8.1.

8.4. HASH TABLES CHAPTER 8. RECORDS, ARRAYS, AND STRING

collisions may occur where two keys hash to the same index. Hash collisions can have a significant impact on performance. The hash table in the figure shows a so- called “chained” implementation, where entries with the same hash are stored in a list associated with that index.

For our example, we’ll implement a simple hash table where the keys are strings, and the table is polymorphic over the type of values. One approach to producing a fast, fairly good hash is called a s-box (for substitution box), which uses a table of randomly-generated numbers.

1 let random_numbers =

2 [|0x04a018c6; 0x5ba7b0f2; 0x04dcf08b; 0x1e5a22cc; 0x2523b9ea; · · ·|] 3 let random_length = Array.length random_numbers

4

5 type hash_info = {mutable hash_index : int; mutable hash_value : int } 6

7 let hash_char info c = 8 let i = Char.code c in

9 let index = (info.hash_index + i + 1) modrandom_length in

10 info.hash_value <- (info.hash_value * 3) lxorrandom_numbers.(index); 11 info.hash_index <- index

The record typehash_infohas two fields: thehash_indexis an index into the random number array, andhash_valueis the partially computed hash. The functionhash_char

uses the character to update thehash_indexand updates thehash_valueby taking the

exclusive-or with a random integer. The hash of a string is computed one character at a time.

12 let hash s =

13 let info = { hash_index = 0; hash_value = 0 } in 14 for i = 0 to String.length s - 1 do

15 hash_char info s.[i]

16 done;

17 info.hash_value

Note that the bounds in the for-loop on line 3 areinclusive; the index of the first char- acter in the string is0, and the final character has indexString.length s - 1.

The hash table itself is an array of key/value pair lists (calledbuckets), as shown in the following code.

18 type ’a hash_entry = { key : string; value : ’a } 19 type ’a hash_table = ’a hash_entry list array 20

21 (* create : unit -> ’a hash_table *)

22 let create () = 23 Array.create 101 [] 24

25 (* add : ’a hash_table -> string -> ’a -> unit *)

26 let add table key value =

27 let index = (hash key) mod (Array.length table) in

28 table.(index) <- { key = key; value = value } :: table.(index) 29

30 (* find : ’a hash_table -> string -> ’a *)

31 let rec find_entry key = function

CHAPTER 8. RECORDS, ARRAYS, AND STRING 8.4. HASH TABLES

33 | _ :: entries -> find_entry key entries 34 | [] -> raise Not_found

35

36 let find table key =

37 let index = (hash key) mod (Array.length table)in 38 find_entry key table.(index)

The functionadd : ’a hash_table -> string -> ’a -> unitadds a new entry to the table by adding the key/value pair to the table at the hash index for the key. The function find : ’a hash_table -> string -> ’a searches the table for the entry containing the key.

8.5. EXERCISES CHAPTER 8. RECORDS, ARRAYS, AND STRING

8.5 Exercises

Exercise 8.1 Reference cells are a special case of records, with the following type definition.

type ’a ref = { mutable contents : ’a } Implement the operations on reference cells.

val ref : ’a -> ’a ref val (!) : ’a ref -> ’a val (:=) : ’a ref -> ’a -> unit

Exercise 8.2 Consider the following record type definition. type (’a, ’b) mpair = { mutable fst : ’a; snd : ’b } What are the types of the following expressions?

1. [|[]|]

2. { fst = []; snd = [] }

3. { { fst = (); snd = 2 } with fst = 1 }

Exercise 8.3 Records can be used to implement abstract data structures, where the data structure is viewed as a record of functions, and the data representation is hidden. For example, a type definition for a functional dictionary is as follows.

type (’key, ’value) dictionary =

{ insert : ’key -> ’value -> (’key, ’value) dictionary; find : ’key -> ’value

}

val empty : (’key, ’value) dictionary

Implement the empty dictionaryempty. Your implementation should be pure, without

side-effects. You are free to use any internal representation of the dictionary.

Exercise 8.4 Records can also be used to implement a simple form of object-oriented programming. Suppose we are implementing a collection of geometric objects (blobs), where each blob has a position, a function (called amethod) to compute the area cov- ered by the blob, and methods to set the position and move the blob. The following record defines the methods for a generic object.

type blob =

{ get : unit -> float * float; area : unit -> float; set : float * float -> unit; move : float * float -> unit }

CHAPTER 8. RECORDS, ARRAYS, AND STRING 8.5. EXERCISES let new_rectangle x y w h =

let pos =ref (x, y) in let rec r =

{ get = (fun () -> !pos); area = (fun () -> w *. h); set = (fun loc -> pos := loc); move = (fun (dx, dy) ->

let (x, y) = r.get () in r.set (x +. dx, y +. dy)) }

in r

The rectangle record is defined recursively so that the methodmovecan be defined in

terms ofgetandset.

Suppose we have created a new rectanglerect1, manipulated it, and now we want

to fix it in position. We might try to do this by redefining the methodset. let rect1 = new_rectangle 0.0 0.0 1.0 1.0 in

rect1.move 1.2 3.4;· · ·

let rect2 = { rect1with set = (fun _ -> ()) }

1. What happens to rect2when rect2.moveis called? How can you prevent it from moving?

2. What happens torect2whenrect1.setis called?

Exercise 8.5 Write a functionstring_reverse : string -> unitto reverse a string

in-place.

Exercise 8.6 What problem might arise with the following implementation of an array blit function? How can it be fixed?

let blit src src_off dst dst_off len = for i = 0to len - 1 do

dst.(dst_off + i) <- src.(src_off + i) done

Exercise 8.7 Insertion sortis a sorting algorithm that works by inserting elements one- by-one into an array of sorted elements. Although the algorithm takesO(n2)time to sort an array of nelements, it is simple, and it is also efficient when the array to be sorted is small. The pseudo-code is as follows.

insert(array a,int i) x <- a[i] j <- i - 1

whilej >= 0 and a[j] > x a[j] <- a[j - 1] j = j - 1 a[j + 1] <- x insertion_sort(array a)

8.5. EXERCISES CHAPTER 8. RECORDS, ARRAYS, AND STRING while i < length(a)

insert(a, i) i <- i + 1 Write this program in OCaml.

Chapter 9

Exceptions

Exceptions are used in OCaml as a control mechanism, either to signal errors, or control the flow of execution in some other way. In their simplest form, exceptions are used to signal that the current computation cannot proceed because of a run-time error. For example, if we try to evaluate the quotient1 / 0in the toploop, the runtime signals

aDivision_by_zeroerror, the computation is aborted, and the toploop prints an error

message. #1 / 0;;

Exception: Division_by_zero.

Exceptions can also be defined and used explicitly by the programmer. For example, suppose we define a functionheadthat returns the first element in a list. If the list is

empty, we would like to signal an error. # exceptionFail of string;;

exception Fail of string

# lethead = function h :: _ -> h

| [] -> raise(Fail "head: the list is empty");;

val head : ’a list -> ’a = <fun>

#head [3; 5; 7];;

- : int = 3

#head [];;

Exception: Fail "head: the list is empty".

The first line of this program defines a new exception, declaringFailas an exception

with a string argument. Theheadfunction uses pattern matching—the result ishif the

list has first elementh; otherwise, there is no first element, and theheadfunction raises

aFailexception. The expression(Fail "head: the list is empty")is a value of typeexn; theraisefunction is responsible for aborting the current computation.

#Fail "message";;

- : exn = Fail "message"

# raise;;

- : exn -> ’a = <fun>

CHAPTER 9. EXCEPTIONS

Exception: Fail "message".

The typeexn -> ’afor theraisefunction may seem striking at first—it appears to say that the raise function can produce a value havinganytype. In fact, what it really means is that theraisefunction never returns, and so the type of the result doesn’t matter. When araise expression occurs in a larger computation, the entire computation is aborted.

# 1 + raise(Fail "abort") * 21;;

Exception: Fail "abort".

When an exception is raised, the current computation is aborted, and control is passed directly to the currently active exception handler, which in this case is the toploop itself. It is also possible to define explicit exception handlers. Exception handlers have the same form as amatchpattern match, but using thetrykeyword instead. The syntax is as follows.

try expressiont with

|pattern1 -> expression1 |pattern2 -> expression2 . . . |patternn -> expressionn

First,expressiontis evaluated. If it does not raise an exception, its value is returned as the result of thetrystatement. Otherwise, if an exception is raised during evaluation

ofe, the exception is matched against the patternspattern1, . . . ,patternn. If the first pattern to match the exception ispatterni, the expressionexpressioniis evaluated and returned as the result of the entiretryexpression. Unlike amatchexpression, there is no requirement that the pattern matching be complete. If no pattern matches, the exception is not caught, and it is propagated to the next exception handler (which may be the toploop).

For example, suppose we wish to define a functionhead_default, similar tohead, but returning a default value if the list is empty. One way would be to write a new function from scratch, but we can also choose to handle the exception fromhead.

# let head_default l default = try head l with

Fail _ -> default;;

val head_default : ’a list -> ’a -> ’a = <fun>

# head_default [3; 5; 7] 0;;

- : int = 3

# head_default [] 0;;

- : int = 0

In this case, if evaluation of head lraises an exception Fail, the value default is

CHAPTER 9. EXCEPTIONS 9.1. NESTED EXCEPTION HANDLERS