• No results found

Generic Functions and Datatypes

Consider these two functions that compute the lengths of either an

int list(length1) or astring list(length2):

let rec length1 (l:int list) : int = begin match l with

| [] -> 0

| _::tl -> 1 + (length1 tl) end

let rec length2 (l:string list) : int = begin match l with

| [] -> 0

| _::tl -> 1 + (length2 tl) end

Other than the type annotation on the argumentl, both functions are identical—they follow exactly the same algorithm, independently of the kind of elements stored in the lists.

Computing a list length is an example of a generic function. In this case, the function is generic with respect to the type of list elements. Modern programming languages like OCaml (and also including Java and C#) pro-vide support for writing such generic functions so that the same algorithm can be applied to many different input types.

For example, to write one lengthfunction that will work for any list, we can write:

(* a generic version of length *) let rec length (l:’a list) : int =

begin match l with

| [] -> 0

| _::tl -> 1 + (length tl) end

The only difference between this generic version and the two above, is that the type of the argumentlis’a list. Here the’ais a type variable; a place holder for types. The type oflengthsays that it works for an input of type’a listwhere’acan be instantiated to any type.

For example, given the definition above, we can passlength a list of integers or a list of strings:

length [1;2;3;4] (* ’a instantiated to int *) length ["uno", "dos", "tres"] (* ’a instantiated to string *)

OCaml uses the type of the list passed in tolengthto figure out what the’ashould be. In the first case, the type’ais instantiated toint, in the second,’ais instantiated tostring.

Thelengthfunction doesn’t need to do anything with the elements of the list, but there are generic functions that can manipulate the list ele-ments. For example, here is how we can write a genericappendfunction that will take two lists of the same element type and compute the result of appending them:

(* generic append *)

let rec append (l1:’a list) (l2:’a list) : ’a list = begin match l1 with

| [] -> l2

| h::tl -> h::(append tl l2) end

Here there are a couple of observations to make. First, the type variable

’aappears in the types of two different inputs (l1andl2); this means that whenever OCaml figures out what type’astands for, it must agree with both list arguments—it is not possible to callappendwith aint listas the first argument and a string list as the second argument. Second, the result type of the function also mentions’a, which means that the element type of the resulting list is the same as the element types of both the input lists. Finally, note that we can still use pattern matching to manipulate generic data: sincel1 has type’a listwe know that inside the case for

conshmust be of type’aandtlitself has type’a list.

Functions may be generic with respect to more than one type of value.

For example, below is a generic version of the zip function that we saw in §4.5 (the version there worked only with inputs of type int list and

string list):

let rec zip (l1:’a list) (l2:’b list) : (’a*’b) list = begin match (l1,l2) with

| ([], []) -> []

| (h1::tl1, h2::tl2) -> (h1,h2)::(zip tl1 tl2)

| _ -> failwith "zip called on unequal length lists"

end

Some examples of usingzipshow how it behaves “the same” no matter which types the’aand’bvariables are instantiated to:

’a = intand’b = string:

zip [1;2;3] ["uno", "dos", "tres"]=⇒

[(1,"uno");(2,"dos");(3,"tres")]

’a = intand’b = int:

zip [1;2;3] [4;5;6]=⇒[(1,4);(2,5);(3,6)]

’a = booland’b = int:

zip [true;false] [1;2]=⇒[(true,1); (false,2)]

8.1 User-defined generic datatypes

We saw in §5 how programmers can define their own datatypes in OCaml, but we haven’t yet seen how to define a generic datatype like OCaml’s built inlist. The idea is straightforward: we create a generic datatype by parameterizing the type by type variables (’a,’b, etc.) just like the ones used to write down the types in a generic function.

Recall the definition ofint-labeled binary trees that we worked with in

§6:

(* non-generic binary trees with int labels *) type tree =

| Empty

| Node of tree * int * tree

We can make this into a generic binary tree type by adding a type pa-rameter like so:

(* generic binary trees labeled by ’a values *) type ’a tree =

| Empty

| Node of (’a tree) * ’a * (’a tree)

Note the differences: we have generic type’a treethat represents bi-nary trees all of whose nodes are contain values of type’a. Different con-crete instances of such trees may instantiate the’avariable differently. The type variable’ais a type, so it can be used as part of a tuple, as shown in the case for theNodeconstructor. The recursive occurrences oftreemust also be parameterized by the same’a—this ensures that all of the subtrees of an’a treehave nodes consistently labeled by’avalues.

Here are some examples:

Node(Empty, 3, Empty) : int tree Node(Empty, "abc", Empty) : string tree Node(Node(Empty, (true, 3), Empty),

(false, 4), Empty) : (bool * int) tree

Node(Node(Empty, 3, Empty), "abc", Empty) Error! ill-typed Such generic datatypes can be computed with by pattern matching, just as we saw earlier. In particular, the constructors of the datatype form the patterns, and those patterns bind identifiers of the appropriate types within the branches. For example, we can write a generic function that

“mirrors” (i.e. recursively swaps left and right subtrees) like this:

let rec mirror (t:’a tree) : ’a tree = begin match t with

| Empty -> Empty

| Node(lt, x, rt) -> Node(mirror rt, x, mirror lt) end

In the branch for the Node constructor, the identifiers lt and rt have type’a treeand identifierx has type’a. Since this function doesn’t de-pend on any particular properties of’ait is truly generic.

8.2 Why use generics?

Why are generic functions and datatypes valuable? They allow program-mers to re-use algorithms in many contexts. For example, we can define lots of different list functions generically and then re-use them for any par-ticular kind of list we happen to need. In parpar-ticular, the designers of the generic list functions don’t have to be aware of what particular kind of list elements some future program might happen to use. A programmer may find herself needing awidget listin a graphics program, but if she needs to know its length, then the generic listlengthwill do the trick. Im-portantly, generic functions work even for types not yet defined when the generic function or datatype was created.

This flexible re-use of code has another benefit: it means less work de-bugging lots of specialized versions of the same thing. If we had to write a list lengthfunction for every type of list element, then we would have to have many copies of essentially the same program. Such code dupli-cation becomes a nightmare to maintain in larger-scale software systems.

Imagine needing to keep twenty “almost identical but not quite” versions of the same function in sync—if you find a bug in one instance of the code, you have to patch it the same way in all nineteen other instances.