Preliminaries and Assumptions - Pattern discovery for parallelism in functional languages

We illustrate our approach over the simple expression language,E.

| casevarof nilt !e, constvar var!e | e~e

| lvar~ !e | fixe

Eis a simple, strict, functional language. Its terms for a common subset of functional languages: boolean constants, true and false; integer constants,

z2Z; variables,var; list constructors, nil_t and cons_te e; case discrimina-

tion on lists, casevarof nilt !e, constvar var!e; function application, e~e; lambda expressions, lvar~ ! e; and fixpoints, fixe. Constructors are restricted to cons-lists for simplicity and clarity of presentation, but

4.2. Preliminaries and Assumptions

bool1 _G

`true :bool bool2 G`false :bool

int

G_`Z:int var G_[_{x:t_}_`x:t list₁

G_`nilø:listt list2

G`e1:t G`e2:listt

G`consøe1e2:listt

case

G_`xs:listt1 G`y:t1 G`ys:listt1 G`e1:t2

G _[ _{y:t1,ys:listt1}`e2:t2

G_`casexsof nilt1 !e1, const1(y, ys)!ex2:t2 app Gè0:~t!tm G`~e:~t G_è0~e:tm fun G[{~x:~t_}_è:t_m G`l~x !e:~t_!t_m fix Gè:(((t2, . . . ,tn)!tm),t2, . . . ,tn)!tm G_`fixe:(t2, . . . ,tn)!tm

Figure 4.3: Typing judgements for E, determining simple types inT. the approach is extensible to other types, given a definition of variable update (Definition 4.3.1) for that type. Similarly, the approach can be extended to arbitrary types, given that a definition of variable updatecan be derived for arbitrary constructors. Vector notation, e.g.~e, refers to

a non-empty tuple: ~e ⌘ (e1, . . . ,en),n 1. In order to simplify our

presentation, list constructors and case discriminators are annotated with the (monomorphic) type of the list elements, t. The corresponding type

language, T, is shown below.

t2T ::= bool | int | listt | ~t_!t

The typing judgements in Fig. 4.3 then determine the well-formedness

of expressions in E with regard to their monomorphic types in T. A statement, s2S, is an assignment.

s₂S ::= def var=lvar~ _! e

Statements appear only at the top level of a program. A variable may be bound either to a lambda expression or to a fixpoint expression. Those bindings are then in scope for the duration of program. A program,

p2P, is a series of statements.

p2P ::= s | s; p

P can be thought of as an intermediate representation to which, e.g., Haskell or Erlang are compiled, similar to Core Haskell. For example, the Haskell definition of sudoku,

1 sudoku [] = []

2 sudoku (p:ps) = solve p : sudoku ps

can be translated into a term in P: def sudoku=fixl(f,x) !

casexof nilp!

nilp,

consp(y, ys)!

consp(solve(y), f(ys))

This chapter will use Haskell syntax for examples in order to improve readability. All examples given can be translated into P following a similar principle to the above example. Our approach will inspect only the code provided and does not presume to predict possible compiler optimisations, e.g. fusions [48] or worker-wrapper [49] transformations. As an intermediate representation, these techniques can be applied after, or prior to, the application of our approach.

Where pattern matching is used, we will use as-patterns to indicate the implicit list variables; e.g.

1 sudoku ps₀@[] = []

2 sudoku ps₀@(p:ps) = solve p : sudoku ps

For clarity, all the variables in our examples will be consistent across function clauses. All variables are assumed to be unique under a-conversion,

4.2. Preliminaries and Assumptions at both the statement and expression levels. Type environments, G, are

defined to be a set of bindings of variables to types:

G 2 {var:T}

As usual, all values in the domain of G are assumed unique, and G(x)

denotes thet of xin G, such that₉t₂T,(x:t)2G. For a given program

p, theprogram environment, Gp, contains all the variables in p.

Definition 4.2.1 (Program Environment, Gp). Given some program p and the set of variables X✓varthat occur (either freeorbound) in p, we define an environment Gpto be a set such that 8x2X,9t2T,Gp={x:T}[Gp.

The abovesudokudefinition, for example, has the Gp:

Gp={x:list(list int), y:list int, ys:list(list int),

f :(list(list int))!(list(list int)), solve:(list int)!(list int), . . .}

We omit the variables and types of solvefor clarity. For the rest of this chapter, we will assume that for all given variables x2Gp.

It is useful to define the notion ofsubexpressionsin E, since subexpressions are a key element in both slicing and classification definitions. Definition 4.2.2(Subexpression). Given two expressions e,e0_{, say that e}0_{is a} subexpression of e (denoted e0_⌧_{e) when}

e0₌_e e0_⌧_e e0_⌧_e₁_{_}_e0_⌧_e₂ e0_⌧_cons te1e2 e0_⌧_e₁_{_}_e0_⌧_e₂ e0_⌧_case_x_{of nil} t !e1, constx0x00!e2 9i2[0,n],e0⌧ei e0_⌧_e₀_~_e e 0_⌧_e e0_⌧_l_~_x _! _e e 0_⌧_e e0_⌧_fix_e Subexpressions form a partial order relation.

For example, a function that adds 1 to its argument, l(x) _! add(x,1),

v) l(x) _! sum(x,1). We will refer to any application that is a subex-

pression of a fixpoint and that is not a recursive call as an operation. For example, the fixpoint expression

fixl(f,xs) _! casexsof nilp! 0, consp(y, ys)! plus(y,f(ys))

which sums the elements of a list of integers, has one operation: plus.

f(ys) isnotan operation because it is a recursive call.

Functions are introduced using al-expression. They are always pure,

are uncurried, and are never partially applied. They may, however, be higher-order. Recursive (function) definitions are always introduced using an explicit fixpoint expression, as in e.g.:

fix(l(f,xs) _! f(xs)).

The form of recursion is not otherwise restricted; general recursive forms are allowed, for example.

Lists are defined to be an ordered collection of elements, where those elements are accessed via case-expressions. As shown by the typing rules of Figure 43, case discrimination is restricted to lists of some type t. We assume the existence of built-in functions (e.g. if, eq, plus) for discrimination and operations on integers and booleans. Case- expressions can be extended to other types, given an additional check on the type of the discriminated variable in the relevant definitions. We limit case-expressions here to simplify our presentation. In the non-nil branch, new variables are bound respectively to the first element in the list (i.e. the head) and to the remainder of the list (i.e. the tail). A corresponding Reachability Relation is defined for each case-expression.

Definition 4.2.3(Reachability Relation). Given a program p, a program envir- onmentGp, and a case-expression in p, e=casexs0of nilt !e1, constx xs! e2, we say that x/pxs0and xs/pxs0. The transitive closure of the reachability

In document Pattern discovery for parallelism in functional languages (Page 51-56)