• No results found

Annotated Complex Values

We start from the (positive) Nested Relational Calculus [19]. The types ofNRCare:

t ::= label|t×t| {t}

Complex values are built with the following constructors: v ::= l|(v,v)| {v} |v∪v| {}

We abbreviate{v1} ∪ · · · ∪ {vn}as{v1, . . . ,vn}—e.g.,(l1,{l2,l3})is a complex value of typelabel× {label}.

9

When the nesting depth of the XML documents is bounded, the structural recursion operator (and the recursive tree type) are not needed, see [37].

e ::= l (atom) | x (variable) | {} (empty set) | {e} (singleton) | e∪e (union) |

(x∈e)e (big-union) | let x=e in e (let)

| if e=e then e else e (if)

| (e,e) (pair) | π1(e) (first projection) | π2(e) (second projection) Γ`l:label Γ(x) =t Γ`x:t Γ` {}:{t} Γ`e:t Γ` {e}:{t} Γ`e1:{t} Γ`e2:{t} Γ`e1∪e2:{t} Γ`e1:{t1} Γ,x:t1`e2:{t2} Γ`

(x∈e1)e2:{t2} Γ`e1:label Γ`e2:label Γ`e3:t Γ`e4:t Γ`if e1=e2then e3else e4:t Γ`e1:t1 Γ`e2:t2 Γ`(e1,e2):t1×t2 Γ`e:t1×t2 i∈ {1, 2} Γ`πi(e):ti Figure6.9: Syntax and Typing rules forNRC

The syntax and typing rules for the fragment of NRC we consider is given in Figure 6.9.

The restriction topositive expressions is embodied in the typing rule for conditionals—we only compare label values. It is shown in [19] that equality tests for arbitrary sets can be used to

define non-monotonic operations (i.e., difference, intersection, membership, and nesting). This restriction is essential for the semantics ofNRCon annotated complex values because semirings do not contain features for representing negation.

The crucial NRCoperation is the big-union operator:

(x ∈ e1) e2. It computes the union of

the family of sets defined bye2indexed byx, wherextakes each value in the sete1. For example,

the first relational projection is expressed as follows

project1 Rdef=

(x∈R){π1(x)}.

Next we show how to decorate complex values with semiring annotations, and give a se- mantics that works on annotated values. Again we fix a commutative semiring (K,+,·, 0, 1). Dealing with complex values annotated with elements from K requires a different semantics for the type {t}. The usual semantics is the set of finite subsets of JtK. Instead, the seman-

tics of J{t}KK is defined as the set of functions f : JtKK → K with finite support, i.e., such that

supp(f):={a∈JtKK| f(a)6=0}is finite. We call elements ofJ{t}KKK-collections. WithK=Bwe

K-complex values are obtained by arbitrarily nesting pairing andK-collections. We define new semantics for theNRC constructors: the singleton constructorJ{v}KK is the function that maps JvKKto 1 and everything else to 0; J{}KK is the constant function that maps everything to 0; and Jv1∪v2KK is the pointwiseK-addition ofJv1KKandJv2KK. In order to express allK-collections in

the calculus, we extendNRCwith an operation for multiplying the annotations on the elements ofK-collections by the “scalar”kinK. It is writtenk eand has the following typing rule:

Γ`k∈K Γ`e:{t} Γ`k e:{t}

We call the calculus extended with this operator NRCK. The set of K-complex values are con- structed using:

v ::= l|(v,v)|k{v} |v∪v| {}

and, as above, we abbreviateK-collections using the following notation: {vk1 1, . . . ,v

kn

n }def= k1{v1} ∪ · · · ∪kn{vn}. Determining the right semantics for the

(x∈e1)e2operation is more challenging.

Let e1 have type {t1} and e2 have type {t2} (whenever x has type t1). Let X = Jt1KK and

Y = Jt2KK. ThenJe1KK is a function f : X → K with finite supportsupp(f) = {x1, . . . ,xn}. In

generale2depends on x so for eachxi we have a corresponding semantics fore2, i.e., a function

gi:Y→K. Using this function we define for eachy∈Y

J

(x∈e1)e2KK(y) def = n

i=1 f(xi)·gi(y)

Since eachgi has finite support, so doesJ

(x∈e1)e2KK.

The semantics of the other operations inherited from positive NRCis straightforward (it is essential that the equality test does not involveK-collections and therefore additional annotations). For example,

flatten{{ap,br}u,{bs}v} = {au·p,bu·r+v·s}

{ap,br}

×

{cu} = {(a,c)p·u,(b,c)r·u}

whereR

×

Sdef=

(x∈R)

(y∈S) (x,y).

We take the fact that the semantics of NRCK is an instance of the general approach to col- lection languages promoted in e.g., [19,99,12] as evidence for the robustness of our semantics.

Appendix6.7 gives a set of equational axioms forNRCK that follow from the general approach

just mentioned. These axioms also form a foundation for query optimization forNRCK and K- UXQuery (e.g., see [120]).

As positive NRC strictly extends the positive relational algebra (RA+), the following sanity check is also in order.

Proposition 6.4. Let NRC(RA+) be the usual encoding of projection, selection, cartesian product and union in (positive) NRC. The semantics of NRC(RA+) on K-complex values representing K-relations coincides with the semantics of RA+ on K-relations given in Chapter2(Definition2.2).

As another sanity check, observe thatNRCN corresponds to the positive fragment of theNested Bag Calculus[105].