• No results found

CAN XML DOCUMENTS BE TREATED AS COMPONENTS?

4 REFINING THE SOLUTION

Provided and required interfaces are specified within the schema. From the viewpoint of the schema writer, a

provided interface is a set of global output variables,

whose values are computed during the processing of an XML-file. For each output variable, there can be any number of input variables, whose values are set by the consumers of the XML-file and used in the computation of the output variable. An XML schema can have several provided interfaces.

Similarly, a required interface is a set of global input variables whose values are functions. These values are set by the user of the XML-file. For each input variable, there can be any number of output variables, whose values are computed during the processing of the XML-file. These output variables are used as parameters of the required functions. An XML schema can have several required interfaces. The provided and required interfaces are depicted in Figure 3. Proxy f(x, y, z) g(u, v, w) XML-file tree representation Consumer of the XML-file required interface provided interface input output output input

Fig. 3. Provided and required interfaces of an XML-

document. Input and output variables are shown with small boxes inside the interfaces.

The rationale behind this kind of interface concept is that the variable-based computation model becomes much 4

simpler than the specification of a function in the context of XML. Since the rules contributing to the computation of a function can be scattered throughout the XML schema, it becomes unnatural to view this kind of computation strictly as a function. Nevertheless, in an abstract sense the output variables of a provided interface correspond to a function providing a value for the users of the XML-file, and the input variables correspond to the parameters of that function. In the case of a required interface the need for a variable-based interpretation is less obvious, but the additional flexibility it brings in the computation of the parameter values for required functions can sometimes be welcome. Symmetry reasons favor this choice, too.

The correspondence between an output variable in a provided interface and a function becomes very concrete in the implementation: a provided interface is eventually mapped to a Java interface which has a function for each output variable. The input variables are in turn mapped to the parameters of that function. This is the reason we group the input variables under a particular output variable. The same applies to required interfaces, the roles of output and input variables being exchanged.

Let us illustrate the implementation of the proxy object with an example. In the case of the purchase order example, the proxy could look as follows:

public class PurchaseOrderProxy implements PurchaseOrderServices {

PurchaseOrderSupport support; XMLrepresentation doc;

public PurchaseOrderProxy() {...}

public void register(PurchaseOrderSupport client) {

support = client; }

public void readXMLfile(file f) { ... } public void processOrders() {

...

support.handleOrder(doc.getOutput(“price”), ...);

... }

public Integer totalValueForArea(Positive areaCode) { doc.setInput(“areaCode”, areaCode); ... return doc.getOutput(“totalValueForArea”); } }

In this case the schema has defined output variables

processOrders and totalValueForArea. For the latter,

there is an input variable areaCode, which becomes a

parameter for the function. Initially, the client component (support) is registered for the proxy, and the XML-

document is parsed into an internal representation (doc)

using the appropriate functions of the proxy. In the body of

the function totalValueForArea, input variables are given

initial values for the processing of the XML-document. Then the internal representation is traversed, and the

computation rules are executed. These rules compute the

value of the output variable totalValueForArea, calling

the operations of support when determined by the rules. In

the example, the provided operation processOrders calls

one of the operations of the required interface, handle-

Order, using the output variables of the required interface

as parameters. Finally, function totalValueForArea

returns as its value the final value of the output variable of the provided interface.

Let us next study how the computation rules are given in a schema in more detail. We will not discuss their concrete XML form here, but instead discuss the main principles they follow. A possible concrete form of the computation rules is presented in [Kos03]. This part requires some knowledge of XML terminology.

A computation rule is always given in a context. A context is a complex type (that is, a structural type) definition in an XML-schema; a computation rule is given as a subelement of the complex type that serves as its context. The left

context of a computation rule consists of the attributes of

the subelements preceding the computation rule in the complex type definition; the right context consists of the attributes of the subelements following the computation rule. In addition, the attributes of the complex type itself belong both to the left and to the right context.

A computation rule takes the form of an assignment, given as the value of a particular attribute of a rule element. The left hand side of a rule is an attribute belonging to the right context of the rule, or an output variable. The right hand side is an expression consisting of attributes belonging to the left context of the rule, or input variables. As customary in attribute grammars, we allow simple arithmetic operations on the right hand side. If an input variable denotes a function, the conventional parameterized notation can be used as well; in that case the actual parameters are assigned to the corresponding output variables before executing the function. A computation rule can also be conditional, executed only if a given boolean expression is true. The left hand side of a computation rule can be omitted.

Note that here we deviate from the classical L-attributed grammar by treating attributes simply as variables, instead of dividing them into inherited and synthesized single- valued data containers. However, we do retain the left-to- right direction of data flow characteristic to L-attributed grammars. In principle, we could give up this restriction and allow arbitrary data flow between the attributes in the context of a rule: we could simply state that the rules are executed in the left-to-right, top-down order, and leave it to the schema writer to ascertain that the sequence of assignments makes sense. However, the left-to-right data- flow makes the computation safer in the sense that the attributes of an element are not used before the subtree rooted by that element is processed. Thus, the schema 5

writer can imagine that some of the attributes in the root represent the “result” of processing the subtree. Note that it is still possible that some attribute does not always get a value, or that some attribute is assigned many times. Tool support should be provided to statically check the rules and warn about these cases.