• No results found

The presented typing rules do not cover all of Xcerpt’s current features. As Xcerpt is in active development, features currently come and go. The features typed in the preceding sections are either generally useful when typing query languages for the Web or are principally new to the way typing is done in e.g. programming languages. Features of Xcerpt currently available or under development, that have not been covered here are:

1. conjunctive and disjunctive queries 2. resource specifiers

3. conditions (a.k.a.where-clauses) 4. optional variable binding modifiers

5. query negation modifiers 6. function applications 7. rule chaining

Handling of conjunction and disjunction in queries (1) can be seen mostly as separate typ- ing of the different components but sharing a common environment for the variables, where variables occurring in different conjuncts have to have non empty intersection and variables in different components of a disjunction get typed with the union of the types in both disjunctive components. Resource specifications (2) occur at the root level of a query. They by this do not have a logical impact on the contained query or construct terms. However they provide a promi- nent place to indicate information about the type of the data intended to be queries, e.g. a resource on where to find a schema or type declaration. From a prescriptive view, modifiers—optional (4) and negation (5)—can be treated as regular terms, as their occurrence makes no sense, if they are not applicable at their given occurrence. At the same time they should be treated as optional content in the given data model, as it makes no sense to negate or treat as optional obligatory content. In the ordered automaton model, optionality can be achieved by optionally skipping a (nonε) horizontal transition step, for unordered content models a corresponding optionality has to be introduced in the expression constructed from the Xcerpt term sequence. Conditions (3) do not use the Xcerpt term syntax, they are more in the spirit of (un)equations of variables and function applications. As such they can be handled by traditional typing (e.g. in the spirit of typing for functional programming languages) adapted toR2G2type declarations. The nec- essary adaption is to use type intersection instead of type equality when checking variable type conformance in the environment. Rule chaining (7) has not been considered in this thesis. It is, among the current Web query languages, a very special feature applicable to Xcerpt but not to most of the common Web query languages. An important property to consider about a type sys- tem treating chaining in Xcerpt is subject reduction: is a language with chaining well typed, or maybe even type-able, at every stage of evaluation of the chaining. In functional and imperative programming languages, this means usually after application ofβ-reduction. For logic languages or deductive rule languages like Xcerpt, this means the application of substitutions.

9

Outlook

Many promising continuations of the practical work in this thesis have shown up. Some of them are summarized now.

9.1

Type Based Querying—an Extension to Simulation Unifica-

tion

The traditional use of types in programming- and query languages is to enhance security, perfor- mance, documentation or verbosity of errors. For many languages it holds, that for a well typed program the program is equivalent to what the program would be after removing all type infor- mation and running it in asibling languagewithout type support. In traditional settings, types arguably do not alter the semantics of (well typed) programs, they may just help finding ill typed programs, the can considered to bepassiveat run time.

For Web querying in general, for Xcerpt in special, types on the side of selection constructs can be used to ensure, that the query is reasonable for data of given type to query, where an unreasonable query is one, that never matches any data of the given type. Types hence represent a set that may not have empty intersection with the set of the data the given selection construct matches with. As an example, the rather vague Xcerpt query term in figure 55 is expected to query HTML Table element in its given context. As the query may query arbitrary data, it is obviously well types. However, as nothing restricts the variable TABLEfrom matching with arbitrary content, the query will most likely not fulfill the author’s desire. The type information is very restrictive (and it can be assumed that it is not reasonable for the given query under traditionalpassivetype semantic) about what the query actually queries. It is possible, that in the given situation, the programmer really expected to query HTML tables and was inspired, that the given query would more or less fulfill his requirement.

CONSTRUCT

result{ all var TABLE } FROM

in document("http://example.com") desc var TABLEˆˆTable END

Code Example 55 An Xcerpt query querying arbitrary data in an HTML document—however the type annotation indicates, that the author had something else in mind...

A proposed extension to the query semantic in general, and to simulation unification in the case of Xcerpt, is to use type information of well typed queries to restrict the query, to be anactive

member of the querying process. As type information is usually given and mostly designed in an unambiguous way, sometimes involving complex rules with many details, they can be a powerful tool for the query author to exactly specify the elements he is interested in. In the example in figure 55, the type would hence restrict the rather generic query pattern to match only with HTML tables, even if the untyped query would match other elements additionally (in the given case very likely, e.g. the content of the HTML table).