• No results found

2.6 Encoding Standards for Mathematical Content

2.6.1 OpenMath

One of the most used standards to describe mathematical formulae is OpenMath [184].

Its main aim is to provide the necessary mechanisms to describe mathematical formulae and encapsulate in its description semantic meaning of the mathematical terms. The semantic information ensures that a document containing mathematical formulae can be correctly evaluated and understood independent of the of the software package that produced it.

It is often the case that a mathematical document produced with one software package has to be loaded and evaluated by another package, either of the same type or different one. For mathematical formulae, semantic mechanisms must allow different software packages to determine the meaning of the content they parse. For instance, in the for-mula that expresses the area of a circleA = P I ∗ R2the meaning of various terms of the formula should be self explanatory for a trained human eye. A software system though cannot assume the meaning of the particular terms. It is significant to have additional in-formation to allow it to determine that “R” represents a variable while “PI” is a substitute for the well known mathematical constantπ.

The description model that OpenMath proposes is not necessarily tied to a certain en-coding format. The two enen-codings that OpenMath directly supports are a XML based language and a binary format. Either of the two may be chosen, depending on the sce-nario in which they are supposed to be used. The binary format is more compact and potentially more suitable for machine to machine communication when the communica-tion channels use raw binary format while the XML encoding may be more suitable for Web service related technologies.

The fundamental concepts that OpenMath is based on are Content Dictionaries (CDs), OpenMath symbols and the concept of an OpenMath object. Similar to the mechanism of dictionaries used in every day speech that states the semantic meaning of a word in the dictionary, OpenMath CDs are collections of OpenMath symbols that have a particular meaning in a specific context. A software package understands a mathematical formula that is expressed using OpenMath only if it implements corresponding Phrasebooks that allow the system to transform the formula in the encoding model that it uses internally.

In this case, the software package supports or implements the corresponding CDs.

Mathematical formulae can be encoded as compound OpenMath objects by combining basic OpenMath constructor objects and OpenMath symbols defined in OpenMath CDs.

It is a common practice to group OpenMath symbols that are related in CDs covering a particular mathematical area. Grouping multiple symbols in CDs is a convenient way to organize OpenMath symbols. The OpenMath symbol is a mechanism to identify certain concept in that particular area of mathematics and ensures that any interpreter will consider the same semantic meaning defined by the associated OpenMath symbol definition.

There are two main types of objects in OpenMath. The first category comprises of basic OpenMath objects:

• Integer - any element that is part of the mathematical set of integers

• IEEE - any floating point number expressed using double precision format

• Character String - any character string

• ByteArray - any sequence of bytes

• Symbol - any symbol element that is part of a CD

• Variable - represent a place holder; it has to have a unique name

For an OpenMath symbol to be correctly specified two mandatory attributes of the sym-bol have to be set. The cd attribute that specifies the OpenMath CD of which the symsym-bol is part of; the name attribute is a meaningful name, unique in the context of the CD.

Compound OpenMath objects can be constructed by combining existing OpenMath ob-jects. The constructive approach has to comply with the following composition mecha-nisms [184]:

• foreign(A) - is an OpenMath object if A is not an OpenMath object. This construc-tor function allows creating OpenMath objects from non OpenMath objects which may be useful if arbitrary data has to be encapsulated in an OpenMath compound object;

• application(A1, . . . ,An) - where A1,...An represent OpenMath objects specifies an application in a similar way with defining a regular mathematical function with multiple arguments. The first argument is referred to as the head. To encode a mathematical function the head object is an OpenMath object, such as an Open-Math symbol that specifies the function and the rest of the objects represent the argument that have to be applied;

• attribution(A,(S1,A1), . . . ,(Sn, An)) - where A,A1,...An represent OpenMath objects and S1,...Sn represents OpenMath symbols; this construction may be used to add attributes or characteristics that are part of the A object’s definition;

• binding(B,v1, . . . ,vn,C) - where B and C represent OpenMath objects and v1,...vn represent OpenMath variables; it may be used to express functions or logical statements;

• error(S,A1, . . . ,An) - describes an OpenMath error objects

Based on the mechanisms described above computer algebra application specialists have created a strong foundation that can be used to describe complicated mathematical for-mulae covering the most common mathematical areas. Due to the popularity of XML

languages and the available support for parsing XML documents most of the OpenMath encodings are done in XML format. The OpenMath standard endorses the XML for-mat and describes a set of language elements that correspond to the encoding models described above. To encode basic OpenMath object the following XML tags have to be used:

• Integer: <OMI>...</OMI>

• IEEE: <OMF>...</OMF>

• Character String: <OMSTR>...</OMSTR>

• ByteArray: <OMB>...</OMB>

• Symbol: <OMS cd=”cd name” name=”oms name”></OMS>

• Variable: <OMV name=”variable name”></OMV>

The corresponding XML tags that should be used to construct compound objects are:

• foreign <OMFOREIGN>...</OMFOREIGN>

• application: <OMA>...</OMA>

• attribution <OMATTR>...</OMATTR>

• binding: <OMBIND>... <OMBVAR>...</OMBVAR>...</OMBIND>

• error <OME>...</OME>

As an example, to encode in OpenMath the formula sin(0) where “sin” represent the sinus trigonometric function, the corresponding XML should be created:

<OMOBJ>

<OMA>

<OMS cd="transc1" name="sin">

<OMI>0</OMI>

</OMA>

</OMOBJ>

Remark. OpenMath objects are stored in separate XML documents. The correspond-ing XML document may contain a basic object or a compound object constructed us-ing the mechanisms described above. For the document to be a well formed descrip-tion of an object, its content has to be enclosed between the start and ending tags

< OMOBJ >, < /OMOBJ > respectively. These tags can only appear once in the same file.

OpenMath References

Due to their complexity, XML representations of large OpenMath objects can sometimes be large. It is also possible that some of the OpenMath sub-objects that an OpenMath objects is compound of may appear more than once in object’s description. To shorten and simplify the representation of an OpenMath object, the OpenMath standard provides a reference mechanism that allows replacement of sub-object with a reference to the object’s definition. Practically a reference replaces an in-line definition of the object and makes the encoding more compact and easy to read.

We illustrate this concept with an excerpt taken from the the OpenMath standard def-inition [184]. The following two encodings are semantically equivalent even if their definition is different. The first representation describes the mathematical formula “1 + 1” by combining in an OpenMath application object two OpenMath integers.

<OMOBJ version="2.0">

<OMA>

<OMS cd="arith1" name="plus"/>

<OMI>1</OMI>

<OMI>1</OMI>

</OMA>

</OMOBJ>

The second encoding uses OpenMath references to replace the second definition the integer value “1” with a reference that points to an existing definition of the required object. Within the same document, the id attribute must have value distinct from all other identifier values. The unique value can be used to specify a reference encoded as the< OMR > XML element specified below.

<OMOBJ version="2.0">

<OMA>

<OMS cd="arith1" name="plus"/>

<OMI id="bar">1</OMI>

<OMR href="#bar"/>

</OMA>

</OMOBJ>

The reference mechanism is similar to the anchor mechanism provided by HTML Web page description language. Valid references may point to objects described within the same document, objects that are described in a document stored at a location relative to the location of the file where the reference appear and even based on a absolute location.

The OpenMath standard only requires that the value of the href attribute is a valid URI.

In the context of OpenMath XML encoded objects, reference resolving defines the pro-cess of identifying the OpenMath objects that are referenced by a compound OpenMath object’s definition and, if the object is not hosted on the same machine, retrieval of ref-erenced OpenMath object to make it available locally.

OMDoc

The foundation of OpenMath semantic annotation is the concept of content dictionary.

Kohlhase [118] has investigated the suitability of using OpenMath for automatic proving.

He concluded that the lack of semantics associated with OpenMath CDs makes the CDs machine readable but not machine understandable. OpenMath CDs were not conceived in a way that makes them suitable for computer to computer communication.

As a result, Kohlhase [119] has developed an extension of the OpenMath CD mecha-nism that allows clarification and addition of semantic context to CDs. The extensions define XML tags that can be used to accommodate several types of information such as semantic meaning of terms used in explanatory text elements and theory based classifi-cation of symbols defined in CDs and relation operators between theories. While these additions may improve mathematical formulae manipulation and automatic reasoning, these extensions were not widely adopted by computer algebra software packages.