2.3 Computing requirement identifiers
3.1.1 Dealing with embedded media
The ReqIF file format allows to embed external media into rich text content (see Section 2.1.3).
In order to maximize compatibility across different RM-tools, ReqIF contains different layers
of content for each media-artifact. On the lowest level each such artifact is represented by an
XHTML-formatted String which is expected to be digestible by all conceivable RM-tools (the
embedded pictures in Figure 10 on page 47 are displayed this way). The next level is always a
PNG-image and the last, optional level is a file of arbitrary format. While rendering, RM-tools are
23In the computer-science denotation of the term. I.e. some action is being performed on each node of the tree.
24read: “write in correct order”
required to start with the highest available layer, but may fall back onto the preceding one if they
fail to handle it. The entire process is described in more detail in [Obj13, clause 10.8.20, point 2].
Currently, the tool only writes the first two layers. This implies that all embedded media which
is not already in PNG format needs to receive special treatment. For this purpose the media
subfolder contains two CSV-files:
images.csv deals with all graphical objects which can be extracted as a separate file from the
DOC input. Those raw (unconverted) files are saved alongside the CSV and are usually of
Windows Metafile (WMF) or Windows Enhanced Metafile (EMF) format.
Each line in images.csv represents a reference to an individual object and stores the tar-
get dimensions (width and height) along with it. By feeding this file to a dedicated macro
designed for Microsoft Visio, all those objects can be batch-converted into PNG.
The tool can also be reconfigured to use different conversion approaches which do not
rely on proprietary software from the Microsoft Office family. However, those alternatives
(namely: ImageMagick’s convert on Windows with GDI-support and libwmf on Unix, both
of which are open-source) do not provide comparable quality.
shapes.csv deals with shapes in the so-called “Office Drawing Binary Format” as specified in
[Mic14b]. These are commonly created through the drawing tools natively provided by
Microsoft Word. Such shapes cannot exist in isolation (i.e. they cannot be extracted and
legally saved into a separate file) [Mic11]. Thus, shapes.csv only states offsets (similar to
the startOffset used for backward tracing in Section 2.2) of those objects in the original
input DOC together with the filename where the resulting PNG is expected to go. The ac-
tual extraction is performed by another macro, which requires both the original DOC-file
and shapes.csv as its input. Although this macro runs inside Microsoft Word, it needs Mi-
crosoft Visio to be present as well.
There is no viable alternative25for the handling of such content, except for one special
kind of drawings (see Section 3.3.3). Formalized directly by the tool, they use a very lim-
ited subset of the drawing format discussed above and are therefore exempted from the
file shapes.csv. Hence, this is the only time when the tool must rely on external propri-
etary software.
In the example of Listing 5 both CSV-files are explicitly referenced (Lines 44–45). If the input
file happens to contain only one kind of media or no media at all, the non-applicable lines are
omitted and the CSV will not be present, either.
As stated in Section 2.3.2, the input documents contain a fair amount of OLE-data. Using the
approach outlined above, these data will always be flattened to WMF or EMF26. By utilizing
ReqIF’s third content layer which can hold arbitrary data, one could also link these original OLE-
BLOBs to the ReqIF output file. However, only a few RM-tools can actually take advantage of
this option. Besides, the focus of the tool was primarily on providing a decent input to imple-
menters of a system, rather than to authors of a specification willing to alter the embedded
25Although LibreOffce/OpenOffice, respectively their headless variant unoconv, claim to support such drawings, they,
in fact, fail miserably with those embedded in the Subset-026.
26In fact, this is performed by Microsoft Word automatically in order to display something meaningful in case the
graphics (which is why one would embed the original OLE-data in the first place). Lastly, this
approach will not work for the non-independent data referenced by shapes.csv unless it is em-
bedded into an artificial wrapper document27, which is quite an onerous task.
Extracting the original MTEF-representation of equations is also unlikely to be worthwhile since
edits can only be performed using Microsoft’s own Equation editor for as long as those objects
are embedded in an Office document. Alternatively, Design Science’s MathType software, which
the Microsoft editor derives from, may still be used even after they have been extracted. How-
ever, that is a rather exotic piece of proprietary software without any significant market pene-
tration. Alas, a truly useful formalization of such equations as TEX- or MathML-markup is hard to
obtain because only a limited open-source implementation of the MTEF file format is available
[SP12] and ReqIF lacks support for any of the aforementioned markups. Fortunately, this situ-
ation has somewhat improved with the XML-based successor of the DOC file format (*.docx)
where equations are stored in the openly documented Office MathML (OMML) format, a com-
petitor to MathML [Mur06].
In document
On the domain-specific formalization of requirement specifications - a case study of ETCS
(Page 45-47)