Basic lenses are a natural class of well-behaved bidirectional transformations that provide the semantic foundations of bidirectional programming languages. Their design emphasizes both robustness and ease of use, guaranteeing totality and strong well-behavedness conditions, for- mulated as intuitive round-tripping laws. Many familiar transformations can be interpreted as basic lenses including the identity and constant functions, composition, iteration, conditionals, product, and many others. In the domain of strings, these constructs can be used to elegantly describe updatable views over many formats of practical interest.
Chapter 4
Quotient Lenses
“Good men must not obey the laws too well.”
—Ralph Waldo Emerson The story described in the previous chapter is an appealing one... but unfortunately, it is not
perfectly true! In the real world, most bidirectional transformations donotobey the basic lenses
laws. Or rather, they obey them in spirit, but not to the letter—i.e., only “modulo unimportant details.” The nature of these details varies from one application to another: examples include whitespace, artifacts of representing richer structures (relations, trees, and graphs) as text, es- caping of atomic data (XML PCDATA, vCard, and BibTeX values), ordering of elds in record- structured data (BibTeX elds, XML attributes), wrapping of long lines in ASCII formats (RIS bibliographies, UniProtKB genomic data bases), and duplicated information (aggregate values, tables of contents).
To illustrate, consider the composers lens again. The information about each composer could be larger than ts comfortably on a single line in the ASCII view. We might then want to relax the type of the view to allow lines to be broken (optionally) using a newline followed by at least one space, so that
Jean Sibelius, 1865-1957 and
Jean Sibelius, 1865-1957
would be accepted as equivalent, alternate presentations of the same view. But now we have a
means that the lens must map these views, which we intuitively regard as equivalent, to different XML sources—i.e., the presence or absence of linebreaks in the view must be re ected in the
source. We couldbuild a lens that does this—e.g., storing the line break inside the PCDATA
string... <composer> <name>Jean Sibelius </name> <lived>1865-1957</lived> <nationality>Finnish</nationality> </composer>
...but this “solution” isn’t very attractive. For one thing, it places an unnatural demand on the XML representation—indeed, possibly an unsatis able demand if the application using the source requires that the PCDATA not contain newlines. For another, writing the lens that handles and propagates linebreaks involves extra work. Moreover, this warping of the XML format and complicated lens programming is all for the purpose of maintaining information that we don’t actually care about! A much better alternative is to relax the lens laws to accommodate this transformation. Finding a way to do this gracefully is the goal of this chapter.
Several ways of treating inessential data have been explored in previous work.
1. We can be informal, stating the basic lens laws in their strict form and explaining that they “essentially hold” for our program, perhaps providing evidence in support of this claim by describing how unimportant details are processed algorithmically. In many applica- tions, being informal is a perfectly acceptable strategy, and several bidirectional languages adopt it. For example, the biXid language, which describes XML to XML conversions using pairs of intertwined tree grammars, provides no explicit guarantees about round-
trip behavior, but its designers clearly intend it to be “morally bijective” (Kawanaka and
Hosoya,2006). The PADS system is similar (Fisher and Gruber,2005).
2. We can weaken the laws. The designers of the X language have argued that PG
should be replaced with a weaker “round-trip and a half” version (Hu et al.,2004):
s′ =putv s
put(gets′)s′=s′
(PGP)
Their reason for advocating this law is that they want to support aduplication opera-
get canonize choose put choose canonize concrete structures abstract structures lens canonizer canonizer canonical concrete structures canonical abstract structures
Figure 4.1: Lens architecture with “canonizers at the edges”
augmenting a document with a table of contents—but because the duplicated data is not preserved exactly on round trips (consider making a change to just one copy of the du- plicated data), the PG law is not satis ed.
The weaker PGP law imposes some constraints on the behavior of lenses, but it
opens the door to a wide range of unintended behaviors—e.g., lenses with constantput
functions, lenses whosegetcomponent is the identity and and whoseput component is
(putv s) =s, etc.1
3. We can divide bidirectional programs into a “core component” that is a lens in the strict sense and “canonization” phases that operate at the perimeters of the transformation,
standardizing the representation of inessential data. See Figure4.1.
For example, in our previous work on lenses for trees, the end-to-end transformations on actual strings (i.e., concrete representations of trees in the lesystem) only obey the
lens laws up to the equivalence induced by aviewer—a parser and pretty printer mapping
between raw strings and more structured representations of trees (Foster et al., 2007b).
Similarly, XSugar, a language for converting between XML and ASCII, guarantees that its transformations are bijective modulo a xed relation on input and output structures obtained by normalizing “unordered” productions, “ignorable” non-terminals, and the
representation of XML (Brabrand et al.,2008).2
1Later work by the same authors (the journal version ofLiu et al. 2007) excludes such transformations by
decorating data with “edit tags,” ordering this data according to a “more edited than” relation, and adding a new law stipulating that doing aputfollowed by agetmust yield a “more edited” view.
This approach is quite workable when the data formats and canonizers are generic. How- ever, for ad-hoc data including textual databases, bibliographies, con guration les, etc., it rapidly becomes impractical—the two components of the canonization transformation themselves become dif cult to write and maintain. In particular the schema of the data is recapitulated, redundantly, in the lens and in each component of the canonizer! In other words, we end up back in the situation that lenses were designed to avoid. In our experience, these dif culties quickly become unmanageable for most formats of practical interest.
4. We can develop a more re ned account of the whole lens framework that allows us to say,
precisely and truthfully, that the lens laws holdmodulo a particular equivalence relation.
This is the approach we pursue in this chapter. The main advantage over the approach using viewers, as we will see, is that it allows us to de ne and use canonizers anywhere in a lens program, not only at the perimeters.
This chapter is organized as follows. Section 4.1 presents the relaxed semantic space of
quotient lenses. Section4.2describes a number of generic combinators—coercions from basic
lenses to quotient lenses and from quotient lenses to canonizers, operators for quotienting a
lens by a canonizer, and sequential composition. Section 4.3 de nes quotient lens versions
of the regular operators—concatenation, union, and Kleene star. Section 4.4introduces new
primitives that are possible in the relaxed space of quotient lenses. Section4.5 discusses the
issue of typechecking quotient lenses. Section 4.6illustrates some uses of quotient lenses on
a large example—a lens for converting between XML and ASCII versions of a large genomic
database. We conclude in Section4.7.
4.1 Semantics
At the semantic level, the de nition of quotient lenses is a straightforward re nement of ba- sic lenses. We enrich the types of lenses with equivalence relations—instead of the basic lens
type S ⇐⇒ V, we writeS/∼S ⇐⇒ V /∼V, where ∼S is an equivalence onS and ∼V is an
equivalence onV—and we relax the lens laws accordingly.
able data is interleaved with the rest of the transformation; in this respect, XSugar can be regarded as a special case of the framework proposed in this chapter.
4.1.1 De nition [Quotient Lens]: Let S ⊆ U andV ⊆ U be sets of sources and views and let
∼S and∼V be equivalence relations onS andV. A quotient lens lhas components with the
same types as a basic lens
l.get ∈ S →V l.put ∈ V →S→S l.create ∈ V →S
but is only required to obey the lens laws up to∼Sand∼V:
l.put(l.gets)s∼Ss (GP)
l.get(l.putv s)∼V v (PG)
l.get(l.createv)∼V v (CG)
Additionally, the components of every quotient lens must respect∼S and∼V:
s∼S s′ l.gets∼V l.gets′ (GE) v∼V v′ s∼Ss′ l.putv s∼S l.putv′s′ (PE) v∼V v′ l.createv∼S l.createv′ (CE)
We writeS/∼S ⇐⇒V /∼V for the set of quotient lenses betweenS(modulo∼S) andV (modulo
∼V).
The relaxed round-tripping laws are just the basic lens laws on the equivalence classesS/∼S
andV /∼V, and when we pick∼Sand∼V to be equality—the nest equivalence relation—they
are equivalent to the basic laws precisely. However, although we reason about the behavior of quotient lenses as if they operated on equivalence classes, note that the component functions
actually transform members of the underlying sets of sources and views—i.e, the type ofget
isS → V, notS/∼S → V /∼V. The second group of laws ensures that the components of a
quotient lens treat equivalent structures equivalently. They play a critical role in (among other things) the proof that the typing rule for composition, de ned below, is sound.