Automata Learning: An Algebraic Approach

(1)

arXiv:1911.00874v3 [cs.FL] 28 Aug 2020

Automata Learning: An Algebraic Approach

Henning Urbat

∗ Friedrich-Alexander-Universität Erlangen-Nürnberg Erlangen, Germany [email protected]

Lutz Schröder

∗ Friedrich-Alexander-Universität Erlangen-Nürnberg Erlangen, Germany [email protected]

Abstract

We propose a generic categorical framework for learning unknown formal languages of various types (e.g. finite or infinite words, weighted and nominal languages). Our ap-proach is parametric in a monadTthat represents the given type of languages and their recognizing algebraic struc-tures. Using the concept of an automata presentation of T-algebras, we demonstrate that the task of learning a T-recognizable language can be reduced to learning an ab-stract form of algebraic automaton whose transitions are modeled by a functor. For the important case of adjoint automata, we devise a learning algorithm generalizing An-gluin’sL∗_{. The algorithm is phrased in terms of categorically} described extension steps; we provide for a termination and complexity analysis based on a dedicated notion of finite-ness. Our framework applies to structures likeω-regular lan-guages that were not within the scope of existing categori-cal accounts of automata learning. In addition, it yields new learning algorithms for several types of languages for which no such algorithms were previously known at all, includ-ing sorted languages, nominal languages with name bind-ing, and cost functions.

Keywords Automata Learning, Monads, Algebras

1 Introduction

Active automata learning is the task of inferring a finite representation of an unknown formal language by asking questions to a teacher. Such learning situations naturally arise, e.g., in software verification, where the “teacher” is some reactive system and one aims to construct a formal model of it by running suitable tests [61]. Starting with An-gluin’s [8] pioneering work on learning regular languages, active learning algorithms have been developed for count-less types of systems and languages, including ω-regular languages [9,32], tree languages [30], weighted languages [12,63], and nominal languages [47]. Most of these exten-sions are tailor-made modifications of Angluin’s L∗ algo-rithm and thus bear close structural analogies. This has mo-tivated recent work towards a uniform category theoretic understanding of automata learning, based on modelling state-based systems as coalgebras [14, 65]. In the present

∗_{The authors acknowledge support by Deutsche Forschungsgemeinschaft} (DFG) under project SCHR 1118/8-2.

, ,

.

paper, we propose a novelalgebraicapproach to automata learning.

Our contributions are two-fold. First, we study the prob-lem of learning an abstract form of automata originally in-troduced by Arbib and Manes [10] in the context of mini-mization: given an endofunctorF on a categoryD_and ob-jectsI,O ∈ D, anF-automaton consists of an objectQ of states and morphismsδQ,iQ andfQas shown below,

repre-senting transitions, initial states and ﬁnal states (or outputs). FQ δQ I iQ / /_Q fQ / /_O

TakingFQ=Σ×QonSetwithI =1 andO ={0,1}yields

classical deterministic automata, but also several other no-tions of automata (e.g. weighted automata, residual nonde-terministic automata, and nominal automata) arise as in-stances. As our ﬁrst main result, we devise a generalized

L∗ algorithm for adjointF-automata, i.e. automata whose type functorF admits a right adjointG, based on alternat-ing moves along the initial chain for the functorI+Fand

the ﬁnal cochain for the functorO ×G. Our generic algo-rithm subsumes knownL∗-type algorithms for all the above classes of automata, and its analysis yields uniform proofs of their correctness and termination. In addition, it also instan-tiates to a number of new learning algorithms, e.g. for sorted automata and for several versions of nominal automata with name binding.

We subsequently show that learning algorithms for F-automata (including our generalizedL∗algorithm) apply far beyond the realm of automata: they can be used to learn languages representable bymonads[7,59]. Given a monad Ton the categoryD_{, we model a}_language_{as a morphism} L:T I → O inD_{. At this level of generality, one obtains} a concept ofT-recognizable language(i.e. a language recog-nized by a ﬁniteT-algebra) that uniformly captures numer-ous automata-theoretic classes of languages. For instance, regular andω-regular languages (the languages accepted by classical ﬁnite automata and Büchi automata, respectively) correspond precisely to T-recognizable languages for the monads representing semigroups and Wilke algebras,

TI =I+onSet and T(I,J)=(I+,Iup₊I∗×J)onSet2. HereIupdenotes the set of ultimately periodic inﬁnite words over the alphabetI. Forω-regular languages, Farzan et al.

(2)

[32] proposed an algorithm that learns a languageL ⊆ Iω of inﬁnite words by learning the set of lassos inL, i.e. the regular language ofﬁnitewords given by

lasso(L)={u$v :u ∈I∗,v∈I+,uvω ∈L} ⊆ (I₊{$})∗. We show that this idea extends to generalT-recognizable languages, using the concept of an automata presenta-tion. Such a presentation allows for thelinearizationof T-recognizable languages, i.e. a reduction to “regular” lan-guages accepted by ﬁniteF-automata for suitableF.

In combination, our results yield a generic strategy for learning an unknownT-recognizable languageL:T I →O: (1) ﬁnd an automata presentation for the freeT-algebraT I; (2) learn the minimal automaton for the linearization ofL. This approach turns out to be applicable to a wide range of languages. In particular, it covers several settings for which no learning algorithms are known, e.g. cost functions [23]. Related work.A categorical interpretation of several key concepts in Angluin’sL∗ algorithm for classical automata was ﬁrst given by Jacobs and Silva [37], and later extended

toF-automata in a category, i.e. to similar generality as in

the present paper, by van Heerdt, Sammartino, and Silva [64]. Their main contribution is an abstract categorical framework (CALF) for correctness proofs of learning algo-rithms, while a concrete generic algorithm is not given. Van Heerdt et al. [65] also study learning automata with side ef-fects modelled via monads; this use of monads is unrelated to the monad-based abstraction of algebraic recognition in the present paper. Barlocco, Kupke, and Rot [14] develop a learning algorithm forsetcoalgebras (with all underlying concepts phrased categorically), parametric in a coalgebraic logic. Its scope is quite diﬀerent from our generalizedL∗ al-gorithm: via genericity over the branching type it covers, e.g., labeled transition systems, but unlike our algorithm it does not apply to, e.g., nominal automata. The connections between the two approaches are further discussed in Re-mark4.16.

Automata learning can be seen as an interactive ver-sion of automata minimization, which has been extensively studied from a (co-)algebraic perspective [5,10,16,24, 35,

63]. In particular, our chain-based iterative learning algo-rithm resembles the coalgebraic approach to partition re-ﬁnement [1].

2 Preliminaries

We proceed to recall concepts from category theory and the theory of nominal sets that we will use throughout the paper. Readers should be familiar with basic notions such as functors, (co-)limits and adjunctions; see, e.g., Mac Lane [41].

Functor (co-)algebras.LetH:D_→D_{be an endofunctor} on a categoryD_{. An}_H_-algebra_{is a pair}₍_A,_α₎_{consisting of}

an objectA∈D_{and a morphism}_α_:_HA_→_{A. A} homomor-phismh:(A,α) → (B,β)betweenH-algebras is a morphism h:A→ B such thath·α = β·Fh. AnH-algebra(A,α)is

initialif for everyH-algebra(B,β)there is a unique homo-morphism(A,α) → (B,β); we generally denote the initial algebra ofH (unique up to isomorphism if it exists) asµH. IfD_{is cocomplete and}_H_{preserves ﬁltered colimits,}_µH _can be constructed as the colimit of theinitialω-chain forH [6]:

µH = colim(0

¡

−

→H0−−→H¡ H20−−−H→2¡ H30→ · · · ), where ¡ is the unique morphism from the initial object 0

of D _into _{H0, and} _Hn _means_H _applied_n _{times. Letting}

jn:Hn0 → µH (n ∈ N) denote the colimit cocone, we

ob-tain theH-algebra structure onµH as the unique morphism α:H(µH) →µHsatisfying

α·H jn =jn+1 for alln∈N.

Dually, one has notions of acoalgebrafor the endofunctor H, acoalgebra homomorphism, and a ﬁnal coalgebra. Coal-gebras provide an abstract notion of state-based transition system: We think of the base objectAof anH-coalgebra as an object ofstates, and of its structure mapα:A→HAas assigning to each state a structured collection of successors. Coalgebra homomorphisms are behaviour-preserving maps, and ﬁnal coalgebras have abstracted behaviours as states. Monad algebras.AmonadT =(T,µ,η)on a categoryD is given by an endofunctorT:D _→ D _{and two natural} transformationsη:IdD →T andµ:TT →T (theunitand multiplication) such that the following diagrams commute:

TTT T µ // µT TT µ TT _µ //_T T T η // ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ TT µ T ηT o o ✉✉✉✉ ✉✉✉ ✉✉✉✉ ✉✉✉ T

A T-algebra is an algebra(A,α)for the endofunctorT for which the following diagrams commute:

TT A µA // T α T A α T A _α //_A A ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ηA / /T A α A

AhomomorphismofT-algebras is just a homomorphism of the underlyingT-algebras. For eachX ∈ D_{, the}_T-algebra TX =(T X,µX)is called thefreeT-algebraonX.

Monads form a categorical abstraction of algebraic theo-ries [43]. In fact, every algebraic theory (given by a ﬁnitary signatureΓ _{and a set}_E _{of equations between} Γ_-terms)

in-duces of monadTonSetwhereT Xis the underlying set of the free(Γ_,_E₎_{-algebra on}_X _{(i.e. the set of all}Γ_{-terms over}

X modulo equations inE), and the mapsηX:X →T X and

µX:TT X → T X are given by inclusion of variables and 2

(3)

ﬂattening of terms, respectively. Then the categories of T-algebras and(Γ,E)-algebras are isomorphic. Conversely, ev-ery monadTonSetwithTpreserving ﬁltered colimits arises from some algebraic theory(Γ_,_E₎_{in this way.}

Similarly, every ordered algebraic theory [17], given by a signatureΓ_{and a set}_E _{of inequations}_s _≤ _t _between Γ

-terms, yields a monadTon the categoryPosof posets whose algebras are ordered Γ_{-algebras (i.e.}Γ_{-algebras on a poset}

with monotone operations) satisfying the inequations inE. Free monads. LetH:D _→ D _{be an endofunctor on a} category D _{with coproducts, and suppose that, for each} X ∈ D_{, the initial algebra} _µ₍_X ₊_H₎ _{for the functor}_X ₊ H exists. Then H induces a monad TH, the free monad

overH [15]. It is given on objects byTHX = µ(X +H); its

action on morphisms and the unit and multiplication are deﬁned via initiality of the algebrasµ(X+H). Then the

cat-egories ofTH-algebras and H-algebras are isomorphic: If

B+H(THB)

[iB,αB]

−−−−−→THBdenotes theB+H-algebra structure

ofTHB=µ(B+H), the isomorphism is given on objects by

(THB β − →B) 7→ (H B−−−→H iB H(THB) αB −−→THB β − →B) and on morphisms byh7→h.

Factorization systems.Afactorization system(E,M)in a categoryD_{is given by two classes}_E_and_M_{of morphisms} such that (i) E andM are closed under composition and contain all isomorphisms, (ii) every morphism f has a fac-torization f =m·ewithe ∈ Eandm ∈ M, and (iii) the

diagonal ﬁll-inproperty holds: given a commutative square m·f =д·ewithe ∈ Eandm∈ M, there exists a unique

morphismd withf =d ·eandд =m·d. The morphisms

mandein (i) are unique up to isomorphism and are called theimageandcoimageoff. Categories of (co-)algebras typ-ically inherit factorizations from their underlying category: (1) IfH:D _→ D _{is an endofunctor with}_H_{(E) ⊆ E}_{, the} factorization system(E,M)forDlifts to the category of H-algebras, that is, everyH-algebra homomorphism uniquely factorizes into a homomorphism inEfollowed by a homo-morphism inM. Dually, ifH(M) ⊆ M, then the category ofH-coalgebras has a factorization system lifting(E,M). (2) IfTis a monad onD _with_T_{(E) ⊆ E}_{, the factorization} system(E,M)forDlifts to the category ofT-algebras. A factorization system(E,M)isproperif every morphism in E is epic and every morphism in M is monic. When-ever a proper factorization system (E,M) is ﬁxed, quo-tients and subobjectsin D _{are represented by morphisms} inE andM, respectively. In particular, in the situation of (1) and (2) above, we representquotient (co-)algebras and sub(co-)algebras by homomorphisms in E andM, respec-tively.

Closed categories.Asymmetric monoidal categoryis a cat-egoryD_{equipped with a functor} _⊗:D_×D _→ D_(tensor

product), an objectID ∈D_{(tensor unit), and isomorphisms}

(X⊗Y)⊗Z X⊗(Y⊗Z),X⊗Y Y⊗X,ID⊗X X X⊗ID, natural inX,Y,Z∈D, satisfying coherence laws [41, Chap-ter VII].D_is_closed_{if the endofunctor}_X_{⊗ (−)}:D_→D_has a right adjoint (denoted by[X,−]) for everyX ∈D, i.e. there is a natural isomorphismD₍_X _⊗_Y_,_Z₎D₍_Y_,_[_X_,_Z_])_. Nominal sets.Fix a countably inﬁnite setAofnames, and let Perm(A)be the group of all permutationsπ:A→Awith π(a)=afor all but ﬁnitely manya. Anominal set[51] is a

setXwith a group action·: Perm(A)×X →Xsubject to the following property: for eachx ∈X there is a ﬁnite setS ⊆A

(asupportofx) such that everyπ ∈Perm(A)that leaves all elements ofSfixed satisfiesπ·x =x. This implies thatxhas a least supportsupp(x) ⊆A. The idea is thatxis a syntactic object with bound and free variables (e.g. aλ-term modulo α-equivalence), and thatsupp(x)is its set of free variables. A nominal setX isorbit-finiteif the number of orbits (i.e. equivalence classes of the relationx ≡ yiffx = π ·y for

someπ) is ﬁnite. A mapf:X →Y between nominal sets is equivariantiff(π·x)=π·f(x)forx ∈X andπ ∈Perm(A).

3 Automata in a Category

We next develop the abstract categorical notion of automa-ton that underlies our generic learning algorithm.

Notation 3.1. For the rest of this paper, let us ﬁx

(1) a categoryD_{with a proper factorization system}_(E,M), (2) an endofunctorF:D_→D_{, and}

(3) two objectsI,O ∈D_.

Deﬁnition 3.2(Automaton (cf. [5,10])). An(F-)automaton is given by an objectQ ∈D_{of states and three morphisms}

δQ:FQ →Q, iQ:I →Q, fQ:Q →O,

representing transitions, initial states, and ﬁnal states (or outputs), respectively. Ahomomorphismbetween automata

(Q,δQ,iQ,fQ)and(Q′,δQ′,iQ′,fQ′)is a morphismh:Q →

Q′inD_{such that the following diagrams commute:}

FQ δQ // F h Q h FQ′ δQ′ / /_Q′ I iQ // iQ′❋❋❋_#_# ❋ ❋ ❋ ❋ Q h fQ / /O Q′ fQ′ ; ; ✇ ✇ ✇ ✇ ✇ ✇

Example 3.3 (Σ_-automata)_. _{Suppose that} ₍D,⊗,ID) is a symmetric monoidal closed category. Choosing the data

F =Σ⊗ (−), I ₌ID, and O ∈D(arbitrary) for a ﬁxed input alphabet Σ _∈ D _{yields Goguen’s notion} of a Σ_-automaton_[₃₅_{]. In our applications, we shall work}

with the categoriesSet(sets and functions),Pos(posets and monotone maps),JSL(join-semilattices with⊥and semilat-tice homomorphisms preserving⊥),K-Vec (vector spaces over ﬁeldKand linear maps) andNom (nominal sets and equivariant maps). The factorization systems and monoidal

(4)

structures are given in the table below. In the fourth row,

⊗is the usual tensor product of vector spaces representing bilinear maps. Similarly, in the third row, ⊗ is the tensor product of semilattices representing bimorphisms [13], i.e. semilattice morphismsh:A⊗B →C correspond to maps h′:A×B→Cpreserving∨and⊥in each component.

D _(E,M) ⊗ ID O

Set (surjective, injective) × 1 {0,1}

Pos (surjective, embedding) × 1 {0<1} JSL (surjective, injective) ⊗ {0<1} {0<1}

K-Vec (surjective, injective) ⊗ K K

Nom (surjective, injective) × 1 {0,1}

Table 1.Symmetric monoidal closed categories

We choose the input alphabetΣ_∈D_{to be a finite set, a} dis-crete finite poset, a free semilattice on a finite set, a finite-dimensional vector space, and the nominal setAof atoms, respectively, and the output objectO ∈ D_{as shown in the} last column. ThenΣ_{-automata are precisely classical}

deter-ministic automata [53], ordered automata [50], semilattice automata [39], linear weighted automata [31], and nominal automata [18]. See Example3.9and3.10for further details.

Example 3.4 (Tree automata). Let Γ _{be a signature and}

FΓQ=Ýn∈NÝ_γ_∈Γ_nQnonSetthe induced polynomial func-tor, withΓ_n_{the set of}_{n-ary operations in}Γ_{. Choosing}_I ₌_∅

andO =2, anFΓ-automaton is a (bottom-up) tree

automa-ton over Γ_[₂₅_{], shortly a}Γ_{-automaton. For the analogous}

functorFΓ onPosandO = {0 < 1}, we obtainorderedΓ -automata.

In the following, we focus onadjoint automata, i.e. au-tomata whose transition typeFis a left adjoint:

Assumptions 3.5. For the rest of this section and in Sec-tion 4, our data is required to satisfy the following condi-tions:

(1) D_{is complete and cocomplete; in particular,}D _{has an} initial object 0 and a terminal object 1.

(2) The unique morphism ¡ : 0 → I lies in M, and the unique morphism ! :O→1 lies inE.

(3) The functorF:D_→D_{has a right adjoint}_G_:D _→D_. (4) The functorFpreserves quotients (F(E) ⊆ E).

Example 3.6. Every symmetric monoidal closed category D_with_F ₌ Σ_{⊗ −}_{satisﬁes Assumption}₍₃₎_{: closedness}

as-serts precisely thatF has the right adjointG =[Σ,−]. The categoriesD_{of Table}₁_{also satisfy the remaining} assump-tions.

Remark 3.7. The key feature of our adjoint setting is that automata can be dually viewed asalgebras andcoalgebras for suitable endofunctors. In more detail:

(1) An automatonQcorresponds precisely to an algebra

(FIQ αQ

−−→Q) = (I+FQ

[iQ,δQ]

−−−−−−→Q)

for the endofunctorFI = I +F equipped with an output

morphism fQ:Q →O. SinceFI preserves ﬁltered colimits

(using that the left adjointF preserves all colimits and the functorI+(−)preserves ﬁltered colimits), the initial algebra

µFI forFI emerges as the colimit of the initialω-chain:

µFI = colim(0 ¡ − →FI0 FI¡ −−→F_I20 F 2 I¡ −−→F_I30→ · · · ). The colimit injections and theFI-algebra structure onµFI

are denoted by

jn:FIn0→µFI (n∈N) and α:FI(µFI) →µFI. For any automatonQ(viewed as anFI-algebra), we write

eQ:µFI →Q

for the uniqueFI-algebra homomorphism fromµFI intoQ.

(2) Dually, replacingδQ:FQ →Qby its adjoint transpose

δ_Q@:Q →GQ, an automaton can be presented as a coalge-bra

(Q−−→γQ GOQ) = (Q

hfQ,δ_Q@i

−−−−−−→O×GQ)

for the endofunctorGO = O ×G equipped with an

ini-tial stateiQ:I → Q. SinceGO preserves coﬁltered limits,

the ﬁnal coalgebraνGO arises as the limit of the ﬁnalωop

-cochain: νGO =lim(1←−! GO1 GO! ←−−−G2_O1 G 2 O! ←−−−G_O31← · · · ). The limit projections and the GO-coalgebra structure on

νGO are denoted by

j_k′:νGO →GOk1 (k ∈N) and νGO

γ

−

→GO(νGO). For any automatonQ(viewed as aGO-coalgebra), we write

mQ:Q→νGO

for the uniqueGO-coalgebra homomorphism intoνGO.

Deﬁnition 3.8(Language). (1) Alanguageis a morphism L:µFI →O.

(2) The languageacceptedby an automatonQis deﬁned by LQ = (µFI

eQ

−−→Q−−→fQ O).

Example 3.9(Σ_{-automata, continued)}_. _{(1) In the setting}

of Example3.3, the initial algebraµFI and the initial chain

for the functorFI =ID+Σ_{⊗ −}_{can be described as follows}

[35]. LetΣn₌Σ_⊗Σ_{⊗ · · · ⊗}Σ_{denote the}_{nth tensor power}

ofΣ_(whereΣ0₌_{ID), and put} Σ<n = Þ m<n Σm ₍_n_∈_N₎ _and Σ∗₌Þ n∈N Σn_.

ThenµFIis carried by the objectΣ∗of words, and the initial

chain is given by the coproduct injections

Σ<0

֌Σ<1֌Σ<2֌Σ<3֌· · ·.

(5)

(2) For the functorGO =O× [Σ,−]the ﬁnal coalgebraνGO

is carried by the object[Σ∗,O]of languages and we have the ﬁnal cochain

[Σ<0

,O] ← [Σ<1

,O] ← [Σ<2,O] ← [Σ<3,O] ← · · · with connecting morphisms given by restriction. To see this, consider the contravariant functorP =[−,O]:D → Dop.

It is not diﬃcult to verify thatP is a left adjoint (with right adjointPop_{) and that there is a natural isomorphism}

PFI G_OopP.

IfAlgFI andCoalgGO denote the categories ofFI-algebras

andGO-coalgebras, it follows [36, Theorem 2.4] thatP lifts

to a left adjointP:AlgFI → (CoalgGO)opgiven by

(FIQ αQ

−−→Q) 7→ (PQ−−−→P αQ PFIQGOPQ).

Since left adjoints preserve initial objects,Pmaps the initial algebraµFI to the ﬁnal coalgebraνGO, i.e. one hasνGO =

P(µFI)with the coalgebra structure

γ =(νGO =P(µFI) P α

−−→PFI(µFI)GOP(µFI)=GO(νGO) ).

Moreover, applyingP to the initial chain forFI yields the

ﬁnal cochain forGO:

(1←−! GO1 GO!

←−−−G_O21· · · )=(P0←−P¡ PFI0 P FI¡

←−−−PF_I20· · · ). SinceµFI =Σ∗andP =[−,O], we obtain the above

descrip-tion ofνGO and of the ﬁnal cochain forGO.

(3) For the categories of Table1, the categorical notion of (accepted) language given in Deﬁnition3.8thus specializes to the familiar ones. For illustration, let us spell out the case D₌_{Set. A}Σ_{-automaton in}_Set_{is precisely a classical}

deter-ministic automaton: it is given by a setQof states, a transi-tion mapδQ:Σ×Q →Q, a mapiQ: 1→Q (representing

an initial stateq0 = iQ(∗)), and a map fQ:Q → 2

(rep-resenting a set f_Q−1[1]of ﬁnal states). From (1) and (2) we obtain the well-known description of the initial algebra for FI = 1+Σ× −as the set Σ∗ of ﬁnite words overΣ(with

algebra structureα: 1+Σ×Σ∗ → Σ∗given by∗ 7→εand

(a,w) 7→wa) and of the ﬁnal coalgebra forGO =2× [Σ,−] as the set[Σ∗_,₂_] PΣ∗_{of all languages}_L _⊆ Σ∗_[₅₅_{]. The}

unique FI-algebra homomorphismeQ:Σ∗ → Q maps a

wordw ∈ Σ∗_{to the state of}_Q _{reached on input}_{w. Thus,}

the languageLQ = fQ·eQ accepted byQ is the usual

con-cept:w lies inLQ if and only ifQ reaches a ﬁnal state on

inputw.

Example 3.10(Nominal automata). Our notion of automa-ton (Deﬁnition3.2) has several natural instantiations to the categoryNomof nominal sets and equivariant maps. (1) The simplest instance was already mentioned in Exam-ple 3.3: a Σ_{-automaton in} _Nom _{corresponds precisely to}

anominal deterministic automaton[18]. For simplicity, we choose the alphabetΣ₌_A_{. A nominal automaton is given}

by a nominal setQof states, an equivariant transition map

δQ:A×Q → Q, an equivariant mapiQ: 1 → Q

(repre-senting an equivariant initial stateq0∈Q), and an equivari-ant map fQ:Q → 2 (representing an equivariant subset

F ⊆ Q of ﬁnal states). The initial algebraA∗ is the nomi-nal set of words overAwith group actionπ · (a1. . .an)=

(π·a1). . .(π·an)fora1. . .an ∈A∗andπ ∈Perm(A). Thus,

a languageL:A∗ →2 corresponds to an equivariant set of words overA.

Nominal automata with orbit-ﬁnite state space are known to be expressively equivalent to Kaminski and Francez’ [38]deterministic ﬁnite memory automata. (2) NowNomcarries a further symmetric monoidal closed structure, theseparated product∗given on objects by

X ∗Y ={ (x,y) ∈X×Y : x#y},

wherex#ymeans thatsupp(x) ∩supp(y) = ∅. The right

adjoint ofF =A∗ (−)is theabstraction functorG=[A](−)

[51] which maps a nominal setX to the quotient ofA×X modulo the equivalence relation∼deﬁned by(a,x) ∼ (b,y)

iﬀ(ac) ·x =(bc) ·yfor some (equivalently, all)c ∈ Awith c#a,b,x,y. We writehaixfor the equivalence class of(a,x), which we think of as the result of binding the nameainx. F-automata are precisely theseparated automata recently introduced by Moerman and Rot [46].

(3) By combining the adjunctions of (1) and (2), we obtain the adjoint pair of functorsF ⊣Gwith

F =A× (−)+A∗ (−), G₌[A,−] × [A](−). The ensuing notion of automaton coincides with one used in Kozen et al.’s [40] coalgebraic representation of nominal Kleene algebra [33]. Such automata have two types of transi-tions,freetransitions ([A,−]) andboundtransitions ([_A](−)). They acceptbar languages[56]: putting ¯A=A∪ {ha|a ∈A}

(changing the original notation from|atohafor compatibil-ity withdynamic sequencesas discussed next), abar stringis just a word over ¯A. We considerhaas bindingato the right. This gives rise to the expected notions of free names and α-equivalence≡α. A bar string iscleanif its bound names are

mutually distinct and distinct from all its free names. Simpli-fying slightly, we deﬁne abar languageto be an equivariant set of bar strings moduloα-equivalence, i.e. an equivariant subset of ¯A∗/≡α. The initial algebraµF1is the nominal set of

clean bar strings. A language in our sense is thus an equivari-ant set of clean bar strings; such languages are in bijective correspondence with bar languages [56].

(4) We note next that[A](−)is itself a left adjoint, our ﬁrst example of a left adjoint that is not of the formΣ_{⊗ −}_{for a}

closed structure⊗. The right adjointRis given on objects byRX = {f ∈ [A,X] : a#f(a)for alla ∈A}[51]. We

extend the above notion of automaton with this feature, i.e. we now work with the adjoint pairF ⊣Ggiven by

F =A× (−)+A∗ (−)+[A](−), G₌[A,−] × [A](−) ×R.

(6)

The initial algebra µF1 now consists of words built from three types of letters; we denote the new type of letters in-duced by the new summand[A](−)inF byai (fora ∈ A). Recalling that words grow to the right, we see thataibinds to the left. We readaiasdeallocatingthe name or resourcea. Languages in this model consist ofdynamic sequences[34]. We associate such languages with a species of nominal au-tomata having three types of transitions: free and bound transitions as above, anddeallocating transitionsq −−a→i q′ witha#q′. To the best of our knowledge, this notion of nom-inal automaton has not appeared in the literature before.

Example 3.11(SortedΣ_-automata)_. _{In our applications in}

Section 5, we shall encounter a generalized version of Σ

-automata where (1) the input objectI is arbitrary, not nec-essarily equal to the tensor unitID, and (2) the automaton has a sorted object of states and consumes sorted words. This reﬂects the fact that the algebraic structures arising in algebraic language theory are often sorted. For brevity, we only treat the case of sorted automata in Set. Fix a set S of sorts and a family of sets Σ ₌ ₍Σ_s_,_t₎_s_,_t_∈_S_{; we think}

of the elements of Σ_s_,_t _{as letters with domain sort}_s _and

codomain sortt. We instantiate our setting to the adjoint pairF ⊣ G:SetS → SetS deﬁned as follows forQ ∈ SetS ands,t∈S:

(FQ)t =Ýs∈SΣs,t×Qs, (GQ)s =Ît∈S[Σs,t,Qt]. ChoosingI ∈ SetS _{arbitrary and the output object}_O

= 2,

theS-sorted set with two elements in each component, an

F-automaton is asorted Σ_{-automaton. It is given by an}

S-sorted set of statesQ, transitionsδQ,s,t: Σs,t ×Qt → Qt

(s,t ∈ S), initial states i:I → Q and an output map

fQ:Q→2 (representing anS-sorted set of ﬁnal states). The

initial algebraµFIis theS-sorted set of all well-sorted words

overΣ_{with an additional ﬁrst letter from}_I_{. More precisely,}

(µFI)t consists of all wordsxa1. . .anwithx ∈Ýs∈SIs and

a1, . . . ,an ∈Ýr,sΣr,ssuch that the sorts of consecutive

let-ters match, i.e. there exist sortss =s0,s1, . . . ,sn =t ∈ S

such thatx ∈Is andai ∈Σsi−1,si fori=1, . . . ,n. In particu-lar, in the single-sorted case we haveµFI =I×Σ∗. For any

well-sorted input wordw=xa1. . .anone obtains the run x

−

→q0−→a1 q1→ · · ·−−→an qn

inQwhereq0 =iQ,s(x)andqi =δQ,si−1,si(ai,qi−1)fori = 1, . . . ,n, andwis accepted if and only ifqnis a ﬁnal state.

We conclude with a discussion of minimal automata.

Deﬁnition 3.12(Minimal automaton). An automatonQis called (1)reachableif the uniqueFI-algebra homomorphism

eQ: µFI → Q lies inE, and (2)minimalif it is reachable

and for every reachable automatonQ′withLQ =LQ′, there

exists a unique automata homomorphism fromQ′toQ.

Theorem 3.13. For every languageLthere exists a minimal automatonMin(L)acceptingL, unique up to isomorphism.

Proof sketch. We describe the construction of the minimal automaton. By equippingµFIwith the ﬁnal statesL:µFI →

O, we can viewµFIas aGO-coalgebra. Consider the(E,M)

-factorization of the unique coalgebra homomorphismmµ FI: mµ FI = ( µFI eMin(L) / ///_Min(L)// mMin(L) / /_νG_O ). The object Min(L) can be uniquely equipped with an au-tomaton structure for whicheMin(L)is anFI-algebra

homo-morphism andmMin(L)is aGO-coalgebra homomorphism.

This automaton is the minimal acceptor forL.

The minimization theorem and its proof are closely re-lated to the classical work of Arbib and Manes [10] on the minimal realization of dynamorphisms, i.e. F-algebra homomorphisms from µFI into νGO. Under diﬀerent

as-sumptions on the type functorF and the base categoryD (e.g. co-wellpoweredness), minimization results were also established by Adámek and Trnková [5] and, recently, by van Heerdt et al. [63].

4 A Categorical

L

∗

Algorithm

To motivate our learning algorithm for adjoint automata, we recall Angluin’s classicalL∗_{algorithm [}₈_{] for learning an} un-knownΣ_-automaton_Q _in_{Set. The algorithm assumes that}

the learner has access to an oracle (theteacher) that can be asked two types of questions:

(1) Membership queries:given a wordw ∈Σ∗_{, is}_w _∈_L_Q_?

(2) Equivalence queries:given an automatonH, isLH =LQ?

If the answer in (2) is “no”, the teacher discloses a counterex-ample, i.e. a wordw ∈LQ\LH∪LH\LQ, to the learner.

The idea of L∗ is to compute a sequence of approxima-tions of the unknown automatonQ by considering ﬁnite (co-)restrictions of the morphismmQ ·eQ, as indicated by

the diagram below. Note that the kernel ofmQ ·eQ is

pre-cisely the well-knownNerode congruenceofLQ. Σ<0_/_/ _/_/_{· · ·} _Σ<N _/_/ _/_/Σ<N+1_/_/ _/_/_{· · ·} Σ∗ eQ ✤ ✤ ✤ ✤ ✤ S hS,T % % eS,T O O O O HS,T mS,T Q mQ ✤ ✤ ✤ ✤ ✤ [T,2] [Σ<0 ,2]oooo ·· [Σ<K ,2] O O O O [Σ<K+1 ,2] o o o o oooo ·· [Σ∗_,₂_] (1)

In more detail, the algorithm maintains a pair(S,T)of ﬁnite setsS,T ⊆ Σ∗_{(“states” and “tests”). For any such pair, the} restriction ofmQ·eQ to the domainSand codomain[T,2], hS,T:S→ [T,2], hS,T(s)(t)=LQ(st) fors ∈S,t ∈T, is called theobservation tablefor(S,T). It is usually repre-sented as an|S| × |T|-matrix with binary entries. The learner

(7)

can computehS,Tvia membership queries. The pair(S,T)is

closedif for eachs ∈Sanda∈Σ_{there exists}_s′_∈_S_with

hS∪SΣ,T(sa)=hS,T(s′). It isconsistentif, for alls,s′∈S,

hS,T(s)=hS,T(s

′₎ _implies

hS,T∪ΣT(s)=hS,T∪ΣT(s

′₎ .

Initially, one putsS =T = {ε}. If at some stage the pair

(S,T)is not closed or not consistent, eitherSorT can be extended by invoking one of the following two procedures:

ExtendS

Input:A pair(S,T)that is not closed. (0) Chooses ∈Sanda∈Σ_{such that}

hS∪SΣ,T(sa),hS,T(s′) for alls′∈S.

(1) PutS:=S∪ {sa}.

ExtendT

Input:A pair(S,T)that is not consistent. (0) Chooses,s′∈S,t ∈T anda ∈Σ_{such that}

hS,T(s)=hS,T(s′) and hS,T∪ΣT(s)(at),hS,T∪ΣT(s′)(at). (1) PutT :=T ∪ {at}.

The two procedures are applied repeatedly until the pair

(S,T)is closed and consistent. Then, one constructs an au-tomatonHS,T, thehypothesis for(S,T). Its set of states is

the imagehS,T[S], the transitionsδS,T: Σ×HS,T → HS,T

are given byδS,T(a,hS,T(s)) = hS∪SΣ,T(sa) fors ∈ S and

a ∈Σ_{, the initial state is}_h_S_,_T₍_ε₎_{, and a state}_h_S_,_T₍_s₎_{is ﬁnal}

ifs∈LQ(i.e.hS,T(s)(ε)=1). Note that the well-deﬁnedness

ofδS,T is equivalent to(S,T)being closed and consistent.

The learner now tests whether LHS,T = LQ by asking an equivalence query. If the answer is “yes”, the algorithm terminates successfully; otherwise, the teacher’s counterex-ample and all its preﬁxes are added toS. In summary:

L∗Algorithm

Goal:Learn an automaton equivalent to an unknown au-tomatonQ.

(0) InitializeS=T ={ε}.

(1) While(S,T)is not closed or not consistent: (a) If(S,T)is not closed: ExtendS. (b) If(S,T)is not consistent: ExtendT. (2) Construct the hypothesisHS,T.

(a) IfLHS,T =LQ: ReturnHS,T.

(b) IfLHS,T ,LQ: PutS:=S∪C, whereCis the set of preﬁxes of the teacher’s counterexample.

(3) Go to(1).

The algorithm runs in polynomial time w.r.t. the size of the minimal automaton Min(LQ) and the length of

the longest counterexample provided by the teacher. The learned automaton (i.e. the correct hypothesis returned in Step (2a)) is isomorphic to Min(LQ). Correctness and

termination rest on the invariant that S is preﬁx-closed andT is suﬃx-closed. Note that ifT ⊆ Σ<K_{, then}_T _yields

a quotient [Σ<K_,₂_]

։ [T,2] given by restriction. In the

following,T is represented via this quotient.

We shall now develop all ingredients of L∗ for adjoint F-automata. This requires additional assumptions, which hold for all the functors discussed in Example3.3,3.10and3.11:

Assumptions 4.1. On top of our Assumptions3.5, we re-quire for the rest of this section thatFI =I +F preserves

subobjects (FI(M) ⊆ M) and pullbacks ofM-morphisms,

and thatGO =O×Gpreserves quotients (GO(E) ⊆ E).

Our categorical learning algorithm generalizes (1) to the diagram shown below, where the upper and lower part are given by the initial chain forFIand the ﬁnal cochain forGO:

F_I00// ¡ //· · · FIN0 jN ) ) / / FN I ¡ / /_FN+1 I 0_F// N+1 I ¡ / /_{· · ·} _µF_I eQ ✤ ✤ ✤ ✤ ✤ S hs,t ' ' es,t O O s O O Hs,t ms,t Q mQ ✤ ✤ ✤ ✤ ✤ T G_O01oooo ! · · · _GK O1 t O O O O GK_O+11 GK O! o o o o G · · · K+1 O ! o o o o _νG_O j′ K i i (2)

The algorithm maintains a pair(s,t)of anFI-subcoalgebra

and aGO-quotient algebra

s:(S,σ)֌(FIN0,F N

I ¡), t:(G

K

O1,GKO!)։(T,τ), (3)

withN,K >0. ForΣ-automata inSet, this means precisely thatSis a preﬁx-closed subset ofΣ<N_{, and that}_T_represents

a suﬃx-closed subset ofΣ<K_.

Initially, one takesN =K=1,s=idI andt=idO, which

corresponds to Step (0) of the originalL∗algorithm.

Remark 4.2. By Assumptions3.5(2)and4.1, every subcoal-gebras:(S,σ) ֌ (FIN0,F

N

I ¡) induces the two

subcoalge-bras (S,σ) // FN I ¡·s / / (F_IN+10,F_IN+1¡) oo FIs oo (FIS,FIσ). In the case ofΣ_{-automata in}_{Set, the construction of these}

two subcoalgebras corresponds to viewing a preﬁx-closed subsetS ⊆Σ<N _{as a subset of}Σ<N+1_{, and to extending}_S_to the preﬁx-closed subsetSΣ_{∪ {}_ε_}₌_S_∪_SΣ_⊆Σ<N+1_{. A dual}

remark applies to quotient algebras of(G_OK1,GK O!).

Deﬁnition 4.3(Observation table). Let(s,t)be a pair as in (3), and letQbe an automaton. Theobservation tablefor

(8)

(s,t)w.r.t.Qis the morphism hQ_s_,_t = (S→−s FIN0 jN −−→µFI eQ −−→Q−m−−→Q νGO j′ K −−→G_OK1→−t T). Its(E,M)-factorization is denoted by

hQ_s,t = (S e_s,tQ / ///H_sQ,t // m_s,tQ / /T ).

In the following, we ﬁxQ (the unknown automaton to be learned) and omit the superscripts(−)Q.

Remark 4.4. In our categorical setting, membership queries are replaced by the assumption that the learner can compute the observation tablehQ_s,t for each pair(s,t).

Im-portantly, this morphism depends only on the language of Q: one can show that for every automatonQ′withLQ =LQ′

one hasmQ·eQ =mQ′·e_Q′, whencehQ

s,t =h

Q′ s,t.

Deﬁnition 4.5(Closed/Consistent pair). For any pair(s,t)

as in (3), let cls,t andcss,t be the unique diagonal ﬁll-ins

making all parts of the diagram below commute:

Hs,GOt css,t ✤ ✤ / / ms,GO t / /_G_O_T τ S es,GO t 7 777 ♥ ♥ ♥ ♥ ♥ ♥ ♥ ♥ ♥ es,t / /// σ Hs,t // ms,t / / cls,t ✤ ✤ T FIS _e FI s,t / ///_H_F Is,t 6 6 mFI s,t 6 6 ♥ ♥ ♥ ♥ ♥ ♥ ♥ ♥ ♥ ♥

The pair(s,t)isclosedifcls,t is an isomorphism, and

consis-tentifcss,tis an isomorphism.

If(s,t)is not closed or not consistent, at least one of the two dual procedures below applies. “Extends” replacesS֌

F_IN0 by a new subcoalgebraS′ ֌ FIN+10, i.e. it moves to

the right in the initial chain forFI. Analogously, “Extendt”

replacesG_OK1։T by a new quotient algebraGKO+11։T

′_, and thus moves to the right in the ﬁnal cochain forGO.

Extends

Input:A pair(s,t)as in (3) that is not closed.

(0) Choose an objectS′andM-morphismss0:S֌S′and s1:S′֌FISsuch that σ =s1·s0 and eFIs,t ·s1∈ E. (1) Replaces:(S,σ)֌(FIN0,F N I ¡)by the subcoalgebra FIs·s1:(S′,FIs0·s1)֌(FIN+10,FIN+1¡).

Remark 4.6. (1) One trivial choice in Step (0) is S′=FIS s0=σ, s₁₌id.

To get an eﬃcient implementation of the algorithm, one aims to choose the subobjects1:S′֌FISas small as

pos-sible.

(2) The update ofs in Step (1) is well-deﬁned, i.e.FIs ·s1

is a subcoalgebra. Indeed, the commutative diagram below shows thatFIs·s1is a coalgebra homomorphism:

F_IN+10 F N+1 I ¡ / /FI(F_IN+10) FIS FIs O O FIσ / /_F_I_F_I_S FIFIs O O SOO′ s1 O O s1 //FIS FIσ_♦_♦♦77 ♦ ♦ ♦ ♦ ♦ FIs0 / /FIS′ O OFIs1 O O

Moreover, sinces,s1∈ MandFIpreservesM(see

Assump-tions4.1), we haveFIs·s1∈ M.

(3) In the case ofΣ_{-automata in}_{Set, the condition}_σ ₌_s₁_·_s₀

states thatS ⊆ S′ ⊆ S ∪SΣ ₌ _SΣ_{∪ {}_ε_}_{. The condition}

eFIs,t ·s1 ∈ E asserts that given s ∈ S anda ∈ Σsuch

thathS∪SΣ,T(sa),hS,T(r)for allr ∈S, there existss′∈S′

withhS∪SΣ,T(sa)=hS∪SΣ,T(s′). Thus, “Extends” subsumes

several executions of “ExtendS” in the originalL∗algorithm.

Extendt

Input:A pair(s,t)as in (3) that is not consistent.

(0) Choose an objectT′andE-morphismst0:GOT ։T′

andt1:T′։T such that

τ =t1·t0 and t0·ms,GOt ∈ M.

(1) Replacet:(GK_O1,G_OK!)։(T,τ)by the quotient algebra t0·GOt:(GOK+11,GKO+1!)։(T

′

,t0·GOt1).

Remark 4.7. (1) Dually to Remark4.6, a trivial choice in Step (0) is given byT′=GOT,t0=id,t1=τ, and Step (1) is

well-deﬁned, i.e.t0·GOtis a quotient algebra.

(2) In the case ofΣ_{-automata in}_{Set, we view the quotients}

T andT′as subsets ofΣ<K _and_Σ<K+1_{, respectively, using} the above identiﬁcation between subsets and quotients. The conditionτ =t1·t0then states thatT ⊆T′⊆T ∪ΣT. The

conditiont0·ms,GOt ∈ Mstates that every inconsistency admits a witness inT′: givens,s′∈SwithhS,T(s)=hS,T(s′)

but hS,T∪ΣT(s) , hS,T∪ΣT(s′), there exists t′ ∈ T′ with

hS,T′(s)(t′) , hS,T′(s′)(t′). Thus, “Extendt” subsumes

sev-eral executions of “ExtendT” in the originalL∗_algorithm. If(s,t)is both closed and consistent, then we can deﬁne an automaton structure onHs,t:

Deﬁnition 4.8 (Hypothesis). Let the pair (s,t) be closed and consistent. Thehypothesisfor(s,t)is the automaton

(Hs,t,δs,t,is,t,fs,t)

with states Hs,t and structure deﬁned below. Here, inl/inr

are coproduct injections,outl/outrare product projections, and(−)#denotes adjoint transpose along the adjunctionF ⊣

G:

(9)

(1) The transitionsδs,t:F Hs,t →Hs,t are given by the

di-agonal ﬁll-in of the commutative square F S ls,t F es,t / ///F Hs,t δs,t x x r r r _r# s,t Hs,t // _m s,t / /_T

with the two vertical morphisms deﬁned by ls,t =(F S−−→inr I+F S=FIS e_{FI s},t −−−−→HFIs,t cl−1_s,_t −−−→Hs,t), rs,t =(Hs,t cs−1_s,_t −−−→Hs,GOt ms,GO t −−−−−−→GOT =O×GT −−−→outr GT). (2) The initial states are

is,t = (I −−inl→I+F S=FIS e_{FI s},t

−−−−→HFIs,t

cl−1_s,t

−−−→Hs,t). (3) The ﬁnal states are

fs,t = (Hs,t cs−1_s,t −−−→Hs,GOt ms,GO t −−−−−−→GOT =O×GT outl −−−→O).

Remark 4.9. The square deﬁningδs,t commutes: both legs

can be shown to be equal toF S −−→inr I+F S =FIS h_{FI s},t

−−−−→T. The idea of constructing theF-algebra structure of a hypoth-esis via diagonal ﬁll-in originates in the abstract framework of CALF [64]. An important diﬀerence is that in the latter the existence of the two vertical morphisms of the corre-sponding square is postulated, while our present setting fea-tures a concrete description ofls,tandrs,t.

Recall that in L∗, if a hypothesis HS,T is not correct (i.e.

LHS,T ,LQ), the learner receives a counterexamplew ∈ Σ ∗ from the teacher and adds the setC of all its preﬁxes toS. Identifying the wordwwith this set, the concept of a coun-terexample has the following categorical version:

Deﬁnition 4.10(Counterexample). Let(s,t)be closed and consistent. Acounterexample forHs,t is a subcoalgebra

c:(C,γ)֌(FIM0,F M

I ¡) for someM >0

such thatHs,t andQdo not agree on inputs fromC, that is,

LHs,t ·jM ·c , LQ·jM ·c.

Remark 4.11. (1) IfLHs,t ,LQ, then a counterexample al-ways exists. Indeed, since the colimit injectionsjM:F_IM0→

µFI are jointly epimorphic, one hasLHs,t ·jM ,LQ ·jM for someM > 0 and thus(C,γ) ₌ (F_IM0,F_IM¡)is a counterex-ample. To obtain an eﬃcient algorithm, it is often assumed that the teacher delivers aminimalcounterexample, i.e.M is minimal and no proper subcoalgebra is a counterexample. (2) Given a counterexamplec:(C,γ) ֌ (FIM0,FIM¡), one

can addc to the subcoalgebras:(S,σ) ֌ (FIN0,FIN¡)as

follows: by Remark4.2, we can assume thatM = N, and

then form the supremums∨c:(S∨C,σ∨γ)֌(FIN0,FIN¡)

ofsandc in the lattice of subcoalgebras of(FN

I 0,FIN¡), viz.

the image of the homomorphism[s,c]:S+C→FIN0.

With all these ingredients at hand, we obtain the following abstract learning algorithm for adjointF-automata:

GeneralizedL∗Algorithm

Goal:Learn an automaton equivalent to an unknown au-tomatonQ.

(0) InitializeN =K=1,s=idI andt=idO.

(1) While(s,t)is not closed or not consistent: (a) If(s,t)is not closed: Extends. (b) If(s,t)is not consistent: Extendt. (2) Construct the hypothesisHs,t.

(a) IfLHs,t =LQ: ReturnHs,t. (b) IfLHs,t

,LQ: Replace the subcoalgebrasbys∨c,

wherecis the teacher’s counterexample. (3) Go to (1).

To prove the termination and correctness of GeneralizedL∗, we need a ﬁniteness assumption on the unknown

automa-tonQ. We call aD_-object_Q_Noetherian_{if both its poset of}

subobjects (ordered bym≤m′_iﬀ_m

=m′·pfor somep) and

that of its quotients (ordered bye≤e′iﬀe=q·e′for some

q) contain no inﬁnite strictly ascending chains.

Theorem 4.12. IfQ is Noetherian, then the generalizedL∗

algorithm terminates and returnsMin(LQ).

Remark 4.13. Under a slightly stronger ﬁniteness condi-tion on Q, we obtain a complexity bound. Suppose that Q has ﬁniteheight n, that is,nis the maximum length of any strictly ascending chain of subobjects or quotients ofQ. Then Steps (1a), (1b) and (2b) are executedO(n)times.

Example 4.14. In D ₌ _Set, _Pos, _JSL, _K_{-Vec, and} _Nom, the Noetherian objects are precisely the finite sets, finite posets, finite semilattices, finite-dimensional vector spaces and orbit-finite nominal sets. The height ofQis equal to the number of elements ofQ(forD₌_Set,_{Pos) or the dimension} (forD₌_K_{-Vec). For}D₌_{Nom, the height of an orbit-finite} setQcan be shown to be polynomial in the number of orbits ofQand max{ |supp(q)| |q ∈Q}, using upper bounds on the length of subgroup chains in symmetric groups [11].

Remark 4.15. In the generalizedL∗algorithm, counterex-amples are added toS. Dually, one may opt to add them toT instead; forΣ_{-automata in}_{Set, this corresponds to a}

modifi-cation of Angluin’s algorithm due to Maler and Pnueli [42] that makes it possible to avoid inconsistent observation ta-bles, i.e. all tables constructed in the modified algorithm are consistent. In this dual approach, the accepted language of an automatonQis defined coalgebraically as the morphism

L_Q′ = (I iQ

−−→Q−m−−→Q νGO),

and acounterexampleis a quotient algebrac:(G_OM1,GM

O!)։

(C,γ)for someM >0 such thatc·j′_M·L′_H

s,t ,c·j ′

M·L

′

Q. In

Step (2b), a counterexamplecis added to the quotient alge-brat:(G_OK1,G−OK!)։(T,τ)by forming the supremum of

(10)

tandc. To guarantee termination, our original requirement thatFI preserves pullbacks ofM-morphisms (see

Assump-tions4.1) needs to be replaced by the dual requirement that GOpreserves pushouts ofE-morphisms.

Remark 4.16. We elaborate on the connection between GeneralizedL∗ _{and the learning algorithm for coalgebras} due to Barlocco et al. [14]. The latter is concerned with coalgebras whose semantics is given in terms of a coalge-braic logic, i.e. a natural transformation δ:Lop_P _→ _PB whereL:A _→ A _and_B_:B _→ B_{are endofunctors and} P:B _→Aop_{is a left adjoint (see the left-hand square} be-low). Aop δ ❑ ❑ ❑ ❑ ❑ ❑ ! ) ❑ ❑ ❑_❑_❑ ❑ Lop B P o o B Aop _B P o o (Dop₎op (Gop_O)op id ▲ ▲ ▲_▲_▲ ▲ " * ▲ ▲ ▲ ▲ ▲ ▲ D Id o o GO (Dop₎op _D Id o o

Here, L represents the syntax (usually modalities over a propositional base logic embodied by A_{), and} _B _the be-haviour (deﬁning the branching type of coalgebras onB_). The coalgebraic semantics ofF-automata corresponds to the trivial logic shown in the right-hand square. In this sense, F-automata are formally covered by the framework of [14].

While GeneralizedL∗is based on Angluin’sL∗algorithm, the coalgebraic learning algorithm in op. cit. generalizes Maler and Pnueli’s approach, and thus needs to keep obser-vation tables consistent (Remark4.15). To this end, tables are required to satisfy a property calledsharpness, which entails that the existence of extensions of non-closed tables is nontrivial and can only be guaranteed under strong as-sumptions on epimorphisms in the base category (e.g., all epimorphisms must split). Thus, the algorithm is eﬀectively limited to coalgebras inSetand does not apply, e.g., to Σ

-automata inNom; see Appendix. In our GeneralizedL∗_{, no} such assumptions are needed since table extensions always exist (Remark4.6). This makes our algorithm applicable in categories beyondSet, including the ones in Example3.3. GeneralizedL∗ provides a unifying perspective on known learning algorithms for several notions of deterministic au-tomata, including classicalΣ_{-automata (}D ₌ _Set_[₈_]), lin-ear weighted automata (D ₌ _K_-Vec_[₁₂_{]) and nominal} au-tomata (D₌_Nom_[₂₁_,₄₇_{]). For}D₌_{JSL, finite semilattice} automata can be interpreted as nondeterministicfinite au-tomata by means of an equivalence between the category of finite semilattices and a suitable category of finite clo-sure spaces and relational morphisms [3,48]. For any reg-ular languageL, the minimal Σ_-automaton _Min₍_L₎_in _JSL

corresponds under this equivalence to the minimalresidual ﬁnite state automaton (RFSA)[28], a canonical nondetermin-istic acceptor forLwhose states are the join-irreducible ele-ments ofMin(L). Consequently, theNL∗algorithm for learn-ing RFSA due to Bollig et al. [20] is also subsumed by our

categorical setting. We note that althoughNL∗learns a min-imal RFSA, the intermediate hypotheses arising in the algo-rithm are not necessarily RFSA, but general nondeterminis-tic ﬁnite automata. Our categorical perspective provides an explanation of this phenomenon: it shows thatNL∗ implic-itly computes deterministic ﬁnite automata over JSL, and not every such automaton corresponds to an RFSA.

Finally, our algorithm instantiates to new learning algo-rithms for nominal languages with name binding, includ-ing languages of dynamic sequences (Example 3.10), and for sorted languages (Example3.11). A special instance of sorted automata where all transitions are sort-preserving (i.e. Σ_s_,_t ₌ _∅_for_s ,t) appeared in the work of Moerman

[45] on learning product automata.

In each of the above settings, in order to turn General-izedL∗_{into a concrete algorithm, one only needs to provide} a suitable data structure for representing observation tables hs,t by ﬁnite means, and a strategy for choosing the objects

S′andT′in the procedures “Extends” and “Extendt”. We emphasize that these design choices can be non-trivial and depend on the specific structure of the underlying category D_{. The typical approach is to represent the map}_h_s_,_t_:_S_→_T by restricting the objectsSandT to finite sets of generators. For instance, finite-dimensional vector spaces can be repre-sented by their bases (D ₌ _K_{-Vec), finite semilattices by} their join-irreducible elements (D ₌ _{JSL) and orbit-finite} sets by subgroups of finite symmetric groups (D ₌_Nom).

Our above results demonstrate, however, that the core of our learning algorithm is independent from such implemen-tation details; in particular, its correctness and termination, and parts of the complexity analysis, always come for free as instances of the general results in Theorem4.12and Re-mark4.13. In this way, the categorical approach provides a clean separation between generic structures and design choices tailored to a speciﬁc application. This leads to a sim-pliﬁed derivation of learning algorithms in new settings.

5 Learning Monad-Recognizable

Languages

In this section, we investigate languages recognizable by monad algebras and show that the task of learning them can be reduced to learningF-automata.

Notation 5.1. Fix a monadT=(T,µ,η)onDthat preserves quotients (T(E) ⊆ E). We continue to work with the fixed objectsI,O ∈D_{of inputs and outputs (with}_I_{now thought} of as an input alphabet, so not normally the monoidal unit). Finally, we fix a full subcategoryD_f _⊆D_{closed under} sub-objects and quotients, and call the sub-objects ofD_f _the_finite objects ofD_.

Example 5.2. ChooseSetf,Posf,JSLf,K-Vecf andNomf

to be the class of all Noetherian objects (see Example4.14). Our monads of interest model formal languages:

(11)

D _T

Set T+X =X+

Set2 T∞(X,Y)₌(X+,Xup₊X∗Y) Set TΓX =Γ-trees overX

JSL T∗X =free idempotent semiring onX

K-Vec T∗X =freeK-algebra onX

Pos TSX =free stabilization algebra onX

Nom T∗X =X∗

In the second row,Xup

= {vwω : v ∈ X∗,w ∈ X+} denotes the set of ultimately periodic words overX, and in the third row,Γ_{is a ﬁnitary algebraic signature. Finite}

alge-bras for the above seven monads correspond to finite semi-groups, finite Wilke algebras [66], finite Γ_-algebras,

finite-dimensional K-algebras, finite stabilization algebras [26], and orbit-finite nominal monoids [19], respectively. In the present setting, we shall consider the following gen-eralized concept of a language:

Deﬁnition 5.3(Language). Alanguageis a morphism

L:T I →O in D.

It is calledrecognizableif there exists aT-homomorphism e:(T I,µI) → (A,α)into a ﬁniteT-algebra(A,α)and a

mor-phismp:A→OinD_with_L₌_p_·_e. T I e❆❆❆❆ ❆ L / /O A p ? ? ⑧ ⑧ ⑧ ⑧ ⑧

In this case, we say thaterecognizesL(viap).

Remark 5.4. The above deﬁnition generalizes the concepts of the previous sections. Indeed, if F is functor for which the free monadTF (see Section 2) exists, then a language

L:TFI →Oin the sense of Deﬁnition5.3is precisely a

lan-guageL:µFI →O in the sense of Deﬁnition3.8. Moreover,

since the categories ofF-algebras andTF-algebras are

iso-morphic,LisTF-recognizable if and only ifLisregular, i.e.

accepted by some ﬁniteF-automaton.

Example 5.5. Many important automata-theoretic classes of languages can be characterized algebraically as recogniz-able languages for a monad. For the monads of Example5.2

we obtain the following languages: D _T _{T-recognizable languages} Set T₊ regular languages [50]

Set2 T∞ ω-regular languages [49]

Set TΓ tree languages overΓ_[₂₅_]

JSL T∗ regular languages [52]

K-Vec T∗ recognizable weighted languages [54] Pos TS regular cost functions [23]

Nom T∗ monoid-recognizable data languages [19] In the following, we focus on (ω-)regular languages and cost functions; see [58,60] for details on the remaining examples.

(1) For the semigroup monadT₊onSetwe obtain the clas-sical concept of algebraic language recognition: a language L⊆I+_{is recognizable if there exists a semigroup morphism} e:I+_→_S_{into a ﬁnite semigroup}_S_{and a subset}_P _⊆_S_with L=e−1[P]. Recognizable languages are exactly the (ε-free)

regular languages [50]. In fact, the expressive equivalence between Σ_{-automata in}_Set_{and semigroups generalizes to} Σ_{-automata in symmetric monoidal closed categories [}₄_].

(2) Languages of inﬁnite words can be captured alge-braically as follows. A Wilke algebra[66] is a two-sorted set(S+,Sω)with a product·:S+×S+→ S+, a mixed prod-uct·:S+×Sω→Sωand a unary operation(−)ω:S+→Sω

subject to the laws

(st)u=s(tu), (st)z₌s(tz), s(ts)ω ₌(st)ω, (sn)ω₌sω, for alls,t,u ∈ S+,z ∈ Sω andn > 0. The free Wilke al-gebra generated by the two-sorted set(X,Y)isT∞(X,Y)₌

(X+_,_Xup

+X∗Y)with the two products given by

concatena-tion of words, andwω =www. . .forw ∈X+. In particular, choosing the input object(I,∅)for some setI and the out-put objectO =({0,1},{0,1}), we haveT∞(I,∅)₌(I+,Iup), and thus a languageL:T∞(I,∅) →Ospecifies a set of finite or ultimately periodic infinite words. Languages recogniz-able by Wilke algebras correspond toω-regular languages, i.e. languages accepted by Büchi automata [49,66].

(3) Regular cost functions were introduced by Colcom-bet [23] as a quantitative extension of regular languages that provides a unifying framework for studying limited-ness problems. Acost functionover the alphabetIis a func-tionf:I∗→N∪ {∞}. Two cost functionsf andдare iden-tified if, for every subsetA⊆N, the functionf is bounded onAiffд is bounded onA. Regular cost functions corre-spond to languages recognizable by finitestabilization al-gebras. The latter are ordered algebras over the signature

Γ ₌ _{₁_/_0,_·/_2,₍₋₎#_/_1,₍₋₎ω_/₁_}_{, with}_−/_n_{denoting arities,}

subject to suitable inequations; see [26,58]. We letTS

de-note the monad onPosinduced by this ordered algebraic theory.

Our generic approach to learningT-recognizable languages is based on the idea of presenting the free algebra TI =

(T I,µI)and its ﬁnite quotient algebras as automata:

Deﬁnition 5.6(T-reﬁnable). A quotiente:T I ։AinDis

T-reﬁnableif there exists a ﬁnite quotient algebrae′:TI ։

(B,β)ofTIand a morphismf:B ։Awithe =f ·e′.

Deﬁnition 5.7(Automata presentation). Anautomata pre-sentationof the freeT-algebraTIis given by an endofunctor FonD_{and an}_F_{-algebra structure}_δ_:_{FT I} _→_{T I} _{such that} (1) F(E) ⊆ E, the initial algebraµFI exists, and every

reg-ular language L: µFI → O admits a minimal automaton Min(L);

(2) theFI-algebra(T I,[ηI,δ])is reachable (i.e.eT I ∈ E);

(3) a T-reﬁnable quotient e:T I ։ A in D carries a

T-algebra quotient iﬀe carries anF-algebra quotient; that is,

(12)

there existsαAmaking the left-hand square below commute

iﬀ there existsδAmaking the right-hand square commute.

TT I µI // T e T I e T A ∃αA / / ❴ ❴ ❴ A ⇐⇒ FT I δ // F e T I e F A ∃δA / / ❴ ❴ ❴ A

If in (3) only the implication “⇒” is required,(F,δ)is called aweak automata presentation.

Remark 5.8. (1) Examples of functorsFfor which the ﬁrst condition is satisﬁed include all functors satisfying the As-sumptions 3.5, see Remark 3.7(1) and Theorem 3.13, and polynomial functorsF =FΓ onSetorPosfor a signatureΓ.

Recall from Example3.4thatFΓ-automata areΓ_-automata.

(2) Presentations ofT-algebras as (sorted)Σ_{-automata were}

previously studied by Urbat, Adámek, Chen, and Milius [59] for the special case where D _{is a variety of algebras and}

Σ_∈D_{is a free algebra, and called}_{unary presentations.}

Example 5.9. For all monads of Example5.2, free algebras admit an automata presentation (in fact, a presentation as (sorted)Σ_{-automata [}₅₈_–₆₀_{]). Here we consider three cases:}

(1) Semigroups. The free semigroup T₊I = I+ has a Σ

-automata presentationδ:Σ_×_I+_→_I+_{given by the alphabet}

Σ₌_{→_a _: _a_∈_I_{} ∪ {}←_a _: _a _∈_I_}

and the transitions

δ(→a,w)=wa and δ(←a,w)=aw for w ∈I+_,_a_∈_I_. Recall from Example3.11 thatµFI = I ×Σ∗. The unique

homomorphismeI+:I×Σ∗→I+interprets a word inI×Σ∗ as a list of instructions for forming a word inI+_{, e.g.}

eI+(a

→

a→b←b→a) = baaba.

For a weak automata presentation ofI+_{, it suﬃces to take} the restrictionδ′:Σ′_×_I+_→_I+_of_δ_whereΣ′₌_{→_a : _a_∈_I_}_.

(2) Wilke algebras. The free Wilke algebra T∞(I,∅) =

(I+_,_Iup₎_{can be presented as a two-sorted}_Σ_{-automaton with} the sorted alphabetΣ₌₍Σ

+,+,Σ+,ω,Σω,ω,∅)given by Σ +,+={ → a :a∈I} ∪ {←a :a∈I} Σ +,ω={ω} ∪ {→vω:v∈I+} Σ_ω_,_ω₌_{_a ←:a∈I}

and the transitions below, wherev,w ∈I+_,_z_∈_Iup_,_a _∈_I_: δ+,+( → a,w)=wa, δ+,+( ← a,w)=aw, δ+,ω(ω,w)=wω, δ₊,ω( → vω,w)=wvω, δω,ω(a ←,z)=az.

Recall from Example3.11that the initial algebraµFIconsists

of sorted words overΣ_{with an additional ﬁrst letter from}_I_.

The homomorphisme(I+,Iup₎:µF_I → (I+,Iup)views such a word as an instruction for forming a word in(I+_,_Iup₎_{, e.g.}

e(I+,Iup₎(a → b→aωa ←a←) = aa(aba) ω .

To obtain a weak automata presentation, it suﬃces to re-strictΣ

+,+andΣ+,ω to the ﬁnite subalphabetsΣ′₊,+ = {

→

a : a ∈I}andΣ′

+,ω={ω}. AΣ

′_{-automaton is similar to a} fam-ily of DFAs, a concept recently employed by Angluin and Fisman [9] for learningω-regular languages.

(3) Stabilization algebras.Suppose thatTis a monad onSet orPosinduced by a ﬁnitary signatureΓ_{and (in-)equations}_E;

see Section2. ThenTIcan be presented as theΓ_-automaton

δ: FΓ(T I) → T I given by the Γ-algebra structure on the free(Γ,E)-algebraT I. The initial algebraµ(FΓ)I is the

alge-braTΓI ofΓ_{-terms over}_I_{, and the unique homomorphism}

eT I:TΓI ։ T I interprets Γ-terms inT I. In particular, for

the monadT=TSonPos, the free stabilization algebraTSI

admits aΓ_{-automata presentation for the signature}Γ_of

Ex-ample5.5(3).

From now on, we ﬁx a weak automata presentation(F,δ)of the freeT-algebraTI.

Deﬁnition 5.10(Linearization). Thelinearizationof a lan-guageL:T I →Ois given by

lin(L) = (µFI eT I

/

///_{T I} L //_O ₎.

Example 5.11. (1) Semigroups. Take the Σ_-automata

pre-sentation of Example5.9(1). GivenL ⊆ I+_{, the language}

lin(L) ⊆ I × Σ∗ _{consists of all possible ways of}

gener-ating words in L by starting with a letter a ∈ I and adding letters on the left or on the right. For instance, if L contains the word abc, then lin(L) contains the words a→b→c,b←a→c,b→c←a,c

←

b←a.

(2) Wilke algebras. Take the weak presentation of

Exam-ple5.9(2). GivenL⊆ (I+_,_Iup₎_{, the language}_lin₍_L₎_{consists of}

all possible ways of generating words inLby starting with a lettera ∈I and repeatedly applying any of the following operations: (i) right concatenation of a finite word with a letter; (ii) left concatenation of an infinite word with a let-ter; (iii) taking theω-power of a finite word. For instance, if Lcontains the word(ab)ω, thenlin(L)containsa→b ω,b→aωa

←,

a→b ωb

←a←,b →

aωa

←b←a←, . . .. Thus,lin(L)is a two-sorted version of

the languagelasso(L)mentioned in the Introduction. (3) Stabilization algebras.Take the presentation of Exam-ple5.9(3). Given a languageL ⊆ TSI, the setlin(L) ⊆ TΓI

consists of allΓ_{-trees whose interpretation in}_T_S_I _{lies in}_L.

As demonstrated by the above examples, the linearization allows us to identify a languageL:T I → O with a lan-guagelin(L): µFI → O of ﬁnite words or trees. Since the

morphism eT I: µFI ։ T I is assumed to be epic by

Def-inition 5.7(2), this identiﬁcation is unique; that is, lin(L)

uniquely determinesL. In particular, in order to learnL, it is suﬃcient to learnlin(L). This approach is supported by the following result:

(13)

Theorem 5.12. IfL:T I →Ois aT-recognizable language, then its linearizationlin(L):µFI →Ois regular, i.e. accepted

by some ﬁniteF-automaton.

Proof sketch. Lete:TI → (A,α)be aT-homomorphism rec-ognizingLviap:A→O. By replacingewith its coimage, we may assume thate∈ E. The weak automata presentation yields anF-algebra structure onAmakingeanF-algebra ho-momorphism. ThenA, viewed as an automaton with initial statese·ηI:I →Aand ﬁnal statesp, acceptslin(L).

In view of this theorem, one can apply any learning algo-rithm for ﬁniteF-automata (e.g. GeneralizedL∗for the case of adjoint automata, or a learning algorithm for tree au-tomata [30] ifF is a polynomial functor) to learn the mini-mal automatonQLforlin(L). This automaton, together with

the epimorphismeT I, constitutes a ﬁnite representation of

the unknown languageL:T I → O. If the given automata presentation forTIis non-weak, we can go one step further and infer fromQLa minimal algebraic representation ofL:

Deﬁnition 5.13(SyntacticT-algebra). LetL:T I → O be recognizable. A syntactic T-algebra for L is a quotient T-algebraeL:TI ։Syn(L)ofTIsuch that (1)eLrecognizesL,

and (2)eLfactorizes through every ﬁnite quotientT-algebra

e:TI։(A,α)recognizingL. TI e //// eL▲▲▲%%%% ▲ ▲ ▲ ▲ (A,α) ✤ ✤ Syn(L)

Theorem 5.14. Let (F,δ)be an automata presentation for TI. Then everyT-recognizable languageL:T I →Ohas a syn-tacticT-algebraSyn(L), and its correspondingF-automaton (via the given presentation) is the minimal automaton for

lin(L):

Syn(L) Min(lin(L)).

This theorem asserts that we can uniquely equip the learned minimalF-automatonQL = Min(lin(L))with aT-algebra

structureαL:T QL→QLfor which the unique automata

ho-momorphismeL:T I ։ QL is aT-algebra homomorphism

eL:TI ։(QL,αL). TheneLis the syntactic algebra forL.

Remark 5.15. To make the construction ofSyn(L) from the learned automatonQLeﬀective, we need to assume that

the morphisms eQL,eT I,T eQL,T eT I and µI can be repre-sented as (sorted families of) computable maps and more-over the mapseT I andT eQL admit computable (not neces-sarily morphic) right inversesmandn, respectively. Then theT-algebra structureαL ofSyn(L)can be represented as

the computable mapeQL·m·µI·T eT I·n; see the commutative diagram below.

T(µFI) T eT I

/

///

T e_QL▼▼▼&&&& ▼ ▼ ▼ TT I T eL µI / /_{T I} eL µFI eT I o o o o e_QL z z z z ✉✉✉✉ ✉✉ T QL _α L //QL

Example 5.16. This computation strategy works for all monads of Example5.2. We consider our running examples: (1) Semigroups.For the Σ_{-automata presentation of}

Exam-ple5.9(1)andL⊆I+_{, we compute the semigroup structure}

•:QL ×QL → QL onQL from its automaton structure as

follows. Givenq,q′ ∈ QL choose words w,w′ ∈ I ×Σ∗

witheQL(w)=q,eQL(w ′₎

=q′, i.e. witnesses for the

reach-ability ofq andq′. Next, choosev ∈ I ×Σ∗ _with_e_I+(v) ₌ eI+(w)e_I+(w′) ∈I+, and putq•q′:₌e_Q_L(v).

(2) Wilke algebras.Analogous to the case of semigroups. (3) Cost functions.For a monadTonSetorPosgiven by a signature Γ _{and (in-)equations} _E _{and the}Γ_-automata

pre-sentation of TI in Example5.9(3), the computation ofαL

is trivial: the structure of theΓ_-algebra_Syn₍_L₎ _{is just the}

automaton structure of QL. In particular, this applies to

the monadTS on Posrepresenting cost functions

(Exam-ple5.2(3)). Thus, we obtain the ﬁrst learning algorithm for this class of languages.

6 Conclusions and Future Work

We have presented a generic algorithm (GeneralizedL∗) for learningF-automata that forms a uniform abstraction ofL∗ -type algorithms, their correctness proofs, and parts of their complexity analysis, and instantiates to several new learn-ing algorithms, e.g. for various notions of nominal automata with name binding. Moreover, we have shown how to ex-tend the scope of GeneralizedL∗, and other learning algo-rithms forF-automata, to languages recognizable by monad algebras. This gives rise to a generic approach to learning numerous types of languages, including cases for which no learning algorithms are known (e.g. cost functions).

The next step is to turn our high-level categorical ap-proach into an implementation-level algorithm, parametric in the monadTand its automata presentation, with corre-sponding tool support. We expect that the recent work on coalgebraic minimization algorithms and their implementa-tion [27,29] can provide guidance. It should be illuminating to experimentally compare the performance of the generic algorithm with tailor-made algorithms for speciﬁc types of automata.

Our generalizedL∗ algorithm is concerned with adjoint F-automata and applies to a wide variety of automata on ﬁ-nite words (including weighted, residual nondeterministic, and nominal automata), but presently not to tree automata. To deal with the latter, the adjointness of the type functorF needs to be relaxed, which entails that a coalgebraic seman-tics is no longer directly available. A categorical approach to learning tree automata, assuming a purely algebraic point of view, was recently proposed by van Heerdt et al [62]. The subtle interplay between the algebraic and coalgebraic as-pects underlying learning algorithms is up for further in-vestigation.