The complexity and generality of learning answer set programs

(1)

Contents lists available atScienceDirect

Artiﬁcial

Intelligence

www.elsevier.com/locate/artint

The

complexity

and

generality

of

learning

answer

set

programs

Mark Law

∗

,

Alessandra Russo,

Krysia Broda

Department of Computing, Imperial College London, SW7 2AZ, United Kingdom

a

r

t

i

c

l

e

i

n

f

o

a

b

s

t

r

a

c

t

Article history:

Received2September2016

Receivedinrevisedform20February2018 Accepted15March2018

Availableonline21March2018

Keywords:

Non-monotoniclogic-basedlearning AnswerSetProgramming

Complexityofnon-monotoniclearning

TraditionallymostoftheworkinthefieldofInductiveLogicProgramming(ILP)has ad-dressedtheproblemoflearningPrologprograms.Ontheotherhand,AnswerSet Program-mingisincreasinglybeingusedasapowerfullanguageforknowledgerepresentationand reasoning,andisalsogainingincreasingattentioninindustry.Consequently,theresearch activityinILPhaswidened tothe areaofAnswerSetProgramming,witnessingthe pro-posalofseveralnewlearningframeworksthathaveextendedILPtolearninganswerset programs.Inthispaper,weinvestigatethetheoreticalpropertiesoftheseexisting frame-worksfor learningprograms under theanswer set semantics.Specifically,we presenta detailedanalysis ofthe computationalcomplexity ofeachoftheseframeworkswith re-specttothe twodecision problemsofdecidingwhether ahypothesisis asolutionofa learningtaskanddecidingwhetheralearningtaskhasanysolutions.Weintroduceanew notionofgenerality ofalearningframework,whichenablesustodefineaframeworktobe moregeneralthananotherintermsofbeingabletodistinguish oneASPhypothesissolution fromasetofincorrectASPprograms.Basedonthisnotion,weformallyproveagenerality relationovertheset ofexistingframeworks forlearningprogramsunderanswerset se-mantics.Inparticular,weshowthatourrecentlyproposedframework,Context-dependent LearningfromOrderedAnswerSets,ismoregeneralthanbraveinduction,inductionofstable models,andcautiousinduction,andmaintainsthesamecomplexityascautiousinduction, whichhasthehighestcomplexityoftheseframeworks.

1. Introduction

Over thelast twodecadestherehasbeena growinginterest inInductiveLogicProgramming(ILP) [1], wherethegoal is to learn a logic program calleda hypothesis, whichtogether witha given background knowledge base, explains a set of examples.Themain advantagethat ILP hasovertraditionalstatisticalmachine learningapproachesis thatthe learned hypotheses canbeeasilyexpressedinplainEnglishandexplainedto ahumanuser,so facilitatinga closerinteraction be-tweenhumansandmachines.TraditionalILPframeworkshavefocusedonlearningdeﬁnitelogicprograms [1–6] andnormal logicprograms [7,8].Ontheotherhand,AnswerSetProgramming [9] isapowerfullanguageforknowledgerepresentation andreasoning.ASPiscloselyrelatedtootherdeclarativeparadigmssuchasSAT,SMTandConstraintProgramming,which haveeachbeenusedforinductivereasoning [10–12].Comparedwiththeseotherparadigms,duetoitsnon-monotonicity, ASP isparticularlysuitedforcommon-sensereasoning [13–15].Because ofitsexpressivenessandeﬃcientsolving, ASPis

*

Correspondingauthorat:DepartmentofComputing,HuxleyBuilding,180Queen’sGate,ImperialCollegeLondon,London,SW72AZ,UnitedKingdom. E-mail address:[email protected](M. Law).

https://doi.org/10.1016/j.artint.2018.03.005

(2)

alsoincreasingly gainingattentioninindustry [16];forexample,indecisionsupportsystems [17],ine-tourism [18] and in productconﬁguration [19].Consequently,thescopeofILPhasrecentlybeenextendedtolearninganswersetprogramsfrom examples ofpartialsolutions ofa givenproblem,withthe intentionbeingtoprovide algorithmsthat support automated learningofcomplexdeclarativeknowledge.LearningASPprogramsallowsustolearnavarietyofdeclarativenon-monotonic, common-sensetheories,includingforinstanceEventCalculus [20] theories [21] andtheoriesforschedulingproblemsand agents’preferencemodels,bothfromrealuserdata [22] andfromsyntheticdata [23,24].

LearningASPprogramshasseveraladvantageswhencomparedtolearningPrologprograms.Firstly,whenlearningProlog programs, thegoaldirected SLDNFprocedure ofProlog mustbe takenintoaccount. Speciﬁcally,when learning programs with negation, it must be ensured that the programs are stratiﬁed, or otherwise the learned program may loop under certain queries. AsASP is declarative,no such considerationneed be takenintoaccount when learning ASP programs.A second, more fundamental advantage oflearning ASP programs, is that thetheory learned can be expressedusing extra typesofrulesthatarenotavailableinProlog,suchaschoicerulesandweakconstraints.Learningchoicerulesallowsusto learnnon-deterministicconcepts;forinstance,wemaylearnthatacoinmaynon-deterministicallylandoneitherheads or tails,butneverboth.Thiscouldbeachievedbylearning thesimplechoicerule

1 {heads, tails}1

.Learningchoicerules isdifferentfromprobabilisticILPsettingssuchas [25–27] where,insimilarcoinsproblemsthefocuswouldbeonlearning theprobabilitiesofthetwooutcomesofarecoin.LearningweakconstraintsenablesanaturalextensionofILPtopreference learning [23],whichhasresultedtobeeffectiveinproblemdomainssuchaslearningpreferencemodelsforscheduling[23] andforurbanmobility[24].

Severalalgorithms,aimed atlearning undertheanswersetsemantics, anddifferentframeworksforlearningASP pro-grams havebeen recently introduced inthe literature. [28] presented the notions of braveinduction (I L Pb) andcautious induction (I L Pc), basedrespectivelyonthewellestablished notionsofentailmentundertheanswer setsemantics [13,29] ofbraveentailment (whenanatomistrueinatleastoneanswerset)andcautiousentailment (whenandanatomistruein allanswersets).Inbraveinduction,atleastoneanswersetmustcovertheexamples,whereasincautiousinduction,every answersetmustcovertheexamples.Braveinductionisactuallyaspecialcaseofan earlierlearningframework, called in-ductionofstablemodels (I L Psm) [30],inwhichexamplesarepartialinterpretations.Ahypothesisisasolutionofaninduction ofstablemodelstaskifforeach oftheexamplepartialinterpretations,thereisananswersetofthehypothesiscombined withthebackgroundknowledge,thatcoversthatpartialinterpretation.Braveinductionisequivalenttoinductionofstable modelswithexactlyone(partialinterpretation)example.

EachoftheaboveframeworksforlearningASPprogramsisunabletolearnsometypesofASPprograms [31];for exam-ple,braveinductionalonecannotlearnprogramscontaining hardconstraints.In [31],wepresentedalearningframework, calledLearningfromAnswerSets (I L PL A S), which uniﬁes brave andcautious induction andis ableto learn ASP programs containing normal rules, choice rules andhard constraints. In spiteof the increasedexpressivity, noneof the above ap-proaches canlearn weakconstraints, whichareable tocapturepreferencelearning. Informally,learning weakconstraints consistson identifyingconditions forordering answersets. The learning taskinthis casewouldrequireexamples of or-deringsover partial interpretations.To tacklethisaspectof learning ASPprograms, we haveextended theLearning from AnswerSetsframeworktoLearningfromOrderedAnswerSets(I L PL O A S) [23] anddemonstratedthatouralgorithm1isable tolearnpreferences ina schedulingdomain.Morerecently,we haveextendedthe I L PL O A S framework to I L Pcontext_{L O A S} ,with context-dependent examples,whichcometogetherwithextracontextualinformation [24].

Inthispaper,we exploreboththeexpressive powerandthecomputationalcomplexity ofeachframework.The former is important, as it allows us to identify the class of problems that each framework can solve, whereas the latter gives an indication ofthepricepaidforusingeach framework. Wecharacterise theexpressive powerofa frameworkinterms of new notions called one-to-one-distinguishability, one-to-many-distinguishability and many-to-many-distinguishability. The intuition ofone-to-one-distinguishability is that,given some ﬁxed backgroundknowledge B and suﬃcient examples,the frameworkshould beabletodistinguish a targethypotheses H1 fromanother,unwanted,hypotheses H2.Thismeansthat there shouldbe at leastone task T (ofthe givenframework)with backgroundknowledge B,such that H1 is asolution of T ,andH2 isnot.Wecharacterisetheone-to-one-distinguishabilityclassofaframework

F

(written

D

1₁

(F)

)asthesetof tuples

B

,

H1

,

H2

forsuch B’s, H1’sandH2’s,andstatethat aframework

F

1 ismore

D

11-general thananother

F

2 if

F

2’s one-to-one-distinguishabilityclassisastrictsubsetof

F

1’sone-to-one-distinguishabilityclass.

One-to-many-distinguishability relates to the task of ﬁnding a single target hypothesis from within a set of possi-ble hypotheses. It upgrades the notion of one-to-one-distinguishability classes to one-to-many-distinguishability classes. These are tuples of the form

B

,

H

,

S

for which a framework has at least one task that includes H and none of the (unwanted) hypotheses in S as an inductive solution. Many-to-many-distinguishability upgrades this notion to many-to-many-distinguishability classes. Thesecontain tuples of the form

B

,

S1

,

S2

, where S1 is a set of target hypotheses, for whichaframeworkmusthaveataskthatacceptseachhypothesisinS1 andnohypothesisinS2 asinductivesolution.We show that,underthesethreemeasures, I L Pcontext_{L O A S} ismoregeneralthan I L PL O A S,whichismoregeneralthan I L PL A S.We alsoshowthat I L PL A S ismoregeneralthanboth I L Psm andI L Pc.AlthoughI L Psm isequally

D

1₁-generalto I L Pb,weshow that I L Psm ismoregeneralthanI L Pb undertheone-to-manyandmany-to-manygeneralitymeasures.

1 _Our_ILASP_system_for_solving

_{I L P}

(3)

Despitethedifferentgeneralitiesof I L Pc,I L PL A S, I L PL O A S and I L PcontextL O A S ,weshow thatthecomputationalcomplexity of all fourframeworksis the same,both forthedecision problemofverifying thata givenhypothesis is a solutionofa given learning task, and forthe problemofdeciding whethera given learning taskhas anysolutions. Similarly, we also showthat I L Psm andI L Pb havethesamecomputationalcomplexitiesforbothdecisionproblems,despitetheformerbeing moregeneralthanthelatterundertwoofourgeneralitymeasures.

Webegin,inSection2,byreviewingthebackgroundmaterialnecessaryfortherestofthepaper.InSection3werecall the deﬁnitions ofeach of the learning frameworks and inSections 4 and 5 we prove the complexities and generalities (respectively)ofeachlearningframework.Weconcludethepaperwithadiscussionoftherelatedandfuturework.

2. Background

2.1. AnswerSetProgramming

In this section we introduce the concepts neededin the paper.Given anyatoms

h

, h

1

, . . . , h

k

, b

1

, . . . , b

n

, c

1

, . . . , c

m,

h

:- b

1

, . . . , b

n

, not c

1

, . . . , not c

m iscalledanormalrule, with

h

asthehead and

b

1

, . . . , b

n

, not c

1

, . . . , not c

m

(col-lectively)asthebody (“

not

”representsnegationasfailure);arule

:- b

1

, . . . , b

n

, not c

1

, . . . , not c

m,withanemptyhead,

is ahardconstraint;achoicerule isarule

l

{h

1

, . . . , h

k

}u ← b

1

, . . . , b

n

, not c

1

, . . . , not c

m (where

l

and

u

areintegers)

anditsheadiscalledanaggregate.Arule R issafe ifeachvariablein R occursinatleastonepositiveliteralinthebodyof

R.Inthispaperwewilluse

ASP

ch todenotethesetofchoiceprograms P ,whichareprograms composedofsafenormal rules,choicerules,andhardconstraints.Givenarule R,wewillwritehead

(

R

)

todenotetheheadofR,body

(

R

)

todenote the bodyof R andbody+ (resp.body−

(

R

)

)to denotetheatomsthat occurpositively (resp.negatively)inthebody of R.

Givenaprogram P ,wewillalsowrite Atoms

(

P

)

todenotetheatomsin P .Wewillalsoextendthisnotationtofragments ofaprogram.

The Herbrand Base ofanyprogram P

∈

ASP

ch, denoted H BP, isthe set ofvariablefree (ground) atoms that canbe formed frompredicatesandconstantsin P .Thesubsetsof H BP are calledthe(Herbrand)interpretations of P .Aground aggregate

l

{h

1

, . . . , h

k

}u

issatisﬁedbyaninterpretationI iff

l

≤ |

I

∩ {h

1

, . . . , h

k

}|

≤ u

.

As we restrict our programs to sets of normal rules, (hard) constraints and choice rules, we can use the simpliﬁed deﬁnitions ofthe reduct forchoice rules presentedin[33].Givena program P andan Herbrand interpretation I

⊆

H BP, the reduct PI isconstructed from ground

(

P

)

(the setof groundinstancesof rulesin P )in 4steps:ﬁrstly, removerules whosebodiescontainthenegationofanatominI;secondly,removeallnegativeliteralsfromtheremainingrules;thirdly, replace the headof anyhard constraint, oranychoice rule whose headis not satisﬁedby I with

⊥

(where

⊥

∈

/

H BP); and ﬁnally,replaceany remaining choicerule

l

{h

1

, . . . , h

m

}u:- b

1

, . . . , b

n withthe setofrules

{h

i

:- b

1

, . . . , b

n

| h

i

∈

I

∩ {h

1

, . . . , h

m

}}

.Any I

⊆

H BP isan answerset of P ifitistheminimalmodelofthereduct PI.Throughoutthepaperwe denotethesetofanswersetsofaprogram P with A S

(

P

)

.

We say aprogram P bravelyentails an atom

a

(written P

|=

b

a

) ifthere is atleastone answer set A of P suchthat

a

∈

A.Similarly, P cautiouslyentails

a

(written P

|=

c

a

)ifforeveryanswerset A of P ,

a

∈

A.

Unlike hard constraints in ASP, weakconstraints do not affect what is, or is not, an answer set of a program P . Hence the above deﬁnitions also apply to programs with weak constraints. Weak constraints create an ordering over A S

(

P

)

specifying which answer sets are “preferred” to others. A weakconstraint is of the form

:∼ b

1

, . . . , b

n

,

not c

1

, . . . , not c

m

.

[w

@

l

, t

1

, . . . , t

k

]

where

b

1

, . . . , b

n,

c

1

, . . . , c

m are atoms,

w

and

l

are termsspecifying theweight

and the level, and

t

1

, . . . , t

k are terms. A weak constraint W is safe if every variable in W occurs in at least one

positive literal in the body of W . At each prioritylevel

l

, the aim is to discard any answer set which does not min-imise the sum of the weights of the ground weak constraints with level

l

whose bodies are true. The higher levels are minimised ﬁrst. The terms

t

1

, . . . , t

k specify which ground weak constraints should be considered unique [34].

For any program P and an interpretation A, weak

(

P

,

A

)

is the set of tuples

(w, l, t

1

, . . . , t

k

)

for which there is some

:∼ b

1

, . . . , b

n

, not c

1

, . . . , not c

m

.

[w

@

l

, t

1

, . . . , t

k

]

in ground

(

P

)

such that A satisﬁes

b

1

, . . . , b

n

, not c

1

, . . . , not c

m.

For each level l, the score of the interpretation A is the sum of the weights of tuples with level l, formally Pl A

=

(w,l,t1,...,tk)∈weak(P,A)w. For A1

,

A2

∈

A S

(

P

)

, A1 dominates A2 (written A1

P A2) iff

∃

l such that P

l A1

<

P

l A2 and

∀

m

>

l

,

Pm_A₁

=

Pm_A

2.Ananswerset A

∈

A S

(

P

)

isoptimal ifitisnotdominatedbyanyA2

∈

A S

(

P

)

.

Example1.Let P be the program

{0{p(1), p(2), p(3)}1.}

. P has 8 answer sets, which are the various combinations of making eachofthethree

p

atomstrueorfalse.Consider thetwoweakconstraints

:∼ p(X).[1

@

1 ]

and

:∼ p(X).[1

@

1 , X

]

. Theﬁrstweakconstraintstatesthatifanyofthe

p

atomsistruethenapenaltyofonemustbepaid.Thispenaltyisonly paidonce,regardlesswhether1,2or3ofthe

p

atomsaretrue.Conversely,thesecondweakconstraintsaysthatapenalty of1mustbepaidforeachofthe

p

atomsthatistrue.Inbothcases,

∅

istheonlyoptimalanswerset;however,intheﬁrst case,noneoftheremaininganswersetsdominateeachother,whereasinthesecondcase,theanswersetswithonlyone

p

atomdominatethosewith2

p

atoms,whichinturneachdominatethesingleanswersetwith3

p

atoms.

NotethatthedeﬁnitionofweakconstraintsusedinthispaperisinlinewiththerecentASPstandardestablishedin [34]. Thesyntaxofsomepreviousdeﬁnitionsofweakconstraintssuchas [13] donotincludetheterms

t

1

, . . . , t

kandconsidered

(4)

everygroundinstanceofeveryweakconstraintindividually.Thissemanticscanbeachievedusingthenotionofweak con-straintsin [34].Anyweakconstraint

:∼ body.[w : l]

2 _can_be_mapped_to_the_weak_constraint

_{:∼ body.[w}

_@

_l

_{, V}

1

, . . . , V

n

]

,

where

V

1

, . . . , V

nisthesetofallvariablesthatoccurin

body

.Iftherearemultipleweakconstraints,toexactlypreservethe

semantics of [13], aunique termmustbe added toeach weakconstraint. Forexample,

{:∼ p(X).[1 : 1]; :∼ q(X).[1 : 1]}

wouldbecome

{:∼ p(X).[1

@

1 , X, 1

]; :∼ q(X).[1

@

1 , X, 2

]}

.With thisadditionalterm, W eak

(

P

,

{p(a),

q

(a)

})

(where P is

theprogram containing thetwo weak constraints)would beequal to

{(1, 1, a, 1),

(1, 1, a, 2)

}

,leading toa scoreof2 at level1;withouttheadditionalterm, W eak

(

P

,

{p(a),

q

(a)

})

wouldequal

{(1, 1, a)}

,leadingtoascoreof1atlevel1.

Unlessotherwisestated,whenwe refertoanASPprograminthispaper,wemeanaprogramconsistingofa ﬁniteset ofnormalrules,choicerules,hardandweakconstraints.

We now introduce some extra notation which will be useful in later sections.Given a set of interpretations S, the set ord

(

P

,

S

)

captures the ordering of the interpretations given by the weak constraints in P . It generalises the dom-inates relation; so it not only includes

A1

,

A2

,

<

if A1

P A2, but it also includes tuples for other binary compari-son operators. Formally,

A1

,

A2

,

<

∈

ord

(

P

,

S

)

if A1

,

A2

∈

S and A1

P A2;

A1

,

A2

,

>

∈

ord

(

P

,

S

)

if A1

,

A2

∈

S and

A2

P A1;

A1

,

A2

,

≤

∈

ord

(

P

,

S

)

if A1

,

A2

∈

S and A2

P A1;

A1

,

A2

,

≥

∈

ord

(

P

,

S

)

if A1

,

A2

∈

S and A1

P A2;

A1

,

A2

,

=

∈

ord

(

P

,

S

)

if A1

,

A2

∈

S, A1

P A2 and A2

P A1;

A1

,

A2

,

=

∈

ord

(

P

,

S

)

if A1

,

A2

∈

S and A1

P A2 or

A2

P A1. Given an ASP program, we write ord

(

P

)

asa shorthand for ord

(

P

,

A S

(

P

))

. Two ASP programs P and Q are stronglyequivalent (writtenP

≡

sQ )ifforeveryASPprogramR, A S

(

P

∪

R

)

=

A S

(

Q

∪

R

)

.

Wenowrecallthesplittingsettheoremfrom [35],whichweuseintheproofsthroughoutthepaper.Thistheoremrelies on the notions ofa splitting set andthe partial evaluationof a logicprogram. Given a program P , a set U

⊆

H BP is a splittingset of P if andonly ifforevery rule R

∈

ground

(

P

)

such that Atoms

(

head

(

R

))

∩

U

= ∅

, Atoms

(

R

)

⊆

U .Givena groundruleR andasetofatomsU ,wewriteR

\

U todenotetheruleR withall(positiveornegative)occurrencesofatoms inU removed fromthebodyof R.Givena program P asplittingsetU of P andaset X

⊆

U ,thepartialevaluationof P

withrespecttoU and X ,writteneU

(

P

,

X

)

,istheprogram

{

R

\

U

|

R

∈

ground

(

P

),

Atoms

(

head

(

R

))

∩

U

= ∅,

(

body+

(

R

)

∩

U

)

⊆

X

,

body−

(

R

)

∩

X

= ∅}

.

Theorem1.GivenanygroundASPprogramP ,andsplittingsetU ofP ,A S

(

P

)

= {

X

∪

Y

|

X

∈

A S

(

{

R

∈

P

|

Atoms

(

head

(

R

))

∩

U

=

∅}),

Y

∈

A S

(

eU

(

P

,

X

))

}.

TheintuitionbehindthesplittingsettheoremisthatifasetofatomsU isknowntosplit theprogram P ,thenwecan findtheanswersetsofthesubprogramthatdefinestheatomsinU first.Foreachoftheseanswersets X ,wecanpartially evaluate P using X and solvethispartiallyevaluated programforanswersets. Thesplittingsettheorem thenguarantees thatforeachanswersetY ofthepartiallyevaluatedprogram, X

∪

Y isananswersetof P .Furthermore,everyanswerset of P canbeconstructedinthisway.

2.2. Complexitytheory

Weassumethereaderisfamiliarwiththefundamentalconceptsofcomplexity,suchasTuringmachinesandreductions; foradetailedexplanation,see [36].

Many of the decisionproblems for ASP are known to be complete for classesin the polynomial hierarchy [37]. The classesofthepolynomialhierarchyaredeﬁnedasfollows: P istheclassofallproblemswhichcanbesolvedinpolynomial timebyaDeterministicTuringMachine(DTM);

₀P

=

₀P

=

₀P

=

P ;

_kP₊₁

=

P_kP _is_the_class_of_all_problems_which_can_be solvedbyaDTMinpolynomialtimewitha

_kP oracle;

_kP₊₁

=

N PkP _is_the_class_of_all_problems_which_can_be_solved_by_a

non-deterministicTuringMachineinpolynomialtimewitha

_kP oracle;ﬁnally,

_kP₊₁

=

co-N P_kP _is_the_class_of_all_problems whosecomplementcanbesolvedbyanon-deterministicTuringMachineinpolynomialtimewitha

_kP oracle.

P₁ and

₁P

are N P and co-N P (respectively),where N P istheclass ofproblemswhichcan be solvedby anon-deterministic Turing machineinpolynomialtimeandco-N P istheclassofproblemswhosecomplementisanN P problem.

D P istheclassofproblems D thatcanbe mappedtoapairofproblems D1 andD2 such that D1

∈

N P , D2

∈

co-N P , and for each instance I of D, I answers “yes” ifand only if both of the mapped instances I1 and I2 (of D1 and D2, respectively) answer “yes”.It is well known [36] that thefollowing inclusionshold: P

⊆

N P

⊆

D P

⊆

₂P

⊆

₂P and P

⊆

co-N P

⊆

D P

⊆

P 2

⊆

P2.

3. Learningframeworks

Inthissection,wegive thedefinitionsofthesixlearning frameworkswe analyseinthispaper.Thefirstthree–brave induction,cautiousinductionandinductionofstablemodels–arenotourown.Wereformulate,butpreservethemeaning of,theoriginaldefinitionsforeasiercomparisonwithourown.

(5)

ItiscommoninILPforatasktohaveahypothesisspace (thesetofallruleswhichcanappearinhypotheses).Thepurpose of the hypothesis spaceistwo-fold: firstly, it allowsthe taskto be restrictedto those solutions whichare in some way interesting; secondly,it aidsthe computationalsearch forinductive solutions.Tasksforbraveandcautious inductionand forinductionofstablemodelswereoriginallypresentedwithnohypothesisspace [28,30] astheywere mainlyconsidered theoreticallywithoutthespecificationsofefficientalgorithmiccomputations.Theonlypubliclyavailablealgorithmsforbrave induction [38,39] makeuseofa hypothesisspacedefinedby modedeclarations [40].Inthispaper,we “upgrade”each of braveinduction,cautiousinductionandinductionofstablemodelswithahypothesisspaceSM.

3.1. Notationandterminology

An ILPlearningframework

F

deﬁneswhata learningtask of

F

isandwhatan inductivesolution isforagivenlearning taskof

F

.Foreachframeworkataskisatuple

B

,

SM

,

E

,whereB isanASPprogramcalledthebackgroundknowledge,SM is asetofASP rulescalledthehypothesisspace, andE isa tuplecalledthe examples.Thestructure of E depends onthe type ofILPframework.Eachofthepapers [28], [30], [31] and [23] presentedlearningframeworkswithdifferentlanguages forB and SM;forexample,inductionofstablemodelswaspresentedonlyfornormallogicprograms.Itwouldbeunfairto saythatinductionfromstablemodelsisnotgeneralenoughtolearnprogramswithchoicerules,simplybecausetheywere not consideredintheoriginalpaper(infact,inductionfromstablemodelsis generalenoughtolearnsomeprogramswith choice rules). Fora faircomparisonwe therefore assume in thispaperthat every learning frameworkhas a background knowledge B andhypothesisspace SM thatconsistofnormalrules,choicerules,hardconstraintsandweakconstraints.

Givenaframework

F

andalearning taskT_F

=

B

,

SM

,

E

of

F

,ahypothesis isanysubsetofthehypothesisspace SM. In Section 5,we considertaskswithunrestricted hypothesis spaces(written

B

,

E

), inwhichcaseanyASP programcan becalledahypothesis.Aninductivesolution isahypothesisthat,togetherwiththebackgroundknowledge B,satisﬁessome conditions on E (given by the particular learning framework

F

). We write I L P_F

(

T_F

)

to denotethe set ofall inductive solutions of T_F. Throughout the paper, we use the term covers to apply to any kind of example: i.e. given a

F

task

B

,

SM

,

E

,wesaythatahypothesisH coversanexamplee (anyelementofanycomponentofE),ifitmeetstheparticular conditionsthattheframework

F

putsonH ande.

3.2. Frameworkdeﬁnitions

Braveinduction (I L Pb), firstpresented in [28],defines an inductive taskinwhich all examples are groundatomsthat should be covered inatleastone answer set,i.e.entailedunderbrave entailment inASP.The original definitiondidnot consideratomswhichshouldnotbepresentinananswerset,namelynegative examples.Thetwopubliclyavailable algo-rithmsthatrealisebraveinduction,ontheotherhand,doallowfornegativeexamples.Wethereforeupgradethedefinition inthispapertoallownegativeexamples3 _as_follows.

Deﬁnition1.Abrave induction(I L Pb) task Tb is a tuple

B

,

SM

,

E+

,

E−

, where B is an ASP program calledthe back-groundknowledge, SM isthe hypothesisspaceand E+ and E− aresets ofgroundatomscalledthepositiveandnegative examples(respectively).Ahypothesis H

⊆

SM issaidtobeaninductivesolutionofTb(written H

∈

I L Pb

(

Tb

)

)ifandonlyif

∃

A

∈

A S

(

B

∪

H

)

suchthatE+

⊆

A andE−

∩

A

= ∅

.

Cautiousinduction (I L Pc)wasalsoﬁrstpresentedin [28].Itdeﬁnesaninductivetaskwherealloftheexamples should be coveredinevery answerset(i.e.entailedundercautiousentailmentinASP)andthat B

∪

H shouldbesatisfiable (have at least one answer set). Similarlyto brave induction, the original definition didnot consider negative examples,but in Definition2weupgradetheframeworktoincludenegativeexamples.

Deﬁnition2.Acautiousinduction(I L Pc)taskTc isatuple

B

,

SM

,

E+

,

E−

,where B isan ASPprogramcalledthe back-groundknowledge, SM isthe hypothesisspaceand E+ and E− aresets ofgroundatomscalledthepositiveandnegative examples(respectively).AhypothesisH

⊆

SM issaidtobeaninductivesolutionofTc (written H

∈

I L Pc

(

Tc

)

)ifandonlyif

A S

(

B

∪

H

)

= ∅

and

∀

A

∈

A S

(

B

∪

H

)

,E+

⊆

A and E−

∩

A

= ∅

.

Brave inductionalone canonlyreasonaboutwhatshould betrue(or false)inasingle answersetof B

∪

H .Itcannot specify other brave tasks such as enforcing that two atoms are both bravely entailed, but not necessarily in the same answerset.Inductionofstablemodels [30] (I L Psm),ontheotherhand,generalisesthenotionofbraveinductionasshownin Deﬁnition4.Thefollowingterminologyisﬁrstintroduced.

Deﬁnition3.Apartialinterpretation e isapairofsetsofgroundatoms

einc

_,

_eexc

_._An_{interpretation} _{I is}_said_to_{extend e iff} einc

_⊆

_{I and}_eexc

_∩

_I

_{= ∅}

_.

3 _Note_that_in

_{I L P}

banegativeexampleeicanbeeasilysimulatedbyaddingaruleai:- not eitothebackgroundknowledgeandgivingaiasa

(6)

Deﬁnition4.Aninductionofstablemodels(I L Psm)taskTsm isatuple

B

,

SM

,

E

,where B isanASPprogramcalledthe backgroundknowledge,SMisthehypothesisspaceandE isasetofpartialinterpretationscalledtheexamples.Ahypothesis H issaidtobe aninductive solutionofTsm (written H

∈

I L Psm

(

Tsm

)

) ifandonlyif H

⊆

SM and

∀

e

∈

E,

∃

A

∈

A S

(

B

∪

H

)

suchthat A extendse.

Note that a braveinduction taskcan be thought ofasa special caseof inductionof stablemodels, withexactly one (partialinterpretation)example.

We now consider the LearningfromAnswerSets frameworkintroduced in [31]. This isthe ﬁrst framework capable of unifyingtheconceptsofbraveandcautious induction.Theideaistouseexamplesofpartialinterpretationswhichshould orshouldnotbeextendedbyanswersetsofB

∪

H .

Deﬁnition5.ALearningfromAnswerSets taskisatupleT

=

B

,

SM

,

E+

,

E−

where B isanASPprogramcalledthe back-groundknowledge, SM isthehypothesis spaceand E+ and E− are sets ofpartialinterpretations called, respectively,the positiveandnegativeexamples.AhypothesisH

⊆

SM isaninductivesolution ofT (written H

∈

I L PL A S

(

T

)

)ifandonlyif:

1.

∀

e+

∈

E+

∃

A

∈

A S

(

B

∪

H

)

suchthat A extendse+

2.

∀

e−

∈

E−

A

∈

A S

(

B

∪

H

)

suchthat A extendse−

Notethatthisdeﬁnitioncombinespropertiesofboththebraveandcautioussemantics:thepositiveexamplesmusteach bebravelyentailed,whereasthenegationofeachnegativeexamplemustbecautiouslyentailed.

Example2.Consider an I L PL A S learning taskwhose background knowledge B contains deﬁnitions ofthe structure of a

4x4

Sudokuboard;i.e.deﬁnitionsof

cell

,

same

_

row

,

same

_

col

and

same

_

block

(where

same

_

row

,

same

_

col

and

same

_

block

aretrueonlyfortwodifferent cellsinthesamerow,columnorblock).

B

=

⎧

⎪

⎨

⎪

⎩

cell

((1, 1)). cell((1, 2)).

. . .

cell

((4, 4)).

same

_

row

((X1, Y), (X2, Y)):- cell((X1, Y)), cell((X2, Y)), X1

= X2.

same

_

col

((X, Y1), (X, Y2)):- cell((X, Y1)), cell((X, Y2)), Y1

= Y2.

block

((1, 1), 1). block((1, 2), 1). block((2, 1), 1). block((2, 2), 1).

block

((3, 1), 2). block((3, 2), 2). block((4, 1), 2). block((4, 2), 2).

block

((1, 3), 3). block((1, 4), 3).

block

((2, 3), 3). block((2, 4), 3).

block

((3, 3), 4). block((3, 4), 4). block((4, 3), 4). block((4, 4), 4).

same

_

block

(C1, C2):- block(C1, B), block(C2, B), C1

= C2.

⎫

⎪

⎬

⎪

⎭

Forthepurposesofthisexample,wewillconsideronlyasmallhypothesisspaceSM butinpracticethiswouldbemuch larger.4 SM

=

⎧

⎪

⎨

⎪

⎩

0 {value(C, 1), value(C, 2), value(C, 3), value(C, 4)}1:- cell(C).

1 {value(C, 1), value(C, 2), value(C, 3), value(C, 4)}1:- cell(C).

1 {value(C, 1), value(C, 2), value(C, 3), value(C, 4)}2:- cell(C).

:- same

_

row

(C1, C2), value(C1, V), value(C2, V).

:- same

_

col

(C1, C2), value(C1, V), value(C2, V).

:- same

_

block

(C1, C2), value(C1, V), value(C2, V).

⎫

⎪

⎬

⎪

⎭

E+

=

{value((1, 1), 1)}, ∅

E−

=

⎧

⎪

⎨

⎪

⎩

{value((1, 1), 1), value((1, 3), 1)}, ∅

{value((1, 1), 1), value((3, 1), 1)}, ∅

{value((1, 1), 1), value((2, 2), 1)}, ∅

{value((1, 1), 1), value((1, 1), 2)}, ∅

∅, {value((1, 1), 1), value((1, 1), 2), value((1, 1), 3), value((1, 1), 4)}

⎫

⎪

⎬

⎪

⎭

Weneedtobeabletosaythat thereshould beatleastoneanswersetthat assignsavalue toacell,orotherwisethe empty hypothesis would be suﬃcient.This iscaptured by our positive examplewhich causes atleastone ofthe choice rules to be part ofa solutionin order to be covered. Ourﬁrst three negative examples require the threeconstraints to be alsoincludedin asolution.Without each oneof thesenegative examples,atleastone constraintcould beleft out of thesolution.Thefourthnegativeexamplemeansthat theupperboundofthecountingaggregateinthechoicerulemust be 1,asotherwisetherewouldbeanswersetsinwhichcell

(1, 1)

wasassignedtoboth

1

and

2

.Finally,theﬁfthnegative

(7)

example forcesthat the lower bound ofthe choice ruleshould be

1

asotherwise there wouldbe answersets in which

(1, 1)

wasnotassignedtoanyofthevaluesbetween

1

and

4

.Hence,onepossibleinductivesolutionis:

H

=

⎧

⎪

⎨

⎪

⎩

1 {value(C, 1), value(C, 2), value(C, 3), value(C, 4)}1:- cell(C).

:- same

_

row

(C1, C2), value(C1, V), value(C2, V).

:- same

_

col

(C1, C2), value(C1, V), value(C2, V).

:- same

_

block

(C1, C2), value(C1, V), value(C2, V).

⎫

⎪

⎬

⎪

⎭

TheonlyothersolutionswithinthehypothesisspaceSM arethosethatcontainH andalsoextraredundantchoicerules, suchas

0 {value(C, 1), value(C, 2), value(C, 3), value(C, 4)}1:- cell(C).

Note that weneed I L PL A S’s combinationofbraveandcautious induction toseparate the correcthypothesis fromthe incorrecthypotheses.

•

If we instead use brave induction, whichever examples we use, if H is a solution, then any of the choice rules on their own is also a solution. For instance, consider the hypothesis H, containing only the choice rule

0 {value(C, 1), value(C, 2), value(C, 3), value(C, 4)}1:- cell(C)

. For any examples

E+

,

E−

such that H

∈

I L Pb

(

B

,

E+

,

E−

)

, there must be an answer set A of B

∪

H such that E+

⊆

A and E−

∩

A

= ∅

. As A S

(

B

∪

H

)

⊂

A S

(

B

∪

H

)

,anysuchanswersetisalsoananswersetof B

∪

H;andhence,Hisalsoasolutionofthetask.

•

If we usecautious induction,we have togive examples which areeither truein every answerset, orfalse inevery answerset.Therefore,wecouldnotgiveanyexamplesaboutthe

value

predicate–foreachatom

value

(x, y)

(where

x

and

y

rangefrom1 to4),thereisatleastoneanswersetof B

∪

H thatcontains

value

(x, y)

andatleastonethat doesnot;thismeansthatif

value

(x, y)

isgivenaseitherapositiveornegativeexample,H willnotbeasolutionof thetask.

ThismeansthatforanyI L Pc taskTc suchthatH isasolution,anysubsetofthehypothesisspaceSM isalsoasolution ofTc.

Note that noneofthelearning frameworkswe haveconsidered sofar (I L PL A S included)can incentiviselearning a weak constraint. Thisisbecausetheframeworksonlyhaveexamplesofwhatshouldbeinsome,allornoneoftheanswersets of B

∪

H .Anysolution H containinga weakconstraint W willhavethesame answersetswith W removedand H

\{

W

}

would thereforebe a shorter (more optimal5₎ _solution. _The_notion _of_ordering_{examples is} _needed_to _incentivise_learning

weakconstraints,inordertoenforcewhichanswersetsofB

∪

H shoulddominateotheranswersets.

Deﬁnition6.Anorderingexample is atupleo

=

e1

,

e2

,

op

wheree1 ande2 arepartial interpretationsandop isabinary comparisonoperator (

<

,

>

,

=

,

≤

,

≥

or

=

).An ASPprogram P bravelyrespects o iff

∃

A1

,

A2

∈

A S

(

P

)

such thatall ofthe following conditionshold: (i) A1 extends e1; (ii) A2 extends e2; and(iii)

A1

,

A2

,

op

∈

ord

(

P

)

. P cautiouslyrespects o iff

A1

,

A2

∈

A S

(

P

)

suchthatallofthefollowingconditionshold:(i) A1 extendse1;(ii) A2 extendse2;and(iii)

A1

,

A2

,

op

∈

/

ord

(

P

)

.

NotethatDeﬁnition6generalisesourinitialdeﬁnitionoforderingexamplesgivenin [23],whereorderingexampleshad onlytheoperator

<

,andwecouldnotexpressexamplesofpairsofanswersetswhichwereequallypreferred.InSection5

weshowthatthisextensionallowsustolearnawiderclassofprograms.WenowdeﬁnethenotionofLearningfromOrdered AnswerSets (I L PL O A S).

Deﬁnition7.ALearningfromOrderedAnswerSets taskisatupleT

=

B

,

SM

,

E+

,

E−

,

Ob

,

Oc

where B isanASPprogram, calledthebackgroundknowledge, SM isthehypothesisspace, E+ andE−aresetsofpartialinterpretationscalled, respec-tively, positive andnegative examples, and Ob and Oc are setsof ordering examples over E+ calledbrave andcautious orderings.AhypothesisH

⊆

SM isaninductivesolutionofT (written H

∈

I L PL O A S

(

T

)

)ifandonlyif:

1. H

∈

I L PL A S

(

B

,

SM

,

E+

,

E−

)

2.

∀

o

∈

Ob B

∪

H bravelyrespectso

3.

∀

o

∈

Oc _B

_∪

_{H cautiously}_respects_o

Notethattheorderings areonlyoverpositiveexamples.Wechosetomakethisrestrictionastheredoesnotappearto beanyscenariowhereahypothesiswouldneedtorespectorderingswhicharenotextendedbyanypairofanswersetsof

B

∪

H .

Example3.Considerthe I L PL O A S task T

=

B

,

SM

,

E+

,

E−

,

Ob

,

Oc

wheretheindividual componentsofthe taskareas follows:

(8)

•

B

= {0{p, q}2.}

•

SM isunrestricted(i.e. SM isthesetofallnormalrules,choicerulesandhardandweakconstraints).

•

E+

=

e+₁

,

e+₂

wheree+₁

= {p},

∅

ande+₂

= ∅,

{p}

•

E−

= ∅

•

Ob

=

e+₁

,

e+₂

, <

•

Oc

₌

_e+ 1

,

e+1

,

=

Thepositiveexamplesofthistaskarealreadysatisﬁedbythebackgroundknowledge,whichhastheanswersets

∅

,

{p}

,

{q}

and

{p, q}

.As thereareno negativeexamples, itremains toﬁndaset ofweakconstraintssuch thatthereis atleast one answersetwhichcontains

p

which ispreferredtoatleastoneanswerset whichdoesnotcontain

p

andallanswer setswhichcontain

p

areequallyoptimal.

Onesuchhypothesisisthesingleweakconstraint

:∼ not p.[1

@

1 ]

.

TheframeworksdiscussedsofarhaveexampleswhichcanonlyexpressthepropertiesofalearnedhypothesisH together

withaﬁxedbackgroundknowledgeB.ThesepropertiesareontheanswersetsofB

∪

H (andtheorderingoftheseanswer sets).In[24],wepresentedanewlearningframeworkthatusescontext-dependent examples.Eachexamplecomeswithits owncontext,whichisan

ASP

chprogramC .ExamplesthenexpresspropertiesofB

∪

H

∪

C ,meaningthatbyusingmultiple examples(withdifferentcontexts),wecanexpressthat B

∪

H

∪

C1 shouldhavesomepropertiesandthatB

∪

H

∪

C2should havedifferentproperties.

Deﬁnition8.Acontext-dependentpartialinterpretation (CDPI)isapair

e

,

C

,wheree isapartialinterpretationandC isan

ASP

ch _program,_called_a_context._A_{context-dependent}_ordering_{example (CDOE)}_{o is}_a_tuple

_e

1

,

C1

,

e2

,

C2

,

op

,wherethe ﬁrsttwoelementsareCDPIsandopisabinary comparisonoperator(

<

,

>

,

=

,

≤

,

≥

or

=

). P issaidtobravelyrespect o if

∃

A1

∈

A S

(

P

∪

C1

),

∃

A2

∈

A S

(

P

∪

C2

)

suchthat A1 extendse1,A2 extendse2 and

A1

,

A2

,

op

∈

ord

(

P

,

A S

(

P

∪

C1

)

∪

A S

(

P

∪

C2

))

.Aprogram P is said tocautiouslyrespect o if

∀

A1

∈

A S

(

P

∪

C1

),

∀

A2

∈

A S

(

P

∪

C2

)

suchthat A1 extendse1 and A2 extendse2,

A1

,

A2

,

op

∈

ord

(

P

,

A S

(

P

∪

C1

)

∪

A S

(

P

∪

C2

))

.

When examples are givenwith empty contexts, they are equivalent to examples in I L PL O A S. Notealso that contexts donot contain weak constraints.Infact, the operator

P deﬁnesthe ordering overtwo answer setsbased onthe weak constraintsinoneprogram P .So,givenaCDOE

e1

,

C1

,

e2

,

C2

such thatC1 andC2 containdifferentweakconstraints, itisnotclearwhichprogramtoconsiderforcomputingtheorderingofanswersets–i.e.whethertheyshouldbechecked againsttheweakconstraintsin P ,P

∪

C1,P

∪

C2 orP

∪

C1

∪

C2.

WenowpresentaformaldeﬁnitionoftheI L Pcontext_{L O A S} framework.

Deﬁnition9.AContext-dependentLearningfromOrderedAnswerSets (I L Pcontext

L O A S )taskisatupleT

=

B

,

SM

,

E+

,

E−

,

Ob

,

Oc

where B is an ASP program called the background knowledge, SM is the set of rules allowed in the hypotheses (the hypothesisspace), E+and E− areﬁnitesetsofCDPIscalled,respectively,positiveandnegativeexamples,and Ob andOc

areﬁnitesetsofCDOEsoverE+called,respectively,braveandcautiouscontext-dependentorderings.Ahypothesis H

⊆

SM isaninductivesolutionofT (written H

∈

I L Pcontext

L O A S

(

T

)

)ifandonlyif: 1.

∀

e+

,

C

∈

E+,

∃

A

∈

A S

(

B

∪

C

∪

H

)

stA extendse+

2.

∀

e−

,

C

∈

E−,

A

∈

A S

(

B

∪

C

∪

H

)

st A extendse−

3.

∀

o

∈

Ob_,_B

_∪

_{H bravely}_respects_o 4.

∀

o

∈

Oc_,_B

_∪

_{H cautiously}_respects_o

In [24],weshowedthatcontext-dependentexamplescouldbeusedtosimplifytheencodingofcertaintasks,bysplitting thebackgroundknowledge intocontexts thatwere onlyrelevanttoparticularexamples. AlthoughanyI L Pcontext

L O A S taskcan be transformed into an I L PL O A S task, in general this requires parts of the examples to be encoded in the background knowledge.Example4showssuchatransformation.

Example4.Consider a simple scenario wherewe havea machine that hasa single conﬁgurationparameter a, whichis allowed to take anynaturalnumberasits value. Auser isallowed to inputanother naturalnumberb, andifa

>

b,the machineshouldbeep.

Two example scenarios could be encoded as the context-dependent positive examples

{beep},

∅,

{value(a, 3).

value

(b, 2).

}

, and

∅,

{beep},

{value(a, 4). value(b, 20).}

.Ataskcontaining theseexamplesandan empty back-groundknowledge requiresan inductive solutionthat when combinedwiththe context oftheﬁrst examplewouldhave atleastone answersetcontaining

beep

,andwhencombinedwiththe secondexamplewouldhaveatleastone answer setnotcontaining

beep

.Ifwewere expressingthesametaskin I L PL O A S theabove twoscenarioswouldberepresented consideringthebackgroundknowledge:

(9)

Table 1

Asummaryoftheavailablesystemsforlearningundertheanswerset se-mantics.

Framework Systems

I L Pb XHAIL [42], ASPAL [38] and RASPAL [43]

I L Psm I L Pc I L PL A S ILASP [32] I L PL O A S ILASP [32] I L Pcontext L O A S ILASP [32] Table 2

Asummaryofthecomplexityofthevariouslearningframeworks.

Framework Complexity of veriﬁcation Complexity of deciding satisﬁability

I L Pb N P -complete N P -complete I L Psm N P -complete N P -complete I L Pc D P -complete 2P-complete I L PL A S D P -complete 2P-complete I L PL O A S D P -complete 2P-complete I L Pcontext L O A S D P -complete 2P-complete

B

=

1 {value(a, 3), value(a, 4)}1.1{value(b, 2), value(b, 20)}1.

The context-dependent examples could then be mapped to the non context-dependent examples

{value(a, 3),

value

(b, 2),

beep

},

∅

and

{value(a, 4),

value

(b, 20)

},

{beep}

.Infact,in [24],weshowthatthereisageneral map-ping from I L Pcontext_{L O A S} to I L PL O A S.Thismapping, justasthe simpliﬁedmapping here, dependson encoding theexamples inthebackgroundknowledge,whichabusesthepurposeofthebackgroundknowledge.Thecontextsincontext-dependent examplesallowusinsteadtoseparateinformationthatistrulybackgroundknowledge,whichappliesinallscenarios,from informationthatispartofaparticularexample.

3.3. Systemsforlearningundertheanswersetsemantics

ThecurrentpubliclyavailablesystemsforILPcanbecategorisedaccordingtothe6frameworkspresentedinthissection (Table 1). It shouldbe notedthat although there are no systemswhich directly solve I L Pc or I L Psm tasks, both canbe simplytranslatedintoI L PL A S tasks,andcanthereforebesolvedbytheILASPsystem.

The ILED [21] systemisanincremental extensionofXHAIL, thatisspeciﬁcallytargetedatlearningEventCalculus [20] theories.The underlyingmechanismis basedonbraveinduction,buteachofitsexamples areintermsoftwo sequential timepoints.

4. Complexity

Inthissection,wediscussthecomplexityofeachofthelearningframeworkspresentedinSection3withrespecttotwo decisionproblems:veriﬁcation,decidingwhetheragivenhypothesisH isaninductivesolutionofataskT ;andsatisﬁability,

deciding whethera learning task T has any inductive solutions. A summary of the results is shown in Table 2. To aid readability, theproofsof thepropositions statedin thissection are givenin appendix.All complexities discussed inthis section are forpropositional versionsof the frameworks(both the backgroundknowledge and hypothesis space ofeach learningtaskisground).

4.1. Learningfromanswersetswithstratiﬁedsummingaggregates

As thereare existing resultsonthe complexityof solvingaggregatestratiﬁed programs, it isusefultointroduce anew learning framework I L Ps_{L A S},whichisa generalizationof I L PL A S,thatallows summingaggregates inthebodiesofrules,as longastheyarestratiﬁed.Theexistingresultsonthecomplexityoftheseprogramsthenallowustoprovethecomplexity of I L Ps

L A S.Hence,aswecanshowthat I L PL O A S reducestoI L PsL A S,thisishelpfulinprovingthecomplexityofI L PL O A S. A summing aggregate s is of the form

l

#

sum

{a

1

= w

1

, . . . , a

n

= w

n

}u

, where

l

,

u

and

w

1

,

. . . ,

w

n are integers and

a

1

,

. . . ,

a

n are atoms.s is satisﬁedbyan interpretation I ifandonlyif

l

≤

wi∈W Swi

≤ u

,whereW S istheset

{w

i

|

i

∈ [

0

..

n

],

a

i

∈

I

}

.We now recallthe definitionofaggregatestratification from [44].We slightlysimplify thedefinitionby

consideringonlypropositionalprogramswithoutdisjunction.

Deﬁnition10.ApropositionallogicprogramP ,inwhichaggregatesoccuronlyinbodiesofrules,isstratiﬁedonanaggregate

agg

ifthereisalevelmapping

from Atoms

(

P

)

toordinals,suchthatforeachruleR

∈

P ,thefollowingholds:

(10)

1.

∀b

∈

Atoms

(

body

(

R

))

: ||b||

≤ ||

head

(

R

)

||

2. If

agg

∈

body

(

R

)

,then

∀b

∈

Atoms

(agg)

: ||b||

<

||

head

(

R

)

||

P issaidtobeaggregatestratiﬁed ifitisstratiﬁedoneveryaggregatein P .

Theintuitionisthataggregatestratificationforbidsrecursionthroughaggregates.Ingeneralaggregatestratifiedprograms havea lowercomplexity thannon-aggregatestratifiedprograms. Aggregatestratificationhasnothingto dowithnegation asfailure,andtherefore,whetheraprogramisaggregatestratifiedisunrelatedtowhetheritisstratified intheusualsense. Notethatconstraintsandchoicerules canbe addedintoanyaggregate stratifiedprogramwithoutbreakingstratification solongasnoatomsintheheadofthechoiceruleareonalowerlevelthananyatominthebody.Thisisillustratedbythe followingexample.

Example5.Any constraint

:- b

1

, . . . , b

n

, not c

1

, . . . , not c

m can be rewritten as

s:- b

1

, . . . , b

n

, not c

1

, . . . , not c

m

,

not s

where

s

isanewatom.

s

canthenbemappedtoahigherlevelthananyotheratom.

Achoicerule

l

{h

1

, . . . , h

o

}u:- b

1

, . . . , b

n

, not c

1

, . . . , not c

m canberewrittenas:

h

1

:- b

1

, . . . , b

n

, not c

1

, . . . , not c

m

, not h

1

.

h

₁

:- b

1

, . . . , b

n

, not c

1

, . . . , not c

m

, not h

1

.

. . .

h

o

:- b

1

, . . . , b

n

, not c

1

, . . . , not c

m

, not h

o

.

h

_o

:- b

1

, . . . , b

n

, not c

1

, . . . , not c

m

, not h

o

.

s:- b

1

, . . . , b

n

, not c

1

, . . . , not c

m

,

{h

1

, . . . , h

n

}l − 1, not s.

s

:- b

1

, . . . , b

n

, not c

1

, . . . , not c

m

, u

+ 1{h

1

, . . . , h

n

}, not s

.

where

h

₁

, . . . , h

_o

, s, s

areall newatoms.

s

and

s

canbothbe givenanewhighestlevelandeach

h

_i canbe giventhe samelevelas

h

i(iftheydidnotoccurinthepreviousprogramthentheyshouldbegivenanewlevelonebelow

s

and

s

).

Providedthepreviousprogramwas aggregatestratiﬁed, thenthisnewoneistoo.Toavoidconstantlyusingthismapping, wewillrefertoprogramswithchoicerulesandconstraintsasalsobeingaggregatestratiﬁed.

Lemma1.[44] Decidingwhetheranaggregatestratiﬁedpropositionalprogramwithoutdisjunctioncautiouslyentailsanatomis co-N P -complete.

Corollary 1.Deciding whether an aggregate stratiﬁed propositional program without disjunction bravely entails an atom is N P -complete.

Proof. Wefirstshowthat decidingwhetheran aggregatestratifiedpropositionalprogramwithoutdisjunctionbravely en-tailsanatomisinN P .Wedothisbyshowingthatthereisapolynomialreductionfromthisproblemtothecomplementof theprobleminLemma1(whichbydefinitionofco-N P mustbeinN P ).ThecomplementoftheprobleminLemma1is de-cidingwhetheranondisjunctiveaggregatestratifiedprogramdoesnotcautiouslyentailanatom.Takeanynon-disjunctive aggregatestratifiedprogram P andanyatom

a

andlet

neg

_

a

beanatomthatdoesnotoccurin P . P

|=

b

a

ifandonlyif P

∪ {neg

_

a:- not a.

}

|=

c

neg

_

a

.SothedecisionproblemisinN P .

It remains to show that deciding whether an aggregate stratified propositional program without disjunction bravely entailsanatomisN P -hard.WedothisbyshowingthatanyprobleminN P canbereducedinpolynomialtimetodeciding thesatisfiabilityofanaggregatestratifiedpropositionalprogramwithoutdisjunction.

Consideran arbitraryN P problem D.Thecomplementof D, D,

¯

mustbeinco-N P (bydeﬁnitionofco-N P ).Hence,by Lemma1,thereisapolynomialreductionfromD to

¯

decidingwhetheranaggregatestratifiedpropositionalprogramwithout disjunction cautiously entails an atom. We define the polynomial reduction from D to deciding whether an aggregate stratifiedpropositionalprogramwithoutdisjunctionbravelyentailsan atomasfollows:foranyinstanceI of D,let P and

a

betheprogramandatomgivenbythepolynomialreductionfromthecomplementofI todecidingcautiousentailment; deﬁne Pastheprogram P

∪ {neg

_

a:- not a

.

}

(where

neg

_

a

isanewatom).I returnstrueifandonlyifP

|=

c

a

ifand onlyif P

|=

b

neg

_

a

.Hence,as Pisstillaggregate stratiﬁed(thenewatom

neg

_

a

canbeputinthetopstrata),thisisa polynomialreductionfromD todecidingwhetheranaggregatestratiﬁedpropositionalprogramwithoutdisjunctionbravely entailsanatom.Hence,thedecisionproblemisN P -hard.

2

Wecannowintroduceourextralearningtask,LearningfromAnswerSetswithStratiﬁedAggregates (I L Ps

L A S).Itisthesame asLearningfromAnswerSets,exceptforallowing summingaggregatesinthebodiesoftherules inB and SM,aslongas

(11)

Fig. 1. Chains of polynomial reductions where each arrow denotes that there is a polynomial reduction from one framework to another. B

∪

SM isaggregatestratiﬁed.Notethattheconditionof B

∪

SM beingaggregatestratiﬁedimpliesthatforanyhypothesis H

⊆

SM,B

∪

H isaggregatestratiﬁed.

4.2. Relationshipsbetweenthelearningtasks

In thissection we proveforbothdecisionproblemsthat I L Pb and I L Psm both reduceto eachother polynomially. We alsoshow thatforbothdecisionproblemsthereisa chainofpolynomialreductionsfromI L Pc to I L PL A S to I L PcontextL O A S to I L PL O A S to I L Ps_{L A S}.Thischainofreductionsisthenusedinprovingthatallfourtaskssharethesamecomplexityforboth decisionproblems.ByprovingthatI L Pcis

O

-hardandI L PsL A S isin

O

forsomecomplexityclass

O

,weprovethatallfour tasks are

O

-complete.SimilarlyasI L Pb andI L Psm bothreduce polynomiallyto eachother forboth decisionproblems,if foroneoftheproblemsI L Pb is

O

-completeforsomeclassthensois I L Psm.ThechainsofreductionsareshowninFig.1.

Proposition1showsthatthecomplexityofI L Pb andI L Psm coincideforbothdecisionproblems.

Proposition1.

1. DecidingbothverificationandsatisfiabilityforI L PbreducespolynomiallytothecorrespondingI L Psmdecisionproblem. 2. DecidingbothverificationandsatisfiabilityforI L PsmreducespolynomiallytothecorrespondingI L Pbdecisionproblem.

Proposition 2 showsthat thereis a chain ofpolynomial reductions from I L Pc to I L PL A S to I L PL O A S to I L Pcontext_{L O A S} to I L Ps_{L A S} forbothdecisionproblems.

Proposition2.

1. DecidingbothverificationandsatisfiabilityforI L PcreducespolynomiallytothecorrespondingI L PL A Sdecisionproblem. 2. DecidingbothverificationandsatisfiabilityforI L PL A SreducespolynomiallytothecorrespondingI L PcontextL O A S decisionproblem. 3. DecidingbothverificationandsatisfiabilityforI L Pcontext_{L O A S} reducespolynomiallytothecorrespondingI L PL O A Sdecisionproblem. 4. DecidingbothverificationandsatisfiabilityforI L PL O A SreducespolynomiallytothecorrespondingI L PsL A Sdecisionproblem. 4.3. Complexityofdecidingverificationandsatisfiabilityforeachframework

Foreach ofthelearning frameworks,weprove thecomplexity ofdecidingveriﬁcationandsatisﬁability.We startwith the I L PbandI L Psm frameworks,forwhichbothdecisionproblemsare N P -complete.

Proposition3.VerifyingwhetheragivenH isaninductivesolutionofageneralI L PbtaskisN P -complete.

Corollary2.VerifyingwhetheragivenH isaninductivesolutionofageneralI L PsmtaskisN P -complete.

Proposition4.DecidingthesatisﬁabilityofageneralI L PbtaskisN P -complete.

Corollary3.DecidingthesatisﬁabilityofageneralI L PsmtaskisN P -complete.

We havenow proven thecomplexity of decidingveriﬁcationandsatisﬁability for I L Pb and I L Psm,proving the corre-spondingentriesinTable2.ItremainstoshowthecomplexitiesforI L Pc,I L PL A S, I L PL O A S andI L Pcontext_{L O A S} .

Aswehaveshownthat I L PcreducestoI L PL A S which,inturn,reducestoI L PL O A S,whichreducestoI L PcontextL O A S andthat I L Pcontext

L O A S reducestoI L PsL A S (allinpolynomialtime),toprovethecomplexityofverifyingahypothesisforeachframework, itsuﬃcestoshowthat I L Pc isD P -hard(thusalsoprovingthehardnessforeachoftheotherframeworks)andthat I L Ps_{L A S} isamemberofD P (thusprovingmembershipfortheotherframeworks).Thisshowsthateachframeworkisbothamember of D P andalsoD P -hard,andthereforemustbe D P -complete.