Contents lists available atScienceDirect
Artificial
Intelligence
www.elsevier.com/locate/artint
The
complexity
and
generality
of
learning
answer
set
programs
Mark Law
∗
,
Alessandra Russo,
Krysia Broda
Department of Computing, Imperial College London, SW7 2AZ, United Kingdom
a
r
t
i
c
l
e
i
n
f
o
a
b
s
t
r
a
c
t
Article history:
Received2September2016
Receivedinrevisedform20February2018 Accepted15March2018
Availableonline21March2018
Keywords:
Non-monotoniclogic-basedlearning AnswerSetProgramming
Complexityofnon-monotoniclearning
TraditionallymostoftheworkinthefieldofInductiveLogicProgramming(ILP)has ad-dressedtheproblemoflearningPrologprograms.Ontheotherhand,AnswerSet Program-mingisincreasinglybeingusedasapowerfullanguageforknowledgerepresentationand reasoning,andisalsogainingincreasingattentioninindustry.Consequently,theresearch activityinILPhaswidened tothe areaofAnswerSetProgramming,witnessingthe pro-posalofseveralnewlearningframeworksthathaveextendedILPtolearninganswerset programs.Inthispaper,weinvestigatethetheoreticalpropertiesoftheseexisting frame-worksfor learningprograms under theanswer set semantics.Specifically,we presenta detailedanalysis ofthe computationalcomplexity ofeachoftheseframeworkswith re-specttothe twodecision problemsofdecidingwhether ahypothesisis asolutionofa learningtaskanddecidingwhetheralearningtaskhasanysolutions.Weintroduceanew notionofgenerality ofalearningframework,whichenablesustodefineaframeworktobe moregeneralthananotherintermsofbeingabletodistinguish oneASPhypothesissolution fromasetofincorrectASPprograms.Basedonthisnotion,weformallyproveagenerality relationovertheset ofexistingframeworks forlearningprogramsunderanswerset se-mantics.Inparticular,weshowthatourrecentlyproposedframework,Context-dependent LearningfromOrderedAnswerSets,ismoregeneralthanbraveinduction,inductionofstable models,andcautiousinduction,andmaintainsthesamecomplexityascautiousinduction, whichhasthehighestcomplexityoftheseframeworks.
©2018TheAuthors.PublishedbyElsevierB.V.Thisisanopenaccessarticleunderthe CCBYlicense(http://creativecommons.org/licenses/by/4.0/).
1. Introduction
Over thelast twodecadestherehasbeena growinginterest inInductiveLogicProgramming(ILP) [1], wherethegoal is to learn a logic program calleda hypothesis, whichtogether witha given background knowledge base, explains a set of examples.Themain advantagethat ILP hasovertraditionalstatisticalmachine learningapproachesis thatthe learned hypotheses canbeeasilyexpressedinplainEnglishandexplainedto ahumanuser,so facilitatinga closerinteraction be-tweenhumansandmachines.TraditionalILPframeworkshavefocusedonlearningdefinitelogicprograms [1–6] andnormal logicprograms [7,8].Ontheotherhand,AnswerSetProgramming [9] isapowerfullanguageforknowledgerepresentation andreasoning.ASPiscloselyrelatedtootherdeclarativeparadigmssuchasSAT,SMTandConstraintProgramming,which haveeachbeenusedforinductivereasoning [10–12].Comparedwiththeseotherparadigms,duetoitsnon-monotonicity, ASP isparticularlysuitedforcommon-sensereasoning [13–15].Because ofitsexpressivenessandefficientsolving, ASPis
*
Correspondingauthorat:DepartmentofComputing,HuxleyBuilding,180Queen’sGate,ImperialCollegeLondon,London,SW72AZ,UnitedKingdom. E-mail address:[email protected](M. Law).https://doi.org/10.1016/j.artint.2018.03.005
0004-3702/©2018TheAuthors.PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/).
alsoincreasingly gainingattentioninindustry [16];forexample,indecisionsupportsystems [17],ine-tourism [18] and in productconfiguration [19].Consequently,thescopeofILPhasrecentlybeenextendedtolearninganswersetprogramsfrom examples ofpartialsolutions ofa givenproblem,withthe intentionbeingtoprovide algorithmsthat support automated learningofcomplexdeclarativeknowledge.LearningASPprogramsallowsustolearnavarietyofdeclarativenon-monotonic, common-sensetheories,includingforinstanceEventCalculus [20] theories [21] andtheoriesforschedulingproblemsand agents’preferencemodels,bothfromrealuserdata [22] andfromsyntheticdata [23,24].
LearningASPprogramshasseveraladvantageswhencomparedtolearningPrologprograms.Firstly,whenlearningProlog programs, thegoaldirected SLDNFprocedure ofProlog mustbe takenintoaccount. Specifically,when learning programs with negation, it must be ensured that the programs are stratified, or otherwise the learned program may loop under certain queries. AsASP is declarative,no such considerationneed be takenintoaccount when learning ASP programs.A second, more fundamental advantage oflearning ASP programs, is that thetheory learned can be expressedusing extra typesofrulesthatarenotavailableinProlog,suchaschoicerulesandweakconstraints.Learningchoicerulesallowsusto learnnon-deterministicconcepts;forinstance,wemaylearnthatacoinmaynon-deterministicallylandoneitherheads or tails,butneverboth.Thiscouldbeachievedbylearning thesimplechoicerule
1
{heads, tails}1
.Learningchoicerules isdifferentfromprobabilisticILPsettingssuchas [25–27] where,insimilarcoinsproblemsthefocuswouldbeonlearning theprobabilitiesofthetwooutcomesofarecoin.LearningweakconstraintsenablesanaturalextensionofILPtopreference learning [23],whichhasresultedtobeeffectiveinproblemdomainssuchaslearningpreferencemodelsforscheduling[23] andforurbanmobility[24].Severalalgorithms,aimed atlearning undertheanswersetsemantics, anddifferentframeworksforlearningASP pro-grams havebeen recently introduced inthe literature. [28] presented the notions of braveinduction (I L Pb) andcautious induction (I L Pc), basedrespectivelyonthewellestablished notionsofentailmentundertheanswer setsemantics [13,29] ofbraveentailment (whenanatomistrueinatleastoneanswerset)andcautiousentailment (whenandanatomistruein allanswersets).Inbraveinduction,atleastoneanswersetmustcovertheexamples,whereasincautiousinduction,every answersetmustcovertheexamples.Braveinductionisactuallyaspecialcaseofan earlierlearningframework, called in-ductionofstablemodels (I L Psm) [30],inwhichexamplesarepartialinterpretations.Ahypothesisisasolutionofaninduction ofstablemodelstaskifforeach oftheexamplepartialinterpretations,thereisananswersetofthehypothesiscombined withthebackgroundknowledge,thatcoversthatpartialinterpretation.Braveinductionisequivalenttoinductionofstable modelswithexactlyone(partialinterpretation)example.
EachoftheaboveframeworksforlearningASPprogramsisunabletolearnsometypesofASPprograms [31];for exam-ple,braveinductionalonecannotlearnprogramscontaining hardconstraints.In [31],wepresentedalearningframework, calledLearningfromAnswerSets (I L PL A S), which unifies brave andcautious induction andis ableto learn ASP programs containing normal rules, choice rules andhard constraints. In spiteof the increasedexpressivity, noneof the above ap-proaches canlearn weakconstraints, whichareable tocapturepreferencelearning. Informally,learning weakconstraints consistson identifyingconditions forordering answersets. The learning taskinthis casewouldrequireexamples of or-deringsover partial interpretations.To tacklethisaspectof learning ASPprograms, we haveextended theLearning from AnswerSetsframeworktoLearningfromOrderedAnswerSets(I L PL O A S) [23] anddemonstratedthatouralgorithm1isable tolearnpreferences ina schedulingdomain.Morerecently,we haveextendedthe I L PL O A S framework to I L PcontextL O A S ,with context-dependent examples,whichcometogetherwithextracontextualinformation [24].
Inthispaper,we exploreboththeexpressive powerandthecomputationalcomplexity ofeachframework.The former is important, as it allows us to identify the class of problems that each framework can solve, whereas the latter gives an indication ofthepricepaidforusingeach framework. Wecharacterise theexpressive powerofa frameworkinterms of new notions called one-to-one-distinguishability, one-to-many-distinguishability and many-to-many-distinguishability. The intuition ofone-to-one-distinguishability is that,given some fixed backgroundknowledge B and sufficient examples,the frameworkshould beabletodistinguish a targethypotheses H1 fromanother,unwanted,hypotheses H2.Thismeansthat there shouldbe at leastone task T (ofthe givenframework)with backgroundknowledge B,such that H1 is asolution of T ,andH2 isnot.Wecharacterisetheone-to-one-distinguishabilityclassofaframework
F
(writtenD
11(F)
)asthesetof tuplesB,
H1,
H2forsuch B’s, H1’sandH2’s,andstatethat aframeworkF
1 ismoreD
11-general thananotherF
2 ifF
2’s one-to-one-distinguishabilityclassisastrictsubsetofF
1’sone-to-one-distinguishabilityclass.One-to-many-distinguishability relates to the task of finding a single target hypothesis from within a set of possi-ble hypotheses. It upgrades the notion of one-to-one-distinguishability classes to one-to-many-distinguishability classes. These are tuples of the form
B,
H,
S for which a framework has at least one task that includes H and none of the (unwanted) hypotheses in S as an inductive solution. Many-to-many-distinguishability upgrades this notion to many-to-many-distinguishability classes. Thesecontain tuples of the form B,
S1,
S2, where S1 is a set of target hypotheses, for whichaframeworkmusthaveataskthatacceptseachhypothesisinS1 andnohypothesisinS2 asinductivesolution.We show that,underthesethreemeasures, I L PcontextL O A S ismoregeneralthan I L PL O A S,whichismoregeneralthan I L PL A S.We alsoshowthat I L PL A S ismoregeneralthanboth I L Psm andI L Pc.AlthoughI L Psm isequallyD
11-generalto I L Pb,weshow that I L Psm ismoregeneralthanI L Pb undertheone-to-manyandmany-to-manygeneralitymeasures.1 OurILASPsystemforsolving
I L P
Despitethedifferentgeneralitiesof I L Pc,I L PL A S, I L PL O A S and I L PcontextL O A S ,weshow thatthecomputationalcomplexity of all fourframeworksis the same,both forthedecision problemofverifying thata givenhypothesis is a solutionofa given learning task, and forthe problemofdeciding whethera given learning taskhas anysolutions. Similarly, we also showthat I L Psm andI L Pb havethesamecomputationalcomplexitiesforbothdecisionproblems,despitetheformerbeing moregeneralthanthelatterundertwoofourgeneralitymeasures.
Webegin,inSection2,byreviewingthebackgroundmaterialnecessaryfortherestofthepaper.InSection3werecall the definitions ofeach of the learning frameworks and inSections 4 and 5 we prove the complexities and generalities (respectively)ofeachlearningframework.Weconcludethepaperwithadiscussionoftherelatedandfuturework.
2. Background
2.1. AnswerSetProgramming
In this section we introduce the concepts neededin the paper.Given anyatoms
h
, h
1, . . . , h
k, b
1, . . . , b
n, c
1, . . . , c
m,h
:- b
1, . . . , b
n, not c
1, . . . , not c
m iscalledanormalrule, withh
asthehead andb
1, . . . , b
n, not c
1, . . . , not c
m(col-lectively)asthebody (“
not
”representsnegationasfailure);arule:- b
1, . . . , b
n, not c
1, . . . , not c
m,withanemptyhead,is ahardconstraint;achoicerule isarule
l
{h
1, . . . , h
k}u ← b
1, . . . , b
n, not c
1, . . . , not c
m (wherel
andu
areintegers)anditsheadiscalledanaggregate.Arule R issafe ifeachvariablein R occursinatleastonepositiveliteralinthebodyof
R.Inthispaperwewilluse
ASP
ch todenotethesetofchoiceprograms P ,whichareprograms composedofsafenormal rules,choicerules,andhardconstraints.Givenarule R,wewillwritehead(
R)
todenotetheheadofR,body(
R)
todenote the bodyof R andbody+ (resp.body−(
R)
)to denotetheatomsthat occurpositively (resp.negatively)inthebody of R.Givenaprogram P ,wewillalsowrite Atoms
(
P)
todenotetheatomsin P .Wewillalsoextendthisnotationtofragments ofaprogram.The Herbrand Base ofanyprogram P
∈
ASP
ch, denoted H BP, isthe set ofvariablefree (ground) atoms that canbe formed frompredicatesandconstantsin P .Thesubsetsof H BP are calledthe(Herbrand)interpretations of P .Aground aggregatel
{h
1, . . . , h
k}u
issatisfiedbyaninterpretationI iffl
≤ |
I∩ {h
1, . . . , h
k}|
≤ u
.As we restrict our programs to sets of normal rules, (hard) constraints and choice rules, we can use the simplified definitions ofthe reduct forchoice rules presentedin[33].Givena program P andan Herbrand interpretation I
⊆
H BP, the reduct PI isconstructed from ground(
P)
(the setof groundinstancesof rulesin P )in 4steps:firstly, removerules whosebodiescontainthenegationofanatominI;secondly,removeallnegativeliteralsfromtheremainingrules;thirdly, replace the headof anyhard constraint, oranychoice rule whose headis not satisfiedby I with⊥
(where⊥
∈
/
H BP); and finally,replaceany remaining choicerulel
{h
1, . . . , h
m}u:- b
1, . . . , b
n withthe setofrules{h
i:- b
1, . . . , b
n| h
i∈
I∩ {h
1, . . . , h
m}}
.Any I⊆
H BP isan answerset of P ifitistheminimalmodelofthereduct PI.Throughoutthepaperwe denotethesetofanswersetsofaprogram P with A S(
P)
.We say aprogram P bravelyentails an atom
a
(written P|=
ba
) ifthere is atleastone answer set A of P suchthata
∈
A.Similarly, P cautiouslyentailsa
(written P|=
ca
)ifforeveryanswerset A of P ,a
∈
A.Unlike hard constraints in ASP, weakconstraints do not affect what is, or is not, an answer set of a program P . Hence the above definitions also apply to programs with weak constraints. Weak constraints create an ordering over A S
(
P)
specifying which answer sets are “preferred” to others. A weakconstraint is of the form:∼ b
1, . . . , b
n,
not c
1, . . . , not c
m.
[w
@l
, t
1, . . . , t
k]
whereb
1, . . . , b
n,c
1, . . . , c
m are atoms,w
andl
are termsspecifying theweightand the level, and
t
1, . . . , t
k are terms. A weak constraint W is safe if every variable in W occurs in at least onepositive literal in the body of W . At each prioritylevel
l
, the aim is to discard any answer set which does not min-imise the sum of the weights of the ground weak constraints with levell
whose bodies are true. The higher levels are minimised first. The termst
1, . . . , t
k specify which ground weak constraints should be considered unique [34].For any program P and an interpretation A, weak
(
P,
A)
is the set of tuples(w, l, t
1, . . . , t
k)
for which there is some:∼ b
1, . . . , b
n, not c
1, . . . , not c
m.
[w
@l
, t
1, . . . , t
k]
in ground(
P)
such that A satisfiesb
1, . . . , b
n, not c
1, . . . , not c
m.For each level l, the score of the interpretation A is the sum of the weights of tuples with level l, formally Pl A
=
(w,l,t1,...,tk)∈weak(P,A)w. For A1
,
A2∈
A S(
P)
, A1 dominates A2 (written A1 P A2) iff∃
l such that Pl A1
<
Pl A2 and
∀
m>
l,
PmA1=
PmA2.Ananswerset A
∈
A S(
P)
isoptimal ifitisnotdominatedbyanyA2∈
A S(
P)
.Example1.Let P be the program
{0{p(1), p(2), p(3)}1.}
. P has 8 answer sets, which are the various combinations of making eachofthethreep
atomstrueorfalse.Consider thetwoweakconstraints:∼ p(X).[1
@1
]
and:∼ p(X).[1
@1
, X
]
. Thefirstweakconstraintstatesthatifanyofthep
atomsistruethenapenaltyofonemustbepaid.Thispenaltyisonly paidonce,regardlesswhether1,2or3ofthep
atomsaretrue.Conversely,thesecondweakconstraintsaysthatapenalty of1mustbepaidforeachofthep
atomsthatistrue.Inbothcases,∅
istheonlyoptimalanswerset;however,inthefirst case,noneoftheremaininganswersetsdominateeachother,whereasinthesecondcase,theanswersetswithonlyonep
atomdominatethosewith2
p
atoms,whichinturneachdominatethesingleanswersetwith3p
atoms.NotethatthedefinitionofweakconstraintsusedinthispaperisinlinewiththerecentASPstandardestablishedin [34]. Thesyntaxofsomepreviousdefinitionsofweakconstraintssuchas [13] donotincludetheterms
t
1, . . . , t
kandconsideredeverygroundinstanceofeveryweakconstraintindividually.Thissemanticscanbeachievedusingthenotionofweak con-straintsin [34].Anyweakconstraint
:∼ body.[w : l]
2 canbemappedtotheweakconstraint:∼ body.[w
@l
, V
1
, . . . , V
n]
,where
V
1, . . . , V
nisthesetofallvariablesthatoccurinbody
.Iftherearemultipleweakconstraints,toexactlypreservethesemantics of [13], aunique termmustbe added toeach weakconstraint. Forexample,
{:∼ p(X).[1 : 1]; :∼ q(X).[1 : 1]}
wouldbecome{:∼ p(X).[1
@1
, X, 1
]; :∼ q(X).[1
@1
, X, 2
]}
.With thisadditionalterm, W eak(
P,
{p(a),
q
(a)
})
(where P istheprogram containing thetwo weak constraints)would beequal to
{(1, 1, a, 1),
(1, 1, a, 2)
}
,leading toa scoreof2 at level1;withouttheadditionalterm, W eak(
P,
{p(a),
q
(a)
})
wouldequal{(1, 1, a)}
,leadingtoascoreof1atlevel1.Unlessotherwisestated,whenwe refertoanASPprograminthispaper,wemeanaprogramconsistingofa finiteset ofnormalrules,choicerules,hardandweakconstraints.
We now introduce some extra notation which will be useful in later sections.Given a set of interpretations S, the set ord
(
P,
S)
captures the ordering of the interpretations given by the weak constraints in P . It generalises the dom-inates relation; so it not only includes A1,
A2,
<
if A1P A2, but it also includes tuples for other binary compari-son operators. Formally, A1,
A2,
<
∈
ord(
P,
S)
if A1,
A2∈
S and A1P A2; A1,
A2,
>
∈
ord(
P,
S)
if A1,
A2∈
S andA2
P A1; A1,
A2,
≤
∈
ord(
P,
S)
if A1,
A2∈
S and A2P A1; A1,
A2,
≥
∈
ord(
P,
S)
if A1,
A2∈
S and A1 P A2; A1,
A2,
=
∈
ord(
P,
S)
if A1,
A2∈
S, A1P A2 and A2P A1; A1,
A2,
=
∈
ord(
P,
S)
if A1,
A2∈
S and A1P A2 orA2
P A1. Given an ASP program, we write ord(
P)
asa shorthand for ord(
P,
A S(
P))
. Two ASP programs P and Q are stronglyequivalent (writtenP≡
sQ )ifforeveryASPprogramR, A S(
P∪
R)
=
A S(
Q∪
R)
.Wenowrecallthesplittingsettheoremfrom [35],whichweuseintheproofsthroughoutthepaper.Thistheoremrelies on the notions ofa splitting set andthe partial evaluationof a logicprogram. Given a program P , a set U
⊆
H BP is a splittingset of P if andonly ifforevery rule R∈
ground(
P)
such that Atoms(
head(
R))
∩
U= ∅
, Atoms(
R)
⊆
U .Givena groundruleR andasetofatomsU ,wewriteR\
U todenotetheruleR withall(positiveornegative)occurrencesofatoms inU removed fromthebodyof R.Givena program P asplittingsetU of P andaset X⊆
U ,thepartialevaluationof PwithrespecttoU and X ,writteneU
(
P,
X)
,istheprogram{
R\
U|
R∈
ground(
P),
Atoms(
head(
R))
∩
U= ∅,
(
body+(
R)
∩
U)
⊆
X,
body−(
R)
∩
X= ∅}
.Theorem1.GivenanygroundASPprogramP ,andsplittingsetU ofP ,A S
(
P)
= {
X∪
Y|
X∈
A S(
{
R∈
P|
Atoms(
head(
R))
∩
U=
∅}),
Y∈
A S(
eU(
P,
X))
}.
TheintuitionbehindthesplittingsettheoremisthatifasetofatomsU isknowntosplit theprogram P ,thenwecan findtheanswersetsofthesubprogramthatdefinestheatomsinU first.Foreachoftheseanswersets X ,wecanpartially evaluate P using X and solvethispartiallyevaluated programforanswersets. Thesplittingsettheorem thenguarantees thatforeachanswersetY ofthepartiallyevaluatedprogram, X
∪
Y isananswersetof P .Furthermore,everyanswerset of P canbeconstructedinthisway.2.2. Complexitytheory
Weassumethereaderisfamiliarwiththefundamentalconceptsofcomplexity,suchasTuringmachinesandreductions; foradetailedexplanation,see [36].
Many of the decisionproblems for ASP are known to be complete for classesin the polynomial hierarchy [37]. The classesofthepolynomialhierarchyaredefinedasfollows: P istheclassofallproblemswhichcanbesolvedinpolynomial timebyaDeterministicTuringMachine(DTM);
0P
=
0P=
0P=
P ;kP+1
=
PkP istheclassofallproblemswhichcanbe solvedbyaDTMinpolynomialtimewithakP oracle;
kP+1
=
N PkP istheclassofallproblemswhichcanbesolvedbyanon-deterministicTuringMachineinpolynomialtimewitha
kP oracle;finally,
kP+1
=
co-N PkP istheclassofallproblems whosecomplementcanbesolvedbyanon-deterministicTuringMachineinpolynomialtimewithakP oracle.
P1 and
1P
are N P and co-N P (respectively),where N P istheclass ofproblemswhichcan be solvedby anon-deterministic Turing machineinpolynomialtimeandco-N P istheclassofproblemswhosecomplementisanN P problem.
D P istheclassofproblems D thatcanbe mappedtoapairofproblems D1 andD2 such that D1
∈
N P , D2∈
co-N P , and for each instance I of D, I answers “yes” ifand only if both of the mapped instances I1 and I2 (of D1 and D2, respectively) answer “yes”.It is well known [36] that thefollowing inclusionshold: P⊆
N P⊆
D P⊆
2P⊆
2P and P⊆
co-N P
⊆
D P⊆
P 2⊆
P2.3. Learningframeworks
Inthissection,wegive thedefinitionsofthesixlearning frameworkswe analyseinthispaper.Thefirstthree–brave induction,cautiousinductionandinductionofstablemodels–arenotourown.Wereformulate,butpreservethemeaning of,theoriginaldefinitionsforeasiercomparisonwithourown.
ItiscommoninILPforatasktohaveahypothesisspace (thesetofallruleswhichcanappearinhypotheses).Thepurpose of the hypothesis spaceistwo-fold: firstly, it allowsthe taskto be restrictedto those solutions whichare in some way interesting; secondly,it aidsthe computationalsearch forinductive solutions.Tasksforbraveandcautious inductionand forinductionofstablemodelswereoriginallypresentedwithnohypothesisspace [28,30] astheywere mainlyconsidered theoreticallywithoutthespecificationsofefficientalgorithmiccomputations.Theonlypubliclyavailablealgorithmsforbrave induction [38,39] makeuseofa hypothesisspacedefinedby modedeclarations [40].Inthispaper,we “upgrade”each of braveinduction,cautiousinductionandinductionofstablemodelswithahypothesisspaceSM.
3.1. Notationandterminology
An ILPlearningframework
F
defineswhata learningtask ofF
isandwhatan inductivesolution isforagivenlearning taskofF
.ForeachframeworkataskisatupleB,
SM,
E,whereB isanASPprogramcalledthebackgroundknowledge,SM is asetofASP rulescalledthehypothesisspace, andE isa tuplecalledthe examples.Thestructure of E depends onthe type ofILPframework.Eachofthepapers [28], [30], [31] and [23] presentedlearningframeworkswithdifferentlanguages forB and SM;forexample,inductionofstablemodelswaspresentedonlyfornormallogicprograms.Itwouldbeunfairto saythatinductionfromstablemodelsisnotgeneralenoughtolearnprogramswithchoicerules,simplybecausetheywere not consideredintheoriginalpaper(infact,inductionfromstablemodelsis generalenoughtolearnsomeprogramswith choice rules). Fora faircomparisonwe therefore assume in thispaperthat every learning frameworkhas a background knowledge B andhypothesisspace SM thatconsistofnormalrules,choicerules,hardconstraintsandweakconstraints.Givenaframework
F
andalearning taskTF=
B,
SM,
EofF
,ahypothesis isanysubsetofthehypothesisspace SM. In Section 5,we considertaskswithunrestricted hypothesis spaces(written B,
E), inwhichcaseanyASP programcan becalledahypothesis.Aninductivesolution isahypothesisthat,togetherwiththebackgroundknowledge B,satisfiessome conditions on E (given by the particular learning frameworkF
). We write I L PF(
TF)
to denotethe set ofall inductive solutions of TF. Throughout the paper, we use the term covers to apply to any kind of example: i.e. given aF
task B,
SM,
E,wesaythatahypothesisH coversanexamplee (anyelementofanycomponentofE),ifitmeetstheparticular conditionsthattheframeworkF
putsonH ande.3.2. Frameworkdefinitions
Braveinduction (I L Pb), firstpresented in [28],defines an inductive taskinwhich all examples are groundatomsthat should be covered inatleastone answer set,i.e.entailedunderbrave entailment inASP.The original definitiondidnot consideratomswhichshouldnotbepresentinananswerset,namelynegative examples.Thetwopubliclyavailable algo-rithmsthatrealisebraveinduction,ontheotherhand,doallowfornegativeexamples.Wethereforeupgradethedefinition inthispapertoallownegativeexamples3 asfollows.
Definition1.Abrave induction(I L Pb) task Tb is a tuple
B,
SM,
E+,
E−, where B is an ASP program calledthe back-groundknowledge, SM isthe hypothesisspaceand E+ and E− aresets ofgroundatomscalledthepositiveandnegative examples(respectively).Ahypothesis H⊆
SM issaidtobeaninductivesolutionofTb(written H∈
I L Pb(
Tb)
)ifandonlyif∃
A∈
A S(
B∪
H)
suchthatE+⊆
A andE−∩
A= ∅
.Cautiousinduction (I L Pc)wasalsofirstpresentedin [28].Itdefinesaninductivetaskwherealloftheexamples should be coveredinevery answerset(i.e.entailedundercautiousentailmentinASP)andthat B
∪
H shouldbesatisfiable (have at least one answer set). Similarlyto brave induction, the original definition didnot consider negative examples,but in Definition2weupgradetheframeworktoincludenegativeexamples.Definition2.Acautiousinduction(I L Pc)taskTc isatuple
B,
SM,
E+,
E−,where B isan ASPprogramcalledthe back-groundknowledge, SM isthe hypothesisspaceand E+ and E− aresets ofgroundatomscalledthepositiveandnegative examples(respectively).AhypothesisH⊆
SM issaidtobeaninductivesolutionofTc (written H∈
I L Pc(
Tc)
)ifandonlyifA S
(
B∪
H)
= ∅
and∀
A∈
A S(
B∪
H)
,E+⊆
A and E−∩
A= ∅
.Brave inductionalone canonlyreasonaboutwhatshould betrue(or false)inasingle answersetof B
∪
H .Itcannot specify other brave tasks such as enforcing that two atoms are both bravely entailed, but not necessarily in the same answerset.Inductionofstablemodels [30] (I L Psm),ontheotherhand,generalisesthenotionofbraveinductionasshownin Definition4.Thefollowingterminologyisfirstintroduced.Definition3.Apartialinterpretation e isapairofsetsofgroundatoms
einc,
eexc.Aninterpretation I issaidtoextend e iff einc
⊆
I andeexc∩
I= ∅
.3 Notethatin
I L P
banegativeexampleeicanbeeasilysimulatedbyaddingaruleai:- not eitothebackgroundknowledgeandgivingaiasa
Definition4.Aninductionofstablemodels(I L Psm)taskTsm isatuple
B,
SM,
E,where B isanASPprogramcalledthe backgroundknowledge,SMisthehypothesisspaceandE isasetofpartialinterpretationscalledtheexamples.Ahypothesis H issaidtobe aninductive solutionofTsm (written H∈
I L Psm(
Tsm)
) ifandonlyif H⊆
SM and∀
e∈
E,∃
A∈
A S(
B∪
H)
suchthat A extendse.Note that a braveinduction taskcan be thought ofasa special caseof inductionof stablemodels, withexactly one (partialinterpretation)example.
We now consider the LearningfromAnswerSets frameworkintroduced in [31]. This isthe first framework capable of unifyingtheconceptsofbraveandcautious induction.Theideaistouseexamplesofpartialinterpretationswhichshould orshouldnotbeextendedbyanswersetsofB
∪
H .Definition5.ALearningfromAnswerSets taskisatupleT
=
B,
SM,
E+,
E−where B isanASPprogramcalledthe back-groundknowledge, SM isthehypothesis spaceand E+ and E− are sets ofpartialinterpretations called, respectively,the positiveandnegativeexamples.AhypothesisH⊆
SM isaninductivesolution ofT (written H∈
I L PL A S(
T)
)ifandonlyif:1.
∀
e+∈
E+∃
A∈
A S(
B∪
H)
suchthat A extendse+2.
∀
e−∈
E− A∈
A S(
B∪
H)
suchthat A extendse−Notethatthisdefinitioncombinespropertiesofboththebraveandcautioussemantics:thepositiveexamplesmusteach bebravelyentailed,whereasthenegationofeachnegativeexamplemustbecautiouslyentailed.
Example2.Consider an I L PL A S learning taskwhose background knowledge B contains definitions ofthe structure of a
4x4
Sudokuboard;i.e.definitionsofcell
,same
_row
,same
_col
andsame
_block
(wheresame
_row
,same
_col
andsame
_block
aretrueonlyfortwodifferent cellsinthesamerow,columnorblock).B
=
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
cell
((1, 1)). cell((1, 2)).
. . .
cell
((4, 4)).
same
_row
((X1, Y), (X2, Y)):- cell((X1, Y)), cell((X2, Y)), X1
= X2.
same
_col
((X, Y1), (X, Y2)):- cell((X, Y1)), cell((X, Y2)), Y1
= Y2.
block
((1, 1), 1). block((1, 2), 1). block((2, 1), 1). block((2, 2), 1).
block
((3, 1), 2). block((3, 2), 2). block((4, 1), 2). block((4, 2), 2).
block
((1, 3), 3). block((1, 4), 3).
block
((2, 3), 3). block((2, 4), 3).
block
((3, 3), 4). block((3, 4), 4). block((4, 3), 4). block((4, 4), 4).
same
_block
(C1, C2):- block(C1, B), block(C2, B), C1
= C2.
⎫
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎬
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎭
Forthepurposesofthisexample,wewillconsideronlyasmallhypothesisspaceSM butinpracticethiswouldbemuch larger.4 SM
=
⎧
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎩
0
{value(C, 1), value(C, 2), value(C, 3), value(C, 4)}1:- cell(C).
1
{value(C, 1), value(C, 2), value(C, 3), value(C, 4)}1:- cell(C).
1
{value(C, 1), value(C, 2), value(C, 3), value(C, 4)}2:- cell(C).
:- same
_row
(C1, C2), value(C1, V), value(C2, V).
:- same
_col
(C1, C2), value(C1, V), value(C2, V).
:- same
_block
(C1, C2), value(C1, V), value(C2, V).
⎫
⎪
⎪
⎪
⎪
⎪
⎬
⎪
⎪
⎪
⎪
⎪
⎭
E+=
{value((1, 1), 1)}, ∅
E−=
⎧
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎩
{value((1, 1), 1), value((1, 3), 1)}, ∅
{value((1, 1), 1), value((3, 1), 1)}, ∅
{value((1, 1), 1), value((2, 2), 1)}, ∅
{value((1, 1), 1), value((1, 1), 2)}, ∅
∅, {value((1, 1), 1), value((1, 1), 2), value((1, 1), 3), value((1, 1), 4)}
⎫
⎪
⎪
⎪
⎬
⎪
⎪
⎪
⎭
Weneedtobeabletosaythat thereshould beatleastoneanswersetthat assignsavalue toacell,orotherwisethe empty hypothesis would be sufficient.This iscaptured by our positive examplewhich causes atleastone ofthe choice rules to be part ofa solutionin order to be covered. Ourfirst three negative examples require the threeconstraints to be alsoincludedin asolution.Without each oneof thesenegative examples,atleastone constraintcould beleft out of thesolution.Thefourthnegativeexamplemeansthat theupperboundofthecountingaggregateinthechoicerulemust be 1,asotherwisetherewouldbeanswersetsinwhichcell
(1, 1)
wasassignedtoboth1
and2
.Finally,thefifthnegativeexample forcesthat the lower bound ofthe choice ruleshould be
1
asotherwise there wouldbe answersets in which(1, 1)
wasnotassignedtoanyofthevaluesbetween1
and4
.Hence,onepossibleinductivesolutionis:H
=
⎧
⎪
⎪
⎨
⎪
⎪
⎩
1
{value(C, 1), value(C, 2), value(C, 3), value(C, 4)}1:- cell(C).
:- same
_row
(C1, C2), value(C1, V), value(C2, V).
:- same
_col
(C1, C2), value(C1, V), value(C2, V).
:- same
_block
(C1, C2), value(C1, V), value(C2, V).
⎫
⎪
⎪
⎬
⎪
⎪
⎭
TheonlyothersolutionswithinthehypothesisspaceSM arethosethatcontainH andalsoextraredundantchoicerules, suchas
0
{value(C, 1), value(C, 2), value(C, 3), value(C, 4)}1:- cell(C).
Note that weneed I L PL A S’s combinationofbraveandcautious induction toseparate the correcthypothesis fromthe incorrecthypotheses.
•
If we instead use brave induction, whichever examples we use, if H is a solution, then any of the choice rules on their own is also a solution. For instance, consider the hypothesis H, containing only the choice rule0
{value(C, 1), value(C, 2), value(C, 3), value(C, 4)}1:- cell(C)
. For any examples E+,
E− such that H∈
I L Pb(
B,
E+,
E−)
, there must be an answer set A of B∪
H such that E+⊆
A and E−∩
A= ∅
. As A S(
B∪
H)
⊂
A S(
B∪
H)
,anysuchanswersetisalsoananswersetof B∪
H;andhence,Hisalsoasolutionofthetask.•
If we usecautious induction,we have togive examples which areeither truein every answerset, orfalse inevery answerset.Therefore,wecouldnotgiveanyexamplesaboutthevalue
predicate–foreachatomvalue
(x, y)
(wherex
andy
rangefrom1 to4),thereisatleastoneanswersetof B∪
H thatcontainsvalue
(x, y)
andatleastonethat doesnot;thismeansthatifvalue
(x, y)
isgivenaseitherapositiveornegativeexample,H willnotbeasolutionof thetask.ThismeansthatforanyI L Pc taskTc suchthatH isasolution,anysubsetofthehypothesisspaceSM isalsoasolution ofTc.
Note that noneofthelearning frameworkswe haveconsidered sofar (I L PL A S included)can incentiviselearning a weak constraint. Thisisbecausetheframeworksonlyhaveexamplesofwhatshouldbeinsome,allornoneoftheanswersets of B
∪
H .Anysolution H containinga weakconstraint W willhavethesame answersetswith W removedand H\{
W}
would thereforebe a shorter (more optimal5) solution. Thenotion oforderingexamples is neededto incentiviselearning
weakconstraints,inordertoenforcewhichanswersetsofB
∪
H shoulddominateotheranswersets.Definition6.Anorderingexample is atupleo
=
e1,
e2,
opwheree1 ande2 arepartial interpretationsandop isabinary comparisonoperator (<
,>
,=
,≤
,≥
or=
).An ASPprogram P bravelyrespects o iff∃
A1,
A2∈
A S(
P)
such thatall ofthe following conditionshold: (i) A1 extends e1; (ii) A2 extends e2; and(iii)A1,
A2,
op∈
ord(
P)
. P cautiouslyrespects o iff A1,
A2∈
A S(
P)
suchthatallofthefollowingconditionshold:(i) A1 extendse1;(ii) A2 extendse2;and(iii)A1,
A2,
op∈
/
ord
(
P)
.NotethatDefinition6generalisesourinitialdefinitionoforderingexamplesgivenin [23],whereorderingexampleshad onlytheoperator
<
,andwecouldnotexpressexamplesofpairsofanswersetswhichwereequallypreferred.InSection5weshowthatthisextensionallowsustolearnawiderclassofprograms.WenowdefinethenotionofLearningfromOrdered AnswerSets (I L PL O A S).
Definition7.ALearningfromOrderedAnswerSets taskisatupleT
=
B,
SM,
E+,
E−,
Ob,
Ocwhere B isanASPprogram, calledthebackgroundknowledge, SM isthehypothesisspace, E+ andE−aresetsofpartialinterpretationscalled, respec-tively, positive andnegative examples, and Ob and Oc are setsof ordering examples over E+ calledbrave andcautious orderings.AhypothesisH⊆
SM isaninductivesolutionofT (written H∈
I L PL O A S(
T)
)ifandonlyif:1. H
∈
I L PL A S(
B,
SM,
E+,
E−)
2.∀
o∈
Ob B∪
H bravelyrespectso3.
∀
o∈
Oc B∪
H cautiouslyrespectsoNotethattheorderings areonlyoverpositiveexamples.Wechosetomakethisrestrictionastheredoesnotappearto beanyscenariowhereahypothesiswouldneedtorespectorderingswhicharenotextendedbyanypairofanswersetsof
B
∪
H .Example3.Considerthe I L PL O A S task T
=
B,
SM,
E+,
E−,
Ob,
Ocwheretheindividual componentsofthe taskareas follows:•
B= {0{p, q}2.}
•
SM isunrestricted(i.e. SM isthesetofallnormalrules,choicerulesandhardandweakconstraints).•
E+=
e+1,
e+2wheree+1= {p},
∅
ande+2= ∅,
{p}
•
E−= ∅
•
Ob=
e+1,
e+2, <
•
Oc=
e+ 1
,
e+1,
=
Thepositiveexamplesofthistaskarealreadysatisfiedbythebackgroundknowledge,whichhastheanswersets
∅
,{p}
,{q}
and{p, q}
.As thereareno negativeexamples, itremains tofindaset ofweakconstraintssuch thatthereis atleast one answersetwhichcontainsp
which ispreferredtoatleastoneanswerset whichdoesnotcontainp
andallanswer setswhichcontainp
areequallyoptimal.Onesuchhypothesisisthesingleweakconstraint
:∼ not p.[1
@1
]
.TheframeworksdiscussedsofarhaveexampleswhichcanonlyexpressthepropertiesofalearnedhypothesisH together
withafixedbackgroundknowledgeB.ThesepropertiesareontheanswersetsofB
∪
H (andtheorderingoftheseanswer sets).In[24],wepresentedanewlearningframeworkthatusescontext-dependent examples.Eachexamplecomeswithits owncontext,whichisanASP
chprogramC .ExamplesthenexpresspropertiesofB∪
H∪
C ,meaningthatbyusingmultiple examples(withdifferentcontexts),wecanexpressthat B∪
H∪
C1 shouldhavesomepropertiesandthatB∪
H∪
C2should havedifferentproperties.Definition8.Acontext-dependentpartialinterpretation (CDPI)isapair
e,
C,wheree isapartialinterpretationandC isanASP
ch program,calledacontext.Acontext-dependentorderingexample (CDOE)o isatuplee
1
,
C1,
e2,
C2,
op,wherethe firsttwoelementsareCDPIsandopisabinary comparisonoperator(<
,>
,=
,≤
,≥
or=
). P issaidtobravelyrespect o if∃
A1∈
A S(
P∪
C1),
∃
A2∈
A S(
P∪
C2)
suchthat A1 extendse1,A2 extendse2 andA1,
A2,
op∈
ord(
P,
A S(
P∪
C1)
∪
A S(
P∪
C2
))
.Aprogram P is said tocautiouslyrespect o if∀
A1∈
A S(
P∪
C1),
∀
A2∈
A S(
P∪
C2)
suchthat A1 extendse1 and A2 extendse2,A1,
A2,
op∈
ord(
P,
A S(
P∪
C1)
∪
A S(
P∪
C2))
.When examples are givenwith empty contexts, they are equivalent to examples in I L PL O A S. Notealso that contexts donot contain weak constraints.Infact, the operator
P definesthe ordering overtwo answer setsbased onthe weak constraintsinoneprogram P .So,givenaCDOEe1,
C1,
e2,
C2such thatC1 andC2 containdifferentweakconstraints, itisnotclearwhichprogramtoconsiderforcomputingtheorderingofanswersets–i.e.whethertheyshouldbechecked againsttheweakconstraintsin P ,P∪
C1,P∪
C2 orP∪
C1∪
C2.WenowpresentaformaldefinitionoftheI L PcontextL O A S framework.
Definition9.AContext-dependentLearningfromOrderedAnswerSets (I L Pcontext
L O A S )taskisatupleT
=
B,
SM,
E+,
E−,
Ob,
Oc where B is an ASP program called the background knowledge, SM is the set of rules allowed in the hypotheses (the hypothesisspace), E+and E− arefinitesetsofCDPIscalled,respectively,positiveandnegativeexamples,and Ob andOcarefinitesetsofCDOEsoverE+called,respectively,braveandcautiouscontext-dependentorderings.Ahypothesis H
⊆
SM isaninductivesolutionofT (written H∈
I L PcontextL O A S
(
T)
)ifandonlyif: 1.∀
e+,
C∈
E+,∃
A∈
A S(
B∪
C∪
H)
stA extendse+2.
∀
e−,
C∈
E−,A∈
A S(
B∪
C∪
H)
st A extendse−3.
∀
o∈
Ob,B∪
H bravelyrespectso 4.∀
o∈
Oc,B∪
H cautiouslyrespectsoIn [24],weshowedthatcontext-dependentexamplescouldbeusedtosimplifytheencodingofcertaintasks,bysplitting thebackgroundknowledge intocontexts thatwere onlyrelevanttoparticularexamples. AlthoughanyI L Pcontext
L O A S taskcan be transformed into an I L PL O A S task, in general this requires parts of the examples to be encoded in the background knowledge.Example4showssuchatransformation.
Example4.Consider a simple scenario wherewe havea machine that hasa single configurationparameter a, whichis allowed to take anynaturalnumberasits value. Auser isallowed to inputanother naturalnumberb, andifa
>
b,the machineshouldbeep.Two example scenarios could be encoded as the context-dependent positive examples
{beep},
∅,
{value(a, 3).
value
(b, 2).
}
, and∅,
{beep},
{value(a, 4). value(b, 20).}
.Ataskcontaining theseexamplesandan empty back-groundknowledge requiresan inductive solutionthat when combinedwiththe context ofthefirst examplewouldhave atleastone answersetcontainingbeep
,andwhencombinedwiththe secondexamplewouldhaveatleastone answer setnotcontainingbeep
.Ifwewere expressingthesametaskin I L PL O A S theabove twoscenarioswouldberepresented consideringthebackgroundknowledge:Table 1
Asummaryoftheavailablesystemsforlearningundertheanswerset se-mantics.
Framework Systems
I L Pb XHAIL [42], ASPAL [38] and RASPAL [43]
I L Psm I L Pc I L PL A S ILASP [32] I L PL O A S ILASP [32] I L Pcontext L O A S ILASP [32] Table 2
Asummaryofthecomplexityofthevariouslearningframeworks.
Framework Complexity of verification Complexity of deciding satisfiability
I L Pb N P -complete N P -complete I L Psm N P -complete N P -complete I L Pc D P -complete 2P-complete I L PL A S D P -complete 2P-complete I L PL O A S D P -complete 2P-complete I L Pcontext L O A S D P -complete 2P-complete
B
=
1
{value(a, 3), value(a, 4)}1.1{value(b, 2), value(b, 20)}1.
The context-dependent examples could then be mapped to the non context-dependent examples
{value(a, 3),
value
(b, 2),
beep
},
∅
and{value(a, 4),
value
(b, 20)
},
{beep}
.Infact,in [24],weshowthatthereisageneral map-ping from I L PcontextL O A S to I L PL O A S.Thismapping, justasthe simplifiedmapping here, dependson encoding theexamples inthebackgroundknowledge,whichabusesthepurposeofthebackgroundknowledge.Thecontextsincontext-dependent examplesallowusinsteadtoseparateinformationthatistrulybackgroundknowledge,whichappliesinallscenarios,from informationthatispartofaparticularexample.3.3. Systemsforlearningundertheanswersetsemantics
ThecurrentpubliclyavailablesystemsforILPcanbecategorisedaccordingtothe6frameworkspresentedinthissection (Table 1). It shouldbe notedthat although there are no systemswhich directly solve I L Pc or I L Psm tasks, both canbe simplytranslatedintoI L PL A S tasks,andcanthereforebesolvedbytheILASPsystem.
The ILED [21] systemisanincremental extensionofXHAIL, thatisspecificallytargetedatlearningEventCalculus [20] theories.The underlyingmechanismis basedonbraveinduction,buteachofitsexamples areintermsoftwo sequential timepoints.
4. Complexity
Inthissection,wediscussthecomplexityofeachofthelearningframeworkspresentedinSection3withrespecttotwo decisionproblems:verification,decidingwhetheragivenhypothesisH isaninductivesolutionofataskT ;andsatisfiability,
deciding whethera learning task T has any inductive solutions. A summary of the results is shown in Table 2. To aid readability, theproofsof thepropositions statedin thissection are givenin appendix.All complexities discussed inthis section are forpropositional versionsof the frameworks(both the backgroundknowledge and hypothesis space ofeach learningtaskisground).
4.1. Learningfromanswersetswithstratifiedsummingaggregates
As thereare existing resultsonthe complexityof solvingaggregatestratified programs, it isusefultointroduce anew learning framework I L PsL A S,whichisa generalizationof I L PL A S,thatallows summingaggregates inthebodiesofrules,as longastheyarestratified.Theexistingresultsonthecomplexityoftheseprogramsthenallowustoprovethecomplexity of I L Ps
L A S.Hence,aswecanshowthat I L PL O A S reducestoI L PsL A S,thisishelpfulinprovingthecomplexityofI L PL O A S. A summing aggregate s is of the form
l
#sum
{a
1= w
1, . . . , a
n= w
n}u
, wherel
,u
andw
1,
. . . ,
w
n are integers anda
1,
. . . ,
a
n are atoms.s is satisfiedbyan interpretation I ifandonlyifl
≤
wi∈W Swi
≤ u
,whereW S istheset{w
i|
i∈ [
0..
n],
a
i∈
I}
.We now recallthe definitionofaggregatestratification from [44].We slightlysimplify thedefinitionbyconsideringonlypropositionalprogramswithoutdisjunction.
Definition10.ApropositionallogicprogramP ,inwhichaggregatesoccuronlyinbodiesofrules,isstratifiedonanaggregate
agg
ifthereisalevelmappingfrom Atoms
(
P)
toordinals,suchthatforeachruleR∈
P ,thefollowingholds:1.
∀b
∈
Atoms(
body(
R))
: ||b||
≤ ||
head(
R)
||
2. If
agg
∈
body(
R)
,then∀b
∈
Atoms(agg)
: ||b||
<
||
head(
R)
||
P issaidtobeaggregatestratified ifitisstratifiedoneveryaggregatein P .
Theintuitionisthataggregatestratificationforbidsrecursionthroughaggregates.Ingeneralaggregatestratifiedprograms havea lowercomplexity thannon-aggregatestratifiedprograms. Aggregatestratificationhasnothingto dowithnegation asfailure,andtherefore,whetheraprogramisaggregatestratifiedisunrelatedtowhetheritisstratified intheusualsense. Notethatconstraintsandchoicerules canbe addedintoanyaggregate stratifiedprogramwithoutbreakingstratification solongasnoatomsintheheadofthechoiceruleareonalowerlevelthananyatominthebody.Thisisillustratedbythe followingexample.
Example5.Any constraint
:- b
1, . . . , b
n, not c
1, . . . , not c
m can be rewritten ass:- b
1, . . . , b
n, not c
1, . . . , not c
m,
not s
wheres
isanewatom.s
canthenbemappedtoahigherlevelthananyotheratom.Achoicerule
l
{h
1, . . . , h
o}u:- b
1, . . . , b
n, not c
1, . . . , not c
m canberewrittenas:h
1:- b
1, . . . , b
n, not c
1, . . . , not c
m, not h
1.
h
1:- b
1, . . . , b
n, not c
1, . . . , not c
m, not h
1.
. . .
h
o:- b
1, . . . , b
n, not c
1, . . . , not c
m, not h
o.
h
o:- b
1, . . . , b
n, not c
1, . . . , not c
m, not h
o.
s:- b
1, . . . , b
n, not c
1, . . . , not c
m,
{h
1, . . . , h
n}l − 1, not s.
s
:- b
1, . . . , b
n, not c
1, . . . , not c
m, u
+ 1{h
1, . . . , h
n}, not s
.
where
h
1, . . . , h
o, s, s
areall newatoms.s
ands
canbothbe givenanewhighestlevelandeachh
i canbe giventhe samelevelash
i(iftheydidnotoccurinthepreviousprogramthentheyshouldbegivenanewlevelonebelows
ands
).Providedthepreviousprogramwas aggregatestratified, thenthisnewoneistoo.Toavoidconstantlyusingthismapping, wewillrefertoprogramswithchoicerulesandconstraintsasalsobeingaggregatestratified.
Lemma1.[44] Decidingwhetheranaggregatestratifiedpropositionalprogramwithoutdisjunctioncautiouslyentailsanatomis co-N P -complete.
Corollary 1.Deciding whether an aggregate stratified propositional program without disjunction bravely entails an atom is N P -complete.
Proof. Wefirstshowthat decidingwhetheran aggregatestratifiedpropositionalprogramwithoutdisjunctionbravely en-tailsanatomisinN P .Wedothisbyshowingthatthereisapolynomialreductionfromthisproblemtothecomplementof theprobleminLemma1(whichbydefinitionofco-N P mustbeinN P ).ThecomplementoftheprobleminLemma1is de-cidingwhetheranondisjunctiveaggregatestratifiedprogramdoesnotcautiouslyentailanatom.Takeanynon-disjunctive aggregatestratifiedprogram P andanyatom
a
andletneg
_a
beanatomthatdoesnotoccurin P . P|=
ba
ifandonlyif P∪ {neg
_a:- not a.
}
|=
cneg
_a
.SothedecisionproblemisinN P .It remains to show that deciding whether an aggregate stratified propositional program without disjunction bravely entailsanatomisN P -hard.WedothisbyshowingthatanyprobleminN P canbereducedinpolynomialtimetodeciding thesatisfiabilityofanaggregatestratifiedpropositionalprogramwithoutdisjunction.
Consideran arbitraryN P problem D.Thecomplementof D, D,
¯
mustbeinco-N P (bydefinitionofco-N P ).Hence,by Lemma1,thereisapolynomialreductionfromD to¯
decidingwhetheranaggregatestratifiedpropositionalprogramwithout disjunction cautiously entails an atom. We define the polynomial reduction from D to deciding whether an aggregate stratifiedpropositionalprogramwithoutdisjunctionbravelyentailsan atomasfollows:foranyinstanceI of D,let P anda
betheprogramandatomgivenbythepolynomialreductionfromthecomplementofI todecidingcautiousentailment; define Pastheprogram P∪ {neg
_a:- not a
.
}
(whereneg
_a
isanewatom).I returnstrueifandonlyifP|=
ca
ifand onlyif P|=
bneg
_a
.Hence,as Pisstillaggregate stratified(thenewatomneg
_a
canbeputinthetopstrata),thisisa polynomialreductionfromD todecidingwhetheranaggregatestratifiedpropositionalprogramwithoutdisjunctionbravely entailsanatom.Hence,thedecisionproblemisN P -hard.2
Wecannowintroduceourextralearningtask,LearningfromAnswerSetswithStratifiedAggregates (I L Ps
L A S).Itisthesame asLearningfromAnswerSets,exceptforallowing summingaggregatesinthebodiesoftherules inB and SM,aslongas
Fig. 1. Chains of polynomial reductions where each arrow denotes that there is a polynomial reduction from one framework to another. B
∪
SM isaggregatestratified.Notethattheconditionof B∪
SM beingaggregatestratifiedimpliesthatforanyhypothesis H⊆
SM,B∪
H isaggregatestratified.4.2. Relationshipsbetweenthelearningtasks
In thissection we proveforbothdecisionproblemsthat I L Pb and I L Psm both reduceto eachother polynomially. We alsoshow thatforbothdecisionproblemsthereisa chainofpolynomialreductionsfromI L Pc to I L PL A S to I L PcontextL O A S to I L PL O A S to I L PsL A S.Thischainofreductionsisthenusedinprovingthatallfourtaskssharethesamecomplexityforboth decisionproblems.ByprovingthatI L Pcis
O
-hardandI L PsL A S isinO
forsomecomplexityclassO
,weprovethatallfour tasks areO
-complete.SimilarlyasI L Pb andI L Psm bothreduce polynomiallyto eachother forboth decisionproblems,if foroneoftheproblemsI L Pb isO
-completeforsomeclassthensois I L Psm.ThechainsofreductionsareshowninFig.1.Proposition1showsthatthecomplexityofI L Pb andI L Psm coincideforbothdecisionproblems.
Proposition1.
1. DecidingbothverificationandsatisfiabilityforI L PbreducespolynomiallytothecorrespondingI L Psmdecisionproblem. 2. DecidingbothverificationandsatisfiabilityforI L PsmreducespolynomiallytothecorrespondingI L Pbdecisionproblem.
Proposition 2 showsthat thereis a chain ofpolynomial reductions from I L Pc to I L PL A S to I L PL O A S to I L PcontextL O A S to I L PsL A S forbothdecisionproblems.
Proposition2.
1. DecidingbothverificationandsatisfiabilityforI L PcreducespolynomiallytothecorrespondingI L PL A Sdecisionproblem. 2. DecidingbothverificationandsatisfiabilityforI L PL A SreducespolynomiallytothecorrespondingI L PcontextL O A S decisionproblem. 3. DecidingbothverificationandsatisfiabilityforI L PcontextL O A S reducespolynomiallytothecorrespondingI L PL O A Sdecisionproblem. 4. DecidingbothverificationandsatisfiabilityforI L PL O A SreducespolynomiallytothecorrespondingI L PsL A Sdecisionproblem. 4.3. Complexityofdecidingverificationandsatisfiabilityforeachframework
Foreach ofthelearning frameworks,weprove thecomplexity ofdecidingverificationandsatisfiability.We startwith the I L PbandI L Psm frameworks,forwhichbothdecisionproblemsare N P -complete.
Proposition3.VerifyingwhetheragivenH isaninductivesolutionofageneralI L PbtaskisN P -complete.
Corollary2.VerifyingwhetheragivenH isaninductivesolutionofageneralI L PsmtaskisN P -complete.
Proposition4.DecidingthesatisfiabilityofageneralI L PbtaskisN P -complete.
Corollary3.DecidingthesatisfiabilityofageneralI L PsmtaskisN P -complete.
We havenow proven thecomplexity of decidingverificationandsatisfiability for I L Pb and I L Psm,proving the corre-spondingentriesinTable2.ItremainstoshowthecomplexitiesforI L Pc,I L PL A S, I L PL O A S andI L PcontextL O A S .
Aswehaveshownthat I L PcreducestoI L PL A S which,inturn,reducestoI L PL O A S,whichreducestoI L PcontextL O A S andthat I L Pcontext
L O A S reducestoI L PsL A S (allinpolynomialtime),toprovethecomplexityofverifyingahypothesisforeachframework, itsufficestoshowthat I L Pc isD P -hard(thusalsoprovingthehardnessforeachoftheotherframeworks)andthat I L PsL A S isamemberofD P (thusprovingmembershipfortheotherframeworks).Thisshowsthateachframeworkisbothamember of D P andalsoD P -hard,andthereforemustbe D P -complete.