Intelligent conversation system using multiple classification ripple down rules and conversational context

(1)

Contents lists available at ScienceDirect

Expert

Systems

With

Applications

journal homepage: www.elsevier.com/locate/eswa

Intelligent

conversation

system

using

multiple

classiﬁcation

ripple

down

rules

and

conversational

context

David

Herbert

∗

,

Byeong

Ho

Kang

Discipline of ICT, University of Tasmania, Hobart, Tasmania, Australia

a

r

t

i

c

l

e

i

n

f

o

Article history:

Received27November2017 Revised25June2018 Accepted27June2018 Availableonline28June2018

Keywords:

Knowledgebasesystems Textualquestionanswering MCRDR

Casebasedreasoning Patternmatching

a

b

s

t

r

a

c

t

We introduce an extension to Multiple Classification Ripple Down Rules (MCRDR), called Contextual MCRDR (C-MCRDR). We apply C-MCRDR knowledge-base systems (KBS) to the Textual Question Answer- ing (TQA) and Natural Language Interface to Databases (NLIDB) paradigms in restricted domains as a type of spoken dialog system (SDS) or conversational agent (CA). C-MCRDR implicitly maintains topical conversational context, and intra-dialog context is retained allowing explicit referencing in KB rule conditions and classifications. To facilitate NLIDB, post-inference C-MCRDR classifications can include generic query referencing – query specificity is achieved by the binding of pre-identified context. In contrast to other scripted, or syntactically complex systems, the KB of the live system can easily be maintained courtesy of the RDR knowledge engineering approach. For evaluation, we applied this system to a pedagogical domain that uses a production database for the generation of offline course-related documents. Our system complemented the domain by providing a spoken or textual question-answering alternative for under- graduates based on the same production database. The developed system incorporates a speech-enabled chatbot interface via Automatic Speech Recognition (ASR) and experimental results from a live, integrated feedback rating system showed significant user acceptance, indicating the approach is promising, feasible and further work is warranted. Evaluation of the prototype’s viability found the system responded appropriately for 80.3% of participant requests in the tested domain, and it responded inappropriately for 19.7% of requests due to incorrect dialog classifications (4.4%) or out of scope requests (15.3%). Although the semantic range of the evaluated domain was relatively shallow, we conjecture that the developed system is readily adoptable as a CA NLIDB tool in other more semantically-rich domains and it shows promise in single or multi-domain environments.

1. Introduction

Conversational Agents(CA) andNaturalLanguageInterfacesto Databases (NLIDB) systems typically require the system devel-oper/author to have high-level skills in constructing either com-plexsemanticorsyntactic grammars,orhighlytechnicalscripting languagesto parse userutterances, aswell as databasequerying languagessuch asSQL.This introduces a clear,unwarranted sep-aration between the system author and a domain expert – ide-ally the domain expert should beable toauthorandmaintainthe knowledgerequiredbythesystem,butitisunreasonabletoexpect domainexperts to havehigh-level technical orlinguistic analysis

∗ _{Corresponding}_author.

E-mail addresses: [email protected] (D. Herbert), byeong.kang@utas. edu.au(B.H. Kang).

URL: http://www.utas.edu.au/proﬁles/staff/ict/david-paul-herbert (D. Herbert), http://www.utas.edu.au/proﬁles/staff/ict/byeong-kang(B.H. Kang)

skills(Androutsopoulos,Ritchie,&Thanisch,1995;Smith,Crockett, Latham,& Buckingham,2014).Wepropose asolution tothisthat allowsanexpertintheﬁeldtomaintainknowledgethatisusedto createCAs withNLIDBcapabilities. Ourresearch usesaderivation oftheknowledge engineeringapproach,RippleDownRules(RDR) (Compton&Jansen,1990),calledContextualMCRDR(C-MCRDR).

RDRrecognisestheproblemofelicitingknowledgefromthe do-mainexpert– theyhavetimeconstraintsandtheycannotusually provide a wholistic response inattempts tocapture their knowl-edge (Biermann,1998; Kang, Compton,& Preston, 1995).RDR re-movesthis knowledge acquisition bottleneckby allowing the ex-pert to build a knowledge-base (KB) incrementally as they only havetoprovidejustiﬁcationofaconclusioninalocalcontextasit arises.WeconsideredtheapplicationofRDRtotheCAandNLIDB paradigms,anddeﬁnedcasestobeexamplesofuserdialog– ques-tionsandstatementsthatarerelevanttothedomain.Thedomain expertconsiders what should be appropriate responses – in RDR

https://doi.org/10.1016/j.eswa.2018.06.049

(2)

terms, they are classifying each input case (which has pattern-matchedcomponentsoftheuserutterance asattributes) bya re-sponse.

Standard RDR can onlyprovide a singleclassification foreach case, andanaturalextension isto allowmultipleclassifications– MultipleClassificationRippleDownRules(MCRDR)(Kang,1995)is one such extension. In thispaper we introduce C-MCRDR,which isasignificantextensiontoMCRDRthatfacilitatesconstrainedNL conversationviapattern-matching.

1.1. Contribution summary

The key features andcontributions of C-MCRDRthat facilitate CAandNLIDBservicesdiscussedinlatersectionsareasfollows:

1. Implicitretentionoftopicalconversationalcontextbyadopting astack-basedmodiﬁcationtoMCRDR’sinferencemechanism; 2. Intra-dialog contextual referencing via context-based variable

deﬁnition and assignment (via regular expression pattern-matching) of relevant context that is maintained between di-alogutterances;

3. Rule-count reduction and NLIDB via post-inference deferred classiﬁcations with database querying expressions (bound by relevantcontextvariables);

4. Brittlenessmitigationby:

(a) Pattern-matchingofutterancestokey termsusingalexical orphrasalparaphrasingapproach;

(b) Utterancesuggestion(rulelookahead)basedoncurrent top-icalcontextwhenanutteranceisnotrecognised;

5. ASR transcription correction (when speech is used) by pre-processingtermsusing aset ofcorrectiverules prior to infer-ence;

6. SpeechtoText(STT)correctionbypre-processingtermsusinga setofcorrectiverules;

7. Dynamic rule maintenance of the livesystemcourtesy of the RDRknowledgeengineeringapproach

Weconductedausabilityevaluationstudyofapilotsystem appli-cationofC-MCRDRandtheresultswereverypromisingand posi-tive,whichisindicativetheC-MCRDRapproachtoCAs andNLIDB isviableandworthfurtherconsideration.Wewillbefurther lever-agingthesystemasa componentinthecommand andcontrolof autonomoussystemsviaconstrainedNL.

Thepaperisorganisedby thefollowingsections:Section2 re-viewsrelatedworkassociatedaround RDRandchat-based query-ing.WepresentC-MCRDR’smodiﬁcationstostandardMCRDRand thedevelopedconversationalsysteminSection3.Sections4,5and 6 detail the developed system’s architecture, the methodology adoptedindevelopingandevaluatingthechatsystem,andthe re-sultsofapilotevaluationinatargetdomainrespectively.We sum-marise the main results of this work together withproposals of futureresearchinSections7and8.

2. Relatedwork

2.1. Ripple down rules (RDR)

RippleDownRules(Compton&Jansen,1990)arosefrom expe-riences the researchershadfrom maintaininga thyroiddiagnosis expertsystem, GARVAN-ES1(Horn, Compton,Lazarus, & Quinlan, 1985).Duringthemaintenanceoftheoriginalsystemthey discov-ered that an approximatedoubling of rule count in the KB only increasedthe accuracyof thesystem’sdiagnosis acoupleof per-centage points. It wouldtypically take half a day fora newrule to be added dueto theconstraints of severalfactors, such asan expertendocrinologist’stime,interpretationbytheknowledge en-gineer,andextensiveveriﬁcationandvalidationtoensurethenew

Fig.1. RDRtreestructure a x– antecedent, c x– consequent.

rule didnot compromise the existing KB.Instead, with RDR, the expertcanaddrulesincrementally:theyjustifytheirnew classiﬁ-cationofacaseinthecontextinwhichitarises.Thisisincontrast to other knowledge acquisition methods such asRepertory Grids (Gaines&Shaw,1993),Formal ConceptAnalysis(Wille,1992)and standardCaseBaseReasoning(Aamodt&Plaza,1994).

The original RDR structure is a binary tree, with each node comprised ofa rulethat consistsof an antecedent anda conse-quent. During inference, the attributes of the currentcase to be classiﬁed are evaluated, startingat the root node (Fig. 1). If the antecedent’s conditions are satisﬁed (a 0), evaluation passes to a

childnode(termedthe except edge,here R 2).Iftheparentnode’s

rule conditions are not satisﬁed, evaluation alternatively follows theother child edge, if-not, R 1. Eitheror bothof thechild nodes

maynot bepresent.Classiﬁcationistheresultofthelastnode to besatisﬁed.Therootnode (R 0)usually providesadefault

classifi-cationandasuperficialantecedenttoensureallcaseswillbe triv-iallyassignedaclassifnofurtherrulesaresatisfied.

Multiple Classification Ripple Down Rules (MCRDR) (Kang,1995)extendsRDR’ssingleclassificationinferenceoutcome byallowing a casetohave multipleclassificationsconcurrently – inMCRDR’sn-ary tree,inferenceconsiders all child nodeswhose parent rules are satisfied, and evaluation concludes with each possibleinference pathterminated eitherby a satisfiedleafnode orasatisfiednodethathasnosatisfiedchildren.

RDR and the MCRDR variants have had excellent research and commercial outcomes in the last two or more decades (Richards,2009).Forexample,RDRandvariantsareusedacross di-verseresearchandapplicationareas:telehealth(Hanetal., 2013); breastcancerdetection(Miranda-Menaetal.,2006),legaltext cita-tion(Galgani,Compton,&Hoffmann, 2015);ﬂightcontrolsystems (Shirazi & Sammut, 2008), robot vision systems (Pham & Sam-mut, 2005); induction (Gaines & Compton, 1992; 1995); clinical pathologyreporting(Compton,2011);andahelpdeskinformation retrievalmechanism(HoKang,Yoshida,Motoda,&Compton,1997). For rapidity of development and implementation, (Han, Yoon, Kang,& Park,2014) showsthe MCRDR-backedKBmethodologyis closelyalignedwiththeAgilesoftwaredevelopmentapproach.

2.2. Syntax and semantic parse trees

(3)

parsetreesneedto beredeveloped whenanewdomain isbeing considered.

2.3. CAs and NLIDB

When associated with domain-specific databases, question-answering systems (QAs) are called Natural Language Interfaces toData Bases (NLIDB), although such systems are primarily con-cernedwithNLqueryinginterfacesto databases(andnot as spo-ken dialog systems or conversational agents per se ). These sys-tems parse restrictedor constrained NL expressions into specific structured querylanguages such asSQLor XML(Bais, Machkour, & Koutti, 2016; Li, Yang, & Jagadish, 2005; Nguyen, Nguyen, & Pham,2017),ordomain-neutralontologies (Chaw,2009) allowing non-specialistuserstophrasequestionsagainst adomain-specific databaseinaconstrainedsubsetofNL.OnesuchexampleisNaLIX (Li et al., 2005). NaLIX’s intent is in essence a natural language queryingsystem,withEnglishgrammartermsalignedusinga ma-chinelearning approach to classify terms withassociated typical querying constructs such as value joins and nesting. The inter-mediate representation language approach in (Bais et al., 2016) alsoadopts machine learningto generate context free grammars fordomain-dependentterminal symbolsafterthe inputhas been tokenisedandtagged by part-of-speechanalysis.Localised dictio-nariesagainmatchschema-specificattributestoNLsynonyms,an approach that is widely used, but requires redevelopment when portingtootherdomains.TheASKMEsystem(Chaw,2009) usesa speciallyrestrictedversionofEnglishcalledComputerProcessable Language(CLP)(Clarketal.,2005)toavoidissuessuchas ambigu-itythatariseinNLprocessingofgeneralEnglish.CLP– whichhas restrictionson grammar,styleandvocabulary,isinterpreted bya syntacticparser,howeveritstillrequiresadegreeofusertraining to teach them how to construct their requests. The SEEKER and Aneesahsystems(Shabaz,O’Shea,Crockett,&Latham,2015;Smith etal.,2014)aresimilarNLIDBscoupledtoCAs.SEEKERusesa com-mercialCA tomap appropriate SQLtemplates to evaluateagainst anunderlyingdatabaseandincontrasttoother systems,itallows a targetedfollowup ofrefinement of queryresults via a GUI. CA scriptsareusedtopattern-matchkeywordsinutterances,andthen themostappropriateSQLtemplateischosenviaanexpertsystem based on matched variables in the utterance; the SQL templates areproduced fromapre-establishment phaserequiringa lengthy questionnaire capturing common NL phrases used and their re-sulting associated queries. In contrast, although somewhat simi-lar,Aneesah(Shabazetal.,2015)dynamicallybuildsanSQLquery (althoughit is not clearhow the mapping of utterance to query isperformed)afteraCAengine utilisesascripted-captureofuser utterances.Inbothcasesmisinterpretations requirecareful,offline maintenanceof theCA engine scripts andbothare evaluated via satisfactionandtask-basedquestionnairesofverysmallparticipant groups(10 and20participants respectively).Akey componentof SEEKERistheabilitytofurtherrefinequeryresults.Wewould ar-guetheadditionoffurtherchildrulesassociatedwithqueryingin theC-MCRDRdecisiontreeallowsforadditionalcontext(variables) whererelevant, andthus more specific queries can be evaluated whenandiftheuserprovidesmorecontext.

An alternative approach, instead of pre-deﬁning query tem-plates to potentially map to user input, is to automatically gen-erateallsearch queriesbasedonacorpusofexistingquestionsin communityquestionanswering (cQA)platforms suchas Stack Ex- change (Figueroa,2017).Thismeansaverylargecorpusofqueries isgeneratedagainsttheexistingknowledge-basethatcanthenbe usedtoproviderelatedquestions(andanswers)toauser’sinitial query. The authors extract a number of attributes from a query basedoncomputedresultsfromvarioussourcesincludingCoreNLP (Manningetal., 2014) andWordNet(Miller, 1995)– such as

sen-tences(usingpartofspeechtagging,namedentityrecognitionand sentimentanalysis),semanticconnections,andothers.

2.4. KBS Brittleness

TheseNLIDB andNLIDB withCAapproachesareadvantageous tothe userasthey arenot requiredto learnandadopttechnical databasequeryformalisms(Smithetal.,2014),butpracticalitiesof the system’s linguisticconstraints leads to frustration– the user mustovercometheinherentbrittlenessofthesystemandstill be conversantinthelinguisticcoverageastowhatandcannotbe un-derstood (Androutsopoulos et al., 1995), anda mild appreciation ofunderlyingdatabaseschemasishard toavoid.Totryand over-comethisschemaunfamiliarity,NaLIX(Lietal.,2005)employsan ontological-based term expansion moduleintegratedwithWordNet (Miller, 1995). Matchedtermshoweverrequiresome ﬁnalmarkup of the actual domain-dependant schemas to be undertaken. An-other exampleof overcoming brittleness is a paraphrase genera-tionsystem(Oh,Choi,Gweon,Heo,&Ryu,2015)which automati-callygeneratesparaphrases(forKorean),initiallybasedonKorean thesauri.Weadoptthesameapproachinoursystem,althoughwe do not automatically generate the lexical or phrasal paraphrases (Madnani & Dorr, 2010) – the onus is on the domain expert to doso manually when theypopulate thedictionary. However, the C-MCRDR system prompts users with utterance suggestions (see Section3.4)whentheirutteranceisnotrecognised.

2.5. RDR-Based CAs and NLIDB

The KbQAS (Nguyen et al., 2017) and LEXA classification sys-tems (Galgani et al., 2015) are close to our research due to the factthey applyRDRtogeneraterulesforthesemanticanalysisof tokenisedandtaggedinput.KbQASanswers Vietnamesequestions whereasLEXAexamineslegalcitations,andbothusetheGATENLP analysis framework (Cunningham, Maynard, Bontcheva, & Tablan, 2002). The rule knowledge acquisition (KA) stages in KbQAS re-quirethedomainexperttohavesignificantknowledgeand exper-tiseintheJAPE grammarandits annotations(Cunninghametal., 1999). This is contrary to one of our aims i.e.to develop a sys-tem inarbitrary domains wherethe domain expert doesnot re-quiresignificantlinguisticorprogrammingskillssuchasJAPE.The KAinterfacehastobedesignedcarefullytoprovideasuitable ab-straction ofthesyntacticandsemanticinterpretations oftherule attributesaddedtotheKB.

In contrast to pure NLIDB systems, we donot map input ex-pressionsdirectly (or indirectly)to databasequeries;instead, the expert can predeﬁne queries without SQL syntactical knowledge (via a simpliﬁed GUI interface) that are themselves expressed in anintermediateXMLformfordatabaseindependence,andreferto them as requiredwhen determining the appropriate response to anNLinputquestion. Ineffect,theexpertapplyingoursystemto their domainincrementallydevelopsrulesanalogoustoauthoring chatscriptsforachatbot(forexample,theALICEchatbotandAIML (Wallace,2011)),buttheyarenotrequiredtoknowanesoteric au-thoringsyntax.

2.6. RDR and context

(4)

setofattributesthathavebeenlogged(e.g. the aileron has adjusted by 5 degrees from the previous reading ).Thisiscontrasttoour sys-tem,ascontextinLDRDRisduringtheKAphase,whereasour con-textual retentionis acrosscase-based inference requests(i.e.post acquisition).

3. Approach

C-MCRDRcanbereadilyappliedtoconversationsystemsby in-ferring appropriate responses from userutterances. We observed the clear, clean separation of knowledge from the inference en-gineinacontemporaryKBSisanalogoustotheseparationofchat scripts (such as AIML) from a chatbot program (Compton, 2011; Wallace, 2011). Chat scripts embody conversational responses as a formof knowledge, and it is this knowledge that can be cap-turedandrepresentedinaKBSapproach.Inter-domain applicabil-ity, commonly criticised for CA and NLIDB systems, can be par-tially addressedin oursystem by providing interface support for KB rule, query, dictionary and ASRcorrection rule reuse for rel-evant commonality betweendomains.We detail themodiﬁcation andfeaturesofC-MCRDRtosupportCAservicesandNLIDBinthis section.

3.1. Topical conversational context - stacking

Humanchatdiscoursefollowsoneormoretopicsduringa con-versation,theautomateddetectionofwhichisanongoingareaof research (e.g. TF-IDF approaches, (Adams & Martell, 2008)). Hu-mans naturally maintain an appreciation of what is being dis-cussedwithouthavingtocontinuallyre-emphasiseit.Forexample, thefollowingshowsastilted,unnaturalconversationalexcerpt:

Speaker1: ”The weather is going to be bad today. As for the weather ,theprognosisisforrain.”

Speaker2: ”Whatwillbethe weather’s temperature?”

Speaker1: ”The weather’s temperaturewill be 10 degrees Cel-sius.”

A speakerintuitively understands the currenttopicis weather – all subsequent explicitmentions of weather are redundant.To achieve this morenatural, contextualresponse, the C-MCRDR in-ferencealgorithmisalteredbyinfluencingthestartingruleswhere we begin inference in the decision tree. We achieved this via a stackingofinferenceresults(Glina&Kang,2010;Mak,Kang, Sam-mut,& Kadous, 2004) – eachstackframe containsa setof satis-fiedrule(s)fromthepreviousinferencerequest.Framesarepopped fromthe stack inLIFO order, andinference simplyassumes each ruleintheframe’srulesetissatisfiedasastartingpointtoevaluate child rules. For each dialog session,C-MCRDR implicitlyassumes eachcaseispartofatemporalsequence(i.e.componentsofa con-tinuingdialog)anditisnotinitiallyindependentofother preced-ingcases.Ifnochildrulesaresatisfiedinthecurrentstackframe, wetemporarilydiscardtheframeandattemptinferenceagainwith thenewtopofthestackruleset.Eventually,ifnofurtherstacked rulesetsareavailable,thedefaultruleissatisfied.

TheC-MCRDRinferencealgorithmissummarised inTable1.It should be notedframes arenever deleted fromtheinference re-sultstackduringadialogsession,andinferenceresultframesonly containing R 0 arenever pushedto thestack (apart fromtheﬁrst

frame).Futureworkmayrelaxtheformerconstraintinsituations whereatopicmayneedtobeinvalidated.

3.2. Context variables

An additional aspect to maintain conversational (or factual) continuityis theretentionofcontext-speciﬁc datafound incases astheyariseonatemporalbasis.

Table1

C-MCRDRAlgorithm:Inferencefunction I (case, rule )returnsasetof n satisﬁed rules{r 0..r n}.

Algorithm

INITIALISE:LETstack S ←∅

FUNCTIONC-MCRDR(S ,Case c ) LETstackS _←_S

LETruleset R ←pop(S ₎

WHILER =∅DO:

LETR _←n

i=1

I(c,r i)∀r iR

IF R ₌_∅_THEN

LET R ←pop(S ₎

ELSE push(S, R ₎

RETURN R

LETR ←I (c, r default) RETURNR

Oneormoreofarule’santecedent attributesmaybe satisﬁed by the presence of lexical terms. Generally, the actual synonyms thatmatchedmayneedtoberetainedforfuturereference.For ex-ample,arulemightbesatisﬁedifanutterance containstheterm

/animal/

.Morespeciﬁcally,wemaybemoreinterestedin which animal matched, such as

aardvark

. As a dialog progresses, C-MCRDR can retain this matched data via context variables . Con-text variables (together with system context variables, or default values ) could be considered as a simplified notion of slots in a frame as first put forward by Minsky (Minksy, 1975), with slots retained (temporally) across all subsequent frames. Variables are definedbythedomainexpert(guidedbyaGUI)withsimple meta-rulesreferringtodictionaryterms,literalvalues,orarbitrary regu-larexpressions.Theexpertcanreferencevariablevaluesinarule’s antecedentorconsequent withan XML-stylesyntax (notshown), whichisalsoguidedbyaGUI.Variableassignmentcanalsooccur asaside-effect ofa consequent (viaan action – this featurewas implementedbutnotusedinthepilotsystem).

Implementationpracticalitiesdictatethatthedevelopedsystem adoptsa closed-world approach – this means thedomain expert can pre-define anyassumed factual world knowledge that is re-quired.Wetake theapproachofsimplydefiningmeta-rulesagain in the form of system context variables withdefault values that canbeoverriddenbymorespecificcontext.Forexample,consider thequestion, ”What year is it? ”.The expertcan define via aGUI, adefaultvalue such as(

@SYSTEMyear

=

2018

).Astandard vari-able,(

@year

)wouldoverridethisvalueifan utterancecontained theappropriate variable-matchingcriteria(forexample, ”The year is 2021 ”). This defaultcontexthasextremely usefulconsequences whenbindingqueriesasdetailedinthenextsection.

3.3. Post-inference classiﬁcation binding

(5)

Fig.2. MCRDRRulespriortocontextandqueries.

Table2

Pre-existingdomaindatabase”polygon” – tables names and colours .

names colours

sides name type colour

3 trigon trigon red

4 tetragon tetragon orange 5 pentagon pentagon yellow 6 hexagon hexagon green 7 heptagon heptagon blue

Fig.3. C-MCRDRRulesafteradditionofcontextandqueries.

templates abstract the underlying referential table schema (by a simplemapping),so queries are potentiallyreusablein other do-mainsthat havecompatibletable schema.Standardqueries (such asasimpleprimary-keyjoin)canbedeﬁnedforcross-domain ap-plicabilitybythesystemGUI.

3.3.1. Rule-count reduction

Substantial rule-countreductionintheKBisasignificant ben-eficialside-effectto theinclusion ofpost-inferencequeries; stan-dard MCRDR tends to create larger decision trees compared to other inductive methods (due to over-generalisation and subse-quentrule refinement by thedomain expert forspecial cases as theyarise(Kang,1995)),andthecomplexityofthedecisiontreeis increasedasaconsequence.Indomainswhereconsiderable knowl-edgeisalreadypresentintheformof in situ databases(as isthe casefor our target pedagogicaldomain), the potential rule-bloat canbe drasticallyreduced by incorporatingquerying intothe in-teriminferenceresults.

Thesigniﬁcant reductioninKBrulebloatisbestdemonstrated byan example.Fig. 2givesan exampleof aquestion’s classiﬁca-tionofregularpolygonsandinstancesoftheircolourpriortoany optimisations.HeretenMCRDRrulesarerequiredtoanswer ques-tionsassociatedwiththepolygon’snameandcoloursuchas”what is the colour of polygon with 5 sides? ”.Assuming the existence of a domain database (Table 2), the KB can be reduced to two C-MCRDRrulesinadditiontothedefaultrule(Fig.3).Heretheliteral C-MCRDRruleconclusionscontainingqueryreferencesare shown – these are parsed by the system’s output parser and the corre-spondingqueryXMLcode(notshown)isexecuted withthe rele-vantbindingcontext(the

@sides

variable,whichisassignedthe numericalportionofthe

/sides/

termdeﬁnedinTable3).

Query1(

<

QREF

>

1

)inFig.3isasimpleselect(name )onthe names table with sides asthe key,whereas query2selects colour

Table3

Dictionarytermdeﬁnition.

Term Synonym Inputmatches

/sides/ /RE:’’(\ d) sides ’’ 7 sides 99 sides

based ona table join matching name with type . The exampleKB showninFig.3nowallowsC-MCRDRconversationssuchas:

1. ”what is the name of the polygon with 7 sides? ” –heptagon . 2. ”what is its colour? ” –blue

Thisexampleexhibitsan80%reductionoftherulecount(10rules to 2) in the KB (excludingthe default, R 0 rule) compared to the

non-queryexampleshowninFig.2.

3.4. Brittleness

To address brittleness, in addition to adopting a paraphrasing approach,thesystemcan provideexamplesofutterancesthat are understoodfromthecurrentcontext – thisistriggered ifthelast userutterance onlysatisﬁedthedefaultrootrule(whichis I don’t understand ),althoughameta-rule, What can I say? isalsoatrigger. Suggestionsareselectedbasedonattributesthatwillsatisfya sub-setofchildrules(alookahead)fromthestartingrulesetinthetop stackframeandtheyarevariedbyrandomlyselectinginstancesin anylexical terms referred to by the rule attributes. Forexample, inourpedagogicaldomain,aconversationalsnippetmightbe(the underlinedlexical termhas beenreplaced between each sugges-tion):

User: I like ﬁsh

System: I don’t understand. You could try asking me ”who is the lecturer ?” or ”what is the prior learning ”? User: huh?

System: I don’t understand. You could try asking me ”who is the teacher ?” or ”what is the prerequisite ?”

The developedsystem asdetailedin thenext section also in-cludes a domain-speciﬁc feature to help overcome brittleness. If the systemis unable toprovide results tothe client user’s satis-faction,the client browserinterface provides a buttonto request unitcoordinatorassistanceifthecliententerstheirpersonalemail address. Thistriggers an automatic inference requestfor the lec-turer details associated with the unit currently being discussed (e.g. what is the lecturer’s email address? ), andthe lectureristhen emailed a transcript ofthe entireconversation. The lecturer can thenrespondtotheclient’sunfulﬁlledqueriesviaemailexchange.

4. Systemarchitecture

4.1. Implementation

C-MCRDR wasincorporated in the developmentof the Intelli- gent Conversation System forthetarget pedagogicaldomain, how-evertheimplementedapplicationwasdesignedtobenon domain-speciﬁc.ThearchitecturecanbeseeninFig.4anditconsistsofthe following:

1. Java EE application (with a Tomcat Application Server back-end).GUIinterfacesinclude:

(a) theprovisionofknowledge-acquisitionfromtheDomain Ex-pert(DE);

(b)dictionarymanagement; (c)queryconstructionandpreview;

(6)

Fig.4. ICSarchitecture.

(e) speech-to-text(STT)andtext-to-speech (TTS)preand post-processing;

2. Client web interface (Google Chrome is required for Google Speech API) written in a combination of JSP and Javascript/JQuery;

3. Technology-speciﬁc handler applications (Java Servlets, not shownin Fig.4) foralternate speech-only interfacesvia com-modity voice-enabled Intelligent Personal Assistant (IPA) de-vicessuchas GoogleHome and Amazon Echo ;

4. ExternalRDBMSthatcontainsreferentialdomainknowledge.

Client(user)inputarrivesatthetopleftofFig.4andtheﬁnal re-sponseoccursatthebottomleft.IPAASRinput(notshown)occurs at the STT Pre-processing stage (allowing ASRtranscription error correction),andIPAoutputoccursattheTTSPre-processingstage.

5. Evaluationsetup

A pilot C-MCRDR system was implemented, with a KB that wastargetedagainstasuitabledomain.Criteriausedtodetermine a suitable domain included requiring the existence of an in situ database,adomainexpertwithsuitableknowledgeoftheexisiting systems,suitability ofa questionandanswerparadigmapproach, andtheavailabilityofparticipantsforafeasibilityevaluationstudy. A web-based documentation generation and retrieval system for coursedataattheauthor’sUniversitymatchedthecriteria.Asthe primaryauthorofthismanuscriptwastheoriginalsoftware devel-operfortheexistingdocumentationgenerationsystem,theauthor assumedthedomainexpertrole.Thedocumentationsystemis fre-quentlyaccessedbyundergraduatestudents(whoarerecruitedas participantvolunteers)anditsdatabasecontainsdetailsabouteach unit taughtbytheICTdiscipline,such asaunit’stitle,6character unit code,lecturingstaff,teachingpattern, learningoutcomes, as-sessmentitemsandduedatesetcetera.

Table4

Required Number of Standard MCRDRrules(Nf=numberofﬁelds, Nu=Numberofunits,Calc=Total calculation,Total= Totalnumberof equivalentstandardMCRDRrules).

Nf Nu Calc Total

Standard items (1 per unit) 28 34 28×34×1 952 Assessment items (4 per unit) 2 34 2×34×4 272 Global queries (independent of N u)

1 1

Total: 1225

Wethenevaluated whetherthisdomaincanbecomplemented byaquestionandanswerparadigmwherestudentscanask com-monquestionssuchas”Who is the lecturer for KIT101? ”,and”How many assignments are there for KIT102? ”.TheKBwasconstructedby theprimaryauthorwithananticipatorycoverageofthekeyitems that students refer to in the documentation system, and during evaluation,noruleswereaddedorreﬁned(eventhoughC-MCRDR KAcouldhaveeasily facilitatedmisclassiﬁcation corrections). Par-ticipants were asked to assess both the usability of the system andtheappropriatenessofeachofthesystem’sresponses totheir questionsona5-pointLikertscaleviaanintegratedfeedback sys-tem.

6. Resultsanddiscussion

TheresultsinthissectiondetailtheeffectofapplyingC-MCRDR inthedomain. Weexamine thedomain’s effectiverule-count re-duction,the rules satisﬁed,overall systemperformance (in terms ofappropriateresponses)anduseracceptancefollowingthe eval-uationtrial.

6.1. Rule-count reduction

One of the key features ofC-MCRDR is the reduction in rule countduetopost-inferencequeryingandimplicitreferenceto top-ical context. This is demonstrated by the results fromthe target domain: here the ICT Discipline delivered 34 distinct units that required generation of unit outline documents during the data-gatheringphaseofthisresearch,Semester12017.Associatedwith theseunitswere 140individual assessmenttasks(givingan aver-ageof4assessmentitemsperunit),20coordinatinglecturersand 23distinctteachingteams(consistingof1or2people).Overall,in termsofdatabaseaccesses,eachindividual unitoutlinedocument refers to28specific databasefields,andeach ofthe4 associated assessmentitemsrefer to twodatabase fields.Table4 showsthe standardMCRDRapproach’sminimumnumberofrulesneededto providethesameclassification outcomesasC-MCRDRinthis do-main.

The totals shown in Table 4 are the number of MCRDRrules requiredtoprovidecompletecoverage ofequivalentresponsesfor eachdistinctdatabasedataitem.Withoutpost-inferencequery ref-erencing, thecontents of thereferential databasewouldin effect needtobeﬂattenedandthendirectlyencodedbyrulesintheKB. Fora relatively simple domain,such a high rule count (1225) is excessive.

(7)

con-Table5

Comparisonuserdialog(1:C-MCRDR,2:MCRDR).

Question R# Response

1:I’d like to talk about KIT101 {R 1} _We’re_discussing_KIT101_Programming

2:I’d like to talk about KIT101 {R 1}

1:Who is the coordinator? {R 2}

ThecoordinatorisDrJaneSmith 2:Who is the KIT101 coordinator? {R 1}

1:What are their contact details? {R 3} JaneSmith:extensionx9992,

[email protected] 2:What are the KIT101 coordinator contact details? {R 3}

1:What’s the assessment pattern? {R 4} _The_unit_has₇_assignments

2:What is the KIT101 assessment pattern? {R 4}

1:Let’s discuss KIT371 {R 1}

We’rediscussingKIT371Algorithms

2:Let’s discuss KIT371 {R 133}

1:Contact details? {R 3} JohnJones:extensionx9971,

[email protected] 2:What’s the KIT371 coordinator contact details? {R 135}

Fig.5. ExampleC-MCRDRKBforconversationsystem.

Fig.6. ExampleMCRDRKBforconversationsystem.

versationalsamplescomparingthetwoapproachescanbeseenin Table5.

6.2. Evaluation trial

Intotal,41sessionsbyundergraduatevolunteersintheICT Dis-ciplinewerelogged.Althoughthisisarelativelysmallsamplesize, it is larger and comparable with other similar satisfaction stud-ies (Shabaz et al., 2015; Smith et al., 2014). The sessional data wasgroupedbasedonthenumberofinferencerequeststhatwere made,andwhererelevant,participantresponseratingswere aver-agedpergroup.

6.2.1. Rules satisﬁed during evaluation

It is interesting to note that the frequency distribution, R f, of rulessatisﬁedduringtheevaluation,plotted(Fig.7)againstthelog oftheirrank, R r,(whichisorderedbydecreasingfrequency),yields

Fig.7. Rulefrequencybyrulerank.

aﬁttedmodelthatisindicativeofapowerrelationshipfor R _fi.e.

Rf=186.26×

1

Rαr

with

α

=1.37

Thisrelationshipisofinterestasitiscloseto Zipf’slaw(with

α

=1) (Zipf, 1949), where it was observed (and subsequently shown for many other non-linguistic phenomena) that the fre-quencyofwordsinwrittentextsisinverselyproportional totheir ranking. Thisresult isperhaps intuitive as the rule’santecedents arecloselyalignedwiththepatternmatchingofwords,albeitina highlyreducedandconstraineddomain-speciﬁc dictionarysubset ofEnglish.

6.2.2. Question and response categorisation

Once the sessional data was obtained, we then manually ex-aminedthelogﬁlesto categorisethesystem’sresponses to infer-encerequests– wastheparticipant’sutterance valid,didthe sys-temcapturethecorrectintent,etcetera.Thisleadtothedeﬁnition offourcategories(seeTable6),thatwereappliedinthediscussion ofresultsthatfollow.

6.2.3. System performance

(8)

Table6

InferenceResultCategories(C).

C Inferencesourceandresponse

1 Validquestion,validsystemresponse 2 Validquestion,misinterpretedsystemresponse 3 Validquestion,invalid(default)systemresponse 4 Invalidquestion,defaultsystemresponse

Table7

CategorisedSystemResponses;(Ni=numberof inferencerequests).

Category Ni %Total

1 252 52.72

2 21 4.39

3 73 15.27

4 132 27.62

Total 478 100

categoriesC3andC4)didnotproduceausefulresultfortheuser, whileresponses thatsatisﬁedRules1-36(categoriesC1andC2) had a combined frequency of 57%. However, this does not con-sider when Rule 0 responses are appropriate, orwhen Rules 1 -36responsesareinappropriate,soweinsteadconsidertwooverall categories–appropriate system responses and inappropriate system responses .In Table 7,all inference requests (N _i) are groupedinto their C1- C4categoriesandexpressedasapercentageofthe to-talofall inferencerequests.Wecanthensummarisethesystem’s performanceintermsofthe appropriateness ofsystemresponse:

1. Thesystemisrespondingappropriately(combinecategoriesC1 andC4):80.3%ofallresponses

2. Thesystemhasinappropriateresponses(combinecategoriesC2 andC3):19.7%ofallresponses

The overall appropriate system response rate of 80.3% is very promisingconsideringthefactthatthedevelopedC-MCRDRKBis a minimalencoding ofthe domainknowledge.Thisresultcan be improvedbyreducing thecategoryC2andC3rates– thesetypes ofresponsescanbe correctedbytheadditionofnewruleswhich inC-MCRDRtermsisarefinementofthefinalclassifications, how-ever duringevaluation thiswasnot conducted; C2 responses in-dicates key terms in the utterance (the actual intent) havebeen ignoredormisinterpreted,whereasC3responsesindicatesthe sys-tem is unable to respond to a valid (but out-of-scope) question. ThecategoryC4rates,whilestillappropriateasresponsesto non-sensical input,aredifficultto reducewithoutmoreintensiveuser training,furtherbrittlenessmitigation,orconstraininginput mech-anisms to text-only (as the largest contributor to this category came from ASR errors - see Section 6.2.3). Further rule mainte-nancewould improvethe overallresults andthiswill be consid-eredinfuturework.

Analysingsystemperformance bycategory C1responses alone yields Fig. 8.The Ideal response rateisshown, [NCat1=N i]as is the Best ﬁt linearregressionmodel,[NCat1=0.52×N i].Thismodel showsfundamentally,acrossallparticipationsessions,52%of infer-encerequests arevalidandreceivea non-defaultresponse.There are two obviousoutliers -one at N i=8, NCat1=1 andanother at N _i₌14_, NCat1₌1. Inboth casestheir primary input wasvia speech(75%and78%oftheirrequestsrespectively),andtheyfailed toarticulateasingleunitcode.

Table8containsasummaryofthelinearregressionmodel re-sults for each category against the number of inference requests (scatterplotsarenotshown),whichapproximatelyagreewiththe resultsinTable 7. ForcategoriesC2,C3 andC4 thereis consider-ablevariabilityasindicatedbythelow R 2 _values;_there_were

sev-eraloutliersinthedata,forexample,forthesessionwith N i=12,

Fig.8. NumberofCategory1Responses[NCat1=0.52×N i], R 2=0.93,p <2.2× 10−16_.

Table8 Categorymodels.

Cat Model Statistics

1 NCat1=0.52×N i R 2=0.93,p <2.2×10−16 2 NCat2=0.042×N i R 2=0.19,p <2.3×10−3 3 NCat3=0.15×N i R 2=0.54,p <1.6×10−8 4 NCat4=0.29×N i R 2=0.71,p <1.6×10−12

5out of 12 (42%) ofrequests were category 2misinterpretations (the actual intents were out-of-scope).Examples ofutterances in allcategoriescanbefoundinTable9.

6.2.4. Automatic speech recognition (ASR) errors

The speechinput andoutput components ofthe systemwere notfullyutilisedorembracedinthisdomain.Thisisnotsurprising asmostevaluationwasconductedinnoisylaboratory-based envi-ronmentswhich madespoken questionsandresponses harder to correctlyrecognise andhear. Therewasa scarcity of datalogged with only 11 sessions using speech; the summary is shown in Table10.The ASR Error Rate isthemeanpercentageofASR tran-scriptionerrorsthatoccurredacrossallsessionrequeststhatused speech, and the ASR Duration is the average measure of when speech wasused in a session, how many requests were actually conductedbyspeech.

ThehighstandarddeviationsinTable10areindicativethat par-ticular(outlier) users hadgreater success withspeech input. For example,a session at N i=27 used ASRfor 90% of their session, yetonlysuffered 11%ASRerrors.Incontrast,asessionat N i=14 usedASRfor79% oftheirsessionduration, andsuffered 64%ASR errors.Themajorityofparticipationsessionsimmediatelymodiﬁed theinput mechanism fromspeech totext, butfor thosesessions thatstartedwithspeech,72.7%persevered withitforan average of75.91% of the session requests. We mitigate ASRtranscription errorsforcommonmistakes(for example,correcting unitcode ut-terances)viarule-basedcorrections.Duringevaluationhowever,no additionalcorrectiveruleswereaddedbeyondtheinitialset.

6.2.5. System satisfaction feedback

(9)

Table9

Actualcategoryutteranceexamples.

Cat Utterance Response Comment

1 what is the assessment pattern? R1-36 Validquestionandresponse 1 what are the prerequisites? R1-36 Validquestionandresponse

2 what computer labs can I use? R1-36 Misinterpreted,out-of-scope;“labs” isasynonymfor“classes ” andtheresponsesummarised thetypesofclassesintheunit.Theactualintentwastodeterminewhichcomputerrooms wereavailable(timetabling)

2 KIT001 tutorial times? R1-36 Misinterpreted,out-of-scope; KIT001 matchesaunitcode(Rule3),andtutorial times was ignored.Timetablescheduleswereout-of-scopebutwerefrequentlyaskedfor. 3 who is the tutor? R0 Validquestion,butout-of-scope;Tutorallocationrequirestimetablingdata

3 what work can KIT001 lead to? R0 Validquestion– someanalysisandinclusionofmoresynonymssuchaslead to forthe dictionarytermlearningoutcomes mayhavefacilitatedthisquestion

4 when cold idle R0 Invalidquestion

4 what about Co 80205 R0 Invalidquestion

Table10 ASRdata,|N i|=11.

Statistic μ σ

ASRErrorRate 26.14% 23.20% ASRDuration 75.91% 35.18%

Fig.9. CountofSystemFeedbackScores(|N i|=21).

scale,1₌I am very dissatisﬁed ,5₌I am very satisﬁed ) were ratings atlevels 4and5 (28.57%and33.33%respectively),andtherewas acombined(levels1and2)negativeratingof14.28%– seeFig.9. Overall,only3usersratedthesystembelow3.

6.2.6. Rule satisfaction feedback

Approximately one quarter (26.6%) of individual question re-sponses received feedback (

|

N i

|

=127, out of

|

N total

|

=478), and on average, for sessions where feedback was recorded, 45.3% of thesession’squestionshadfeedback.Thisfeedbackratemayhave beenimprovediftheuserinterfacewithheldfurtherinference re-sponsesuntilthecurrentresponsehasfeedback,butthis punitive-style measure could unduly influence the negative feedback re-sponse rate. Although not shownhere, a breakdown offeedback against individually satisfiedrules is very promising asthe com-bined positive feedback (level 4 and 5) is 53.55% and dissatis-faction(levels 1 and2) isrelatively low – 28.35%. Nearly a fifth

Fig.10. CountofCombinedCategoryFeedbackScores: Appropriate System Responses (|N i|=94); Inappropriate System Responses (|N i|=33).

(18.1%) of feedback wasat the ambivalence level (3). Unsurpris-inglythedefaultrule(Rule0)ratespoorly.Thisisindicativeofnot receivingausefulresponse,although3sessionsinterestinglyrated thisruleasrank5.Rule0achievedthelowestfeedbackscore,but thehighestnumberoffeedbackresponses(38).

6.2.7. Category satisfaction feedback

Theﬁnalﬁgure,Fig.10showsthebreakdownoffeedbackscores when considering the appropriateness of the system response. Consideringtheoverallpositive,negativeandambivalentfeedback results,wehave:

• Appropriate system responses: positive feedback: 70.22%,

|

N i

|

=66; negative feedback: 18.09%,

|

N i

|

=17; ambivalent feedback:11.70%,

|

N _i

|

₌11;

• Inappropriate system responses: positive feedback: 6.06%,

|

N i

|

=2; negative feedback: 57.57%,

|

N i

|

=19; ambivalent feedback:36.36%,

|

N i

|

=12

(10)

Table11

Category Rating Total (N f

= number of feedback re-sponses).

Category Nf %Total

1 75 59.06

2 14 11.02

3 19 14.96

4 19 14.96

Total 127 100

don’t understand ),whichmayleadtosomefrustration.When con-sideringsystemmisbehaviour,unsurprisinglythe combined nega-tivefeedbackscoresarehigh(57.57%),howeveritisapparentthat usersaremorereticenttoleavefeedbackforsystemmisbehaviour compared toappropriate behaviour,asthistype ofresponseonly received aboutathird offeedbackresponses (

|

N i

|

=33compared to

|

N i

|

=94).

Table 11 summarises the overall counts of feedback provided foreach individual category. It isof nosurprise thatcategory C1 responses received the most feedback, and although not shown, 84.0% (

|

N i

|

=63) of those were positive feedback (Rank 4: 12%,

|

N i

|

=9,Rank5:72%,

|

N i

|

=54).

7. Conclusions

In this work our C-MCRDR KBS modiﬁed standard MCRDR to facilitate the retentionof topical and attribute-based context be-tween inference requests, andto reduce the KBrule-count when usedindomainswith in situ databasesbyincludinggeneric query-ingboundby theintra-dialogcontext.Thiswasusedtodevelopa prototype naturallanguage chatsystemconsistingof dialogrules that can be authored by domain experts who possess base-level ICTtechnicalknowhow,andaminimallinguisticanalyticalskillset. ThesystemadoptsasimpleGUIinterfacethat incorporatesdialog pattern matching, variable and dictionary deﬁnition (lexical and phrasal paraphrases),technology-agnostic queryconstruction,and KBruleconstructionbasedonthesecomponents.

The system wasevaluated in a pedagogical question and an-swerdomainby undergraduatestudents inordertoascertainthe feasibility of the approach. In terms of performance, the system respondedappropriately torequestsfor80.3%ofthetime and in-appropriatelyfor19.7%(duetobeingpresentedwithvalidrequests thatweremisinterpretedorvalidrequeststhatwereactuallyoutof scope).61.9%ofparticipants’satisfactionfeedbackscoresratedthe overallsystematlevels4(I am satisﬁed )and5(I am very satisﬁed ) on a 5-point Likert scale. When considering individual inference responses,70.22%ofparticipant’ssatisfactionfeedbackscoresrated thematlevel4(My answer is correct but some information is miss- ing ) and5 (My answer is exactly what you are seeking ) whenthe system is responding appropriately. For inappropriate responses, the satisfaction ratings drop to scores of 1 (My answer is totally wrong ) and2 (My answer is partially wrong ) for57.57%of partici-pant’ssatisfactionratings(andanambivalentratingof3for36.36% ofscores).

Finally,theC-MCRDRKBinthisdomainachievedavery signif-icant97%reductionintherulecountcomparedtotheequivalent, standardMCRDRapproach.Thesatisfactionratingsandthe signiﬁ-cantrulecountreductionshowthesuccessfulfeasibilityofthe ap-proach and that it shows signiﬁcant promise for expansion to a productionsystemintheexistingandothercompatibledomains.

8. Futurework

Future work includes the addition of more rules (with addi-tionaldatabasereferences)bythedomainexpertforthe Unit Out-

line domainto covercasesnotadequately classiﬁed(forexample, requests for timetabling data), integration and adoption in other domains,provision fortheanalysisofmoresyntactic and seman-tic features such as Part-Of-Speech (POS) tagging, Named Entity Recognition (NER) (Hirschberg & Manning, 2015; Manning etal., 2014), and expanding dictionary terms by enabling references to ontologicaldatabasessuch as WordNet (Miller,1995) forsynonym matching. Consideration will also be given to expand the rule-based approach in correcting ASR transcription errors, especially when IPA devices are used – this includes determining regions in the decision tree where a more contextual correction mech-anism can be adopted. Inter-domain applicability will addressed by providing further support for KB rule, query, dictionary and ASRcorrection rulereuse throughsuitable interface options, sav-ing a domain expertconsiderable time when considering a new domain.Thedeveloped systemwillalsobe integratedasthe pri-maryuserinterfacecomponentinawiderstudyforthecontrolof autonomoussystemsbynaturallanguageinordertoachieve hier-archicaltask-basedgoals.

Acknowledgements

This research has been supported by ﬁnancial support via a grant (number FA2386-16-1-4045) from the Asian Oﬃce of Aerospace Research and Development (AOARD). The research is alsosupportedby an Australian Government Research Training Pro- gram Scholarship andithasUniversityofTasmaniaEthicsApproval, numberH0016281.

References

Aamodt,A.,&Plaza,E.(1994).Case-basedreasoning:Foundationalissues, method-ologicalvariations,andsystemapproaches.AIcommunications,7(1),39–59. Adams,P.H.,&Martell,C.H.(2008).Topicdetectionandextractioninchat.In

Se-manticcomputing,2008IEEEinternationalconferenceon(pp.581–588).IEEE. Androutsopoulos,I., Ritchie,G.D., &Thanisch,P.(1995).Naturallanguage

inter-facestodatabases– anintroduction. Natural Language Engineering, 1 (1),29–81. doi:10.1017/S135132490000005X.

Bais,H.,Machkour,M.,&Koutti,L.(2016). Queryingdatabase usinga universal nat-urallanguageinterfacebasedonmachinelearning.InInformationtechnology fororganizationsdevelopment(IT4OD),2016internationalconferenceon(pp.1–6). IEEE.

Biermann,J.(1998).Hades—aknowledge-basedsystemformessageinterpretation andsituationdetermination.InInternationalconferenceonindustrial,engineering andotherapplicationsofappliedintelligentsystems(pp.707–716).Springer. Chaw,S.Y.(2009).Addressingthebrittlenessofknowledge-basedquestion-answering.

UniversityofTexas.Ph.D.thesis.

Clark,P.,Harrison,P.,Jenkins,T.,Thompson,J.A.,Wojcik,R.H.,etal.(2005). Acquir-ingandusingworldknowledgeusingarestrictedsubsetofenglish..InFlairs conference(pp.506–511).

Compton, P. (2011). Challenges with Rules. Technical Report . Paciﬁc Knowledge Systems. Retrieved December 12, 2017 from http://pks.com.au/technology/ resources/

Compton,P.,&Jansen,R.(1990).Knowledgeincontext:Astrategyforexpertsystem maintenance.AI’88,292–306.

Cunningham, H., Cunningham, H., Maynard, D., Maynard, D., Tablan, V., & Tablan,V.(1999).JAPE:aJavaAnnotationPatternsEngine.Research Memoran-dumCS-99-06.DepartmentofComputerScience,UniversityofSheﬃeld. Cunningham,H.,Maynard,D.,Bontcheva,K.,&Tablan,V.(2002).Aframeworkand

graphicaldevelopmentenvironmentforrobustNLPtoolsandapplications..In

Acl(pp.168–175).

Figueroa,A.(2017).Automaticallygeneratingeffectivesearchqueriesdirectlyfrom communityquestion-answeringquestionsforﬁndingrelatedquestions. Expert Systems with Applications, 77 ,11–19.doi:10.1016/j.eswa.2017.01.041.

Gaines,B.,&Compton,P.(1992).Inductionofrippledownrules.Proceedings, Aus-tralianAI,92,349–354.

Gaines,B.R.,&Compton,P.(1995).Inductionofripple-downrulesappliedto mod-elinglargedatabases.JournalofIntelligentInformationSystems,5(3),211–228. Gaines, B. R., & Shaw, M. L. G. (1993). Knowledge acquisition tools based on

personalconstructpsychology. The Knowledge Engineering Review, 8 (1),49–85. doi:10.1017/S0269888900000060.

Galgani,F.,Compton,P.,&Hoffmann,A.(2015).Lexa:Buildingknowledgebasesfor automaticlegalcitationclassiﬁcation. Expert Systems with Applications, 42 (17), 6391–6407.doi:10.1016/j.eswa.2015.04.022.

(11)

Han,S.C.,Mirowski,L.,Jeon,S.-H.,Lee,G.-S.,Kang,B.H.,&Turner,P.(2013).Expert systems andhome-basedtelehealth:Exploring arolefor MCRDRin enhanc-ingdiagnostics.InInternationalconference,UCMA,SIA,CCSC,ACIT–2013:vol.22

(pp.121–127).

Han,S.C.,Yoon,H.-G.,Kang, B.H.,&Park,S.-B.(2014). UsingMCRDRbased ag-ileapproachforexpertsystemdevelopment. Computing, 96 (9),897–908.doi:10. 1007/s00607-013-0336-y.

Hendrix,G.G.,Sacerdoti,E.D.,Sagalowicz,D.,&Slocum,J.(1978).Developinga nat-urallanguageinterfacetocomplexdata.ACMTransactionsonDatabaseSystems (TODS),3(2),105–147.

Hirschberg,J.,&Manning,C.D.(2015).Advancesinnaturallanguageprocessing. Science, 349 (6245),261–266.doi:10.1126/science.aaa8685.

HoKang,B.,Yoshida,K.,Motoda,H.,&Compton,P.(1997).Helpdesksystemwith intelligentinterface.AppliedArtiﬁcialIntelligence,11(7–8),611–631.

Horn,K.A.,Compton,P.,Lazarus,L.,&Quinlan,J.R.(1985).Anexpertsystemfor theinterpretationofthyroidassaysinaclinicallaboratory.Australiancomputer journal,17(1),7–11.

Kang,B.,Compton,P.,&Preston,P.(1995).Multiple classiﬁcationripple downrules: evaluationand possibilities.InProceedings9thbanff knowledgeacquisitionfor knowledgebasedsystemsworkshop(pp.17.1–17.20).

Kang, B. H. (1995). Validating knowledge acquisition: Multiple classiﬁcation ripple downrules.UniversityofNewSouthWalesPh.D.thesis..

Li,Y.,Yang,H.,&Jagadish,H.V.(2005).NaLIX:aninteractivenaturallanguage in-terfaceforqueryingXML.InProceedingsofthe2005ACMSIGMODinternational conferenceonmanagementofdata(pp.900–902).ACM.

Madnani,N.,&Dorr,B.J.(2010).Generatingphrasalandsententialparaphrases:A surveyofdata-drivenmethods.ComputationalLinguistics,36(3),341–387. Mak,P.,Kang,B.-H.,Sammut,C.,&Kadous,W.(2004).KnowledgeAcquisition

Mod-uleforConversationAgent.TechnicalReport.SchoolofComputing,Universityof Tasmania.

Manning,C.,Surdeanu,M.,Bauer,J.,Finkel,J.,Bethard,S.,&McClosky,D.(2014). ThestanfordcoreNLPnaturallanguageprocessingtoolkit.InProceedingsof52nd annualmeetingoftheassociationforcomputationallinguistics:system demonstra-tions(pp.55–60).

Miller,G.A.(1995).Wordnet:Alexicaldatabaseforenglish. Communications of the ACM, 38 (11),39–41.doi:10.1145/219717.219748.

Minksy,M.(1975).Aframeworkforrepresentingknowledge.Thepsychologyof com-putervision,73,211–277.

Miranda-Mena, T. G., Sandra L. Benítez, U., Ochoa, J. L., Martínez-Béjar, R., Fernández-Breis,J.T.,&Salinas,J.(2006).Aknowledge-basedapproachto as-signbreastcancertreatmentsinoncologyunits. Expert Systems with Applica- tions, 31 (3),451–457.doi:10.1016/j.eswa.2005.09.076.

Nguyen,D.Q.,Nguyen,D.Q.,&Pham,S.B.(2017).Rippledownrulesforquestion answering.SemanticWeb,8(4),511–532.

Oh,K.-J.,Choi,H.-J.,Gweon,G.,Heo,J.,&Ryu,P.-M.(2015).Paraphrasegeneration basedonlexicalknowledgeandfeaturesforanaturallanguagequestion an-sweringsystem.InBigdataandsmartcomputing(bigcomp),2015international conferenceon(pp.35–38).IEEE.

Pham,K.C.,&Sammut,C.(2005).Rdrvision-learningvisionrecognitionwithripple downrules.InProceedingsoftheaustralasianconferenceonroboticsand automa-tion(p.7).

Richards,D.(2009).Twodecadesofrippledownrulesresearch. The Knowledge En- gineering Review, 24 (2),159–184.doi:10.1017/S0269888909000241.

Shabaz,K.,O’Shea,J.D.,Crockett,K.A.,&Latham,A.(2015).Aneesah:A conversa-tionalnaturallanguageinterfacetodatabases.InWorldcongressonengineering

(pp.227–232).

Shiraz,G.M.,&Sammut,C.(1997).Combiningknowledgeacquisitionandmachine learningtocontroldynamicsystems.InInternationaljointconferenceonartiﬁcial intelligence(pp.908–913).MorganKaufmann.

Shirazi,H.,&Sammut,C.A.(2008).Acquiring controlknowledgefromexamples usingripple-downrulesandmachinelearning.Iranian JournalofScienceand Technology,32(B3),295–304.

Smith,E.V.,Crockett,K.,Latham,A.,&Buckingham,F.(2014).Seeker:A conversa-tionalagentasanaturallanguageinterfacetoarelationaldatabase.In Proceed-ingsoftheworldcongressonengineering(pp.191–196).Newswood/International AssociationofEngineers.

Wallace, R. (2011). Artiﬁcial Intelligence Markup Language (AIML). Web Page . A.L.I.C.E.AIFoundation.RetrievedOctober24,2017fromhttp://www.alicebot. org/TR/2011/

Waltz,D. L.(1978).An englishlanguagequestionanswering systemfor alarge relationaldatabase. Communications of the ACM, 21 (7),526–539.doi:10.1145/ 359545.359550.

Warren, D. H. D., & Pereira, F.C. N. (1982).An eﬃcienteasily adaptable sys-temforinterpretingnaturallanguagequeries.ComputationalLinguistics,8(3–4), 110–122.

Wille,R.(1992).Conceptlatticesandconceptualknowledgesystems.Computersand mathematicswithapplications,23(6–9),493–515.

Woods,W.A.,Kaplan,R.M.,&Nash-Webber,B.(1972).Thelunarsciences:Natural languageinformationsystem:Finalreport.BoltBeranekandNewman. Zipf,G.K.(1949).Humanbehaviorandtheprincipleofleasteffort:Anintroductionto