AGENT TECHNOLOGY FORPERSONALIZED INFORMATIONFILTERING: THEPIA
SYSTEM
SAHIN ALBAYRAK
∗
, STEFAN WOLLNY
†
, ANDREAS LOMMATZSCH
‡
, AND DRAGAN MILOSEVIC
‡
Abstrat. Astodaytheamountofaessibleinformationisoverwhelming,theintelligentandpersonalizedlteringofavailable
informationisa mainhallenge. Additionally,there isa growing need for the seamlessmobileand multi-modalsystemusage
throughoutthewholedaytomeettherequirementsofthemodernsoiety(anytime,anywhere, anyhow). Apersonalinformation
agentthatisdeliveringthe rightinformationat the righttimebyaessing,lteringand presentinginformationina
situation-awarematterisneeded. ApplyingAgent-tehnology ispromising,beausethe inherentapabilitiesofagentslikeautonomy,
pro-andreativenessoeranadequateapproah. Wedevelopedanagent-basedpersonalinformationsystemalledPIAforolleting,
ltering,and integratinginformationataommonpoint,oeringaessto theinformationbyWWW, e-mail,SMS,MMS,and
J2MElients. Pushand pulltehniquesareombinedallowingthe usertosearh expliitlyforspeiinformationon the one
handandtobeinformedautomatiallyaboutrelevantinformationdividedinpre-,workandrereationslotsonthe otherhand.
IntheoreofthePIAsystemadvanedlteringtehniquesaredeployedthroughmultiplelteringagentommunitiesfor
ontent-basedandollaborativeltering. Information-extratingagentsareonstantlygatheringnewrelevantinformationfromavarietyof
seletedsoures(internet,les,databases,web-servieset.). Apersonalagentforeahuserismanagingtheindividualinformation
provisioning,tailoredtotheneedsofthisspeiuser,knowingtheprole,theurrentsituationandlearningfromfeedbak.
Keywords. intelligentandpersonalizedltering,ubiquitousaess,reommendationsystems,agentsandomplexsystems,
agent-baseddeployedappliations,evolution,adaptationandlearning.
1. Introdution. Nowadays, desiredinformation oftenremains unfound, beauseit is hidden in ahuge
amount of unneessary and irrelevant data. On the Internet there are well maintained searh engines that
are highly useful if you want to do full-text keyword-searh [1℄, but they are not able to support you in a
personalized way and typially do not oer any push-serviesor in other words no information will be sent
to you when you are not ative. Also, as they normally do not adapt themselves to mobile devies, they
annot be used throughout a whole day beause you are not sitting in front of a standard browser all the
timeand whenyoureturn,these systemswilltreatyouin theverysamewayasifyouhaveneverbeenthere
before (nopersonalization, nolearning). Users whoare notfamiliar withdomain-spei keywordswon'tbe
ableto do suessfulresearh, beause no support is oered. Predened or auto-generatedkeywords for the
searh-domainsareneededtollthat gap. Asinformationdemands areontinuouslyinreasingtodayandthe
gatheringof information is time-onsuming, there is agrowingneed for a personalized support (Figure 1.1).
Labor-savinginformationisneededto inreaseprodutivityatworkandalso thereis aninreasingaspiration
for a personalized oer of general information, spei domain knowledge, entertainment, shopping, tness,
lifestyleandhealthinformation. Existingommerialpersonalizedsystemsarefarawayfromthatfuntionality,
astheyusuallydonotoermuhmorethanallowingtohoosethekindofthelayoutorolletingsomeofthe
oeredinformationhannels(andtheinformationwithin isnotpersonalized).
Tooveromethatsituationyouneedapersonalinformationagent(PIA)whoknowsthewayofyourthinking
andanreallysupportyouthroughoutthewholedaybyaessing,lteringandpresentinginformationtoyouin
asituation-awarematter(Figure1.1). Somesystemsexist(Fab[2℄,Amalthaea[3℄,WAIR[4℄,P-Tango[5℄,T
rip-Mather[6℄,PIAgent[7℄,Letizia[8℄,Let'sBrowse[9℄,Newt[10℄,WebWather[11℄,PEA[12℄,BAZAR[13℄)that
implementadvanedalgorithmitehnology,butdidnotbeomewidelyaepted,maybebeauseofrealworld
requirementslikeavailability,salabilityandadaptationtourrentandfuturestandardsandmobiledevies.
In this paper we present an agent-basedapproah for the eient, seamless and tailoredprovisioning of
relevantinformation on the basis of end-users' daily routine. As we assume the reader will befamiliar with
agent-tehnology(see[14,15℄foragoodintrodution),wewillonentrateonthepratialusageandthe
real-world advantages of agent-tehnology. After briey desribing theexisting systemsfrom whih thesienti
publiationsareavailable,wedesribethedesignandarhitetureandafterwardsdepitthesysteminSetion4.
2. Related work. The followingparagraphsare goingto briey present someofthe alreadymentioned
systems(Fab,Amalthaea,WAIR,P-Tango,PIAgent,Letizia,Let'sBrowse,Newt,WebWatherandPEA), for
whih webelievethatarerelatedtoourwork.
∗
DAI-Labor,TehnialUniversityBerlin,Germany(sahindai-lab.de)
†
HighPerformaneComputing,ZuseInstituteBerlin,Germany(wollnyzib.de)
‡
Fig.1.1.Demandforapersonalinformationagent
Fab[2℄isanautomatireommendationservieforinformationretrieval,whih isabletoovertimeadapt
toitsusers,whoonsequentlyreeiveinreasinglypersonalizeddouments. Bymaintainingbotholletionand
seletionagents,Fabsystemisagoodtest-bedfortryingoutdierentlteringstrategies, whih eitherollet
doumentsfrom the Webthat belong to theertain topi,orseletsomeofthe olleteddoumentsthat are
suitableforapartiularuser. Thereationofprolesthroughtheontent-basedanalysis,whihareafterwards
diretly ompared to nd similar users for ollaborative reommendations, represents the unique synergy of
these twofrequentlyombined lteringtehniques. Unfortunately, theusability ofthe whole systemdepends
onthe ability of theontent basedlteringto generate theusableproles, being theserious drawbakof the
Fabsystem.
Sine information disoveryand information lteringare proven to bethe suitable domains for applying
multi-agenttehnology,apersonalizedsystem,namedAmalthaea[3℄hasbeendeveloped. Itproativelytriesto
disoverfromvariousdistributedsourestheinformationthatisrelevanttoauser. Themulti-agenttehnology
is applied by maintainingtwodierent typesof agents, beinginformation lteringand information disovery
ones. The ways how these agents are managing to learn the user's interests and habits, to maintain their
ompetenebyadapting to thehanges inthe user'sinformationneeds,and toexplore thenewdomains that
maybeofinteresttoauser,dependonevolutionprogramming,beingmaybenotsoappliableforthelarge-sale
informationretrievaltasks.
Seekingthestateofauserprole,whihbestrepresentsatualinformationinterestsandthereforemaximize
the expeted value of the umulative user relevane feedbak, is formulated in WAIR multi-agentsystem [4℄
as the reinforement learning problem. The insuieny of expliit user ratings is tried to be overome by
using the lassiationapproah based on the neural network,whih exploits dierent impliit indiators of
interests in order to estimate the real relevane feedbak values. Unfortunately, the amount of the expliit
ratingsneededfortrainingthat lassierstill seemstobetoolarge. Thislearlylimitstheappliabilityofthe
WAIRsystem.
Tointelligentlydeliverapersonalizednewspaper, whih ontainsonlytheartiles ofhighestinterest that
areindividually seletedeverydayforeah andeveryuser,P-Tango[5℄systemproposesaframeworkfor
om-biningdierentlteringstrategies. Althoughtheurrentlyombinedstrategiesareonlytheontent-basedand
ollaborativeones,aproposedframeworkissigniant,byreasonofbeingextendibleto anylteringmethods.
Inspite of this extensibility, we believe that theagent-basedframeworkthat weproposein this paper,oers
betterexibilitywhentheintegrationandafterwardstheusageofnewstrategiesisonerned.
Astheinformationbeametheoneofmostsigniantresouresforbusinessandresearh,bothperiodially
sanning dierent information soures and pushing the found relevant artiles to interested users, have also
motivated thedevelopmentof PIAgent [7℄. Whileausage of various extratoragents eah supporting a
relevantartilesfromothers. Suhaneuralnetworkapproahhasstrengthinoptimistiallyprovidingexellent
lassiationauray. Unfortunately, its bigweaknessin oftenexpensivetrainingthat pratiallymakesthe
PIAgentto behardlyappliablefornowadaysinformationretrievaltasks.
Theintelligentassistanetotheuser,whoisbrowsingtheInternetfortheinterestinginformation,isprovided
by theautonomous interfae agent, named Letizia [8℄. It traks user behavior and uses various heurististo
antiipate, whih hyperlinksmay leadtothe potentiallyrelevantdouments, andwhih shouldbeignored by
reasonofpointingtojunkornotexistingpage. TheornerstonepropertyoftheLetiziasystemisinaskingthe
userneitherto expliitlystateitsinterestsby deningthe querynortoprovidetheexpliitfeedbakabouta
realrelevaneofreommendations. Althoughthisexpliitommuniationwiththeuseranspeeduplearning,
the priorityin designing theLetizia system has been given to bothletting theuser to browse withoutbeing
interruptedandaskingforhelp onlywhen beingunsurewhihlinktofollow.
TheMITMediaLaboratoryhasalsodevelopedanagent,whosejob isto hoose, fromthelinks reahable
ontheurrentWebpage,thosethatarelikelytobestsatisfytheinterestsofmultipleusers. Theagentisnamed
Let's Browse [9℄, by reason of providing the assistane to the group of humans in browsing, by suggesting
hyperlinkslikelytobeofommoninterests. Althoughthissystemdemonstrateshowdoumentsthataregood
for the group of users and that are in the neighborhood an be found, it generally does not respond to the
hallengeofndingthedatathatisloatedanywhereontheInternet.
Theability to both speialize to userinterests,adapt to preferene hangesand explore the newer
infor-mation domains makes thefoundation of theNewT [10℄, being onepersonalized multi-agentltering system
fornews artiles. As user information interestsare modeled asthe populationof theompeting proles, the
used learning mehanismsare both relevane feedbak, aswell asthe rossoverand mutation geneti
opera-tors. Thesereombination genetioperators aremainly responsiblefor theadaptationand explorationissues
byreatingmorettedfuture populations. Inthemeantime, auserprolealsolearnsthroughtheappliation
of the relevane feedbak tehniques. Takentogether these learning mehanismsmakethe so-alled Baldwin
eet[24℄,sayingthatapopulationevolvestowardsatterformmuhfaster,wheneveritsmembersareallowed
to learnduringtheir lifetime. Althoughthe Baldwinevolutionseemsto bemorepowerfulthan theevolution
approahusedinAmalthaea,ithasthesameweaknesseswhihlimititsappliabilityforlarge-saleinformation
retrieval.
Usersmaynd itdiultbothtoreate theappropriatequeriesand toloatetheinformation ofinterest
in thease ofhavingnospei knowledge aboutthe ontent ofthe underlying doumentolletion. On the
onehand,somesystemsaimtodeployeientlusteringalgorithms,whihwilldynamiallyproduethetable
ofontents,neededtofailitate theusers'browsingativities. Theornerstoneideais tobysomemeanshelp
auserrsttogetanoverviewonerningtheavailable ontent,andthento auratelyexpressitsinformation
needs. On the other hand, WebWather [11℄ ats as the tour guide that provides the assistane, whih is
similar to theguidane ofthe humanin thereal museum. It aompanies usersfrom page to page, suggests
appropriate hyperlinks, and learnsfrom the obtained experiene to improve its advie givingskills. Suh a
systemunfortunatelyanonlyloallyassisttheuser,whihbringsthesamedrawbaksbeingpresentin Letizia
andLet'sBrowsesystems.
PersonalEmail Assistant(PEA)[12℄ltersinomingmailsandranksthemaordingtotheirrelevanein
ordertohelpnowadaysusers,whoeasilyendupwithlargepartoftheirworkingdaybeingspentwithreading
emails. PEA maintains the personal user model that onsists of several proles and uses the evolutionary
algorithmstomovethatmodelonstantlylosertotheurrentinformationneeds. BydoingthatPEA aimsat
assistingusersindealingmoreeetivelywiththeirdailyloadofemailsso thatvaluableworkingtimeissaved
formoreprodutiveandreativetasks. Eventhoughtheevolutionstrategiesseemsto bepowerfulenoughfor
dealingwithemailsinthePEAsystem,theirusagein theInternet-likeenvironmentstillremainstobeagreat
hallenge.
3. Design of PIA: The Personal Information Agent. Tomeet the disussed requirements and to
support theuser in that matter, wedesigneda multi-agentsystemomposed offour lasses of agents: many
informationextratingagents,agentsthatimplementdierentlteringstrategies,agentsforprovidingdierent
kindsofpresentationandonepersonalagentforeahuser. Logially,allthisanbeseenasalassialthreetier
appliation(Figure3.1). Conerningtheinformationextration,generalsearhenginesontheonehandbutalso
Several agentsrealize dierent lteringstrategies(ontent-basedand ollaborativeltering [16℄, [5℄) that
haveto beombinedin an intelligentmatter. Alsoagentsforprovidinginformation ativelyviaSMS, MMS,
Fax,e-mail(push-servies)areneeded. AMulti-aessservieplatformhasto managethepresentationofthe
resultstailoredtotheuseddevieandtheurrentsituation. Thepersonalagentshould onstantlyimprovethe
knowledgeabouthis userbylearningfromthegivenfeedbak,whihisalsotakenforollaborativeltering,as
information thathasbeenrated ashighlyinterestingmightbe usefulforauserwith asimilarproleaswell.
Asusersusuallyarenotverykeenongivingexpliitfeedbak(ratings),impliitfeedbaklikethefat thatthe
userstoredanartileanalsobetakeninto aount[18℄.
A keywordassistantshould support the user in dening queries even if he is not familiar with a ertain
domain. PIAprovidesthreestrategiesforndingadequatekeywordsandforoptimizingexisting requests:
1. Keywords predened by experts for frequentlyrequested topis (or ategories) anhelp the
unexpe-riened user to nd the relevant keywords. The suggestionsprovided by domain experts are usually agood
startingpointforindividualrequests.
2. An alternativemethod for nding interesting keywordsis the extration of words and phrasesfrom
interestingpapers. This strategyhelps the userto identify the keyonepts from apaper that anbeuseful
for ndingother relevant douments. Inontrast to other approahes(like Googles Find similar douments)
thekeywordextrationgivesthe usertheopportunityto adapt extratedkeywordsaordingto thepersonal
interestsandpreferenes.
3. ForoptimizingexistingqueriesthePIAsystemsuggestskeywordsfromsimilarrequests. Foromputing
thesimilaritiesbetweenuserrequeststhesystemsanalysestheoverlappingofuserproles(basedonstems)and
theorelation between theuserratings. Keywordsthat frequently oure in therequests of similar usersare
suggestedto theuseraspotentiallyrelevantsearhterms.
Thewholesystemisdesignedtobehighlysalable,easytomodify,toadaptandtoimproveandtherefore
an agent-based approah that allows to integrate or to remove agents even at run-time is a smart hoie.
The dierent ltering tehniques are needed to provideaurate results, beause the weakness of individual
tehniquesshouldbeompensatedbythestrengthsofothers. Doumentsshouldbelogiallylusteredbytheir
domainstoallowfastaess,andforeahdoumentseveralmodels[19℄willbebuilt,allinludingstemmingand
stop-wordelimination,butsometailoredforveryeientretrievalatrun-timeandotherstosupportadvaned
lteringalgorithmsforahighauray.
Ifthesystemnotiesthattheontent-basedlteringisnotabletooersuientresults,additional
infor-mationshouldbeoeredbyollaborativeltering,i.e. informationthatwasratedasinterestingbyauserwith
asimilar prolewill bepresented.
Withthepush-servies,theuserandeidetogetnewintegratedrelevantinformationimmediatelyandon
amobiledevie, butforuserswhodonotwanttogetnewinformationimmediately,apersonalizednewsletter
also hasto beoered. This newsletter isolleting newrelevantinformation to beonvenientlydeliveredby
e-mails,allowinguserstostayinformedeveniftheyarenotativelyusingthesystemforsometime.
4. Deploymentand evaluation.
4.1. Overview. We implemented the system using Java and standard open soure database and
web-tehnologyandbasedontheJIACIVagentframework[20℄. JIACIVisFIPA2000ompliant[21℄,thatisitis
onformingtothelateststandards.
It onsists ofaommuniationinfrastrutureaswellasserviesfor administratingand organizingagents
(AgentManagementServie,AMSandDiretoryFailitator,DF).TheJIACIVframeworkprovidesavarietyof
managementandseurityfuntions,managementserviesinludingonguration,faultmanagementandevent
logging, seurityaspets inludingauthorization, authentiation and mehanismsfor measuringand ensuring
trust and therefore has been a good hoie to be used from the outset to the development of a real world
appliation.
WithinJIACIV,agentsarearrangedonplatforms,allowingthearrangementofagentsthatbelongtogether
withtheontrol ofat leastonemanager. A lot ofvisualtoolsareoeredto dealwithadministrationaspets.
Figure3.2showsaplatform,inthisasewithagentsforthebuildingofdierentmodelsspeializedfordierent
retrievalalgorithms.
TheprototypialsystemisurrentlyrunningonSun-Fire-880,Sun-Fire-480RandSunFireV65x,whereas
Fig.3.1. ThePIASystemseenasathreetierappliation
Fig.3.2. Severalagentsarebuildingdierentmodelsspeilised fordierentretrievalalgorithms
New ontent is stored, validated and onsolidated in a entral relational database(update-driven).
In-formation an be aessed by WWW, e-mail, SMS, MMS, and J2ME Clients, where the system adapts the
presentationaordingly,usingtheCC/PP(PreferenesProle) withatailoredlayoutforamobile phoneand
aPDA (seeSetion 4.6). Thepersonalizednewsletterandthepush-serviesaresentviae-mail,SMSorMMS.
The user anuse self-dened keywords for a request for information orhoose a ategoryand therefore the
systemwillusealistof keywordspredened byexpertsand updatedsmoothlybylearningfrom ollaborative
ltering. A ombination of both is also possible. The keyword assistant is able to extrat the most import
4.2. Gatheringnew information. Newinformationisonstantlyinsertedinthesystembyinformation
extrationagents,e.g. web-readeragentsoragentsthataresearhingspeieddatabasesordiretories.
Addi-tionalagentsfortheolletionofnewontentaneasilybeintegratedevenatruntime,asallthat isneessary
foranewagentistoregisterhimselfatthesystem,storetheextratedinformationatadeneddatabasetable
and inform the modeling-manager agent about the insertion. As a le reader-agentis onstantly observing
a speial diretory, manual insertionof douments an be done simply by drag-and-dropand an e-mail and
upload-interfaealsoexists.Soure analsobeintegratedbyWebservies. NewReadersanbereatedusing
aeasy-to-handletoolandanothertoolisenablingto onvenientlyobservetheextration-agents,asthis isthe
interfaetotheoutsidethatmightbeomeritialifforexamplethedata-formatofasoureishanged.
4.3. Pre-proessing for eient retrieval. The rst step of pre proessing information for eient
retrievalis theuseof distint tablesin the globaldatabasefordierentdomains like e.g. news, agent-related
papers,et. Dependingonthelteringrequest,tableswithnohaneofbeingrelevantanthereforebeomitted.
The next step is the building of several models for eah doument. Stemmingand stop-wordelimination is
implementedineverymodelbut dierentmodelsarebuiltbyomputingatermimportane eitherbasedonly
onloalfrequenies,orbasedontermfrequenyinversedoumentfrequeny(TFIDF)approah. Furthermore
numberofwordsthatshouldbeinludedinmodelsisdierentwhihmakesmodelseithermoreaurateormore
eient. Createdmodelsareindexedeitherondoumentorwordlevel,whihfailitatetheireientretrieval.
Themanageragentisassigningtheappropriatemodelingagentstostartbuildingtheirmodelsbutmightdeide
(orthehumansystemadministratorantellhim)atruntimetodelaylatesttime-onsumingmodelingativity
for a while if systemload is ritial at a moment. This feature is important for areal-world appliation, as
overloadinghasbeenamainreasonfortheun-usabilityof advaned aademisystems.
4.4. Filteringtehnology. Asthequalityofresultstoapartiularlteringrequestmightheavilydepend
ontheinformationdomain (news,sientipapers,onferenealls),dierentlteringommunitiesare
imple-mented. Foreah domain,there isat leastoneommunitywhih ontainsagentsbeingtailoredto dospei
lteringandmanagingtasksin aneientway. Insteadofhavingonlylteringagents(they willbedesribed
in Setion4.5),eahandeveryommunityhasalsooneso-alledmanageragentthatismainly responsiblefor
doingoordinationof lteringtasks,aswellasooperationwithothermanagers.
Theoordinationis basedonquality,CPU, DBandmemorytness values,whih arethemeasures being
assoiated to eah ltering agent [23℄. These measures respetively illustrate lteringagent suessfulnessin
thepast,itseieny inusingavailableCPUandDBresoures,andtheamountofmemorybeingrequiredfor
ltering. AhigherCPU,DBormemorytnessvaluemeansthatlteringagentneedsapartiularresoureina
smallerextentforperformingalteringtask. Thisfurther meansthat aninsuienyofapartiular resoure
hasasmallerinueneonlteringagentswith ahigherpartiulartnessvalue.
The introdued dierent tness values together with the knowledge about the urrent system runtime
performaneanmakeoordinationbesituationaware(seealso[23℄)inthewaythatwhenapartiularresoure
ishighlyloadedapriorityinoordinationshouldbegiventolteringagentsforwhihapartiularresourehas
aminor importane. Thissituation awareoordinationprovidesawayto balaneresponse time andltering
auray,whihisneededinoveromingtheproblemofndingaperfetlteringresultafterfewhoursoreven
fewdaysofanexpensiveltering.
Instead of assigning lteringtask to the agentwith the best ombination of tnessvalues in the urrent
runtimesituation,managerisgoingto employaproportionalseletionpriniple[24℄wheretheprobabilityfor
theagenttobehosentodoatuallteringisproportionaltothementionedombinationofitstnessvalues.By
notalwaysrelyingonlyonthemostpromisingagents,butalsosometimesoeringajobtootheragents,manager
gives ahane to eah and everyagentto improveits tness values. While theadaptation of quality tness
value anbeaomplished after the user feedbak beame available, the other tness valuesan be hanged
immediatelyafter thelteringwasnishedthroughtheresponse timeanalyses. Theadaptationsheme hasa
dereasinglearningratethatpreventsalreadylearnttnessvaluesofbeingdestroyed,whihfurthermeansthat
provenagentspaysmallerpenaltiesforbadjobsthanthenovieones[17℄.
The underlying oordinationativities, essentially responsible for the mentioned optimal exploitation of
availablesystemresoures,arerepresentedonFigure4.1,givingthesimplestpossibleseletionsenario. Under
the assumptionthat everythinggoesperfetly, the senariostarts with ajob reationand ends with aresult
agent (F) that enapsulates the partiular searhing algorithm (deployed ltering strategies are desribed in
Setion 4.5), is produed results, the manager M an adapt tness valuesbased on the measurement of the
responsetime. ThefoundlteringresultsarenallyreturnedbaktotheuseragentU,andthissimplesenario
ends.
Fig.4.1. Systemarhitetureillustratingagentommuniation forresoure-awareoordination
In the asewhere the reeived ltering task annot be suessfully loally aomplished usually beause
of belonging to unsupported information domain, manager agent has to ooperate with other ommunities.
Whileoordinationtakesplaeinsideeahandeverylteringommunitybetweenmanagerandlteringagents,
ooperationoursbetweenommunitiesamongmanageragents(seealsoFigure4.2). Theooperationisbased
oneitherndingaommunitywhihsupportsgivendomainorinsplittingreeivedtaskonsub-taskswherefor
eahsub-taskaommunitywithgood supportexists.
The information is usually sattered around many dierent, more or less dynami, distributed soures.
Twoornerstone hallengesthereforebeomebothnding whih souresshould beonsultedforresolvingthe
partiularrequest, aswellasputting the foundresultstogether. Whilethe hallenge of searhingfor soures
is known as the database seletion problem, the omposing of a nal result set is often simply referred as
theinformation fusion. Onelightooperationapproah,already published in [25℄, andwhih isbasedon the
appliationoftheintelligentooperativeagents,isgoingtobebrieyillustratedintherestofthissub-setion.
Thefundamental ooperationideaisbased ontheinstallation ofat least onelteringommunityaround
eahdatabase,aswellasonsettingupthesophistiatedmehanisms,whihenablethattheseommunitiesan
eientlyhelpeahotherwhileproessingtheinomingrequests. Althoughthelteringrequestanbesentto
any lteringommunity, themost suitableommunities will beautonomously found, and therequest will be
then dispathed to them. The foundresultswill be nally olleted,and only thebest ones will be returned
tothesenderofthelteringrequest. Themostappealingpropertybehindtheseooperativeproessingisthat
everythingisdonetransparentlyfortheuser, beingnotanymoreforedto manuallythink where therequest
shouldbesent,andwhih obtainedlteringresultsarereallythebestones.
Example(Coordination&Cooperation)Figure4.2presentsahighleveloverviewofthelteringframework
being omposed of three dierent ltering ommunities (FC), where eah ommunity hasone lter manager
agent(M)anddierentnumberofspeializedlteringagents(F).Therearetwodierentdatabases(DB) with
informationfromdierentdomains,andtheyareaessedatleastbyoneommunity. Onthegureooperation
isillustratedasairlewitharrowswhihonnetmanageragents.
Fig.4.2.Filteringframeworkwiththreedierentommunities
strategy, their short desriptions will be given in the following paragraphs in order to make this paper
self-ontainedandtoleartherolesoftheagentsonFigure3.2.
Byusing thetermfrequenyinversedoumentfrequenysheme,theimportanevaluesofdierentwords
anbeomputed,andeahandeverydoumentanbemodelledbyaorrespondingweightedvetor. Whilethe
so-alledLargeFiltering Strategy willalwaysbuild amodel withallwordsfrom adoument,OptimalTwenty,
Optimal Ten,andOptimal FiveFiltering Strategieswill takeintoonsiderationonlytwenty,ten,andvemost
important words, respetively. Themodels, being reatedby these optimal strategies, are thus smaller, and
onsequentlyanbefasterbothloadedintomemoryandomparedwithalteringrequest. Astheyareomitting
manywords,theymightbeatthesametimepotentiallylessaurate,andtheoordinationenginehasahane
todeidewhih onein thegivensituation anbethebestsolution.
Sine the examination of everysingle doument for eah request beomes infeasible even for aolletion
with the modest size, two dierent indexing ltering strategies have been also implemented. The rst one,
named Inverted List Filtering Strategy, reates for every word the list of doumentshaving that word. The
inbuiltsimpliation,tendingtodramatiallyredueasizeofinvertedlists,ismadebynotstoringthepositions
of words in the orresponding douments. While a strategy due to suh a design deision beomes more
eient, it loses its ability to support requests with a phrase. The seond Position Filtering Strategy will
not utilise suh a simpliation regarding not storing the positions, and thus will be ableto aurately nd
doumentswithrequestedphrases. Asthisseondstrategyisnaturallymoreexpensive,thetrade-o,between
providing the aurate results and responding quikly, beomes evident and unavoidable for requests with
phrases.
Thepropertyof fuzzylustering [24℄, to assign doumentsto multiple lusters together withspeifyinga
degreetowhihapartiularartilebelongstoagivenluster,hasbeenusedastheinspirationforarealisation
of a dediated Fuzzy Filtering Strategy. Whileits strength is in keeping short luster summaries in the high
speedmemory,itsgreatestweaknessliesinausedsimpliationtolusterdoumentsinadvanexedlusters.
Thefew dierent versions of this fuzzyltering strategyare nallyimplemented by limitingthe amount ofa
Everymentionedlteringstrategyisalso exploitedforreatingitsappropriatelone, whihwill takeinto
aount onlywordsfrom amanuallyreatedditionary. Bylimiting thevoabulary to fewthousandsinstead
to morethan half a million, underlying models are muh smaller, and thus theunderlying strategiesbeome
moreeient. Unfortunately,thepaidprieliesinthelostofasupportforallrequestswithwordsthatarenot
pre-seleted,resulting in thepotentiallylowerquality of foundltering results. These lone strategiesnally
provide even morefruitful playing ground for both ooperation and oordination mehanisms, whih should
provetheirapabilitieswhileresolvingthementionedtrade-oproblems.
4.6. Presentation. As oneof the main design priniples has been to support the user throughout the
wholeday,thePIAsystemprovidesseveraldierentaessmethodsandadaptsitsinterfaestotheuseddevie
(Figure 4.3). To fulll these requirements an agent platform (Multi Aess Servie Platform)wasdeveloped
thatoptimizesthegraphialuserinterfaefortheaessbyDesktopPCs,PDAs andsmartphones.
If the user wants to use the PIA system, the request is reeived by the Multi Aess Servie Platform
(MASP).TheMASPdelegatestherequesttoanagent,providingthelogiforthisservie. InthePIAsystem
therequestsareforwardedeitherto loginagentorthepersonalagent. Thehosenagentperformstheservie
spei ations and sendsthe MASP anabstrat desriptionof the formularthat should bepresentedto the
user. Forthis purpose the XML basedAbstrat Interation Desription Language(AIDL) has been dened.
BasedontheabstratdesriptionandthefeaturesoftheuseddevietheMASPgeneratesanoptimizedinterfae
presentedto the user. The onversionis implementedasaXSLTtransformation in whih the optimalXSLT
stylesheetisseletedbasedontheCC/PPinformationabouttheuser'sdevie.
TheMultiAessServiePlatformprovidesageneriinfrastrutureforprovidingdevieoptimizedinterfaes
for abig number of devies. Thebasi idea of MASP is to separate the appliation logi from the onrete
interfaedesign. Sotheappliationdeveloperdoesnothavetoopewiththespei harateristioftheeah
relevantdevieand anonentrateontheappliation spei dataowandinterationlogi.
Fig.4.3. InformationaessedbybrowserortailoredforpresentationonaPDAoramobilephone
For dening the interfae of an appliation the XML based Abstrat Interation Desription Language
(AIDL)hasbeendened. Thedenition of auserinteration (typiallyoneweb page)isstrutures asatree
of predened user interfae elements (e.g. label, input eld). An exemplary page desription is shown in
Program1.
TheabstratinterfaedesriptionanbeeasilytransformedintoHTML,PDAoptimizedHTMLorWML.
Ifthe userwantsto haveavoieinterfae, astylesheet for onvertingthe abstrat userinterfaedesription
Program1TheabstratinterationdesriptionofthePIAloginpage
<?xml version="1.0" enoding="UTF-8"?>
<senario name="loginPage">
<input>
<UIElement>
<list name="rootNode">
<UIElement>
<pageSetting name="user_language">
<value>de</value>
</pageSetting>
</UIElement>
<UIElement>
<label name="login__piaLoginQXQ 25">
<value>PIA-Login</value>
</label>
</UIElement>
<UIElement>
<list name="login__data">
<UIElement>
<label name="login__userName">
<value>Benutzername:</v alu e>
</label>
</UIElement>
<UIElement>
<fieldValue name="login__userName_d efau lt">
<value>andreas</value>
</fieldValue>
</UIElement>
</list>
</UIElement>
...
<servieLink name="reateAountServi eLi nk">
<provider address="tpip://127.0.0.1 :73 25" name="PIAgent"/>
<servie name="MAPPresent"/>
<param name="senario">reateA ount </p aram >
</servieLink>
</list>
</UIElement>
</input>
</senario>
ThetransformationoftheabstratinterfaedesriptionisdoneusingExtensibleStylesheetLanguageT
rans-formations (XSLT). A XSLT transformation is typiallywritten by adesigner who is an expert for reating
optimizeduserinterfaesforadevieonsideringthepreferenesoftherespetiveaudiene. Forsimplifyingthe
buildingofXSLTtransformations,theMASPprovidesasetofgenerirulesfortransformingthefrequent
ele-mentsoftheabstratuserinterfaedesriptionsintobasiHTMLorWML.Basedontheserulesmoreomplex
anddevieoptimizedXSLTtransformationsanbedened.
An important feature of the utilised MASP is the support of Composite Capability/Preferene Proles
(CC/PP). Consideringthespei features and properties (e.g. sreensize, supported ssversion, supported
imageformats)theuserinterfaedesigneranoptimizetheinterfaestothepropertiesoftherespetivedevie.
For onverting media data into a devie adequate format, the MASP provides a omponent for saling and
onvertingimagesandvideos.
Theomponentsand interfaesof theMulti-Aess-ServiePlatform are shownin Figure4.4. Userswho
wanttousethePIAservieinteratwiththeMultiAessPoint. TheMAPontainsomponentsfor
intera-tionwiththerespetivedevie (e.g. webserverorvoieserver) andomponentsforrenderingtheappliation
interfaeoptimized forsupported devies. ApprovedrenderingomponentsforHTML, WML and VoieXML
baseduser interfaesexists;omponentsforapplet basedomponentsareunder development. Forthe devie
independentinterfaedesriptiontheMASP usestheAbstratInterfaeDesriptionLanguage(AIDL)that is
andspei representationrules. AdditionallytheMedia Caheomponentprovidesthemediaontentaswell
asonnetivityto externalmediaproviders.
Fig.4.4.ThearhitetureoftheMASP
BesideofthefeaturesprovidedbyMASPthedesignoftheuserinterfaemustreateaneasytousesystem
evenondevieswithatinysreenandwithoutakeyboard. ThatiswhythePIAinterfaeprovidesadditional
navigationelementsonomplexformsandminimizestheuseoftextinputelds. Newresultsmathingadened
requestarepresentedrstasalistofshortbloksontainingonlytitle,abstratandsomemeta-information(as
thisisfamiliar toeveryuserfrom well-knownsearh-engines). This informationisalsowellreadableonPDAs
or even mobile phones. Important artiles an be storedin a repository. This allows theuser to hoose the
artilesonhisPDAhewantstoread laterathisdesktopPC.
Dependingonthepersonaloptionsspeiedbytheuser,oldinformationfoundforaspeiquerymaybe
deleted automatially stepby stepafter agiven time, that is, there isalwaysup to date information that is
presentedtotheuser(weallthissmartmode). Thisisforexampleonvenientforgettingpersonalizedltering
news. Theother optionisto keepthatinformationunlimited (global mode)foraqueryfore.g. basisienti
papers.
For the push-servies (that is, the system is beoming ative and sending the user information without
anexpliit request), theuser speies his working time (e.g. 9am to 5pm). This divides thedayin a pre-,
work,andarereationslot,wherethePIAsystemassumesdierentdemandsofinformation. Foreahslot an
adequatedeliveringtehnology anbehosen(e-mail, sms, mms, faxorVoie). Ifyoudeideto subsribeto
thepersonalizednewsletter,newrelevantinformation foryouwill beolletedand sentbye-mailorfax one
adaywithasimilarlayoutandstruture foronvenientreadingifyouhavenotseenitalreadybyother
pull-orpushservies. Thereforeyouanalsostayinformedwithouthavingtologintothesystemandifyoudonot
wantto getallnewinformationimmediately.
5. Conlusionand future work. Theimplementedsystemhasanaeptableruntimeperformaneand
showsthatitisagoodhoietodevelopapersonalinformationsystemusingagent-tehnologybasedonasolid
agent-framework like JIAC IV. Currently, PIA system supports more than 120 dierent web soures, grabs
dailyaround3.000newsemi-struturedandunstrutureddouments,hasalmost500.000alreadypre-proessed
artiles,andativelyhelpsaboutftysientistsrelatedtoourlaboratoryintheirinformationretrievalativities.
Theirfeedbakand evaluationis avaluable inputfor thefurther improvementof PIA.In thenear future we
planto inreasethenumberofuserstothousands, andthereforeweplanto work onthefurther optimization
of theltering algorithmsto beable to simultaneouslyrespond to multiple ltering requests. Also,wethink
aboutintegratingadditionalserviesfortheuserthatprovideinformationtailoredtohis geographialposition
REFERENCES
[1℄ S.Brinand L.Page,Theanatomyof alarge-salehyper textual(Web)searh engine,inPro.7thInternational World
WideWebConfereneonComputerNetworks,30(1-7),1998,pp.107117.
[2℄ M.BalabanoviandS.Yoav,FAB:Content-Based CollaborativeReommendation,CommuniationoftheACM,40(3),
1997,pp.6672.
[3℄ A.Moukas,Amalthaea: InformationDisoveryand FilteringusingaMultiagentEvolving Eosystem,inPratial
Appli-ationofIntelligentAgents&Multi-AgentTehnology,London1998.
[4℄ B.Zhang,andY.Seo,PersonalizedWeb-DoumentFilteringUsingReinforementLearning,AppliedArtiialIntelligene,
15(7),2001,pp.665685.
[5℄ M.ClaypoolandA.GokhaleandT.MirandaandP.MurnikovandD.NetesandN.Sartin,Combining
Content-Based and Collaborative Filtersin anOnline Newspaper, inPro. ACMSIGIRWorkshop on ReommenderSystems,
Berkeley,CA,August19,1999.
[6℄ J.DelgadoandR.Davidson,Knowledgebasesanduserprolingintravelandhospitalityreommendersystems,inPro.
oftheENTER2002Conferene,Innsbruk,Austria,2002,pp.116.
[7℄ D.KuropkaandT.Serries,PersonalInformationAgent,inPro.InformatikJahrestagung2001,pp.940946.
[8℄ H.Lieberman,Letizia: AnAgentThatAssistsWebBrowsing,inPro.International JointConfereneonArtiial
Intelli-gene,Montreal,August1995.
[9℄ H.Liebermanand N.VanDykeand A.Vivaqua,Let's Browse: ACollaborativeBrowsingAgent,Knowledge-Based
Systems,12(8),1999,pp.427431.
[10℄ B.Sheth,ALearningApproahtoPersonalized InformationFiltering,M.S.Thesis,MITdept,USA,1994.
[11℄ T.Joahims andD.FreitagandT. Mithell,WebWather: ATourGuidefortheWorldWideWeb,inPro.IJCAI
(1),1997,pp.770777.
[12℄ W. Winiwarter,PEAAPersonalEmail Assistant withEvolutionary Adaptation, International JournalofInformation
Tehnology,5(1),1999.
[13℄ C. Thomas and G. Fisher, Usingagents to improve the usability and usefulnessof theworld wide web, inPro. 5th
InternationalConfereneonUserModelling,1996,pp.512.
[14℄ N.JenningsandM.Wooldridge,Agent-orientedsoftwareengineering,HandbookofAgentTehnology(ed.J.Bradshaw),
AAAI/MITPress,2000.
[15℄ R.Sesseler andS.Albayrak,Agent-based MarketplaesforEletroniCommere,inPro.InternationalConfereneon
ArtiialIntelligene,IC-AI2001.
[16℄ P.Resnikand J.NeophytosandM.SuhakandP. Bergstrom andJ.Riedl,GroupLens: Anopen arhiteture
forollaborativelteringof netnews, inPro.ACMConfereneonComputer-SupportedCooperativeWork,1994,pp.
175186.
[17℄ S. Albayrak and D. Milosevi, Self Improving Coordination in Multi Agent Filtering Framework, in Pro.
IEEE/WIC/ACM International Joint Confereneon IntelligentAgenttehnology (IAT 04) and WebIntelligene (WI
04),Beijing,China,September2004.
[18℄ D.Nihols,ImpliitRatingandFiltering,inPro.5thDELOSWorkshoponFilteringandCollaborativeFiltering,Budapest,
Hungary,1997,pp.3136.
[19℄ D.Tauritz,AdaptiveInformationFiltering: oneptsandalgorithms,Ph.D.dissertation,LeidenUniversity,2002.
[20℄ S.FrikeandK.BsufkaandJ.KeiserandT.ShmidtandR.SesselerandS.Albayrak,Agent-basedTelemati
ServiesandTeleomAppliations,CommuniationsoftheACM,2001.
[21℄ FoundationforIntelligentPhysialAgents,www.fipa.org2004.
[22℄ L.JingandH.HuangandH.Shi,ImprovedFeatureSeletionApproahTFIDFinTextMining,inPro.1stInternational
ConfereneonMahineLearningandCybernetis,Beijing,2002.
[23℄ S.Albayrak andD.Milosevi,Situation-awareCoordination inMultiAgentFilteringFramework,inPro.19th
Inter-nationalSymposiumonComputerandInformationSienes(ISCIS04),Antalya,Turkey,2004.
[24℄ M.Zbigniew andD.Fogel,HowtoSolveIt: ModernHeuristis,Springer-VerlagNewYork,In.,NewYork,NY,2000.
[25℄ S. Albayrak and D. Milosevi, Cooperative Community Seletion in Multi Agent Filtering Framework, in Pro.
IEEE/WIC/ACMInternationalJointConfereneonIntelligentAgenttehnology(IAT05)andWebIntelligene(WI05),
CompiegneUniversityofTehnology,Frane,2005,pp.527535.
Editedby: MarinPaprzyki,NiranjanSuri
Reeived: Otober1,2006