• No results found

Agent Technology for Personalized Information Filtering: The PIA System

N/A
N/A
Protected

Academic year: 2020

Share "Agent Technology for Personalized Information Filtering: The PIA System"

Copied!
12
0
0

Loading.... (view fulltext now)

Full text

(1)

AGENT TECHNOLOGY FORPERSONALIZED INFORMATIONFILTERING: THEPIA

SYSTEM

SAHIN ALBAYRAK

, STEFAN WOLLNY

, ANDREAS LOMMATZSCH

, AND DRAGAN MILOSEVIC

Abstrat. Astodaytheamountofaessibleinformationisoverwhelming,theintelligentandpersonalizedlteringofavailable

informationisa mainhallenge. Additionally,there isa growing need for the seamlessmobileand multi-modalsystemusage

throughoutthewholedaytomeettherequirementsofthemodernsoiety(anytime,anywhere, anyhow). Apersonalinformation

agentthatisdeliveringthe rightinformationat the righttimebyaessing,lteringand presentinginformationina

situation-awarematterisneeded. ApplyingAgent-tehnology ispromising,beausethe inherentapabilitiesofagentslikeautonomy,

pro-andreativenessoeranadequateapproah. Wedevelopedanagent-basedpersonalinformationsystemalledPIAforolleting,

ltering,and integratinginformationataommonpoint,oeringaessto theinformationbyWWW, e-mail,SMS,MMS,and

J2MElients. Pushand pulltehniquesareombinedallowingthe usertosearh expliitlyforspeiinformationon the one

handandtobeinformedautomatiallyaboutrelevantinformationdividedinpre-,workandrereationslotsonthe otherhand.

IntheoreofthePIAsystemadvanedlteringtehniquesaredeployedthroughmultiplelteringagentommunitiesfor

ontent-basedandollaborativeltering. Information-extratingagentsareonstantlygatheringnewrelevantinformationfromavarietyof

seletedsoures(internet,les,databases,web-servieset.). Apersonalagentforeahuserismanagingtheindividualinformation

provisioning,tailoredtotheneedsofthisspeiuser,knowingtheprole,theurrentsituationandlearningfromfeedbak.

Keywords. intelligentandpersonalizedltering,ubiquitousaess,reommendationsystems,agentsandomplexsystems,

agent-baseddeployedappliations,evolution,adaptationandlearning.

1. Introdution. Nowadays, desiredinformation oftenremains unfound, beauseit is hidden in ahuge

amount of unneessary and irrelevant data. On the Internet there are well maintained searh engines that

are highly useful if you want to do full-text keyword-searh [1℄, but they are not able to support you in a

personalized way and typially do not oer any push-serviesor in other words no information will be sent

to you when you are not ative. Also, as they normally do not adapt themselves to mobile devies, they

annot be used throughout a whole day beause you are not sitting in front of a standard browser all the

timeand whenyoureturn,these systemswilltreatyouin theverysamewayasifyouhaveneverbeenthere

before (nopersonalization, nolearning). Users whoare notfamiliar withdomain-spei keywordswon'tbe

ableto do suessfulresearh, beause no support is oered. Predened or auto-generatedkeywords for the

searh-domainsareneededtollthat gap. Asinformationdemands areontinuouslyinreasingtodayandthe

gatheringof information is time-onsuming, there is agrowingneed for a personalized support (Figure 1.1).

Labor-savinginformationisneededto inreaseprodutivityatworkandalso thereis aninreasingaspiration

for a personalized oer of general information, spei domain knowledge, entertainment, shopping, tness,

lifestyleandhealthinformation. Existingommerialpersonalizedsystemsarefarawayfromthatfuntionality,

astheyusuallydonotoermuhmorethanallowingtohoosethekindofthelayoutorolletingsomeofthe

oeredinformationhannels(andtheinformationwithin isnotpersonalized).

Tooveromethatsituationyouneedapersonalinformationagent(PIA)whoknowsthewayofyourthinking

andanreallysupportyouthroughoutthewholedaybyaessing,lteringandpresentinginformationtoyouin

asituation-awarematter(Figure1.1). Somesystemsexist(Fab[2℄,Amalthaea[3℄,WAIR[4℄,P-Tango[5℄,T

rip-Mather[6℄,PIAgent[7℄,Letizia[8℄,Let'sBrowse[9℄,Newt[10℄,WebWather[11℄,PEA[12℄,BAZAR[13℄)that

implementadvanedalgorithmitehnology,butdidnotbeomewidelyaepted,maybebeauseofrealworld

requirementslikeavailability,salabilityandadaptationtourrentandfuturestandardsandmobiledevies.

In this paper we present an agent-basedapproah for the eient, seamless and tailoredprovisioning of

relevantinformation on the basis of end-users' daily routine. As we assume the reader will befamiliar with

agent-tehnology(see[14,15℄foragoodintrodution),wewillonentrateonthepratialusageandthe

real-world advantages of agent-tehnology. After briey desribing theexisting systemsfrom whih thesienti

publiationsareavailable,wedesribethedesignandarhitetureandafterwardsdepitthesysteminSetion4.

2. Related work. The followingparagraphsare goingto briey present someofthe alreadymentioned

systems(Fab,Amalthaea,WAIR,P-Tango,PIAgent,Letizia,Let'sBrowse,Newt,WebWatherandPEA), for

whih webelievethatarerelatedtoourwork.

DAI-Labor,TehnialUniversityBerlin,Germany(sahindai-lab.de)

HighPerformaneComputing,ZuseInstituteBerlin,Germany(wollnyzib.de)

(2)

Fig.1.1.Demandforapersonalinformationagent

Fab[2℄isanautomatireommendationservieforinformationretrieval,whih isabletoovertimeadapt

toitsusers,whoonsequentlyreeiveinreasinglypersonalizeddouments. Bymaintainingbotholletionand

seletionagents,Fabsystemisagoodtest-bedfortryingoutdierentlteringstrategies, whih eitherollet

doumentsfrom the Webthat belong to theertain topi,orseletsomeofthe olleteddoumentsthat are

suitableforapartiularuser. Thereationofprolesthroughtheontent-basedanalysis,whihareafterwards

diretly ompared to nd similar users for ollaborative reommendations, represents the unique synergy of

these twofrequentlyombined lteringtehniques. Unfortunately, theusability ofthe whole systemdepends

onthe ability of theontent basedlteringto generate theusableproles, being theserious drawbakof the

Fabsystem.

Sine information disoveryand information lteringare proven to bethe suitable domains for applying

multi-agenttehnology,apersonalizedsystem,namedAmalthaea[3℄hasbeendeveloped. Itproativelytriesto

disoverfromvariousdistributedsourestheinformationthatisrelevanttoauser. Themulti-agenttehnology

is applied by maintainingtwodierent typesof agents, beinginformation lteringand information disovery

ones. The ways how these agents are managing to learn the user's interests and habits, to maintain their

ompetenebyadapting to thehanges inthe user'sinformationneeds,and toexplore thenewdomains that

maybeofinteresttoauser,dependonevolutionprogramming,beingmaybenotsoappliableforthelarge-sale

informationretrievaltasks.

Seekingthestateofauserprole,whihbestrepresentsatualinformationinterestsandthereforemaximize

the expeted value of the umulative user relevane feedbak, is formulated in WAIR multi-agentsystem [4℄

as the reinforement learning problem. The insuieny of expliit user ratings is tried to be overome by

using the lassiationapproah based on the neural network,whih exploits dierent impliit indiators of

interests in order to estimate the real relevane feedbak values. Unfortunately, the amount of the expliit

ratingsneededfortrainingthat lassierstill seemstobetoolarge. Thislearlylimitstheappliabilityofthe

WAIRsystem.

Tointelligentlydeliverapersonalizednewspaper, whih ontainsonlytheartiles ofhighestinterest that

areindividually seletedeverydayforeah andeveryuser,P-Tango[5℄systemproposesaframeworkfor

om-biningdierentlteringstrategies. Althoughtheurrentlyombinedstrategiesareonlytheontent-basedand

ollaborativeones,aproposedframeworkissigniant,byreasonofbeingextendibleto anylteringmethods.

Inspite of this extensibility, we believe that theagent-basedframeworkthat weproposein this paper,oers

betterexibilitywhentheintegrationandafterwardstheusageofnewstrategiesisonerned.

Astheinformationbeametheoneofmostsigniantresouresforbusinessandresearh,bothperiodially

sanning dierent information soures and pushing the found relevant artiles to interested users, have also

motivated thedevelopmentof PIAgent [7℄. Whileausage of various extratoragents eah supporting a

(3)

relevantartilesfromothers. Suhaneuralnetworkapproahhasstrengthinoptimistiallyprovidingexellent

lassiationauray. Unfortunately, its bigweaknessin oftenexpensivetrainingthat pratiallymakesthe

PIAgentto behardlyappliablefornowadaysinformationretrievaltasks.

Theintelligentassistanetotheuser,whoisbrowsingtheInternetfortheinterestinginformation,isprovided

by theautonomous interfae agent, named Letizia [8℄. It traks user behavior and uses various heurististo

antiipate, whih hyperlinksmay leadtothe potentiallyrelevantdouments, andwhih shouldbeignored by

reasonofpointingtojunkornotexistingpage. TheornerstonepropertyoftheLetiziasystemisinaskingthe

userneitherto expliitlystateitsinterestsby deningthe querynortoprovidetheexpliitfeedbakabouta

realrelevaneofreommendations. Althoughthisexpliitommuniationwiththeuseranspeeduplearning,

the priorityin designing theLetizia system has been given to bothletting theuser to browse withoutbeing

interruptedandaskingforhelp onlywhen beingunsurewhihlinktofollow.

TheMITMediaLaboratoryhasalsodevelopedanagent,whosejob isto hoose, fromthelinks reahable

ontheurrentWebpage,thosethatarelikelytobestsatisfytheinterestsofmultipleusers. Theagentisnamed

Let's Browse [9℄, by reason of providing the assistane to the group of humans in browsing, by suggesting

hyperlinkslikelytobeofommoninterests. Althoughthissystemdemonstrateshowdoumentsthataregood

for the group of users and that are in the neighborhood an be found, it generally does not respond to the

hallengeofndingthedatathatisloatedanywhereontheInternet.

Theability to both speialize to userinterests,adapt to preferene hangesand explore the newer

infor-mation domains makes thefoundation of theNewT [10℄, being onepersonalized multi-agentltering system

fornews artiles. As user information interestsare modeled asthe populationof theompeting proles, the

used learning mehanismsare both relevane feedbak, aswell asthe rossoverand mutation geneti

opera-tors. Thesereombination genetioperators aremainly responsiblefor theadaptationand explorationissues

byreatingmorettedfuture populations. Inthemeantime, auserprolealsolearnsthroughtheappliation

of the relevane feedbak tehniques. Takentogether these learning mehanismsmakethe so-alled Baldwin

eet[24℄,sayingthatapopulationevolvestowardsatterformmuhfaster,wheneveritsmembersareallowed

to learnduringtheir lifetime. Althoughthe Baldwinevolutionseemsto bemorepowerfulthan theevolution

approahusedinAmalthaea,ithasthesameweaknesseswhihlimititsappliabilityforlarge-saleinformation

retrieval.

Usersmaynd itdiultbothtoreate theappropriatequeriesand toloatetheinformation ofinterest

in thease ofhavingnospei knowledge aboutthe ontent ofthe underlying doumentolletion. On the

onehand,somesystemsaimtodeployeientlusteringalgorithms,whihwilldynamiallyproduethetable

ofontents,neededtofailitate theusers'browsingativities. Theornerstoneideais tobysomemeanshelp

auserrsttogetanoverviewonerningtheavailable ontent,andthento auratelyexpressitsinformation

needs. On the other hand, WebWather [11℄ ats as the tour guide that provides the assistane, whih is

similar to theguidane ofthe humanin thereal museum. It aompanies usersfrom page to page, suggests

appropriate hyperlinks, and learnsfrom the obtained experiene to improve its advie givingskills. Suh a

systemunfortunatelyanonlyloallyassisttheuser,whihbringsthesamedrawbaksbeingpresentin Letizia

andLet'sBrowsesystems.

PersonalEmail Assistant(PEA)[12℄ltersinomingmailsandranksthemaordingtotheirrelevanein

ordertohelpnowadaysusers,whoeasilyendupwithlargepartoftheirworkingdaybeingspentwithreading

emails. PEA maintains the personal user model that onsists of several proles and uses the evolutionary

algorithmstomovethatmodelonstantlylosertotheurrentinformationneeds. BydoingthatPEA aimsat

assistingusersindealingmoreeetivelywiththeirdailyloadofemailsso thatvaluableworkingtimeissaved

formoreprodutiveandreativetasks. Eventhoughtheevolutionstrategiesseemsto bepowerfulenoughfor

dealingwithemailsinthePEAsystem,theirusagein theInternet-likeenvironmentstillremainstobeagreat

hallenge.

3. Design of PIA: The Personal Information Agent. Tomeet the disussed requirements and to

support theuser in that matter, wedesigneda multi-agentsystemomposed offour lasses of agents: many

informationextratingagents,agentsthatimplementdierentlteringstrategies,agentsforprovidingdierent

kindsofpresentationandonepersonalagentforeahuser. Logially,allthisanbeseenasalassialthreetier

appliation(Figure3.1). Conerningtheinformationextration,generalsearhenginesontheonehandbutalso

(4)

Several agentsrealize dierent lteringstrategies(ontent-basedand ollaborativeltering [16℄, [5℄) that

haveto beombinedin an intelligentmatter. Alsoagentsforprovidinginformation ativelyviaSMS, MMS,

Fax,e-mail(push-servies)areneeded. AMulti-aessservieplatformhasto managethepresentationofthe

resultstailoredtotheuseddevieandtheurrentsituation. Thepersonalagentshould onstantlyimprovethe

knowledgeabouthis userbylearningfromthegivenfeedbak,whihisalsotakenforollaborativeltering,as

information thathasbeenrated ashighlyinterestingmightbe usefulforauserwith asimilarproleaswell.

Asusersusuallyarenotverykeenongivingexpliitfeedbak(ratings),impliitfeedbaklikethefat thatthe

userstoredanartileanalsobetakeninto aount[18℄.

A keywordassistantshould support the user in dening queries even if he is not familiar with a ertain

domain. PIAprovidesthreestrategiesforndingadequatekeywordsandforoptimizingexisting requests:

1. Keywords predened by experts for frequentlyrequested topis (or ategories) anhelp the

unexpe-riened user to nd the relevant keywords. The suggestionsprovided by domain experts are usually agood

startingpointforindividualrequests.

2. An alternativemethod for nding interesting keywordsis the extration of words and phrasesfrom

interestingpapers. This strategyhelps the userto identify the keyonepts from apaper that anbeuseful

for ndingother relevant douments. Inontrast to other approahes(like Googles Find similar douments)

thekeywordextrationgivesthe usertheopportunityto adapt extratedkeywordsaordingto thepersonal

interestsandpreferenes.

3. ForoptimizingexistingqueriesthePIAsystemsuggestskeywordsfromsimilarrequests. Foromputing

thesimilaritiesbetweenuserrequeststhesystemsanalysestheoverlappingofuserproles(basedonstems)and

theorelation between theuserratings. Keywordsthat frequently oure in therequests of similar usersare

suggestedto theuseraspotentiallyrelevantsearhterms.

Thewholesystemisdesignedtobehighlysalable,easytomodify,toadaptandtoimproveandtherefore

an agent-based approah that allows to integrate or to remove agents even at run-time is a smart hoie.

The dierent ltering tehniques are needed to provideaurate results, beause the weakness of individual

tehniquesshouldbeompensatedbythestrengthsofothers. Doumentsshouldbelogiallylusteredbytheir

domainstoallowfastaess,andforeahdoumentseveralmodels[19℄willbebuilt,allinludingstemmingand

stop-wordelimination,butsometailoredforveryeientretrievalatrun-timeandotherstosupportadvaned

lteringalgorithmsforahighauray.

Ifthesystemnotiesthattheontent-basedlteringisnotabletooersuientresults,additional

infor-mationshouldbeoeredbyollaborativeltering,i.e. informationthatwasratedasinterestingbyauserwith

asimilar prolewill bepresented.

Withthepush-servies,theuserandeidetogetnewintegratedrelevantinformationimmediatelyandon

amobiledevie, butforuserswhodonotwanttogetnewinformationimmediately,apersonalizednewsletter

also hasto beoered. This newsletter isolleting newrelevantinformation to beonvenientlydeliveredby

e-mails,allowinguserstostayinformedeveniftheyarenotativelyusingthesystemforsometime.

4. Deploymentand evaluation.

4.1. Overview. We implemented the system using Java and standard open soure database and

web-tehnologyandbasedontheJIACIVagentframework[20℄. JIACIVisFIPA2000ompliant[21℄,thatisitis

onformingtothelateststandards.

It onsists ofaommuniationinfrastrutureaswellasserviesfor administratingand organizingagents

(AgentManagementServie,AMSandDiretoryFailitator,DF).TheJIACIVframeworkprovidesavarietyof

managementandseurityfuntions,managementserviesinludingonguration,faultmanagementandevent

logging, seurityaspets inludingauthorization, authentiation and mehanismsfor measuringand ensuring

trust and therefore has been a good hoie to be used from the outset to the development of a real world

appliation.

WithinJIACIV,agentsarearrangedonplatforms,allowingthearrangementofagentsthatbelongtogether

withtheontrol ofat leastonemanager. A lot ofvisualtoolsareoeredto dealwithadministrationaspets.

Figure3.2showsaplatform,inthisasewithagentsforthebuildingofdierentmodelsspeializedfordierent

retrievalalgorithms.

TheprototypialsystemisurrentlyrunningonSun-Fire-880,Sun-Fire-480RandSunFireV65x,whereas

(5)

Fig.3.1. ThePIASystemseenasathreetierappliation

Fig.3.2. Severalagentsarebuildingdierentmodelsspeilised fordierentretrievalalgorithms

New ontent is stored, validated and onsolidated in a entral relational database(update-driven).

In-formation an be aessed by WWW, e-mail, SMS, MMS, and J2ME Clients, where the system adapts the

presentationaordingly,usingtheCC/PP(PreferenesProle) withatailoredlayoutforamobile phoneand

aPDA (seeSetion 4.6). Thepersonalizednewsletterandthepush-serviesaresentviae-mail,SMSorMMS.

The user anuse self-dened keywords for a request for information orhoose a ategoryand therefore the

systemwillusealistof keywordspredened byexpertsand updatedsmoothlybylearningfrom ollaborative

ltering. A ombination of both is also possible. The keyword assistant is able to extrat the most import

(6)

4.2. Gatheringnew information. Newinformationisonstantlyinsertedinthesystembyinformation

extrationagents,e.g. web-readeragentsoragentsthataresearhingspeieddatabasesordiretories.

Addi-tionalagentsfortheolletionofnewontentaneasilybeintegratedevenatruntime,asallthat isneessary

foranewagentistoregisterhimselfatthesystem,storetheextratedinformationatadeneddatabasetable

and inform the modeling-manager agent about the insertion. As a le reader-agentis onstantly observing

a speial diretory, manual insertionof douments an be done simply by drag-and-dropand an e-mail and

upload-interfaealsoexists.Soure analsobeintegratedbyWebservies. NewReadersanbereatedusing

aeasy-to-handletoolandanothertoolisenablingto onvenientlyobservetheextration-agents,asthis isthe

interfaetotheoutsidethatmightbeomeritialifforexamplethedata-formatofasoureishanged.

4.3. Pre-proessing for eient retrieval. The rst step of pre proessing information for eient

retrievalis theuseof distint tablesin the globaldatabasefordierentdomains like e.g. news, agent-related

papers,et. Dependingonthelteringrequest,tableswithnohaneofbeingrelevantanthereforebeomitted.

The next step is the building of several models for eah doument. Stemmingand stop-wordelimination is

implementedineverymodelbut dierentmodelsarebuiltbyomputingatermimportane eitherbasedonly

onloalfrequenies,orbasedontermfrequenyinversedoumentfrequeny(TFIDF)approah. Furthermore

numberofwordsthatshouldbeinludedinmodelsisdierentwhihmakesmodelseithermoreaurateormore

eient. Createdmodelsareindexedeitherondoumentorwordlevel,whihfailitatetheireientretrieval.

Themanageragentisassigningtheappropriatemodelingagentstostartbuildingtheirmodelsbutmightdeide

(orthehumansystemadministratorantellhim)atruntimetodelaylatesttime-onsumingmodelingativity

for a while if systemload is ritial at a moment. This feature is important for areal-world appliation, as

overloadinghasbeenamainreasonfortheun-usabilityof advaned aademisystems.

4.4. Filteringtehnology. Asthequalityofresultstoapartiularlteringrequestmightheavilydepend

ontheinformationdomain (news,sientipapers,onferenealls),dierentlteringommunitiesare

imple-mented. Foreah domain,there isat leastoneommunitywhih ontainsagentsbeingtailoredto dospei

lteringandmanagingtasksin aneientway. Insteadofhavingonlylteringagents(they willbedesribed

in Setion4.5),eahandeveryommunityhasalsooneso-alledmanageragentthatismainly responsiblefor

doingoordinationof lteringtasks,aswellasooperationwithothermanagers.

Theoordinationis basedonquality,CPU, DBandmemorytness values,whih arethemeasures being

assoiated to eah ltering agent [23℄. These measures respetively illustrate lteringagent suessfulnessin

thepast,itseieny inusingavailableCPUandDBresoures,andtheamountofmemorybeingrequiredfor

ltering. AhigherCPU,DBormemorytnessvaluemeansthatlteringagentneedsapartiularresoureina

smallerextentforperformingalteringtask. Thisfurther meansthat aninsuienyofapartiular resoure

hasasmallerinueneonlteringagentswith ahigherpartiulartnessvalue.

The introdued dierent tness values together with the knowledge about the urrent system runtime

performaneanmakeoordinationbesituationaware(seealso[23℄)inthewaythatwhenapartiularresoure

ishighlyloadedapriorityinoordinationshouldbegiventolteringagentsforwhihapartiularresourehas

aminor importane. Thissituation awareoordinationprovidesawayto balaneresponse time andltering

auray,whihisneededinoveromingtheproblemofndingaperfetlteringresultafterfewhoursoreven

fewdaysofanexpensiveltering.

Instead of assigning lteringtask to the agentwith the best ombination of tnessvalues in the urrent

runtimesituation,managerisgoingto employaproportionalseletionpriniple[24℄wheretheprobabilityfor

theagenttobehosentodoatuallteringisproportionaltothementionedombinationofitstnessvalues.By

notalwaysrelyingonlyonthemostpromisingagents,butalsosometimesoeringajobtootheragents,manager

gives ahane to eah and everyagentto improveits tness values. While theadaptation of quality tness

value anbeaomplished after the user feedbak beame available, the other tness valuesan be hanged

immediatelyafter thelteringwasnishedthroughtheresponse timeanalyses. Theadaptationsheme hasa

dereasinglearningratethatpreventsalreadylearnttnessvaluesofbeingdestroyed,whihfurthermeansthat

provenagentspaysmallerpenaltiesforbadjobsthanthenovieones[17℄.

The underlying oordinationativities, essentially responsible for the mentioned optimal exploitation of

availablesystemresoures,arerepresentedonFigure4.1,givingthesimplestpossibleseletionsenario. Under

the assumptionthat everythinggoesperfetly, the senariostarts with ajob reationand ends with aresult

(7)

agent (F) that enapsulates the partiular searhing algorithm (deployed ltering strategies are desribed in

Setion 4.5), is produed results, the manager M an adapt tness valuesbased on the measurement of the

responsetime. ThefoundlteringresultsarenallyreturnedbaktotheuseragentU,andthissimplesenario

ends.

Fig.4.1. Systemarhitetureillustratingagentommuniation forresoure-awareoordination

In the asewhere the reeived ltering task annot be suessfully loally aomplished usually beause

of belonging to unsupported information domain, manager agent has to ooperate with other ommunities.

Whileoordinationtakesplaeinsideeahandeverylteringommunitybetweenmanagerandlteringagents,

ooperationoursbetweenommunitiesamongmanageragents(seealsoFigure4.2). Theooperationisbased

oneitherndingaommunitywhihsupportsgivendomainorinsplittingreeivedtaskonsub-taskswherefor

eahsub-taskaommunitywithgood supportexists.

The information is usually sattered around many dierent, more or less dynami, distributed soures.

Twoornerstone hallengesthereforebeomebothnding whih souresshould beonsultedforresolvingthe

partiularrequest, aswellasputting the foundresultstogether. Whilethe hallenge of searhingfor soures

is known as the database seletion problem, the omposing of a nal result set is often simply referred as

theinformation fusion. Onelightooperationapproah,already published in [25℄, andwhih isbasedon the

appliationoftheintelligentooperativeagents,isgoingtobebrieyillustratedintherestofthissub-setion.

Thefundamental ooperationideaisbased ontheinstallation ofat least onelteringommunityaround

eahdatabase,aswellasonsettingupthesophistiatedmehanisms,whihenablethattheseommunitiesan

eientlyhelpeahotherwhileproessingtheinomingrequests. Althoughthelteringrequestanbesentto

any lteringommunity, themost suitableommunities will beautonomously found, and therequest will be

then dispathed to them. The foundresultswill be nally olleted,and only thebest ones will be returned

tothesenderofthelteringrequest. Themostappealingpropertybehindtheseooperativeproessingisthat

everythingisdonetransparentlyfortheuser, beingnotanymoreforedto manuallythink where therequest

shouldbesent,andwhih obtainedlteringresultsarereallythebestones.

Example(Coordination&Cooperation)Figure4.2presentsahighleveloverviewofthelteringframework

being omposed of three dierent ltering ommunities (FC), where eah ommunity hasone lter manager

agent(M)anddierentnumberofspeializedlteringagents(F).Therearetwodierentdatabases(DB) with

informationfromdierentdomains,andtheyareaessedatleastbyoneommunity. Onthegureooperation

isillustratedasairlewitharrowswhihonnetmanageragents.

(8)

Fig.4.2.Filteringframeworkwiththreedierentommunities

strategy, their short desriptions will be given in the following paragraphs in order to make this paper

self-ontainedandtoleartherolesoftheagentsonFigure3.2.

Byusing thetermfrequenyinversedoumentfrequenysheme,theimportanevaluesofdierentwords

anbeomputed,andeahandeverydoumentanbemodelledbyaorrespondingweightedvetor. Whilethe

so-alledLargeFiltering Strategy willalwaysbuild amodel withallwordsfrom adoument,OptimalTwenty,

Optimal Ten,andOptimal FiveFiltering Strategieswill takeintoonsiderationonlytwenty,ten,andvemost

important words, respetively. Themodels, being reatedby these optimal strategies, are thus smaller, and

onsequentlyanbefasterbothloadedintomemoryandomparedwithalteringrequest. Astheyareomitting

manywords,theymightbeatthesametimepotentiallylessaurate,andtheoordinationenginehasahane

todeidewhih onein thegivensituation anbethebestsolution.

Sine the examination of everysingle doument for eah request beomes infeasible even for aolletion

with the modest size, two dierent indexing ltering strategies have been also implemented. The rst one,

named Inverted List Filtering Strategy, reates for every word the list of doumentshaving that word. The

inbuiltsimpliation,tendingtodramatiallyredueasizeofinvertedlists,ismadebynotstoringthepositions

of words in the orresponding douments. While a strategy due to suh a design deision beomes more

eient, it loses its ability to support requests with a phrase. The seond Position Filtering Strategy will

not utilise suh a simpliation regarding not storing the positions, and thus will be ableto aurately nd

doumentswithrequestedphrases. Asthisseondstrategyisnaturallymoreexpensive,thetrade-o,between

providing the aurate results and responding quikly, beomes evident and unavoidable for requests with

phrases.

Thepropertyof fuzzylustering [24℄, to assign doumentsto multiple lusters together withspeifyinga

degreetowhihapartiularartilebelongstoagivenluster,hasbeenusedastheinspirationforarealisation

of a dediated Fuzzy Filtering Strategy. Whileits strength is in keeping short luster summaries in the high

speedmemory,itsgreatestweaknessliesinausedsimpliationtolusterdoumentsinadvanexedlusters.

Thefew dierent versions of this fuzzyltering strategyare nallyimplemented by limitingthe amount ofa

(9)

Everymentionedlteringstrategyisalso exploitedforreatingitsappropriatelone, whihwill takeinto

aount onlywordsfrom amanuallyreatedditionary. Bylimiting thevoabulary to fewthousandsinstead

to morethan half a million, underlying models are muh smaller, and thus theunderlying strategiesbeome

moreeient. Unfortunately,thepaidprieliesinthelostofasupportforallrequestswithwordsthatarenot

pre-seleted,resulting in thepotentiallylowerquality of foundltering results. These lone strategiesnally

provide even morefruitful playing ground for both ooperation and oordination mehanisms, whih should

provetheirapabilitieswhileresolvingthementionedtrade-oproblems.

4.6. Presentation. As oneof the main design priniples has been to support the user throughout the

wholeday,thePIAsystemprovidesseveraldierentaessmethodsandadaptsitsinterfaestotheuseddevie

(Figure 4.3). To fulll these requirements an agent platform (Multi Aess Servie Platform)wasdeveloped

thatoptimizesthegraphialuserinterfaefortheaessbyDesktopPCs,PDAs andsmartphones.

If the user wants to use the PIA system, the request is reeived by the Multi Aess Servie Platform

(MASP).TheMASPdelegatestherequesttoanagent,providingthelogiforthisservie. InthePIAsystem

therequestsareforwardedeitherto loginagentorthepersonalagent. Thehosenagentperformstheservie

spei ations and sendsthe MASP anabstrat desriptionof the formularthat should bepresentedto the

user. Forthis purpose the XML basedAbstrat Interation Desription Language(AIDL) has been dened.

BasedontheabstratdesriptionandthefeaturesoftheuseddevietheMASPgeneratesanoptimizedinterfae

presentedto the user. The onversionis implementedasaXSLTtransformation in whih the optimalXSLT

stylesheetisseletedbasedontheCC/PPinformationabouttheuser'sdevie.

TheMultiAessServiePlatformprovidesageneriinfrastrutureforprovidingdevieoptimizedinterfaes

for abig number of devies. Thebasi idea of MASP is to separate the appliation logi from the onrete

interfaedesign. Sotheappliationdeveloperdoesnothavetoopewiththespei harateristioftheeah

relevantdevieand anonentrateontheappliation spei dataowandinterationlogi.

Fig.4.3. InformationaessedbybrowserortailoredforpresentationonaPDAoramobilephone

For dening the interfae of an appliation the XML based Abstrat Interation Desription Language

(AIDL)hasbeendened. Thedenition of auserinteration (typiallyoneweb page)isstrutures asatree

of predened user interfae elements (e.g. label, input eld). An exemplary page desription is shown in

Program1.

TheabstratinterfaedesriptionanbeeasilytransformedintoHTML,PDAoptimizedHTMLorWML.

Ifthe userwantsto haveavoieinterfae, astylesheet for onvertingthe abstrat userinterfaedesription

(10)

Program1TheabstratinterationdesriptionofthePIAloginpage

<?xml version="1.0" enoding="UTF-8"?>

<senario name="loginPage">

<input>

<UIElement>

<list name="rootNode">

<UIElement>

<pageSetting name="user_language">

<value>de</value>

</pageSetting>

</UIElement>

<UIElement>

<label name="login__piaLoginQXQ 25">

<value>PIA-Login</value>

</label>

</UIElement>

<UIElement>

<list name="login__data">

<UIElement>

<label name="login__userName">

<value>Benutzername:</v alu e>

</label>

</UIElement>

<UIElement>

<fieldValue name="login__userName_d efau lt">

<value>andreas</value>

</fieldValue>

</UIElement>

</list>

</UIElement>

...

<servieLink name="reateAountServi eLi nk">

<provider address="tpip://127.0.0.1 :73 25" name="PIAgent"/>

<servie name="MAPPresent"/>

<param name="senario">reateA ount </p aram >

</servieLink>

</list>

</UIElement>

</input>

</senario>

ThetransformationoftheabstratinterfaedesriptionisdoneusingExtensibleStylesheetLanguageT

rans-formations (XSLT). A XSLT transformation is typiallywritten by adesigner who is an expert for reating

optimizeduserinterfaesforadevieonsideringthepreferenesoftherespetiveaudiene. Forsimplifyingthe

buildingofXSLTtransformations,theMASPprovidesasetofgenerirulesfortransformingthefrequent

ele-mentsoftheabstratuserinterfaedesriptionsintobasiHTMLorWML.Basedontheserulesmoreomplex

anddevieoptimizedXSLTtransformationsanbedened.

An important feature of the utilised MASP is the support of Composite Capability/Preferene Proles

(CC/PP). Consideringthespei features and properties (e.g. sreensize, supported ssversion, supported

imageformats)theuserinterfaedesigneranoptimizetheinterfaestothepropertiesoftherespetivedevie.

For onverting media data into a devie adequate format, the MASP provides a omponent for saling and

onvertingimagesandvideos.

Theomponentsand interfaesof theMulti-Aess-ServiePlatform are shownin Figure4.4. Userswho

wanttousethePIAservieinteratwiththeMultiAessPoint. TheMAPontainsomponentsfor

intera-tionwiththerespetivedevie (e.g. webserverorvoieserver) andomponentsforrenderingtheappliation

interfaeoptimized forsupported devies. ApprovedrenderingomponentsforHTML, WML and VoieXML

baseduser interfaesexists;omponentsforapplet basedomponentsareunder development. Forthe devie

independentinterfaedesriptiontheMASP usestheAbstratInterfaeDesriptionLanguage(AIDL)that is

(11)

andspei representationrules. AdditionallytheMedia Caheomponentprovidesthemediaontentaswell

asonnetivityto externalmediaproviders.

Fig.4.4.ThearhitetureoftheMASP

BesideofthefeaturesprovidedbyMASPthedesignoftheuserinterfaemustreateaneasytousesystem

evenondevieswithatinysreenandwithoutakeyboard. ThatiswhythePIAinterfaeprovidesadditional

navigationelementsonomplexformsandminimizestheuseoftextinputelds. Newresultsmathingadened

requestarepresentedrstasalistofshortbloksontainingonlytitle,abstratandsomemeta-information(as

thisisfamiliar toeveryuserfrom well-knownsearh-engines). This informationisalsowellreadableonPDAs

or even mobile phones. Important artiles an be storedin a repository. This allows theuser to hoose the

artilesonhisPDAhewantstoread laterathisdesktopPC.

Dependingonthepersonaloptionsspeiedbytheuser,oldinformationfoundforaspeiquerymaybe

deleted automatially stepby stepafter agiven time, that is, there isalwaysup to date information that is

presentedtotheuser(weallthissmartmode). Thisisforexampleonvenientforgettingpersonalizedltering

news. Theother optionisto keepthatinformationunlimited (global mode)foraqueryfore.g. basisienti

papers.

For the push-servies (that is, the system is beoming ative and sending the user information without

anexpliit request), theuser speies his working time (e.g. 9am to 5pm). This divides thedayin a pre-,

work,andarereationslot,wherethePIAsystemassumesdierentdemandsofinformation. Foreahslot an

adequatedeliveringtehnology anbehosen(e-mail, sms, mms, faxorVoie). Ifyoudeideto subsribeto

thepersonalizednewsletter,newrelevantinformation foryouwill beolletedand sentbye-mailorfax one

adaywithasimilarlayoutandstruture foronvenientreadingifyouhavenotseenitalreadybyother

pull-orpushservies. Thereforeyouanalsostayinformedwithouthavingtologintothesystemandifyoudonot

wantto getallnewinformationimmediately.

5. Conlusionand future work. Theimplementedsystemhasanaeptableruntimeperformaneand

showsthatitisagoodhoietodevelopapersonalinformationsystemusingagent-tehnologybasedonasolid

agent-framework like JIAC IV. Currently, PIA system supports more than 120 dierent web soures, grabs

dailyaround3.000newsemi-struturedandunstrutureddouments,hasalmost500.000alreadypre-proessed

artiles,andativelyhelpsaboutftysientistsrelatedtoourlaboratoryintheirinformationretrievalativities.

Theirfeedbakand evaluationis avaluable inputfor thefurther improvementof PIA.In thenear future we

planto inreasethenumberofuserstothousands, andthereforeweplanto work onthefurther optimization

of theltering algorithmsto beable to simultaneouslyrespond to multiple ltering requests. Also,wethink

aboutintegratingadditionalserviesfortheuserthatprovideinformationtailoredtohis geographialposition

(12)

REFERENCES

[1℄ S.Brinand L.Page,Theanatomyof alarge-salehyper textual(Web)searh engine,inPro.7thInternational World

WideWebConfereneonComputerNetworks,30(1-7),1998,pp.107117.

[2℄ M.BalabanoviandS.Yoav,FAB:Content-Based CollaborativeReommendation,CommuniationoftheACM,40(3),

1997,pp.6672.

[3℄ A.Moukas,Amalthaea: InformationDisoveryand FilteringusingaMultiagentEvolving Eosystem,inPratial

Appli-ationofIntelligentAgents&Multi-AgentTehnology,London1998.

[4℄ B.Zhang,andY.Seo,PersonalizedWeb-DoumentFilteringUsingReinforementLearning,AppliedArtiialIntelligene,

15(7),2001,pp.665685.

[5℄ M.ClaypoolandA.GokhaleandT.MirandaandP.MurnikovandD.NetesandN.Sartin,Combining

Content-Based and Collaborative Filtersin anOnline Newspaper, inPro. ACMSIGIRWorkshop on ReommenderSystems,

Berkeley,CA,August19,1999.

[6℄ J.DelgadoandR.Davidson,Knowledgebasesanduserprolingintravelandhospitalityreommendersystems,inPro.

oftheENTER2002Conferene,Innsbruk,Austria,2002,pp.116.

[7℄ D.KuropkaandT.Serries,PersonalInformationAgent,inPro.InformatikJahrestagung2001,pp.940946.

[8℄ H.Lieberman,Letizia: AnAgentThatAssistsWebBrowsing,inPro.International JointConfereneonArtiial

Intelli-gene,Montreal,August1995.

[9℄ H.Liebermanand N.VanDykeand A.Vivaqua,Let's Browse: ACollaborativeBrowsingAgent,Knowledge-Based

Systems,12(8),1999,pp.427431.

[10℄ B.Sheth,ALearningApproahtoPersonalized InformationFiltering,M.S.Thesis,MITdept,USA,1994.

[11℄ T.Joahims andD.FreitagandT. Mithell,WebWather: ATourGuidefortheWorldWideWeb,inPro.IJCAI

(1),1997,pp.770777.

[12℄ W. Winiwarter,PEAAPersonalEmail Assistant withEvolutionary Adaptation, International JournalofInformation

Tehnology,5(1),1999.

[13℄ C. Thomas and G. Fisher, Usingagents to improve the usability and usefulnessof theworld wide web, inPro. 5th

InternationalConfereneonUserModelling,1996,pp.512.

[14℄ N.JenningsandM.Wooldridge,Agent-orientedsoftwareengineering,HandbookofAgentTehnology(ed.J.Bradshaw),

AAAI/MITPress,2000.

[15℄ R.Sesseler andS.Albayrak,Agent-based MarketplaesforEletroniCommere,inPro.InternationalConfereneon

ArtiialIntelligene,IC-AI2001.

[16℄ P.Resnikand J.NeophytosandM.SuhakandP. Bergstrom andJ.Riedl,GroupLens: Anopen arhiteture

forollaborativelteringof netnews, inPro.ACMConfereneonComputer-SupportedCooperativeWork,1994,pp.

175186.

[17℄ S. Albayrak and D. Milosevi, Self Improving Coordination in Multi Agent Filtering Framework, in Pro.

IEEE/WIC/ACM International Joint Confereneon IntelligentAgenttehnology (IAT 04) and WebIntelligene (WI

04),Beijing,China,September2004.

[18℄ D.Nihols,ImpliitRatingandFiltering,inPro.5thDELOSWorkshoponFilteringandCollaborativeFiltering,Budapest,

Hungary,1997,pp.3136.

[19℄ D.Tauritz,AdaptiveInformationFiltering: oneptsandalgorithms,Ph.D.dissertation,LeidenUniversity,2002.

[20℄ S.FrikeandK.BsufkaandJ.KeiserandT.ShmidtandR.SesselerandS.Albayrak,Agent-basedTelemati

ServiesandTeleomAppliations,CommuniationsoftheACM,2001.

[21℄ FoundationforIntelligentPhysialAgents,www.fipa.org2004.

[22℄ L.JingandH.HuangandH.Shi,ImprovedFeatureSeletionApproahTFIDFinTextMining,inPro.1stInternational

ConfereneonMahineLearningandCybernetis,Beijing,2002.

[23℄ S.Albayrak andD.Milosevi,Situation-awareCoordination inMultiAgentFilteringFramework,inPro.19th

Inter-nationalSymposiumonComputerandInformationSienes(ISCIS04),Antalya,Turkey,2004.

[24℄ M.Zbigniew andD.Fogel,HowtoSolveIt: ModernHeuristis,Springer-VerlagNewYork,In.,NewYork,NY,2000.

[25℄ S. Albayrak and D. Milosevi, Cooperative Community Seletion in Multi Agent Filtering Framework, in Pro.

IEEE/WIC/ACMInternationalJointConfereneonIntelligentAgenttehnology(IAT05)andWebIntelligene(WI05),

CompiegneUniversityofTehnology,Frane,2005,pp.527535.

Editedby: MarinPaprzyki,NiranjanSuri

Reeived: Otober1,2006

References

Related documents

RESULTS OF PRELIMS Heat 3 ASHA ROY... RESULTS OF PRELIMS

While the projects examined here have been partially just, and rather sensitive to existing patterns of social differentiation, which speaks about the sensitivity of project

By utilizing model-driven analysis, mobile software developers can quantitatively evaluate key performance and power consumption characteristics earlier in the software

Comparable with the recent completion of petroleum product price reforms parallel with the low international oil price, gas price reform (in terms of introducing a

The Black community is internally diverse, and Black students are navigating multiple intersecting identities, including [dis]abilities (Steward, 2008, 2009). The reality is that

Based on the above description, there are research questions such as: 1) Whether consumers read the labels on food products and when consumers read the

Consequently, the objective of this study was to test the agronomic performance of a group of commonly grown cereal crops including maize, sorghum, pearl and finger

Marcel Williams, Development Services Planner, stated, the petitioner is requesting to rezone a 3-acre portion of a 4.22-acre parcel on the west side of Beverly Street near its