• No results found

CiteSeerX — Efficient Concept Generation

N/A
N/A
Protected

Academic year: 2022

Share "CiteSeerX — Efficient Concept Generation"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

Anne Berry



Jean-PaulBordat y

AlainSigayret



Abstra t

Generating on epts de nedby a binary relationbetween a setP of properties

and a set O of obje ts is one of the important urrent problems en ountered in

Data Mining.

We presenta new algorithm whi h generates ea h on ept exa tly on e, using

graph-theoreti results. Thispro esshasatime omplexityofO(jPjm)permaximal

hain of the on ept latti e, where m denotes the number of non- rosses of the

relation,andusesadatastru turewhi hisofsmallpolynomialsize. Thisimproves

the urrent best-time algorithms for this problem, whi h require either O(2 jPj

)

spa e and O(jPj 2

) time per on ept, or, alternately, polynomial spa e but O(jPj 3

)

timeper on ept.

Our algorithm an be used, with no extra ost, to ompute the edges of the

latti e,and an, just as eÆ iently,generate onlyfrequent sets.

1 Introdu tion

Inthe ontext ofData Base Management andData Miningproblems, databases are oftenrepre-

sentedbya binaryrelationbetweena setP ofpropertiesand aset Oof obje ts. Oneof thetools

foranalyzing the data ontainedin the base is to ompute all possible ombinations of elements

of the relationinto maximal re tangles. These re tangles, alled on epts, are organized into a

hierar hi al stru ture alled a on ept latti e. This theory,thoughstudied bymathemati iansas

farba kastheNineteenth Century(see [1 ℄,[10 ℄), was madepopularand developed byWille and

his team ([10 ℄), and remains one of theimportant urrent trends ofresear h inData Mining and

Arti ialIntelligen e: on ept latti es areused in elds asvaried asthe dis overy of asso iation

rulesinData Bases([24℄), thegenerationoffrequent itemsets([23 ℄), ma hinelearning([15 ℄,[17 ℄)

andthe reorganizationof obje thierar hies([13℄, [5 ℄).

The main drawba k to this approa h is that a on ept latti e is, in general, of exponential

size. Asa onsequen e,itisofprimaryimportan etobeabletogenerate on eptseÆ iently,even

whenonlypart ofthe latti e isexplored.

Con ept generation has given rise to a steady ow of resear h for thepast thirty years. One

of the rst algorithms to be publishedin this eld is dueto Chein ([8 ℄); this algorithmgenerates

su essivelayersof thelatti e,byde ningpossible andidatesby ombinationsof on epts ofthe

previouslayer; ithas anexponentialworst-time omplexitypergenerated on ept (see [11 ℄).

Bordat'salgorithm ([7 ℄)wasan improvement,asit runsinO(n 3

)per on ept, wheren=jPj,

inaBreadth-Firstfashion;oneoftheinterestingfeaturesofthisalgorithmisthatitalso omputes

alltheedges ofthelatti e. Thistime omplexitywasre entlyimprovedbyNourineandRaynaud

([18℄)to O(n 2

) per on ept.

Allthese algorithmsrequireexponentialspa eand storethe omputed on epts.



LIMOS,bat. ISIMA,63173Aubiere edex,Fran e. Mail: berryisima.fr,sigayretisima.fr

y

LIRMM,161RueAda,34392 Montpellier,Fran e. Mail: bordatlirmm.fr.

(2)

lembe omes easier,though therunning timeper on ept is higher: thebest su h algorithm, due

toGanter([9 ℄), runsinO(n 3

)timeper on ept, usingtheinteresting notionof le ti order, whi h

avoids s anningallthepossiblesubsetsofproperties,without,however,avoidingre- omputingthe

same on ept O(n) times.

Inthispaper,we addresstheissueofeÆ iently omputingallthe on epts, en ounteringea h

exa tlyon e, usingonlypolynomialspa e.

Our ontribution isan algorithm whi h runs inO(nm) timepermaximal hain,where m de-

notesthenumberofnon- rossesoftherelation,andrequiresonlysmallpolynomialspa e(O(nm)).

This greatly improves [18 ℄, as we have a omparable omplexity without requiring exponential

spa e; it also signi antly improves Ganter's O(n 3

) time per on ept. Furthermore, the number

of on epts tends to be ome exponentialwhen therelationis dense; inthis ase, m is of order n,

andour omplexitybe omes O(n 2

) permaximal hain.

The algorithm we introdu e here presents similarities with [7 ℄. It uses ea h on ept AB

to generate the on epts whi h are just above AB in the latti e ( alled the over of AB),

working atea h step on thesub-relationR ((P A);B).

One of the new features whi h is addedis that ea h on ept inherits informationon the pre-

viously pro essed on epts, in order to avoid generating the same on ept more than on e, a

breakthroughin on ept generation.

Another di eren eis that we usea re ursive Depth-First Sear h of the latti e, whi h enables

to storeonlya polynomialnumberof bytes, asa on ept latti e, though itmaybeexponentially

large, is of small height (O(n)). Moreover, the fashion in whi h we ompute the over of ea h

on ept isdi erent.

Our approa h is based on ourexperien e on graphs. In [7℄, Bordat useda bipartite graphto

handlethe relation. In [4 ℄, Berry and Sigayret proposed a di erent en oding into a o-bipartite

graphforwhi hthey establisheda one-to-one orresponden ebetween the on eptsof thelatti e

andthe minimalseparators ofthe graph.

This is algorithmi ally interesting be ause, in the past de ade, mu h resear h has beendone

onusingminimalseparatorstoeÆ ientlysolvevariousgraphproblemssu has hordalembedding

([19℄,[2 ℄),andinparti ularseveralpapersdealwiththeeÆ ientenumerationofminimalseparators

([14℄, [21 ℄,[20 ℄,[3℄).

[4℄ pointed out that, using the underlying o-bipartite graph and these re ent results on the

emergingtheoryofminimalseparation, the urrent best algorithmsforgenerating on epts ould

easily be mat hed both in terms of time and spa e. In this paper, we use graph properties to

improve these. Thoughwewillnotexpli itlyuse resultsonminimalseparation, they underlyour

approa h (the readerisreferred to [4℄ forafullexplanation onthisrelationship).

In order to ompute the atoms of a on ept latti e, we use the graph notion of domination

betweenverti es: avertexxissaidtodominateanothervertexyiftheneighborhoodofxin ludes

the neighborhood of y. We use a propertyfrom [4℄: if all the verti es of a set A share thesame

neighborhood,andare nottogether dominatedbyanothervertexof theen odinggraphG

R ,then

Ade nes an atom AB ofthe on ept latti e.

Ourmain omplexityimprovement followsfromtheremarkthatmu hof theinformationne -

essarytodeterminethedominationrelationshipsbetweenverti es anbeinheritedasonemovesup

intothelatti e,alongapathfromthebottom tothetop( alleda maximal hain). Thisenables

usto avoid re omputing all the domination informationea h time a new on ept is en ountered

(3)

pro ess. Wewillalsopresentourdatastru ture,thedominationtable,andillustratethepro esses

involved witha smallexample.

2 Preliminaries

2.1 Latti es

Given a nitesetP of properties and a niteset O ofobje ts,abinary relationR isde ned asa

subsetoftheCartesian produ tPO. Given anelement ofPO,wewillrefer to itasa ross

of the relation if it is in R , as a non- ross if it is not. We will use n to denote jPj, and m to

denotethe numberof non- rosses.

A on ept, also alled a maximalre tangle or losed set of R , is a sub-produ t AB R

su h that 8x 2O B;9y 2Aj(y;x) 62 R , and 8x2P A;9y 2 Bj(x;y) 62R . A is alled the

intentof the on ept, B is alledtheextent.

The on epts, orderedbyin lusionontheintents, de nealatti e, alleda on eptlatti eor

Galoislatti e. A latti e is represented by its Hasse diagram: transitivityand re exivity ar s are

omitted. Con eptsareoftenreferred toaselementsof thislatti e. Su ha latti e,hasasmallest

element, alled the bottom element, and a greatest element, alled the top element. A path

from bottom to top is alled a maximal hain of the latti e. Ifthe relationR hasa fullline of

rosses,it iseasy to omputethe on epts of R from the on eptsof the sub-relationfrom whi h

thisline hasbeen removed ([4 ℄). In therest of this paper, we willdis ussonly relations with no

su h linesof rosses.

We will say that a on ept A 0

B 0

is a su essor of on ept AB if A  A 0

and there is

no intermediate on ept A 00

B 00

su h that A  A 0 0

 A 0

. The set of su essors of an element

is alled the over of this element. The su essors of the bottom element are alled atoms. A

on ept A 0

B 0

is andes endant of on ept AB ifAA 0

. Thenotions of prede essor and

an estor arede neddually.

Example2.1 Binaryrelation R ; the asso iated on ept latti eL(R ) isshown in Figure1.

Set of properties:

P =fa;b; ;d;e;f;g;hg;

Set of obje ts:

O=f1;2;3;4;5;6g.

a b d e f g h

1    

2     

3     

4  

5  

6  

2.2 Graphs

Given a relation R , [4 ℄ de ned an underlying en oding graph G

R

as the graph with vertex set

P[O;P and O are liques,and fora vertex x of P and a vertex y of O, there is an xy edgein

G

R

i (x;y) is not in R . For X P;Y O, we willdenote byG

R

(X[Y) thesubgraph of G

R

indu edbyvertex set X[Y.

We willuseonlyexternalneighborhoods,whi hwe willdenotebyN +

: ifx2P;N +

(x)=fy2

Oj(x;y)2Gg,and ifx2O;N +

(x)=fy2Pj(y;x)2Gg,whereGisthe urrentsubgraphofG

R .

(4)

In our example, N (a) = f1;4;5g, N (b) = f4;5;6g, N ( ) = f3;4;6g, N (d) = f2;3;6g,

N +

(e)=f2;3;5;6g, N +

(f)=f1;2;4;5;6g, N +

(g)=f1;4;5;6g, N +

(h)=N +

(a).

Inorder to omputethe over ofa on ept, we needseveral graphnotions.

De nition 2.2 A vertex x is said to dominate a vertex y i N +

(y) N +

(x); we will say that

this domination isstri t when x dominates y and N +

(y)6=N +

(x).

Anotherimportantnotionforthispaperisthat of maximal liquemodules.

De nition 2.3 A set X of properties is said to form a maximal lique module of G

R

if every

vertex x of X dominates every other vertex of X; we will all X a maxmod.

ThemaxmodsofG

R

de nea partitionofP;essentially,amaxmodbehavesasasinglevertex,

asall theverti esofamaxmodX sharethesameexternalneighborhood,denotedbyN +

(X). We

thusextendDe nition2.2tomaxmods: wewillsaythatamaxmodXdominatesanothermaxmod

Y ifN +

(Y)N +

(X).

Theorem 2.4 ([4 ℄)A on ept (A+X)B 0

overs a on ept AB i X, in G

R

((P A)[B),

isa non-dominating maxmod.

In our example, in G

R

, the partition of P into maxmods is: fa;hg, fbg, f g , fdg, feg, ffg,

fgg, fhg. feg dominates fdg , ffg dominates fgg, fgg dominates fa;hg and fbg; f g is neither

dominated nor dominating. The non-dominating maxmods of G

R

are: fa;hg, fbg, f g and fdg.

3 Algorithmi Pro ess

Theorem2.4 an be used to re ursively ompute the over of an element, starting with thebot-

tom element. This pro ess will generate ea h on ept exa tly as many times as the number of

prede essorsithas.

Ourstatedgoalistogenerateea h on eptexa tlyon e. Be ausetheorderingonthe on epts

isde nedbyin lusion,any on eptA 0

B 0

whi hisades endantof AB veri esAA 0

. Sin e

ouralgorithmworksupina depth- rstfashion,when amaxmod X isused to generatea on ept

(A+X)B,then all on epts ontaining X intheir intent will be de ned bythe re ursive all

upon (A+X)B. Ifabrother on ept of(A+X)B usesa maxmod ontaining somevertex x

ofX,this on ept hasalreadybeengeneratedpreviously. Ifweare arefultostoreinformationon

themaxmodswhi h have already beenused bya brother on ept or by a brother of an an estor

on ept,we an avoid omputingthe same on ept more thanon e.

General algorithmi pro ess on on ept AB:

Computeset ND of non-dominating maxmods of G

R

((P A)[B);

//Ea h maxmod X of ND de nes a on ept overing AB.

Computeset New of maxmodsof ND ontaining no already pro essed vertex;

//Ea h maxmod X of New de nes a new on ept overing AB.

For ea h maxmod X in New:

Re ursively apply pro ess to new on ept withintent A+X;

ADD all verti es of X to set of already pro essed verti es.

The bottlene k omplexity forthispro ess is omputingthe setof non-dominatingmaxmods,

(5)

amaxmodX of minimumdegree, omputethemaxmodswhi hdominate it,remove these andX

fromthevertexset,beforereiterating. AlgorithmMCS([22℄)wouldyielda ompatibleorderingof

themaxmodsinlinearO(m)time,and omputingthemaxmodsdominatingXwould ostthesame

time; globally, with this pro ess, we would obtain the set of non-dominating maxmods in O(m)

timepernon-dominating maxmod generated, whi h would add up to a worst-time omplexity of

O(nm)per on ept. Experimentally,thisalgorithm,implementedasexplainedabove,runsrapidly,

be ause oftheextrainformationonthealreadypro essedverti es,espe iallyifat ea hstepwhere

a maxmod X isadded to the setof already pro essed verti es, the maxmodswhi h dominate X

arealso added.

However, in order to improve the worse-timebehaviour,we propose a more sophisti ated ap-

proa h to omputingthenon-dominated maxmods,asdes ribedbelow.

Updating the domination information

InordertoeÆ ientlyanswerrequestsonthesetofnon-dominatingmaxmods,weuseadomination

table ontaining information on the urrent graph. As thisinformation an be inherited along a

maximal hain,maintaining thistable inthe ourseofthe Depth-Firsttraversalalong a maximal

hain avoids re omputing theentiredomination informationat ea h step ofthealgorithm.

The inheritan e me hanism involved is the following: when moving up into the latti e, say

froma on ept AB representedbytheunderlyinggraphG

R

((P A)[B)to a se ond on ept

(A+X)B 0

, overingthe rst,andrepresentedbytheunderlyinggraphG

R

((P (A+X)[(B

N +

(X)), two thingshappen:

1. SetX ofpropertiesdisappear.

2. SetN +

(X) ofobje tsdisappear.

In our example, when moving up from the bottom element ;O to element ah236, the new

subgraph will be de ned on (P fa;hg)[f2;3;6g, so that properties a and h will disappear, as

well as obje ts1, 4 and 5.

A vertexx isde nedasdominating anothervertexy ingraphG

R

ifwhenthere isan yiedge,

there also is an xi edge. Equivalently, if(y;i) is a non- ross of R , then (x;i) is also a non- ross

of R . Our idea, used to maintain Galois sub-hierar hies in [5 ℄, is to list into a table L, for ea h

pair ofproperties (x;y), theobje ts whi hprevent x from dominating y. Thismeans thatiffor

obje ti, (x;i)2R and (y;i)62R , iwillappear inthelistL[x;y℄.

The orresponding lists in our example are givenin table Lbelow:

a b d e f g h

a ; f1g f1;5g f1;4;5g f1;4g ; ; ;

b f6g ; f5g f4;5g f4g ; ; f6g

f3;6g f3g ; f4g f4g f3g f3g f3;6g

d f2;3;6g f2;3g f2g ; ; f3g f2;3g f2;3;6g

e f2;3;6g f2;3g f2;5g f5g ; f3g f2;3g f2;3;6g

f f2;6g f1;2g f1;2;5g f1;4;5g f1;4g ; f2g f2;6g

g f6g f1g f1;5g f1;4;5g f1;4g ; ; f6g

h ; f1g f1;5g f1;4;5g f1;4g ; ; ;

From this table L, we an seethat will dominate a whenobje ts1 and 5 have disappeared.

(6)

orrespond to distin tneighborsof y inG

R

,and ea h rowy ontainsn su hlists.

In our example, L[ ;a℄=f1;5g; (a;1) and (a;5) are edges of G

R .

UpdatingthetableLmeansforea h (x;y)-pairinP 2

,removingfromlistL[x;y℄theobje tswhi h

disappearfrom thegraphwhen movingup froma on ept to one of its su essors.

A tually, we are only on erned with the number of verti es whi h a vertex x dominates in

a given graph, so that ardinalities are suÆ ient for our data stru ture: a maxmod X will be

non-dominating when, for any x 2 X, the number of verti es whi h x dominates is exa tly jXj.

ThedominationtableT weusethus ontainsnumbersbetween0andjOj,T[x;y℄representingthe

sizeof list L[x;y℄, thusvertex x dominates vertex y i T[x;y℄=0. In order to have rapida ess

to thisinformation, we also keepa table D, s anningP,where D[x℄ givesthe numberof verti es

ysu hthatT[x;y℄=0,i.e. thenumberof verti eswhi hx dominates. A maxmodX willthusbe

non-dominatingif and onlyifforan arbitrary x 2X, D[x℄=jM(x)j, and the query: 'Whi h are

the non-dominating maxmods?' an beansweredinvery eÆ ient O(n)time usingtable D.

In our example, tablesT and D would be:

a b d e f g h

a 0 1 2 3 2 0 0 0

b 1 0 1 2 1 0 0 1

2 1 0 1 1 1 1 2

d 3 2 1 0 0 1 2 3

e 3 2 2 1 0 1 2 3

f 2 2 3 3 2 0 1 2

g 1 1 2 3 2 0 0 1

h 0 1 2 3 2 0 0 0

a b d e f g h

2 1 1 1 2 5 4 2

The pro ess for onstru ting theinitial dominationtable T from atable initializedto ontaining

zerovaluesisthe following:

For ea h x inP

For ea h y inP

For ea h z inO

If (x;z) 2R and (y;z)2= R thenadd 1 to T[x;y℄;

Thesetablesarepre-updatedatea h step todes ribethedominationrelationshipsinthenew

graph before a re ursive all, and then post-updated ba k to its original form. The global time

requiredfor pre-updating T along a maximal hain of the latti e doesnot ex eed the number of

bits ontainedinL,sothepre-updatingwill ostO(nm). Clearly,thepost-updating ostsexa tly

thesame. The updatingalgorithm isgiven inthenext se tion.

4 The algorithm

Thealgorithm isinitially alledbyCONCEPTS(;O;;) on anempty Markedset.

TablesT and D are onstru tedfrom G

R

asdes ribedinthe previousse tion.

Algorithm CONCEPTS

Input: A on ept AB,aset Markedof verti esof P

(7)

Initialization:

G G

R

((P A)[B);

Computethepartitionof P A inG into maxmods;

// Themaxmod whi h a vertex x2P belongs tois denoted by M(x).

For x inMarkeddo Marked Marked[M(x);

//1. Compute the setND of non-dominating maxmods of G.

ND ;;

For xin P A do

If D[x℄=jM(x)j thenND ND[M(x);

//2. If desirable,generate the over of AB.

For X inNDdo

A 0

A+X; B 0

O N

+

(X);

PRINT(A 0

B 0

);

//3. Generate the unpro essed des endants of AB.

For X inND Markeddo

A 0

A+X; B 0

O N

+

(X);

PRINT(A 0

B 0

);

//Whengenerating frequentsets, testsize of B 0

; iftoo small,take nextX in ND{Marked.

UPDATE(pre);

CONCEPTS(A 0

B 0

,Marked);

UPDATE(post);

Marked Marked[X;

End.

Algorithm UPDATE

Input: A variableV set to preorto post.

Output: TablesT and D aremodi ed using urrentvaluesof X and AB.

Begin

Choose arepresentativex inX;

//1. Updatetable D by simulating deletion of property set X.

For y in(P A) X do

If T[y;x℄=0then

If V =prethen D[y℄ D[y℄ jXj;

else D[y℄ D[y℄+jXj;

//2. Updatetables T andD by simulating deletion of obje ts in N +

(x).

For j inN +

(x) do

Z N

+

(j) X;

U (P A) Z X;

For (u;z) inUZ do

If V =prethen

T[u;z℄ T[u;z℄ 1;

If T[u;z℄=0 thenD[u℄ D[u℄+1;

else //V=post

T[u;z℄ T[u;z℄+1;

If T[u;z℄=1 thenD[u℄ D[u℄ 1;

End.

(8)

Ea hstepofAlgorithmCONCEPTSrequires omputingthemaxmodsofthegraph,whi h an

be doneinO(m)timeusingthealgorithmofHsuandMafrom [12 ℄;usingtableD, ndingtheset

ofnon-dominatingmaxmodsrequiresO(n)time; omparingthese withMarked ostsO(n) time,

thusa on ept is pro essedin globalO(m)time. Asdis ussed inSe tion3, Algorithm UPDATE

globally ostsO(nm)permaximal hain. Theglobaltime omplexityisthusinO(m)per on ept

plusO(nm) permaximal hain.

Spa e omplexity: the re ursive queue ontains at most O(n) on epts of size O(n) ea h,

Markedisof sizeO(n);T ontainsO(nm)bits; theglobalspa e omplexityisthusin O(nm).

Example4.1

Step1: Theexe utionstartswiththebottomelement;123456. InG=G

R

,thenon-dominating

maxmods are fa;hg, fbg, f g and fdg. The over of ;123456 is: ah236, b123, 125,

d145. ThesetMarkedofalreadypro essedverti esisempty. ah236is hosentobepro essed

next.

Step2: Con eptah  236is hosentobepro essednext;thetableisa ordinglypre-updated: sin e

obje ts1,4and5disappear,pairsfromtheCartesianprodu tsfb; ;d;egff;gg,fd;egfb; ;f;gg

andf ;dgfb;e;f;gg should ause the orrespondingnumbersfrom T do bede rementedby1.

Newtables T and Dobtained:

b d e f g

b 0 0 0 0 0 0

1 0 0 0 1 1

d 2 1 0 0 1 2

e 2 1 0 0 1 2

f 1 1 0 0 0 1

g 0 0 0 0 0 0

b d e f g

2 3 6 6 3 2

Graph G be omes G

R

(fb; ;d;e;f;g;2;3;6 g). Maxmods of G: fb;gg, f g, fd;eg, ffg ; non-

dominatingmaxmod: fb;gg. Con eptabgh23is generated.

Step3: abgh23ispro essed. Non-dominatingmaxmods: f gandffg. Con eptsab gh2 and

abfgh3 aregenerated; abfgh3 is hosen to be pro essednext.

Step 4: abfgh3 ispro essed. Non-dominatingmaxmod: f ;d;eg; topelement ab defgh; is

generated.

Step 5: ab defgh ; is pro essed; the graph G obtained is empty; no new on ept an be

generated.

Step6: step3re ursively allsab gh2,withMarked=ffg. Non-dominatingmaxmod: fd;e;fg;

sin ef isinMarked,no new on ept isgenerated.

Step 7: step 1 re ursively alls 125 with Marked=fa;hg. Non-dominating maxmods: fbg

andfdg. Con eptsb 12 and d15are generated. b 12is hosento be pro essednext.

Step 8: b 12 is pro essed, with Marked=fa;hg. Non-dominating maxmods: fd;eg and

fa;g;hg; sin e a and h are in Marked, only fd;eg will be used to generate a new on ept:

b de1.

Step9: b de1 ispro essed,withMarked=fa;hg. Non-dominatingmaxmod: fa;f;g;hg;sin e

aand hare inMarked,nonew on ept isgenerated.

Step10: step7re ursively alls d15,withMarked=fa;b;hg;fa;hgisinheritedfrom on ept

ah236, a brother of father 125, and fbg is inherited from brother on ept b 12. Non-

dominatingmaxmod: fb;eg. Sin eb isin Marked,no new on ept isgenerated.

Step 11: step 1 re ursively alls b123 with Marked=fa; ;hg. Non-dominating maxmods:

(9)

f g andfeg. Sin e is inMarked,only on ept de14 isgenerated.

Step13: de14ispro essed, withMarked=fa;b; ;hg. Non-dominatingmaxmod: fb; g. Sin e

b and are in Marked, no new on ept is generated. The re ursive queue is empty and the

algorithmterminates.

x 123456 φ

cd x 15

ah x 236

de x 14 abfgh x 3

abcdefgh x φ

{c} {d,e}

{d} {e}

{d}

{b}

{a,h} {c}

5

abcgh x 2 4 6

abgh x 23

3 8 bc x 12

bcde x 1 9

13

2 11 b x 123 7 c x 125 12 d x 145

{c,d,e}

{b,g}

[ah]

[abch]

[abh]

[ah]

[ach] [ah] [abch]

[f]

1

10 {f}

{b}

Figure 1: Con ept latti e L(R )of relation R ; the on epts are numbered in pre x order following

our sample re ursive exe ution; the inherited sets of already pro essed verti es appear between

bra kets. The edges of the Depth-First tree are labeled by the non-dominating maxmod used to

ompute ea h new on ept.

5 Con lusion

In this paper, we propose a new algorithmi approa h to on ept generation, whi h enables us

to improve all existing algorithms for this problem. Our omplexity analysis ould probably be

streamlined, as O(nm) per maximal hain is a very rough overestimation for the ost of the

updatingpro ess.

Another promisingapproa h to improve the omplexityfor thisproblem would be to usethe

propertythat thereis aone-to-one orresponden ebetweenthemaximal hainsof thelatti e and

the minimal triangulationsof the underlying o-bipartite graph ([4 ℄) and apply Meister's re ent

work (see [16 ℄) on thelinear-timetriangulationsof AT-free and Claw-free graphs,a super lassof

o-bipartitegraphs.

Referen es

[1℄ M. Barbutand B. Monjardet. Ordreet lassi ation. Classiques Ha hette,1970.

[2℄ A. Berry. A Wide-Range EÆ ient Algorithm forMinimal Triangulation. Pro eedings of the

10th AnnualACM-SIAMSymposium onDis reteAlgorithms(SODA'99),Baltimore,pp.860{

861, Jan. 1999.

[3℄ A. Berry, J.-P. Bordat and O. Cogis. Generating all the minimal separators of a graph.

International Journal of Foundations of Computer S ien e,11:397-404, 2000.

(10)

reteMaths and Data MiningWorkshop, 2nd SIAM Conferen e on Data Mining (SDM'02),

Arlington (VA), April2002, submittedto Dis rete AppliedMathemati s.

[5℄ A. Berry and A. Sigayret. Maintaining lass membership information. To appear in the

pro eedings of OOIS'02, Le ture Notes in Computer S ien e, Sept. 2002.

[6℄ G. Birkho . Latti e Theory. Ameri an Mathemati al So iety, 3rdEdition,1967.

[7℄ J.-P. Bordat. Cal ul pratique du treillis de Galois d'une orrespondan e. Mathematiques,

Informatique et S ien es Humaines,96:31{47, 1986.

[8℄ M. Chein. Algorithme de re her he de sous-matri es premieres d'une matri e. Bull. Math.

R.S. Roumanie, 13,1969.

[9℄ B. Ganter. Two basi algorithms in on ept analysis. Preprint 831, Te hnis he Ho hs hule

Darmstadt, 1984.

[10℄ B. Ganter and R.Wille. FormalCon ept Analysis. Springer,1999.

[11℄ A. Gueno he. Constru tion du treillis de Galois d'une relation binaire. Mathematiques,

Informatique et S ien es Humaines,121:23{34, 1993.

[12℄ W.-L. Hsuand T.-H.Ma. Substitutionde ompositionon hordalgraphsand itsappli ations.

SIAM Journal on Computing, 28:1004-1020, 1999.

[13℄ M. Hu hard,H.Di kyandH.Leblan . Galoislatti easaframeworktospe ifybuilding lass

hierar hiesalgorithms. Theoreti al Informati s andAppli ations, 34:521{548, 2000.

[14℄ T. Kloks and D. Krats h. Listing all minimal separators of a graph. SIAM Journal on

Computing, 27:605{613, 1998.

[15℄ M.LiquiereandJ.Sallantin.Stru turalma hinelearningwithGaloislatti esandGraphs.Pro .

of the 1998 Int. Conf. on Ma hine Learning (ICML'98) Morgan Kaufmann Ed,305-313.

[16℄ D. Meister. Minimaltriangulationsforsome lassesof AT-free graphs. Satellite workshop of

WG'02, Prague, 2002.

[17℄ E. Mephu Nguifo and P. Njiwoua. Using Latti e-Based Framework as a Tool for Feature

Extra tion.ECML, 304-309, 1998.

[18℄ L.NourineandO. Raynaud. A FastAlgorithmforbuildingLatti es. IPL,71:199{204, 1999.

[19℄ A. Parra and P. S heer. How to use the minimal separators of a graph for its hordal

triangulation. Pro eedings of the 22nd International Colloquium on Automata, Languages

and Programming (ICALP '95), Le ture Notes in Computer S ien e, 944:123{134, 1995.

[20℄ H. Shen. Separatorsare as simpleas utsets. Asian Computer S ien e Conferen e, Purket,

Thailand, De ember 10-13, 1999, Le tureNotes in Computer S ien e,172:347{358.

[21℄ H.ShengandW.Liang.EÆ ientenumerationofallminimalseparatorsinagraph.Theoreti al

Computer S ien e,180: 169{180, 1997.

[22℄ R. E.Tarjanand M. Yannakakis. Simplelinear-timealgorithms to test hordalityof graphs

[...℄. SIAM Journal of Computing, 13:566{579, 1984.

[23℄ P.Valt hef,R.Missaoui,andR.Godin.AFrameworkforIn rementalGenerationofFrequent

Closed Item Sets. Pro eedings of Dis rete Maths and Data Mining Workshop, 2nd SIAM

Conferen e on Data Mining(SDM'02), Arlington (VA), April2002.

[24℄ M. J. Zaki, S. Parthasarathy, M. Ogihara and W. Li. New Algorithms for Fast Dis overy

of Asso iation Rules. Pro eedings of 3rd International Conferen e on Database Systems for

Advan ed Appli ations, April1997.

References

Related documents

A notable aspect of these models is that they embody some version of a copying process: a node v links to other nodes by picking a random (other) node u in the graph, and copying

Standards for data quality outline a framework to enable the collection, analysis and use of good quality data to support the delivery of health and social care and to report

Patients who had viral load suppression at the time of their last paediatric visit were more than twice as likely to have viral load suppressed at their most recent adult clinic

• Make the information available to the different levels of the distribution channel (sales management and

In this paper, we obtain the bounds on the number of edges, vertices, domatic number, and domination number of the minimal dominating graph and vertex minimal dominating graph of

Discovery at Shadow Creek Ranch offers free personal fitness classes for our residents with a certified instructor lead team.. Should you need to find additional

and plans to achieve them PP1 – Policy & Program Management 7.1 Resources PP1 – Policy & Program Management 7.2 Competence PP2 – Embedding Business Continuity 7.3 Awareness

These differences and complexities thus provided a starting point for our knowledge exchange research workshops which set out to explore the various ways in which screen ‘talent’