and Characteristic Relations - Data Mining Foundations And Practice Tsau Young Lin (2008) pdf

For data sets with missing attribute values, the corresponding functionρ is incompletely specified (partial). A decision table with incompletely specified function? will be calledincompletely specified, orincomplete.

In the sequel we will assume that all decision values are speciﬁed, i.e., they are not missing. Also, we will assume that all missing attribute values are denoted by “?”, by “*” or by “–”, lost values will be denoted by “?”, “do not care” conditions will be denoted by “*”, and attribute-concept values by “–”. Additionally, we will assume that for each case at least one attribute value is speciﬁed.

Incomplete decision tables are described by characteristic relations instead of indiscernibility relations. Also, elementary sets are replaced by characteristic sets. The characteristic set was called a (binary) neighborhood in [16–18]. An example of an incomplete table is presented in Table 2.

For incomplete decision tables the deﬁnition of a block of an attribute- value pair must be modiﬁed.

• If an attributeathere exists a casexsuch thatρ(x, a) = ?, i.e., the corresponding value is lost, then the casexshould not be included in any block [(a, v)] for all valuesv of attributea.

• If for an attributeathere exists a casexsuch that the corresponding value is a “do not care” condition, i.e.,ρ(x, a) =∗, then the corresponding casex

should be included in blocks [(a, v)] for all speciﬁed valuesvof attributea.

Table 2.An incomplete decision table

Attributes Decision Case Temperature Headache Nausea Flu

1 High – No Yes

2 Very high Yes Yes Yes

3 ? No No No

4 High Yes Yes Yes

5 High ? Yes No

6 Normal Yes No No

7 Normal No Yes No

• If for an attributeathere exists a casexsuch that the corresponding value is a attribute-concept value, i.e., ρ(x, a) =−, then the corresponding case

xshould be included in blocks [(a, v)] for all speciﬁed valuesv of attribute

athat are members of the setV(x, a), where

V(x,a) ={ρ(y,a)|ρ(y,a)is speciﬁed,y ∈U, ρ(y,d) =ρ(x,d)},

anddis the decision.

These modifications of the definition of the block of attribute-value pair are consistent with the interpretation of missing attribute values, lost, “do not care” conditions, and attribute-concept values. Also, note that the attribute- concept value is the most universal, since ifV(x, a) =∅, the definition of the attribute-concept value is reduced to the lost value, and if V(x, a) is the set of all values of an attributea, the attribute-concept value becomes a “do not care” condition.

For Table 2, for case 1,ρ(1, Headache) =−, and V(1, Headache) ={yes}, so we add the case 1 to [(Headache, yes)]. For case 3,ρ(3, T emperature) = ?, hence case 3 is not included in either of the following sets: [(Temperature, high)], [(Temperature, very high)], and [(Temperature, normal)]. Similarly,

ρ(5, Headache) = ?, so the case 5 is not included in [(Headache, yes)] and [(Headache, no)]. Also, ρ(8, T emperature) = −, and V(8, T emperature) =

{high, very high}, so the case 8 is a member of both [(Temperature, high)] and [(Temperature, very high)]. Finally, ρ(8, N ausea) = ∗, so the case 8 is included in both [(Nausea, no)] and [(Nausea, yes)]. Thus,

[(Temperature, high)] ={1, 4, 5, 8}, [(Temperature, very high)] ={2, 8}, [(Temperature, normal)] = {6, 7}, [(Headache, yes)] ={1, 2, 4, 6, 8}, [(Headache, no)] ={3, 7},

[(Nausea, no)] = {1, 3, 6, 8}, [(Nausea, yes)] ={2, 4, 5, 7, 8}.

For a casex∈U, thecharacteristic setKB(x) is deﬁned as the intersection

of the setsK(x, a), for alla∈B.

If ρ(x, a) is speciﬁed, then K(x, a) is the block [(a, ρ(x, a)] of attributea

and its value ρ(x, a). If ρ(x, a) =∗ or ρ(x, a) = ? then the setK(x, a) =U. If ρ(x, a) = − and V(x, a) is nonempty, then the corresponding set K(x, a) is equal to the union of all blocks of attribute-value pairs (a, v), where v ∈ V(x, a). IfV(x, a) is empty, thenK(x, a) ={x}.

The way of computing characteristic sets needs a comment. For both “do not care” conditions and lost values the corresponding setK(x, a) is equal to

U because the corresponding attributea does not restrict the set KB(x): if ρ(x, a) =∗, the value of the attributeais irrelevant; ifρ(x, a) = ?, only existing values need to be checked. However, the case when ρ(x, a) = −is diﬀerent, since the attributearestricts the setKB(x). Furthermore, the description of

KB(x) should be consistent with other (but similar) possible approaches to

missing attribute values, e.g., an approach in which each missing attribute value is replaced by the most common attribute value restricted to a concept. Here the setV(x, a) contains a single element and the characteristic relation is an equivalence relation. Our definition is consistent with this special case in the sense that if we compute a characteristic relation for such a decision table using our definition or if we compute the indiscernibility relation as for complete decision tables using definitions from Sect. 2, the result will be the same. For Table 2 andB=A,

KA(1) ={1,4,5,8} ∩ {1,2,4,6,8} ∩ {1,3,6,8}={1,8}, KA(2) ={2,8} ∩ {1,2,4,6,8} ∩ {2,4,5,7,8}={2,8}, KA(3) =U∩ {3,7} ∩ {1,3,6,8}={3}, KA(4) ={1,4,5,8} ∩ {1,2,4,6,8} ∩ {2,4,5,7,8}={4,8}, KA(5) ={1,4,5,8} ∩U∩ {2,4,5,7,8}={4,5,8}, KA(6) ={6,7} ∩ {1,2,4,6,8} ∩ {1,3,6,8}={6}, KA(7) ={6,7} ∩ {3,7} ∩ {2,4,5,7,8}={7},and KA(8) = ({1,4,5,8} ∪ {2,8})∩ {1,2,4,6,8} ∩U ={1,2,4,8}.

The characteristic set KB(x) may be interpreted as the smallest set of

cases that are indistinguishable fromxusing all attributes fromB, and using given interpretation of missing attribute values. Thus,KA(x) is the set of all

cases that cannot be distinguished fromxusing all attributes. Also, note that the previous deﬁnition is an extension of a deﬁnition of KB(x) from [7–9]:

for decision tables with only lost values and “do not care” conditions, both deﬁnitions are identical.

Thecharacteristic relationR(B) is a relation onU deﬁned forx, y∈U as follows

(x,y)∈R(B)if and only if y∈KB(x).

The characteristic relationR(B) is reﬂexive but – in general – it does not need to be symmetric or transitive. Also, the characteristic relationR(B) is known if we know characteristic setsKB(x) for all x∈U. In our example,

R(A) ={(1,1),(1,8),(2,2),(2,8),(3,3),(4,4),(4,8),(5,4),

(5,5),(5,8),(6,6),(7,7),(8,1),(8,2),(8,4),(8,8)}

For decision tables, in which all missing attribute values are lost, a special characteristic relationLV(B) was deﬁned by Stefanowski and Tsoukias in [24], see also [23, 25]. Characteristic relationLV(B) is reﬂexive, but – in general – it does not need to be symmetric or transitive.

For decision tables where all missing attribute values are “do not care” conditions a special characteristic relationDCC(B) was deﬁned by Kryszkiewicz in [14], see also, e.g., [15]. RelationDCC(B) is reﬂexive and symmetric but – in general – is not transitive.

Obviously, characteristic relationsLV(B) and DCC(B) are special cases of the characteristic relationR(B). For a completely speciﬁed decision table, the characteristic relationR(B) is reduced to IND(B).

In document Data Mining Foundations And Practice Tsau Young Lin (2008) pdf (Page 155-158)