5 Variants of Decision Trees - Data Mining Foundations And Practice Tsau Young Lin (2008) pdf

General fuzzy decision trees provide a common framework for expressing different types od decision trees. An instance of general fuzzy decision trees is characterized by the following parameters:

1. data format in the FDT, 2. rule form in the FDL,

3. bipolar interpretation of data and wﬀs (disjunctive or conjunctive), 4. assignment of class labels to decision tree nodes,

5. computation of degrees of concentration.

In this section, we consider instances related to the classical decision tree [13], fuzzy decision tree [6], and multi-valued decision tree [2].

5.1 Classical Decision Trees Revisited

One classical instance of general fuzzy decision trees is characterized by the following parameters:

1. data format in the FDT: fi(x) is a singleton subset of Vi for allfi ∈ A

andx∈U.

2. rule form: in the FDL, we restrictLi=Vi for allfi∈A.

3. interpretation of wﬀs: with the restrictions on FDT and FDL, conjunctive and disjunctive interpretations collapse. We choose the t-norm ⊗= min, so ⊗,⊕, and→_⊗ correspond respectively to the classical Boolean opera- tions∧,∨, and→. Therefore,E(x, ϕ)∈ {0,1} holds for eachx∈U and

wﬀ ϕ. In particular,E(x,(ai, v)) = 1 iﬀfi(x) =v. Consequently,Usis a

crisp subset ofU for each nodesof the decision tree.

4. assignment of class labels to decision tree nodes: we use the average support. Let |Us|=ns andVm={v1,· · ·, vk} and assume that the number

of objects in Us with decision valuevi is ni. Then, according to (2), the

class label of sis a fuzzy subset ofVmwith the membership function

µs(vi) = ni ns

=pi, 1≤i≤k.

Note thatk_i₌₁pi= 1 holds in this case.

5. computation of degrees of concentration: we use the simdeﬁned in (10) and compute the gdcaccording to (5), then

gdcs= k i=1 pi· pi 1 +_j₌_ipj = k i=1 p2 i 2−pi .

This kind of classical decision tree uses diﬀerent stopping conditions and selection criteria than those based on information gains derived from entropy [13] or the Gini index [11].

5.2 Multi-Valued Decision Trees

An instance of multi-valued decision trees is characterized as follows:

1. data format in the FDT: fi(x) is a crisp non-empty subset of Vi for all fi∈Aandx∈U.

2. rule form in the FDL: we restrictLi =Vi for allfi∈A.

3. interpretation of wffs: we use conjunctive interpretation, and still choose the t-norm ⊗= min. Again, E(x, ϕ) ∈ {0,1} holds for each x∈U and wff ϕ. In particular,E(x,(ai, v)) = 1 iffv∈fi(x). Therefore,Us is also a

crisp subset ofU for each nodesof the decision tree.

4. assignment of class labels to decision tree nodes: we use average support. Let |Us| = ns and Vm = {v1,· · ·, vk} and assume that the number of

objects in Us whose decision values contain vi is ni. Then, according to

(2), the class label of s is a fuzzy subset of Vm with the membership

function µs(vi) = ni ns =pi, 1≤i≤k.

Note thatk_i₌₁pi= 1 no longer holds in this case.

5. computation of degrees of concentration: we use the simdeﬁned in (10) and compute the ldc according to (7). Then ldcs is equal to the set- similarity function deﬁned in [2].

This kind of multi-valued decision tree is very similar to that in [2]. There are, however, two subtle diﬀerences. One is the assignment of class labels. In

our approach, we assign a fuzzy subset of Vm as the class label of a node,

whereas in [2] this subset is further defuzziﬁed into a crisp subset of Vm.

The other diﬀerence is the stopping condition. In our approach, the stopping condition is based on ldcs, whereas in [2] a criterion based purely on µs is

given. In [2], with a user-speciﬁed parameterσ, the setVmis partitioned into larges and smalls in a node s, where larges = {v ∈ Vm | µs(v) ≥σ} and smalls =Vm−larges. A nodes is called clear if min{µs(v)|v ∈larges} −

max{µs(v) | v ∈ smalls} > δ, where δ is again a user-speciﬁed parameter.

Roughly speaking, if a node s is clear, then no further expansion is needed and the class label assigned to the node islarges.

5.3 Fuzzy Decision Trees

For an instance of fuzzy decision trees, we use the following parameter setting: 1. data format in the FDT: fi(x) is a singleton subset of Vi for allfi ∈ A

andx∈U.

2. rule form in the FDL: we only restrict thatLi is ﬁnite for allfi∈A.

3. interpretation of wﬀs: we use disjunctive interpretation, and still choose the t-norm ⊗= min. However, E(x,(ai, l)) =µct(l)(fi(x)) is now a real

number in [0,1]. Therefore, Us is a fuzzy subset ofU for each nodes of

the decision tree.

4. assignment of class labels to decision tree nodes: we use average support, and still assumeVm={v1, . . . , vk}. LetSC denote the sigma count ofUs

andridenote

x:f_m(x)=v_iµUs(x) for 1≤i≤k. Then, the class label ofs

is a fuzzy subset of Vm with the membership function

µs(vi) = ri SC

=pi, 1≤i≤k.

Note thatk_i₌₁pi= 1 holds in this case.

5. computation of degrees of concentration: we use the simdeﬁned in (10) and compute the gdc according to (5). Then, analogous to the case of classical decision trees,

gdcs= k i=1 p2 i 2−pi .

6 Conclusion

Decision tree approach is important since decision trees provide solutions to classification problems and extract rules effectively. To deal with different kinds of data, decision trees have been generalized along different directions in the past. In this paper, we propose a quite general framework for fuzzy decision trees. Some particular instances of this framework prove to be interesting

alternatives to previous proposals. Detailed comparison of our approach with these related works is ongoing. Implementation and experimental testing of our approach will be covered by future research.

In document Data Mining Foundations And Practice Tsau Young Lin (2008) pdf (Page 131-134)