General fuzzy decision trees provide a common framework for expressing dif- ferent types od decision trees. An instance of general fuzzy decision trees is characterized by the following parameters:
1. data format in the FDT, 2. rule form in the FDL,
3. bipolar interpretation of data and wffs (disjunctive or conjunctive), 4. assignment of class labels to decision tree nodes,
5. computation of degrees of concentration.
In this section, we consider instances related to the classical decision tree [13], fuzzy decision tree [6], and multi-valued decision tree [2].
5.1 Classical Decision Trees Revisited
One classical instance of general fuzzy decision trees is characterized by the following parameters:
1. data format in the FDT: fi(x) is a singleton subset of Vi for allfi ∈ A
andx∈U.
2. rule form: in the FDL, we restrictLi=Vi for allfi∈A.
3. interpretation of wffs: with the restrictions on FDT and FDL, conjunctive and disjunctive interpretations collapse. We choose the t-norm ⊗= min, so ⊗,⊕, and→⊗ correspond respectively to the classical Boolean opera- tions∧,∨, and→. Therefore,E(x, ϕ)∈ {0,1} holds for eachx∈U and
wff ϕ. In particular,E(x,(ai, v)) = 1 ifffi(x) =v. Consequently,Usis a
crisp subset ofU for each nodesof the decision tree.
4. assignment of class labels to decision tree nodes: we use the average sup- port. Let |Us|=ns andVm={v1,· · ·, vk} and assume that the number
of objects in Us with decision valuevi is ni. Then, according to (2), the
class label of sis a fuzzy subset ofVmwith the membership function
µs(vi) = ni ns
=pi, 1≤i≤k.
Note thatki=1pi= 1 holds in this case.
5. computation of degrees of concentration: we use the simdefined in (10) and compute the gdcaccording to (5), then
gdcs= k i=1 pi· pi 1 +j=ipj = k i=1 p2 i 2−pi .
This kind of classical decision tree uses different stopping conditions and selection criteria than those based on information gains derived from entropy [13] or the Gini index [11].
5.2 Multi-Valued Decision Trees
An instance of multi-valued decision trees is characterized as follows:
1. data format in the FDT: fi(x) is a crisp non-empty subset of Vi for all fi∈Aandx∈U.
2. rule form in the FDL: we restrictLi =Vi for allfi∈A.
3. interpretation of wffs: we use conjunctive interpretation, and still choose the t-norm ⊗= min. Again, E(x, ϕ) ∈ {0,1} holds for each x∈U and wff ϕ. In particular,E(x,(ai, v)) = 1 iffv∈fi(x). Therefore,Us is also a
crisp subset ofU for each nodesof the decision tree.
4. assignment of class labels to decision tree nodes: we use average support. Let |Us| = ns and Vm = {v1,· · ·, vk} and assume that the number of
objects in Us whose decision values contain vi is ni. Then, according to
(2), the class label of s is a fuzzy subset of Vm with the membership
function µs(vi) = ni ns =pi, 1≤i≤k.
Note thatki=1pi= 1 no longer holds in this case.
5. computation of degrees of concentration: we use the simdefined in (10) and compute the ldc according to (7). Then ldcs is equal to the set- similarity function defined in [2].
This kind of multi-valued decision tree is very similar to that in [2]. There are, however, two subtle differences. One is the assignment of class labels. In
our approach, we assign a fuzzy subset of Vm as the class label of a node,
whereas in [2] this subset is further defuzzified into a crisp subset of Vm.
The other difference is the stopping condition. In our approach, the stopping condition is based on ldcs, whereas in [2] a criterion based purely on µs is
given. In [2], with a user-specified parameterσ, the setVmis partitioned into larges and smalls in a node s, where larges = {v ∈ Vm | µs(v) ≥σ} and smalls =Vm−larges. A nodes is called clear if min{µs(v)|v ∈larges} −
max{µs(v) | v ∈ smalls} > δ, where δ is again a user-specified parameter.
Roughly speaking, if a node s is clear, then no further expansion is needed and the class label assigned to the node islarges.
5.3 Fuzzy Decision Trees
For an instance of fuzzy decision trees, we use the following parameter setting: 1. data format in the FDT: fi(x) is a singleton subset of Vi for allfi ∈ A
andx∈U.
2. rule form in the FDL: we only restrict thatLi is finite for allfi∈A.
3. interpretation of wffs: we use disjunctive interpretation, and still choose the t-norm ⊗= min. However, E(x,(ai, l)) =µct(l)(fi(x)) is now a real
number in [0,1]. Therefore, Us is a fuzzy subset ofU for each nodes of
the decision tree.
4. assignment of class labels to decision tree nodes: we use average support, and still assumeVm={v1, . . . , vk}. LetSC denote the sigma count ofUs
andridenote
x:fm(x)=viµUs(x) for 1≤i≤k. Then, the class label ofs
is a fuzzy subset of Vm with the membership function
µs(vi) = ri SC
=pi, 1≤i≤k.
Note thatki=1pi= 1 holds in this case.
5. computation of degrees of concentration: we use the simdefined in (10) and compute the gdc according to (5). Then, analogous to the case of classical decision trees,
gdcs= k i=1 p2 i 2−pi .
6 Conclusion
Decision tree approach is important since decision trees provide solutions to classification problems and extract rules effectively. To deal with different kinds of data, decision trees have been generalized along different directions in the past. In this paper, we propose a quite general framework for fuzzy decision trees. Some particular instances of this framework prove to be interesting
alternatives to previous proposals. Detailed comparison of our approach with these related works is ongoing. Implementation and experimental testing of our approach will be covered by future research.