Chapter 5: A Fuzzy Modelling Approach with a Hierarchical Clustering
5.3 Data Clustering and Initial Fuzzy Model Construction
5.3.2 Initial Fuzzy Model Construction
By using the agglomerative complete-link clustering algorithm, a predefined number of clusters can be obtained from the training data. The information that these clusters will provide is then used to construct an initial fuzzy model. In this modelling approach, one cluster corresponds to one fuzzy rule; the centres of membership functions are defined using the information of their corresponding clusters centre positions; other parameters relating to the membership functions are defined under the principle that one membership function must cover all the training data, which are included in its corresponding cluster.
5.3.2.1 An Example of Constructing the Initial Fuzzy Model
Figure 5-8 shows an example of how to construct initial fuzzy model from the information of clusters:
1. One fuzzy rule corresponds to one cluster. For instance, for the right upper cluster in Figure 5-8, the corresponding fuzzy rule should be: IF input is
A13, THEN output is A23.
2. The membership functions centres correspond to the clusters centres. As shown in Figure 5-8, the green dashed-dotted lines represent the centres of clusters and they are used to define the membership functions centres.
3. The membership functions widths correspond to the clusters widths, which are shown as red dashed lines in Figure 5-8. For the parameters that relate to the covering range of membership functions, they are defined under the clusters width restriction.
Chapter 5: FM-HCMO
5.3.2.2 Fuzzy System Definition and Notation
A generic multi-input and single-output (MISO) fuzzy model is represented as a collection of fuzzy rules in the following form:
RuleRk: IF x1 is A1k and x2 is A2k and xD is ADk, THEN y is Bk,
where Rk is the label of the kth fuzzy rule; x = [x1, x2, , xD]T U1×U2× ×UD
are input linguistic variables; Alkare the antecedent fuzzy sets of the universes of
discourseUl, where l = 1, 2, , D; y V is the output linguistic variable; and Bk
is a consequent fuzzy set of the universe of discourse V.
In this work, Gaussian functions are chosen as the membership functions (without any loss of generality), i.e.:
2 2 ) ( exp ) (x x c A , (5.4)
where A is the membership function of x belonging to the fuzzy set A; parameters c and represent the centre and the width of this membership function, where is a positive number. Besides this, the product inference engine and the centre average defuzzification method [Wang 1997] are also implemented in the fuzzy systems of this work.
5.3.2.3 The Fuzzy Model Extraction Approach
input-output (D-input and 1-output) training data {p1, p2, , pN}.pm = [x1m, x2m, , xDm,ym]T, where m = 1, 2, , N;N is the number of training data.
By using the agglomerative complete-link clustering algorithm, a predefined number of clusters can be obtained from the training data. Let Cn represent the nth cluster and Cn= {pn1, pn2, , pn(NDn)}, where n = 1, 2, , Nc and NDn is the
number of data in the nth cluster.
In this modelling approach, the rule-base is obtained and is composed of Nc fuzzy
rules. The fuzzy rule corresponding to the cluster Cn can be represented as follows:
Rn: IF x1 is A1n and x2 is A2n and xD is ADn, THEN y is Bn.
where n = 1, 2, , Nc;x = [x1, x2, , xD]T are input linguistic variables; Ainare
antecedent fuzzy sets, where i = 1, 2, , D;y is the output linguistic variable; and
Bnis a consequent fuzzy set.
Considering one fuzzy set Ain, the Gaussian membership function of Ainincludes
two parameters cin and in.cin can be calculated using the following equation: NDn x c NDn j nj i n i 1 . (5.5)
It is worth nothing that the membership function should cover all the training data contained in its corresponding cluster. In other words, for every data included in this cluster, its membership degree should be high enough to ensure the data maps into this rule. Based on this requirement, the membership parameter in is designed
Chapter 5: FM-HCMO
to satisfy the following equation:
Th c x x n i n i nj i j nj i A j ni 2 2 ) ( ) ( exp min ) ( min , (5.6) wherej = 1, 2, , NDn. This equation means that, for all the data included in the
nth cluster, the membership degrees are higher than a threshold Th. The value of
Th is set to 0.5 in this work without any loss of generality. The previous equation can be rewritten as follows:
) ln( ) max( Th c x n i nj i n i , (5.7)
wherej = 1, 2, , NDn. Using this equation, the parameter in is determined.