• No results found

Automated Methods for Fuzzy Systems

N/A
N/A
Protected

Academic year: 2021

Share "Automated Methods for Fuzzy Systems"

Copied!
97
0
0

Loading.... (view fulltext now)

Full text

(1)

Automated Methods for Fuzzy Systems

Gradient Method

Adriano Joaquim de Oliveira Cruz

PPGI-UFRJ

(2)

Summary

(3)

Summary

1 Introduction

(4)

Summary

1 Introduction

2 Training Standard Fuzzy System

(5)

Summary

1 Introduction

2 Training Standard Fuzzy System

3 Output Membership Function Centers Update Law 4 Input Membership Function Centers Update Law

(6)

Summary

1 Introduction

2 Training Standard Fuzzy System

3 Output Membership Function Centers Update Law 4 Input Membership Function Centers Update Law 5 Input Membership Function Spreads Update Law

(7)

Summary

1 Introduction

2 Training Standard Fuzzy System

3 Output Membership Function Centers Update Law 4 Input Membership Function Centers Update Law 5 Input Membership Function Spreads Update Law 6 Example

(8)

Section Summary

1 Introduction

2 Training Standard Fuzzy System

3 Output Membership Function Centers Update Law

4 Input Membership Function Centers Update Law

5 Input Membership Function Spreads Update Law

(9)
(10)

Bibliography

Kevin M. Passino, Stephen Yurkovich Fuzzy Control in Chapter 5.

Addison Wesley Longman, Inc, USA, 1998.

Timothy J. Ross

Fuzzy Logic with Engineering Applications.

John Wiley and Sons, Inc, USA, 2010.

J. R. Jang, C. Sun, E. Mizutani

Neuro Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence

(11)

Constructing fuzzy systems

(12)

Constructing fuzzy systems

How to construct a fuzzy system from numeric data?

Using data obtained experimentally from a system, it is possible to identify the model.

(13)

Constructing fuzzy systems

How to construct a fuzzy system from numeric data?

Using data obtained experimentally from a system, it is possible to identify the model.

Find a model that fits the data by using fuzzy interpolation capabilities.

(14)

Introduction

We need to construct a fuzzy system f(x, θ) that approximate the functiong represented in the training dataG.

(15)

Introduction

We need to construct a fuzzy system f(x, θ) that approximate the functiong represented in the training dataG.

(16)

Introduction

We need to construct a fuzzy system f(x, θ) that approximate the functiong represented in the training dataG.

There is no guarantee that it will succeed.

(17)

Section Summary

1 Introduction

2 Training Standard Fuzzy System

3 Output Membership Function Centers Update Law

4 Input Membership Function Centers Update Law

5 Input Membership Function Spreads Update Law

(18)

The System

(19)

The System

Gaussian input membership functions with centers cji and spreadsσji. Output membership function centers bi.

(20)

The System

Gaussian input membership functions with centers cji and spreadsσji. Output membership function centers bi.

(21)

The System

Gaussian input membership functions with centers cji and spreadsσji. Output membership function centers bi.

Product for premise and implication. Center-average defuzzification.

(22)

The System

Gaussian input membership functions with centers cji and spreadsσji. Output membership function centers bi.

Product for premise and implication. Center-average defuzzification. It is described by f(x|θ) = PR i=1biQnj=1exp " −12 xj−ci j σi j 2# PR i=1 Qn j=1exp " −12 xj−ci j σi j 2#

(23)

Error

(24)

Error

Suppose that you have the mth training data pair (x,y)G.

The GM’s goal is to minimize the error between the predicted output value,f(xm|θ) and the actual output value ym.

(25)

Error

Suppose that you have the mth training data pair (x,y)G.

The GM’s goal is to minimize the error between the predicted output value,f(xm|θ) and the actual output value ym.

The equation for the error surface is:

em =

1

2[f(x|θ)−y]

(26)

Error

Suppose that you have the mth training data pair (x,y)G.

The GM’s goal is to minimize the error between the predicted output value,f(xm|θ) and the actual output value ym.

The equation for the error surface is:

em =

1

2[f(x|θ)−y]

2

(27)

Error

Suppose that you have the mth training data pair (x,y)G.

The GM’s goal is to minimize the error between the predicted output value,f(xm|θ) and the actual output value ym.

The equation for the error surface is:

em =

1

2[f(x|θ)−y]

2

We seek to minimize em by choosing the parametersθ that are

(28)

Error

Suppose that you have the mth training data pair (x,y)G.

The GM’s goal is to minimize the error between the predicted output value,f(xm|θ) and the actual output value ym.

The equation for the error surface is:

em =

1

2[f(x|θ)−y]

2

We seek to minimize em by choosing the parametersθ that are

bi,cji andθij,i = 1,2, . . . ,R,j = 1,2, . . . ,n.

(29)

Error

Suppose that you have the mth training data pair (x,y)G.

The GM’s goal is to minimize the error between the predicted output value,f(xm|θ) and the actual output value ym.

The equation for the error surface is:

em =

1

2[f(x|θ)−y]

2

We seek to minimize em by choosing the parametersθ that are

bi,cji andθij,i = 1,2, . . . ,R,j = 1,2, . . . ,n.

R rules, n input variables.

(30)

Section Summary

1 Introduction

2 Training Standard Fuzzy System

3 Output Membership Function Centers Update Law

4 Input Membership Function Centers Update Law

5 Input Membership Function Spreads Update Law

(31)

b

i

Update Law

(32)

b

i

Update Law

How to adjunt the bi to minimize em.

We will use bi(k+ 1) =bi(k)−λ1 ∂em ∂bi k

(33)

b

i

Update Law

How to adjunt the bi to minimize em.

We will use bi(k+ 1) =bi(k)−λ1 ∂em ∂bi k wherei = 1,2, . . . ,R

(34)

b

i

Update Law

How to adjunt the bi to minimize em.

We will use bi(k+ 1) =bi(k)−λ1 ∂em ∂bi k wherei = 1,2, . . . ,R

(35)

Gradient Descent

The update method would movebi along the negative gradient of the

(36)

Gradient Descent

The update method would movebi along the negative gradient of the

error surface.

(37)

Gradient Descent

The update method would movebi along the negative gradient of the

error surface.

The parameter λ1>0 characterizes the step size.

(38)

Gradient Descent

The update method would movebi along the negative gradient of the

error surface.

The parameter λ1>0 characterizes the step size.

If λ1 is chosen too small, then bi is adjusted very slowly.

If λ1 is chosen too big, then it may step over the minimum value of em.

(39)

Gradient Descent

The update method would movebi along the negative gradient of the

error surface.

The parameter λ1>0 characterizes the step size.

If λ1 is chosen too small, then bi is adjusted very slowly.

If λ1 is chosen too big, then it may step over the minimum value of em.

(40)

Gradient Descent

The update method would movebi along the negative gradient of the

error surface.

The parameter λ1>0 characterizes the step size.

If λ1 is chosen too small, then bi is adjusted very slowly.

If λ1 is chosen too big, then it may step over the minimum value of em.

Some algorithms try to adaptively choose the step size.

If the error is big increaseλ1, but if they are decreasing take small

(41)

b

i

Update Formula I

(42)

b

i

Update Formula I

Erro: em = 12[f(x|θ)−y]2

Regra da Cadeia: ∂em

∂bi

= (f(xm|θ)ym)∂f(xm|θ)

(43)

b

i

Update Formula I

Erro: em = 12[f(x|θ)−y]2

Regra da Cadeia: ∂em

∂bi = (f(xm|θ)ym)∂f(xm|θ) ∂bi Since f(x|θ) = PR i=1biQnj=1exp  −12 xj−cij σi j !2  PR i=1 Qn j=1exp  −12 xj−cij σi j !2 

(44)

b

i

Update Formula I

Erro: em = 12[f(x|θ)−y]2

Regra da Cadeia: ∂em

∂bi = (f(xm|θ)ym)∂f(xm|θ) ∂bi Since f(x|θ) = PR i=1biQnj=1exp  −12 xj−cij σi j !2  PR i=1 Qn j=1exp  −12 xj−cij σi j !2  then ∂em ∂bi = (f(xm|θ)ym) Qn j=1exp  −12 xj−cij σi j !2  PR i=1 Qn j=1exp  −12 xj−cij σi !2 

(45)

b

i

Update Formula II

Let µi(xm,k) = Qn j=1exp  −12 xj−cij σi j !2  PR i=1 Qn j=1exp  −12 xj−cij σi j !2 

(46)

b

i

Update Formula II

Let µi(xm,k) = Qn j=1exp  −12 xj−cij σi j !2  PR i=1 Qn j=1exp  −12 xj−cij σi j !2  Let ǫm(k) =f(xm|θ(k))−ym

(47)

b

i

Update Formula II

Let µi(xm,k) = Qn j=1exp  −12 xj−cij σi j !2  PR i=1 Qn j=1exp  −12 xj−cij σi j !2  Let ǫm(k) =f(xm|θ(k))−ym Then bi(k+ 1) =bi(k)−λ1ǫm(k) µi(xm,k) PR i=1µi(xm,k)

(48)

Section Summary

1 Introduction

2 Training Standard Fuzzy System

3 Output Membership Function Centers Update Law

4 Input Membership Function Centers Update Law

5 Input Membership Function Spreads Update Law

(49)

c

i j

Update Law

We will use cji(k+ 1) =cji(k)−λ2 ∂em ∂cji k

(50)

c

i j

Update Law

We will use cji(k+ 1) =cji(k)−λ2 ∂em ∂cji k whereλ2 >0,i = 1,2, . . . ,R andj = 1,2, . . . ,n

(51)

c

i

j

Update Formula I

(52)

c

i

j

Update Formula I

Erro: em = 12[f(x|θ)−y]2

Regra da Cadeia: ∂em

∂ci j =ǫm(k) ∂f(xm|θ(k)) ∂µi(xm,k) ∂µi(xm,k) ∂ci j

(53)

c

i

j

Update Formula I

Erro: em = 12[f(x|θ)−y]2

Regra da Cadeia: ∂em

∂ci j =ǫm(k) ∂f(xm|θ(k)) ∂µi(xm,k) ∂µi(xm,k) ∂ci j Now ∂f(x m|θ(k)) µi(xm,k) = ( PR i=1µi(xm,k))bi(k)−(PRi=1bi(k)µi(xm,k))(1) (PR i=1µi(xm,k)) 2

(54)

c

i

j

Update Formula I

Erro: em = 12[f(x|θ)−y]2

Regra da Cadeia: ∂em

∂ci j =ǫm(k) ∂f(xm|θ(k)) ∂µi(xm,k) ∂µi(xm,k) ∂ci j Now ∂f(x m|θ(k)) µi(xm,k) = ( PR i=1µi(xm,k))bi(k)−(PRi=1bi(k)µi(xm,k))(1) (PR i=1µi(xm,k)) 2 So that ∂f(x m|θ(k)) µi(xm,k) = bi(k)−f(xm|θ(k)) PR i=1µi(xm,k)

(55)

c

i j

Update Formula II

Also we have ∂µi(x m,k) ∂ci j =µi(xm,k) xm j −c j i(k) (σi j(k))2

(56)

c

i j

Update Formula II

Also we have ∂µi(x m,k) ∂ci j =µi(xm,k) xm j −c j i(k) (σi j(k))2

(57)

c

i j

Update Formula II

Also we have ∂µi(x m,k) ∂ci j =µi(xm,k) xm j −c j i(k) (σi j(k))2

(58)

c

i j

Update Formula II

Also we have ∂µi(x m,k) ∂ci j =µi(xm,k) xm j −c j i(k) (σi j(k))2

The update formula for cji is

cji(k+1) =cji(k)−λ2ǫm(k) bi(k)−f(xm|θ(k)) PR i=1µi(xm,k) ! µi(xm,k) xm j −cji(k) (σi j(k))2 !

(59)

Section Summary

1 Introduction

2 Training Standard Fuzzy System

3 Output Membership Function Centers Update Law

4 Input Membership Function Centers Update Law

5 Input Membership Function Spreads Update Law

(60)

σ

i j

Update Law

We will use σij(k+ 1) =σij(k)−λ3 ∂em ∂σji k

(61)

σ

i j

Update Law

We will use σij(k+ 1) =σij(k)−λ3 ∂em ∂σji k whereλ3 >0,i = 1,2, . . . ,R andj = 1,2, . . . ,n

(62)

σ

i

j

Update Formula I

(63)

σ

i

j

Update Formula I

Erro: em = 12[f(x|θ)−y]2

Regra da Cadeia: ∂em

∂σi j =ǫm(k) ∂f(xm|θ(k)) ∂µi(xm,k) ∂µi(xm,k) ∂σi j

(64)

σ

i

j

Update Formula I

Erro: em = 12[f(x|θ)−y]2

Regra da Cadeia: ∂em

∂σi j =ǫm(k) ∂f(xm|θ(k)) ∂µi(xm,k) ∂µi(xm,k) ∂σi j We already calculated ∂f(x m|θ(k)) µi(xm,k) = bi(k)−f(xm|θ(k)) PR i=1µi(xm,k)

(65)

σ

i j

Update Formula II

Also we have ∂µi(x m,k) ∂σji =µi(x m,k) (xm j −cji(k)2) (σi j(k))3

(66)

σ

i j

Update Formula II

Also we have ∂µi(x m,k) ∂σji =µi(x m,k) (xm j −cji(k)2) (σi j(k))3

(67)

σ

i j

Update Formula II

Also we have ∂µi(x m,k) ∂σji =µi(x m,k) (xm j −cji(k)2) (σi j(k))3

(68)

σ

i j

Update Formula II

Also we have ∂µi(x m,k) ∂σji =µi(x m,k) (xm j −cji(k)2) (σi j(k))3

The update formula for σij is

σij(k+ 1) =σij(k)−λ3ǫm(k) bi(k)−f(xm|θ(k)) PR i=1µi(xm,k) µi(xm,k) xjm−cji(k)2 (σji(k))3

(69)

Section Summary

1 Introduction

2 Training Standard Fuzzy System

3 Output Membership Function Centers Update Law

4 Input Membership Function Centers Update Law

5 Input Membership Function Spreads Update Law

(70)

Training Data Set

We will use the training data set of the table to illustrate the algorithm.

x1 x2 y x1 0 2 1

x2 2 4 5

x3 3 6 6

(71)

Choosing the step size

The algorithm requires that a step sizeλbe specified for each of the three parameters.

(72)

Choosing the step size

The algorithm requires that a step sizeλbe specified for each of the three parameters.

Selecting a large λwill converge faster but may risk overstepping the minimum.

(73)

Choosing the step size

The algorithm requires that a step sizeλbe specified for each of the three parameters.

Selecting a large λwill converge faster but may risk overstepping the minimum.

(74)

Choosing the step size

The algorithm requires that a step sizeλbe specified for each of the three parameters.

Selecting a large λwill converge faster but may risk overstepping the minimum.

Selecting a small step means converging very slowly.

(75)

Choosing initial values

(76)

Choosing initial values

Initial values for the rules must be designated.

For the first rule, we choose x11,x21,y1 as the input and output membership centers.

(77)

Choosing initial values

Initial values for the rules must be designated.

For the first rule, we choose x11,x21,y1 as the input and output membership centers.

For the second rule, we choosex12,x22,y2 as the input and output membership centers.

(78)

Choosing initial values

Initial values for the rules must be designated.

For the first rule, we choose x11,x21,y1 as the input and output membership centers.

For the second rule, we choosex12,x22,y2 as the input and output membership centers.

(79)

Choosing initial values

Initial values for the rules must be designated.

For the first rule, we choose x11,x21,y1 as the input and output membership centers.

For the second rule, we choosex12,x22,y2 as the input and output membership centers.

Select spread equals to 1.

(80)

Choosing initial values

Rule1 c11(0) c21(0) = 0 2 σ11(0) σ12(0) = 1 1 b1(0) = 1 Rule2 c12(0) c22(0) = 2 4 σ21(0) σ22(0) = 1 1 b2(0) = 5

(81)

Plotting initial values

0 2 4 6 8 10 0 0.5 1 x 1 µ (x 1 ) c 11 c12 0.5 1 µ (x 2 )

(82)

Calculating predicted outputs

Calculate the membership values of the implication of each rule using:

µi(xm,k = 0) = n Y j=1 exp  −1 2 xjm−cji(k = 0) σi j(k = 0) !2 

(83)

Calculating predicted outputs

Calculate the membership values of the implication of each rule using:

µi(xm,k = 0) = n Y j=1 exp  −1 2 xjm−cji(k = 0) σi j(k = 0) !2 

Calculate the outputs using (defuzzification):

f(xm|θ(k = 0)) =

PR

i=1bi(0)µi(xm,k = 0)

(84)

Membership degrees rule 1

µ1(x1,0) = exp " −1 2 0−0 1 2# ∗exp " −1 2 2−2 1 2# = 1 µ1(x2,0) = exp " −1 2 2−0 1 2# ∗exp " −1 2 4−2 1 2# = 0.0183156 µ1(x3,0) = exp " −1 2 3−0 1 2# ∗exp " −1 2 6−2 1 2# = 3.72665×10−6

(85)

Membership degrees rule 2

µ2(x1,0) = exp " −1 2 0−2 1 2# ∗exp " −1 2 2−4 1 2# = 0.0183156 µ2(x2,0) = exp " −1 2 2−2 1 2# ∗exp " −1 2 4−4 1 2# = 1.0 µ2(x3,0) = exp " −1 2 3−2 1 2# ∗exp " −1 2 6−4 1 2# = 0.082085

(86)

Defuzzification

f(x1|θ(0)) = b1(0)×µ1(x 1,0) +b 2(0)×µ2(x1,0) µ1(x1,0) +µ 2(x1,0) f(x1|θ(0)) = 1×1 + 5×0.0183156 1 + 0.0183156 f(x1|θ(0)) = 1.0719447 f(x2|θ(0)) = b1(0)×µ1(x 2,0) +b 2(0)×µ2(x2,0) µ1(x2,0) +µ2(x2,0) f(x2|θ(0)) = 1×0.0183156 + 5×1 0.0183156 + 1

(87)

Defuzzification

f(x3|θ(0)) = b1(0)×µ1(x 3,0) +b 2(0)×µ2(x3,0) µ1(x3,0) +µ 2(x3,0) f(x3|θ(0)) = 1×3.72665×10 −6+ 5×0.082085 3.72665×10−6+ 0.082085 f(x3|θ(0)) = 4.999818

(88)

Calculating erros

em = 12[f(xm|θ(k = 0))−ym]2

e1 = 12[1.0719447−1]2 = 2.58802×10−3 e2 = 12[4.9280550−5]2 = 2.58802×10−3 e3 = 12[4.9998180−6]2 = 0.500182

(89)

Calculating erros

em = 12[f(xm|θ(k = 0))−ym]2

e1 = 12[1.0719447−1]2 = 2.58802×10−3 e2 = 12[4.9280550−5]2 = 2.58802×10−3 e3 = 12[4.9998180−6]2 = 0.500182

(90)

Calculating erros

em = 12[f(xm|θ(k = 0))−ym]2

e1 = 12[1.0719447−1]2 = 2.58802×10−3 e2 = 12[4.9280550−5]2 = 2.58802×10−3 e3 = 12[4.9998180−6]2 = 0.500182

The first two data points are mapped better than the third. The result can be improved by cycling through the model.

(91)

Calculating erros

em = 12[f(xm|θ(k = 0))−ym]2

e1 = 12[1.0719447−1]2 = 2.58802×10−3 e2 = 12[4.9280550−5]2 = 2.58802×10−3 e3 = 12[4.9998180−6]2 = 0.500182

The first two data points are mapped better than the third. The result can be improved by cycling through the model.

The GM will update the rule-base parameters bi,cji andσji using the

(92)

Updating ...

ǫm(k = 0) =f(xm|θ(k = 0))−ym

(93)

Updating

b

i bi(k) =bi(k−1)−λ1×(ǫk(k−1)) µi(xk,k−1) PR i=1µi(xk,k−1) b1(1) =b1(0)−λ1×(ǫ1(0)) µ1(x1,0) µ1(x1,0) +µ2(x1,0) = 1−1×(0.0719447) 1 1 + 0.0183156 = 0.9644354 b2(1) =b2(0)−λ1×(ǫ1(0)) µ2(x1,0) µ1(x1,0) +µ 2(x1,0) = 5−1×(0.0719447) 0.0183156 1 + 0.0183156 = 4.998706

(94)

Updating

c

1 j cji(k) =cji(k−1)−λ2(ǫk(k−1)) " bi(k −1)−f(xk|θ(k−1)) PR i=1µi(xk,k −1) # ×µi(xk,k−1) xk j −cji(k−1) (σji(k−1))2 ! c11(1) =c11(0)−1ǫ1(0) b1(0)−f(x1|θ(0)) µ1(x1,0) +µ 2(x1,0) ×µ1(x1,0) x11−c11(0) (σ11(0))2 c11(1) = 0 c21(1) =c12(0)−1ǫ1(0) b1(0)−f(x1|θ(0)) µ1(x1,0) +µ 2(x1,0) ×µ2(x1,0) x21−c21(0) (σ21(0))2

(95)

Updating

c

2 j c12(1) =c12(0)−1ǫ1(0) b2(0)−f(x1|θ(0)) µ1(x1,0) +µ2(x1,0) ×µ2(x1,0) x11−c12(0) (σ12(0))2 c11(1) = 2.010166 c22(1) =c22(0)−1ǫ1(0) b2(0)−f(x1|θ(0)) µ1(x1,0) +µ2(x1,0) ×µ2(x1,0) x21−c22(0) (σ22(0))2 c22(1) = 4.010166

(96)

Updating

σ

i j σji(k) =σji(k−1)−λ3(ǫk(k−1)) " bi(k−1)−f(xk|θ(k−1)) PR i=1µi(xk,k−1) # ×µi(xk,k−1) (xjk −cji(k−1))2 (σi j(k−1))3 ! σ11(1) = 1 σ21(1) = 1 σ12(1) = 0.979668 σ22(1) = 0.979668

(97)

References

Related documents

With an increasing salinity level and day of salinity treatment, the decrease rate of leaf os- motic potential in mycorrhizal plants was higher than in non-mycorrhizal

Figure 2.1: The generic pipeline to identify potential network biomarker crosstalks; (A) given biological networks that represent evidences, and a set of biological subsystems,

To identify early indicators for palliative care assessment, patients were classified to those who died within 30 days of diagnosis (short term survivors) and those who survived

PROTEIN PROFILE OF HUMAN IMMUNODEFICIENCY VIRUS (HIV) PATIENTS ON ANTIRETROVIRAL DRUGS.. Ogechi

Based on findings discussed above, it can be then be concluded that the model of Family Communication Patterns as the second order construct can be

Some solutions are necessary to enhance the role of the Government as policy makers and information providers to promote the commercialization of inventions from

Our series of 117 patients was obtained from the hospital records of 140 patients for whom the diagnosis of septic arthritis