• No results found

Research on Prediction Model of Time Series Based on Fuzzy Theory and Genetic Algorithm

N/A
N/A
Protected

Academic year: 2021

Share "Research on Prediction Model of Time Series Based on Fuzzy Theory and Genetic Algorithm"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Physics Procedia 33 ( 2012 ) 1241 – 1247

1875-3892 © 2012 Published by Elsevier B.V. Selection and/or peer review under responsibility of ICMPBE International Committee. doi: 10.1016/j.phpro.2012.05.205

2012 International Conference on Medical Physics and Biomedical Engineering

Research on Prediction Model of Time Series Based on Fuzzy

Theory and Genetic Algorithm

Wu Xiao-qin

Key Laboratory of Network and Intelligent Information Processing

Hefei University Hefei 230601 , China

[email protected]

Abstract

Fuzzy theory is one of the newly adduced self-adaptive strategies,which is applied to dynamically adjust the parameters o genetic algorithms for the purpose of enhancing the performance.In this paper, the financial time series analysis and forecasting as the main case study to the theory of soft computing technology framework that focuses on the fuzzy theory and genetic algorithms(FGA) as a method of integration. the financial time series forecasting model based on fuzzy theory and genetic algorithms was built. the ShangZheng index cards as an example. The experimental results show that FGA perform s much better than BP neural network not only in the precision but also in the searching speed.The hybrid algorithm has a strong feasibility and superiority.

© 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of [name organizer]

Keywords: fuzzy theory,genetic algorithm, financial time series.

I. Introduction

In the financial field, time-series data is an important class of data types and time series of financial data mining is an important aspect of mining. Currently in the field of financial time series analysis, the data mining research focuses mainly on in the forecast, similarity query, correlation analysis and sequence analysis, cluster analysis, anomaly detection, time series segmentation, etc., and financial time series prediction is an important research direction. At present, neural networks, fuzzy logic, genetic algorithms and rough sets, which belong to soft computing technology in data mining is gaining more and more widely used, based on the theory of soft computing technology to build more intelligent, robust data mining system. Research has shown that the theory of soft computing methods have advantages. In

Available online at www.sciencedirect.com

© 2012 Published by Elsevier B.V. Selection and/or peer review under responsibility of ICMPBE International Committee. Open access under CC BY-NC-ND license.

(2)

order to improve the performance of data mining, data mining technology to a wide range of integration, building a hybrid system will be the development trend of future research.

2. FGA algorithm

Optimization problem is the classic applications of genetic algorithms.The genetic algorithm is simple, high efficiency as well as its general applicability, has won many applications. Financial time series data contains a wealth of system information. The time series analysis is to discover the hidden knowledge (relationships, rules, trends, etc.).The causal relationship is one of the most important one that find out market trends based on historical sequence.Fuzzy logic produce the fuzzy domain of the main division of the initial interval. The initial interval determine the division of fuzzy functions and predict the future data. the optimal length of the historical range was designed based on genetic algorithm.The appropriate fitness function was selected to calculate the optimal range of the best fitness division. So as to achieve a specific value of the fuzzy. The flow chart of the algorithm is shown in Fig.1:

Fig. 1 Flow Chart of Algorithm

(3)

3.1 Encoding

Binary coding and real number coding are two methods in common use. The real number coding is the intuitive description of the problem and avoids the process of encoding and decoding, greatly improving the accuracy and speed of computation. Thus in this paper, binary coding is adopted, This paper is a set of discrete data that predict that the data in the form of discrete, while the length of the data is discrete interval of time, for example: 50, such as the encoding length of 22, for binary-coded as follows:

0100110000000000000000

the corresponding variable value is the length of an interval.

3.2 The fitness function

Fitness function is a function evaluating the pros and ons of the individuals. The greater the value of the individual, the more probable the individuals selected for cross and mutation operation are. The individual characteristics pass down to next generation is of more probability.

Fitness function is a function evaluating the pros and ons of the individuals. The greater the value of the individual, the more probable the individuals selected for cross and mutation operation are. The individual characteristics pass down to next generation is of more probability.

The fitness function is

F

f

(t

)

.

) , ( ) (t xi t f

There into, i denotes interval number , i=1,2,3,n,x(i,t) denotes the tth generation computation, ith interval .. Here i = 1,2,3, ... n;. The value of x(i,t) is included inU.U denotes the value range of ith interval.

X (t) is x(i,t) interval set . 3.3 Termination rule

We use the algebra of evolution as a termination rule, that is, when t = MAXGEN, the operation is terminated. Here, t denotes the algebra of current evolution, MAXGEN denotes the largest algebra of evolution.

3.4 Genetic parameters

The design of genetic parameters, especially probabilities of crossover, probabilities of mutation and selection of population size, has considerable influence on the performance of the entire algorithm. Suppose population size is 50, Adaptive crossover probability and mutation probability.In the evolutionary process, with the individual fitness of the evolving population changes, the crossover probability pc and mutation probability pm are dynamically adjusted. The formula is as follows:

avg t m avg t m t m avg t c avg avg t c t c f f p f f f f f p p f f p f f f f f f p p ' ' max ' max 1 max max 1 ) ( ) ( ) (

(4)

There into,

p

ct:crossover probability,

p

mt: mutation probability,

f

max :the largest fitness value in the population,

f

avg: the average fitness value in the population,

f

: the one with the larger fitness value of two crossover individuals,

f

' : the fitness value of an individual.

4. predict analysis based on Fuzzy control

The ShangZheng index cards as an example for description and analysis, see Table 1.

TABLE I The history data of the ShangZheng index cards

Date Closing price Date Closing price 2008.3.19 2008.4.3 2008.3.20 2008.4.7 2008.3.21 2008.4.8 2008.3.24 2008.4.9 2008.3.25 2008.4.10 2008.3.26 2008.4.11 2008.3.27 2008.4.14 2008.3.28 2008.4.15 2008.3.31 2008.4.16 2008.4.1 2008.4.17 2008.4.2 2008.4.18

The discourse universe is :U = [4400,5500], the optimal range is divided into based on genetic algorith, see Table 2.

TABLE Division of the interval

Interval L U Interval L U u1 4400 4425 u14 5300 5350 u2 4425 4450 u15 5350 5366.7 u3 4450 4500 u12 5266.7 5283.4 u4 4500 4600 u13 5283.4 5300 u5 4600 4725 u14 5300 5350 u6 4725 4750 u15 5350 u7 4750 4800 u16 5366.7 5383.4 u8 4800 4950 u17 5383.4 5400 u9 4950 5175 u18 5400 5425 u10 5175 5200 u19 5425 5450 u11 5200 5266.7 u20 5450 5466.7 u12 5266.7 5283.4 u21 5466.7 5483.4 u13 5283.4 5300 u22 5483.4 5500

(5)

4.1 Definition of the discourse universe U.

U=[4400,5500],genetic algorithm was used to divide interval to achieve the optimal partition.

4.2 Definition of fuzzy sets and membership functions

Zadeh representation of the fuzzy set:

n n A A A

x

x

x

x

x

x

A

(

)

(

)

(

)

2 2 1 1

)

(x

A

is

the membership function and A is the fuzzy subset. According to Table 2 the interval

defined by fuzzy sets, which indicated the formation of the Shanghai index semantic variable fuzzy sets . the value of ownership is calculated by membership function.the historical data fuzzy and defined as follows : 22 4 3 2 1 1 1 0.5 0 0 ... 0 u u u u u A 22 4 3 2 1 2 0.5 1 0.5 0 ... 0 u u u u u A 22 21 20 2 1 22 0 0 ... 0 0.5 1 u u u u u A

5. Experimental results and performance analysis

In this paper,adaptive crossover probability and mutation probability is used. The max iteration number is set: Maxgen=300.The initialization population is 22. itness values and interval value is computed as shown in Fig. 2.

Fig 2. Fitness Values and Interval Value

In this paper, the realization of the above algorithm adopts Java language, and stock predictive system is designed which predict the closing price trend of the next following days through the historical data.

(6)

Take " the ShangZheng index cards " as an example, the trend of closing price is predicted as shown in Fig. 3.

Fig.3 Prediction Trend of " ShangZheng Index Cards "

In order to verify the superiority of the use of adaptive genetic algorithms and fuzzy logic optimal solution algorithm to solve things in the direction of the development of the application of the rules of analysis of financial time series prediction problem, stock prediction is taken as an example of algorithm to do performance testing. The max iteration number is set: Maxgen=300.The testing result is shown in Table 5-1. Comparison of Algorithms.

TABLE Comparison of Algorithms

Algorithms Average Iteration number

Run times Success times Average time(second) Adaptive genetic algorithm 163 10 10 28 Sample genetic algorithm 189 10 10 7.5 Random algorithm 217 10 8 14.5

We can see from Fig.4 that the adaptive algorithm overcomes the problem of premature, and arrives at the best fitness early. The simple GA is unstable, it reaches the optimal solution near the 100’th generation.

6. Conclusion

As the stock market is affected by many factors, predicting becomes very difficult. And the present neural network training algorithms have their own advantages and disadvantages. In this paper, is to combine the two data mining algorithms, the use of genetic algorithms and fuzzy logic optimal solution algorithm to solve things in the direction of the development of the application of the rules of analysis of financial time series prediction problem, and comparison with other algorithms. A predictive model is established, predictive accuracy is improved, and more stable results are obtained.

(7)

References

[1]Pan H, “A new fuzzy genetic algorithm based on population diversity”, Proceedings of 2001 IEEE International Symposium on Computational Intelligence in Robots and Automation , 2001 , pp.108-112.

[2]Liu Li,He Xianping, “Model of time series forecasting based on GA and fuzzy decision tre”, Computer Engineering and Design, 2008,29(19),pp. 5044-5048.

[3]Yang Bingru Xiong Fanlun, “KD(D&K)and double-bases coperating mechanism”, Journal of System Engineering and Electronics,.1999,10(2),pp:48-55

[4]Srinvas M, Patnaik L, “M,Adaptive Probabilities of Crossover and Mutation in Genetic Algorithms”, IEEE Trans on Systems , Man and Cybernetics , 1994 , 24(4),pp:162 - 167.

[5]LI Hongxing, HUANG Hanpang, “New method of fuzzy-based genetic algorithms”, Computer Engineering and Design , 2008.29(14),,pp: 3714-3718.

[6] Eric I Chang, Richard P L ippmann, “Using genetic algorithms to improve pattern classification performance”, Proceedings of the 1990 Conference on advances in neural information processing systems , Denver, Colorado, United States, pp.797-803.

References

Related documents

This study aims to describe the housing conditions and health status of a sample of people assisted by Caritas Barcelona (Spain) and living in inadequate housing and/or struggling

In conclusion, the finding of this cross-sectional study of pre and post menopausal women show no evidence of a significant difference in BMD between Ocs users and never user

It was decided that with the presence of such significant red flag signs that she should undergo advanced imaging, in this case an MRI, that revealed an underlying malignancy, which

After successfully supporting the development of the wind power technology, an approach is needed to include the owners of wind turbines in the task of realizing other ways, other

There are infinitely many principles of justice (conclusion). 24 “These, Socrates, said Parmenides, are a few, and only a few of the difficulties in which we are involved if

19% serve a county. Fourteen per cent of the centers provide service for adjoining states in addition to the states in which they are located; usually these adjoining states have

Please address all communica- tions to: Marion Gleason, Research Assistant, Department of Pharmacology and Toxicology, Uni- versity of Rochester School of Medicine and

We present two novel, progressive duplicate detection algorithms that significantly increase the efficiency of finding duplicates if the execution time is limited: