Vol 9, No 1 (2014)

(1)

IJCSBI.ORG

ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 45

An Integrated Procedure for Resolving

Portfolio Optimization Problems using

Data Envelopment Analysis, Ant

Colony Optimization and Gene

Expression Programming

Chih-Ming Hsu

Minghsin University of Science and Technology 1 Hsin-Hsing Road, Hsin-Fong, Hsinchu 304, Taiwan, ROC

ABSTRACT

The portfolio optimization problem is an important issue in the field of investment/financial decision-making and is currently receiving considerable attention from both researchers and practitioners. In this study, an integrated procedure using data envelopment analysis (DEA), ant colony optimization (ACO) for continuous domains and gene expression programming (GEP) is proposed. The procedure is evaluated through a case study on investing in stocks in the semiconductor sub-section of the Taiwan stock market. The potential average six-month return on investment of 13.12% from November 1, 2007 to July 8, 2011 indicates that the proposed procedure can be considered a feasible and effective tool for making outstanding investment plans. Moreover, it is a strategy that can help investors make profits even though the overall stock market suffers a loss. The present study can help an investor to screen stocks with the most profitable potential rapidly and can automatically determine the optimal investment proportion of each stock to minimize the investment risk while satisfying the target return on investment set by an investor. Furthermore, this study fills the scarcity of discussions about the timing for buying/selling stocks in the literature by providing a set of transaction rules.

Keywords

Portfolio optimization, Data envelopment analysis, Ant colony optimization, Gene expression programming.

1. INTRODUCTION

(2)

IJCSBI.ORG

solution. Although quadratic programming can be used to solve the problem with a reasonably small number of different assets, it becomes much more difficult if the number of assets is increased or if additional constraints, such as cardinality constraints, bounding constraints or other real-world requirements, are introduced.

Therefore, various approaches for tackling portfolio optimization problems

using heuristic techniques have been proposed. For example,

(3)

IJCSBI.ORG

optimization solvers including LOQO and CPLEX. Woodside-Oriakhi et al. [8] applied GAs, tabu search (TS) and simulated annealing (SA) to find the efficient frontier in financial portfolio optimization that extends the Markowitz mean-variance model to consider the discrete restrictions of buy-in thresholds and cardbuy-inality constrabuy-ints. The performance of their methods was tested using publicly available data sets drawn from seven major market indices. The implementation results indicated that the proposed methods could yield better solutions than previous heuristics in the literature. Chang and Shi [9] proposed a two-stage process for constructing a stock portfolio. In the first stage, the investment satisfied capability index (ISCI) was used to evaluate individual stock performance. In the second stage, a PSO algorithm was applied to find the optimal allocation of capital investment for each stock in the portfolio. The results of an experiment on investing in the Taiwan stock market from 2005 to 2007 showed that the accumulated returns on investment (ROIs) of the portfolios constructed by their proposed approach were higher than the ROIs of the Taiwan Weighted Stock Index (TWSI) portfolios. Sadjadi et al.[10] proposed a framework for formulating and solving cardinality constrained portfolio problem with uncertain input parameters. The problem formulation was based on the recent advancements on robust optimization and was solved using GAs. Their proposed method was examined on several well-known benchmark data sets including the Hang Seng 31 (Hong Kong), DAX 100 (Germany), FTSE 100 (UK), S&P

100 (USA), and Nikkei 225 (Japan). The results indicated that D-norm

performs better than Lp-norm with relatively lower CPU time for the

(4)

IJCSBI.ORG

et al. [13] proposed an approach for resolving the portfolio selection problem based on quantum-behaved particle swarm optimization (QPSO). The proposed QPSO model was employed to select the best portfolio in 50 supreme Tehran stock exchange companies with aims of optimizing the rate of return, systematic and non-systematic risks, return skewness, liquidity and sharp ratio. The comparison with traditional Markowitz’s and genetic algorithms models revealed that the return of the portfolio obtained by the QPSO was smaller than that in Markowitz’s classic model. However, the QPSO can decrease risk and provide more versatile portfolios than the other models.

The above-mentioned studies prove that soft computing techniques, such as GAs, PSO and ACO, are an effective and efficient way to address portfolio optimization problems. However, the concerns and interests of investors need also to be considered. First, the total number of stocks that investors can consider in their investment portfolio is usually extremely large. Therefore, investors usually focus on a few stock components according to their experience or principles for selecting stocks that have potential to make profits. Second, most investors are interested in minimizing downside risk since the return of stocks may not be normally distributed. Unfortunately, the research on downside risk is relatively little compared to the research that measures risk through the conventional variances used in the traditional Markowitz mean-variance model. Third, investors usually buy and sell their focused stocks several times during their investment planning horizon. Here again, the research regarding the timing of buying/selling stocks is scant.

2. PROBLEM FORMULATION

This study concentrates on the cardinality constrained portfolio optimization problem, which is a variant of the Markowitz mean-variance model where

the portfolio can include at most c different assets. In addition, the minimum

proportion of the total investment of each asset contained in the portfolio is also considered to reflect the fact that an investor usually sets a minimum investment threshold for each asset held. Notably, the study measures the variance (risk) of an asset by using the below-mean semi variance [14] to reflect that only downside risk is relevant to an investor and assets distributions may not be normally distributed. First, some notations are defined, as follows:

N: the total number of assets available;

no: the total number of periods considered;

t i

r : the return of asset i in period t (i1,2,...,N,t1,2,...,no);

i

mr: the expected (mean) return of asset i (i1,2,...,N);

i

(5)

IJCSBI.ORG

ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 49 ij

 : the correlation coefficient between assets i and j (

N j

N

i1,2,..., , 1,2,..., );

*

r : the expected portfolio return;

c: the maximum number of assets in the portfolio;

min

w : the minimum proportion of the total investment held in asset i, if any

investment is made in asset i (i1,2,...,N);

i

 : the decision variable that represents whether asset i (i1,2,...,N) is held

in the portfolio (_i 1) or not (_i 0).

The below-mean semi variance for asset i can then be calculated as follows

[14]:



   no t t i i m

i mr r

no SV 1 2 )] ( , 0 max[ 1

,i1,...,N. (1)

Hence, the cardinality constrained portfolio optimization problem considered in this study is formulated as shown below:



  N i N j ij m j m i j

iw SV SV

w

1 1

Minimize  (2)

subject to *

1 r mr w N i i i 



 (3) 1 1 



 N i i w (4) N i w

w _i _i

i min   , 1,2,...,

 (5) c N i i 



1  (6) ,...,N , i

i 0or 1, 12

 . (7)

Eq. (2) intends to minimize the volatility (variance or risk) associated with the portfolio. Eq. (3) ensures that the portfolio can yield an expected return

of r* at least. Eq. (4) ensures that investment proportions sum to one while

a minimum investment threshold is considered to restrict asset investments as shown in Eq. (5). Of particular importance is Eq. (5), which enforces that

the resulting proportion of w_i is zero if asset i is not held in the portfolio,

i.e. _i 0, and that the investment proportion of w_i cannot be less than the

minimum proportion w_min if asset i is held, i.e. _i 1. Eq. (6) is the

cardinality constraint that ensures the total number of assets in the portfolio

does not exceed the maximum allowable number c. Finally, Eq. (7) is the

(6)

IJCSBI.ORG

ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 50 3. METHODOLOGY ISSUES

3.1 Data Envelopment Analysis

Data envelopment analysis (DEA) is a method for measuring the relative efficiencies of a set of similar decision making units (DMUs) through an evaluation of their inputs and outputs. The two popular DEA models are the CCR model developed by Charnes et al. [15] and the BCC model proposed by Banker et al. [16]. In addition, DEA models can have an input or output orientation. In this study, the objective of applying DEA to portfolio optimization is to screen companies within a given industry on the basis of their financial performance. Since the goal is to measure the underlying financial strength of companies whose scale sizes may differ, the input-oriented CCR model is more appropriate than the output-input-oriented BBC model. Furthermore, it is easier to reduce the input quantities than to increase the output quantities. Hence, the input-oriented CCR model is

applied here. Suppose the goal is to evaluate the efficiency of d independent

DMUs relative to each other based on their common m inputs and s outputs.

The input-oriented CCR model for evaluating the performance h0 of DMU0

can be formulated as follows:



 

 _m

i i i s

r r r

x v

y u h

1 0 1

0

M aximize (8)

subject to j d

x v

y u

m

i ij i s

r rj r

,..., 2 , 1 , 1

1

1  



 

(9)

s r

u_r 0, 1,2,..., (10)

m i

v_i 0, 1,2,..., (11)

where x_ij(0) and y_rj(0) represent the ith input and the rth output of

DMUj, respectively; and vi and ur denote the weight given to input i and

output r, respectively.

3.2 Ant Colony Optimization for Continuous Domains

(7)

IJCSBI.ORG

approaches, the ACO approach of Socha [17] is closest to the spirit of ACO for discrete problems [18].

Suppose a population with cardinality of k is used to solve a continuous

optimization problem with n dimensions. The Gaussian function is usually

used as the probability density function (PDF) to estimate the distribution of

each member (ant) in the solution population. For the ith dimension, the jth

Gaussian function, with mean value i_j and standard deviationi_j, that is

derived from the jth member of the population with a cardinality of k, is

represented by: 2 2 2 ) ( 2 1 )

( ij

i j x i j i

j x e

g      

 ,i1,...,n;j1,...,k;x (12)

Hence, an ant can choose a value for dimension i by using a Gaussian

kernel, which is a weighted superposition of several Gaussian functions, defined as:



  k j i j j i x g w x G 1 ) ( )

( ,i1,...,n;x (13)

where w_j is the weight associated with the jth member of the population in

the mixture [18]. All solutions in the population are first ranked based on their fitness with rank 1 for the best solution, and the associated weight of

the jth member of the population in the mixture is calculated by:

2 2 2 2 ) 1 ( 2

1 qk

r j e qk w   

 ,j1,...,k (14)

where r is the rank of the jth member and q(0) is a parameter of the

algorithm[18]. Furthermore, each ant j must choose one of the Gaussian

functions ( 12 1 1

1

1,g ,...,gj,...,gk

g ) for the first dimension [18], i.e. the first

construction step, with the probability:



  _k l l j j w w p 1

,j 1,...,k. (15)

Suppose the Gaussian function 1

*

j

g is chosen for the ant j in the first

dimension; the Gaussian functions 2

*

j

g to n

j

g * are then used for the

remaining n-1 construction steps. In addition, for the j*th Gaussian function

in the ith dimension, the mean is set by:

i j i j* x*

 ,i 1,...,n, (16)

(8)

IJCSBI.ORG







 k

j

i j i j i

j _k x x

1

2

) (

1

*

* 

 ,i1,...,n (17)

where xijis the value of the ith decision variable in solution (ant)j and

) 1 , 0 (



 is the parameter that regulates the speed of convergence [18].

Once each ant has completed n construction steps, the worst s solutions in

the original population are replaced by the same number of best solutions generated by the search process, thus forming a new solution population. The search process is carried out iteratively until the stopping criteria are satisfied and the near optimal solutions are obtained. The detailed execution steps of the ant colony optimization for continuous domains, denoted by ACO , are summarized as follows:

Step 1: Randomly or by using some principles, create an initial population

consisting of k solutions (ants) with n dimensions.

Step 2: Calculate the fitness of each solution and rank these solutions based on their fitness with rank 1 for the best solution.

Step 3: For each solution j, choose one of the Gaussian functions (

1 1 2 1

1,g ,...,gk

g ) for the first dimension, denoted by 1

*

j

g , based on the

probability obtained through Eqs. (14) and (15).

Step 4: For each solution j, generate a new solution by sampling the

Gaussian functions ( *, *,..., *)

2

1 n

j j

j g g

g whose means and standard

deviations are calculated using Eqs. (16) and (17).

Step 5: Replace the worst s solutions in the original population by the same

number of the best solutions generated in Step 4, thus forming a new solution population.

Step 6: If the termination criteria are satisfied, stop the search process and obtain the near optimal solutions. Otherwise, execute Steps 2 to 5 iteratively.

3.3 Gene Expression Programming

Gene expression programming (GEP) first developed by Ferreira [19] is an evolutionary methodology, based on the principles of Darwinian natural selection and biologically inspired operations, to evolve populations of computer programs in order to solve a user-defined problem. In GEP, the genes consist of a head containing symbols to represent both functions

(elements from the function set F) and terminals (elements from the

terminal set T), and a tail containing only terminals. Suppose, for a problem,

the number of arguments in the function with the most arguments is  and

the length of the head is h. Then, the length of the tail t is evaluated by the

equation:

1 ) 1

(  

h

(9)

IJCSBI.ORG

As an example, consider a gene composed of [Q, ×, ÷, -, +, a, b] where the number of arguments in the function with the most arguments is 2. If the

length of the head h is set as 10, the length of the tail t can be obtained as

11, i.e. 10(21)1, and the length of the gene is 21, i.e. 1011. One such

gene is illustrated as follows:

b b a a a b a b b a b

Q a b a b Q

0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0

  

 (19)

where the tail is shown in bold and “Q” represents the square root function. The above gene (genotype) can be represented by an expression tree (phenotype) as shown in Figure 1 and decoded as follows:

]

[b a b

ab   . (20)

The general execution steps of GEP are presented by Ferreira[19], and are briefly summarized as follows:

Step 1: Randomly generate an initial population of chromosomes.

Step 2: Express the chromosomes and evaluate the fitness of each individual.

Step 3: Select chromosomes from the population using a random probability based on the fitness and replicate the selected chromosomes.

Step 4: Randomly apply genetic operators to the replicated chromosomes in Step 3, thus creating the next generation. The genetic operators include mutation, IS (insertion sequence) transposition, RIS (root insertion sequence) transposition, gene transposition, one-point recombination, two-point recombination and gene recombination. Step 5: When the termination criterion is satisfied, the outcome is

designated as the final result of the run. Otherwise, Steps 2 to 4 are executed iteratively.

Figure 1. An example of the expression tree in GEP

4. PROPOSED PORTFOLIO OPTIMIZATION PROCEDURE

The proposed optimization procedure comprising three stages is described in the following sub-sections.

4.1 Selection of Stocks

(10)

IJCSBI.ORG

assets, total equity, cost of sales and operating expenses are defined as inputs in the DEA model and two variables including net sales and net

income are defined as outputs. This is in line with previous studies [20–22].

Next, the input-oriented CCR model is applied to evaluate the underlying fundamental financial strength of companies (DMUs) by using the financial data collected from the financial reports, which consists of the four inputs and two outputs. The companies are then ranked based on their efficiency scores with the highest score as rank 1. In addition, the companies with the same efficiency score are further ranked based on their earnings per share (EPS) in a descending order. Hence, companies with rank 1, up to and

including c, are then selected as the essential candidate companies (stocks)

in the investment portfolio. These are the maximum allowable number of assets in the portfolio, as shown in Eq. (6).

4.2 Optimization of a Portfolio

In the second stage, the ACO algorithm is applied to select the final stocks in the investment portfolio, as well as optimize the investment proportion of

each selected stock. First, the expected weekly return of stock i, i.e. mr_i in

Eq. (3), the below-mean semi variance for stock i, i.e. SV_im in Eq. (2), and

the correlation coefficient between stocks i and j, i.e. _ij in Eq. (2), are

calculated based on the weekly trading data in the stock market. Next, the ACO algorithm presented in Section 3.2 is used to resolve the cardinality constrained portfolio optimization problem as formulated in Eqs. (2) to (7). Since the number of companies with superior financial strength included in

the previous stage exactly equals c, the cardinality constraint in Eq. (6) is

fulfilled. In addition, the constraint regarding the expected return in Eq. (3) is designed into the objective function in Eq. (2). Hence, the objective function to be minimized in ACO is defined as follows:

} 0 , Max{

1 *

1 1

ACO





  

 



 N

i

i i N

i N

j

ij m j m i j

iw SV SV M r wmr

w

f  (21)

where M is a very large number that represents the penalty, while the

portfolio cannot yield an expected return better than the desired level r* as

shown in Eq. (3). In addition, the obtained jth solution (x1_j,x2_j,...,xc_j), i.e. the

jth ant in the solution population with a cardinality of k, from ACO is

modified according to the following equation:

k j

c i

w x x y

i j i

j i

j , 1,2,..., , 1,...,

otherwise 0

if min    

 

 

 . (22)

Therefore, the jth solution ( j1,...,k) in ACO can now be transformed

(11)

IJCSBI.ORG

c i

y y

w _c

i i

j i j

i , 1,2,...,

1

  





. (23)

In this manner, all w_is lie between w_min and 1, and the sum of w_is in each

solution equals one, i.e.





c

i i

w

1

1 ; thus the constraints in Eqs. (4), (5) and (7)

are met.

4.3 Buying/Selling of Stocks

In the last stage, the GEP technique is utilized to forecast stock closing prices and transaction rules are designed to determine the optimal timing for buying/selling stocks. First, fifteen technical indicators including (1) 10-day

moving average, (2) 20-day bias, (3) moving average

convergence/divergence, (4) 9-day stochastic indicator K, (5)9-day stochastic indicator D, (6) 9-day Williams overbought/oversold index, (7) 10-day rate of change, (8) 5-day relative strength index, (9) 24-day commodity channel index, (10) 26-day volume ratio, (11) 13-day psychological line, (12) 14-day plus directional indicator, (13)14-day minus directional indicator, (14) 26-day buying/selling momentum indicator and (15)26-day buying/selling willingness indicator are calculated based on the historical stock trading data. These indicators will serve as the input variables of GEP forecasting models, which is in line with previous studies [23–28].The technical indicators on the last trading day of each week, along with the closing price on the last trading day of the following week, are then randomly partitioned into training and test data based on a pre-specified proportion, e.g., 4:1. Next, the GEP algorithm is utilized to construct several forecasting models and an optimal forecasting model is determined based on simultaneously minimizing the root mean squared errors (RMSEs) of the

training and test data, named ModelGEP. Let pi represent the closing price

on the last trading day of the current week and let pˆ represent the _i

forecasted closing price on the last trading day of the next week for stock i.

Four transaction rules can then be designed as follows:

(1) IF (Stock i is held) AND (pˆ  p), THEN (Do not take any action);

(2) IF (Stock i is held) AND ( pˆ  p), THEN (Sell stock i on the next

trading day);

(3) IF (Stock i is not held) AND (pˆ  p), THEN (Buy stock i on the next

trading day);

(4) IF (Stock i is not held) AND (pˆ  p), THEN (Do not take any action).

Using these rules and the forecasted closing stock price obtained by the

ModelGEP, an investor can make buy/sell decisions for each stock on the last

(12)

IJCSBI.ORG

ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 56 5. CASE STUDY

In this section, a case study on investing in stocks in the semiconductor sub-section of Taiwan’s stock market is presented.

5.1 Selecting Potential Stocks

According to the Securities and Exchange Act of Taiwan, the third-quarterly financial report and annual financial report of a listed company must be

announced before October31st of the current year and before April 30th of

the next year, respectively. Hence, the financial data obtained from the third-quarterly financial report was designed to plan the investment during

the period from November31st of the current year in which the study was

conducted to April 30th of the next year, and the financial data obtained

from the annual financial report was utilized to arrange the investment plan

from May 1st to October 31st of the current year. The release time of

financial reports, the types of financial reports, the corresponding investment planning horizons and the periods of collecting ROI and trading data in this study are summarized in Table 1.Seven financial variables described in Section 4.1 are first collected from the Taiwan Economic Journal (TEJ) database at each release time of the financial report as listed in Table 1. Taking the fifth case in Table 1 as an example, there were 65 listed companies in the semiconductor sub-section of Taiwan’s stock market on October 31, 2009. The input-oriented CCR model is then applied to the remaining 48 listed companies to evaluate their underlying fundamental financial strength by using DEA-Solver Learning Version 3.0 (http://www.saitech-inc.com) software. Therefore, the best ten companies, ranked by using their efficiency scores as the first priority and their EPS as the second priority, are selected as the essential candidate companies (stocks) in the investment portfolio as listed in Table 2 (Case 5). By following the above procedure, the essential candidate stocks in investment portfolios for the other cases in Table 1 can be obtained in Table 2.

Table 1. Release time of financial reports, investment planning horizons and periods of data collection

Case No.

Release time of the financial report (The type of the financial report)

The investment planning horizon The collection period for ROI and trading data 1 _{(Third-quarterly report of 2007)}2007/10/31 2007/11/01~2008/04/30 _{2006/11/01~2007/10/31}

(13)

IJCSBI.ORG

ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 57 Table 2. Essential candidate stocks in the investment portfolio

Case 1 Case 2 Case 3

Rank Stock _code Efficiency _score EPS Rank Stock _code Efficiency _score EPS Rank Stock _code Efficiency _score EPS 1 2454 1.00 26.48 1 2454 1.00 32.59 1 2454 1.00 15.31 2 6286 1.00 11.02 2 6286 1.00 14.98 2 3519 1.00 11.00 3 3034 1.00 10.45 3 3034 1.00 14.02 3 3579 1.00 10.64 4 6239 1.00 7.88 4 6239 1.00 11.08 4 6286 1.00 7.92 5 2451 1.00 7.28 5 2451 1.00 7.78 5 6239 1.00 7.81 6 3443 1.00 4.52 6 3532 1.00 6.70 6 3443 1.00 4.74 7 2441 1.00 3.71 7 3443 1.00 6.41 7 2451 1.00 4.12 8 8131 1.00 3.09 8 2441 1.00 5.07 8 3588 1.00 4.07 9 2473 1.00 2.45 9 2330 1.00 4.14 9 2330 1.00 3.36 10 6145 1.00 0.01 10 8131 1.00 4.11 10 2441 1.00 2.76

Case 4 Case 5 Case 6

Rank Stock _code Efficiency _score EPS Rank Stock _code Efficiency _score EPS Rank Stock _code Efficiency _score EPS 1 2454 1.00 18.01 1 2454 1.00 26.04 1 2454 1.00 34.12 2 3579 1.00 14.16 2 6286 1.00 7.75 2 6286 1.00 10.93 3 6239 1.00 10.38 3 2451 1.00 7.11 3 2451 1.00 10.42 4 6286 1.00 10.05 4 6239 1.00 4.92 4 6239 1.00 7.44 5 3443 1.00 6.05 5 6145 1.00 2.84 5 2330 1.00 3.45 6 2451 1.00 5.72 6 3041 1.00 2.51 6 3041 1.00 3.23 7 3588 1.00 5.05 7 2330 1.00 2.19 7 3443 1.00 3.15 8 2330 1.00 3.86 8 2441 1.00 1.73 8 6145 1.00 3.13 9 2441 1.00 3.10 9 2473 1.00 1.29 9 3579 1.00 2.89 10 3532 1.00 2.54 10 3443 1.00 1.07 10 2441 1.00 2.74

Case 7 Case 8

Rank Stock _code Efficiency _score EPS Rank Stock _code Efficiency _score EPS 1 2454 1.00 24.95 1 2454 1.00 28.44 2 6286 1.00 11.82 2 6286 1.00 14.60 3 6239 1.00 8.37 3 6239 1.00 10.89 4 2330 1.00 4.67 4 3579 1.00 9.02 5 5471 1.00 4.15 5 2330 1.00 6.24 6 3443 1.00 3.42 6 4919 1.00 4.13 7 2351 1.00 3.14 7 2451 1.00 3.48 8 6202 1.00 3.05 8 8131 1.00 3.46 9 2451 1.00 2.79 9 8271 1.00 2.92 10 8131 1.00 2.38 10 2473 1.00 2.22

5.2 Optimizing the Portfolio

In order to select the final stocks in the investment portfolio and optimize their investment proportions, the research first collects the weekly ROI of each essential candidate stock listed in Table 2 from the TEJ database. The collection period for the ROI data is the previous 12 months starting from the release time of the financial report (see Table 1). Following the data

collection, the expected weekly return of stock i, i.e. mr_i in Eq. (3), the

below-mean semi variance for stock i, i.e. SV_im in Eq. (2), and the

correlation coefficient between stocks i and j, i.e. _ij in Eq. (2), can be

calculated.

Next, the ACO algorithm coded by using C++ programming language is used to resolve the portfolio optimization problem as formulated in Eqs. (2)

to (7) where the minimum proportion of each stock held, i.e. w_min in Eq. (5),

(14)

IJCSBI.ORG

is set as 10. The expected portfolio return, i.e. r* in Eq. (3), is set as the

maximum of the average weekly ROI over the last twelve months’ stock market and the weekly interest rate of a fixed deposit for six to nine months bulletined by the Bank of Taiwan to reflect the activeness of investors. In addition, the objective function in ACO is designed by Eq. (21) in Section

4.2 where the parameter M is set as 1,000. To find the optimal settings of

the key parameters in ACO , including k (cardinality, i.e. the total number

of ants), q,  , s and rmax (the maximum allowable cycles for the ACO

algorithm to attempt to improve on its best solution), a preliminary

experiment is conducted using a 25-1 fractional factorial design for the

seventh case in Table 1. Table 3 shows the experimental results by carrying out thirty replications for each combination of parameters, and Table 4

shows the analyzed results. The parameter k, interaction qr_max and

interactionr_max are automatically selected into the model in ANOVA, as

shown in Table 4. According to Table 4, the model is significant at

05 . 0



 . From the effect plot of parameter k, interaction qr_max and

interaction r_maxgraphed in Figure2, the optimal settings of k, q,  and

rmaxin ACO are set at 100, 4, 0.9 and 200, respectively. In addition, the

parameter s, i.e., the total number of worst solutions in the original

population replaced by the best solutions generated by the ACO search process, is set as 20. The fifth case in Table 1, taken as an example, shows that the weekly ROI data of the essential candidate stocks listed in Table 2 (Case 5) are collected from November 1, 2008 to October 31, 2009. The expected weekly return, the below-mean semi variance of each stock, and the correlation coefficient between each pair of stocks are calculated. The ACO search procedure is implemented for 100 runs on a personal computer with an Intel Core 2 Quad 2.66GHz CPU and 2GB RAM, and Table 5 lists the optimal portfolio. The average weekly ROI in the Taiwan stock market from November 1, 2008 to October 31, 2009 is 0.88%, and the weekly interest rate of a fixed deposit for six to nine months bulletined by the Bank of Taiwan on October 31, 2009 is 0.0142%. Therefore, the expected

portfolio return r*is set as 0.88%. According to the experimental results of

the fifth case in Table 5, the portfolio contains five stocks including stocks with codes 2454, 6239, 6145, 2330 and 2441, and their corresponding investment proportions are 0.0857, 0.2592, 0.0868, 0.4822 and 0.0861,

respectively. The investment risk (variance) of the portfolio is 1.15×10-3,

and the expected weekly ROI of the portfolio is 1.33×10-2 (1.33%), which is

(15)

IJCSBI.ORG

corresponding investment proportions, investment risk and expected weekly ROI and CPU time. This information is summarized in Table 5.

Table 3. A preliminary experiment on ACO parameters

No. k q  s rmax Mean of fACO Variance of fACO

1 50 2 0.90 10 10 3.18×10-4 _4.66×10-9

2 100 2 0.90 10 10 2.92×10-4 _3.19×10-9

3 50 4 0.90 10 10 3.39×10-4 _5.18×10-9

4 100 4 0.90 10 10 2.98×10-4 _4.88×10-9

5 50 2 0.99 10 10 3.01×10-4 _3.47×10-9

6 100 2 0.99 10 10 2.92×10-4 3.74×10-9 7 50 4 0.99 10 10 3.20×10-4 _5.33×10-9

8 100 4 0.99 10 10 2.75×10-4 _2.06×10-9

9 50 2 0.90 20 20 3.11×10-4 _3.46×10-9

10 100 2 0.90 20 20 2.95×10-4 _3.74×10-9

11 50 4 0.90 20 20 2.77×10-4 _3.93×10-9

12 100 4 0.90 20 20 3.10×10-4 _3.92×10-9

13 50 2 0.99 20 20 3.20×10-4 _3.72×10-9

14 100 2 0.99 20 20 2.90×10-4 _4.34×10-9

15 50 4 0.99 20 20 3.11×10-4 5.12×10-9 16 100 4 0.99 20 20 2.80×10-4 _3.62×10-9

Table 4. ANOVA for the preliminary experiment on ACO parameters

Source Sum ofsquares d.f. Meansquare Fvalue Significance Model 9.13×10-8 ₆ _1.52×10-8 _3.75 _0.0012

k 5.06×10-8 ₁ _5.06×10-8 _12.48 _0.0005

q 8.16×10-11 ₁ _8.16×10-11 _0.02 _0.8872  5.03×10-9 ₁ _5.03×10-9 _1.24 _0.2661

rmax 1.62×10-9 1 1.62×10-9 0.40 0.5272 max

r

q 1.59×10-8 ₁ _1.59×10-8 _3.93 _0.0479 max

r 

 1.81×10-8 ₁ _1.81×10-8 _4.45 _0.0353

Residual 1.92×10-6 ₄₇₃ _4.05×10-9

Lack of Fit 5.04×10-8 ₉ _5.60×10-9 _1.39 _0.1892

Pure Error 1.87×10-6 ₄₆₄ _4.02×10-9

Corrected Total 2.01×10-6 ₄₇₉

(A) Effect of Parameter k(B) Effect of Interaction qrmax(C) Effect of Interaction rmax

Figure 2. Effects of the parameter and interactions

Table 5. The optimal investment portfolio obtained using ACO

Case 1 Case 2 Case 3 Case 4 Stock

code

Investment proportion

Stock code

Investment proportion 2454 0.3503 2454 0.0776 3519 0.1657 2454 0.1978 3034 0.1985 6239 0.2957 6286 0.1263 6286 0.5055 6239 0.1538 2451 0.2442 6239 0.0887 2451 0.2213 2451 0.1218 2330 0.3825 3443 0.1678 2330 0.0754 2441 0.1756 - - 2451 0.1949 - -

(16)

IJCSBI.ORG

Investment risk (variance)

4.58×10-4 Investment _risk

(variance)

2.62×10-3

Expected

weekly ROI 1.00×10

-2 Expected

weekly ROI 2.81×10

-3 Expected

weekly ROI -8.00×10

-3 Expected

-3

Stock market weekly ROI 6.31×10

-3 Stock market

weekly ROI 2.80×10

-3 Stock market

-2 Stock market

-3

CPU Time (sec) of 100

runs

51.45

runs

52.81

runs

27.06

runs

51.52

Table 5. The optimal investment portfolio obtained using ACO (Continued)

Case 5 Case 6 Case 7 Case 8 Stock code Investment proportion Stock code Investment proportion Stock code Investment proportion Stock code Investment proportion 2454 0.0857 6286 0.1074 2330 0.8706 6286 0.0850 6239 0.2592 6239 0.2581 6202 0.1294 3579 0.1384 6145 0.0868 2330 0.5226 - - 2330 0.5934 2330 0.4822 2441 0.1118 - - 2451 0.0709 2441 0.0861 - - - - 2473 0.1123 Investment

risk (variance)

(variance)

2.96×10-4

Expected

weekly ROI 1.33×10

-2 Expected

weekly ROI 7.85×10

-3 Expected

weekly ROI 2.67×10

-3 Expected

weekly ROI 3.05×10

-3

Stock market weekly ROI 8.83×10

-3 Stock market

weekly ROI 6.13×10

-3 Stock market

weekly ROI 2.67×10

-3 Stock market

weekly ROI 2.59×10

-3

runs

50.70

runs

51.22

runs

54.05

runs

51.52

5.3 Stock Buying and Selling

In this stage, the transaction rules designed in Section 4.3 are used to determine the optimal timing for buying or selling stocks with the help of stock price forecasting models constructed by the GEP technique. The fifth case in Table 1 is taken as an example. The daily trading data including opening price, highest price, lowest price, closing price and trade volume of the ten essential candidate stocks as shown in Table 2 are first collected from Taiwan Stock Exchange Corporation(TWSE) for the last twelve months starting from the release time of the financial report. The fifteen technical indicators described in Section 4.3are then calculated for the last trading day of each week. The technical indicators for the last trading day of each week along with the closing price on the last trading day of the following week are randomly partitioned into training and test data groups based on the proportion of 4:1.Next, the GEP algorithm using the GeneXpro Tools 4.0 (http://www.gepsoft.com) software is employed to construct stock price forecasting models where the fitness of an individual is evaluated through RMSE and the parameters are set as their default values. The GEP algorithm is executed 5 times and the optimal GEP forecasting model is

selected based on the training and test RMSEs, described as ModelGEP.

Next, the fifteen technical indicators for the last trading day of each week in

(17)

IJCSBI.ORG

model, thus obtaining the forecasted closing stock price for the last trading day of the next week. With the forecasted closing stock prices, the investor can make buy/sell decisions for each stock on the last trading day of each week based on the four transaction rules presented in Section 4.3.

Here, assume that the initial investment capital is one million dollars and the total investment capital can vary at any time due to the profit or loss arising from stock transactions made during the investment planning horizon. Next, assume the stocks are arbitrarily dividable, and can be bought or sold absolutely at the opening prices on the next trading day after the day of making buy/sell decisions. In addition, the stocks held must be sold out on the last trading day of the investment planning horizon. Table 6 illustrates the partial transactions of stock 6239 contained in the portfolio listed as the fifth case in Table 5. The closing price on November 6, 2009 is 87.58 which is less than the forecasted closing price 90.80 for the last trading day of the next week, i.e. November 13, 2009. Hence, based on the third transaction rule in Section 4.3, stock 6239 is bought at the opening price of 88.06 on the next trading day after November 6, 2009, which is November 9, 2009. As for November 13, 2009, the closing price of 89.79 is less than the forecasted closing price of 92.37 for the last trading day of the next week; thus no actions are taken in keeping with the first transaction rule. In addition, the forecasted closing price for January 22, 2010 is 106.64, which is less than the closing price of 107.78 on January 15, 2010. Therefore, based on the second transaction rule, stock 6239 is sold out at the opening price of 106.82 on January 18, 2010, which yields a profit of 18.76 (106.82-88.06) for each share. The four transaction rules are likewise applied to the other stocks in the portfolio for the fifth case in Table 5, i.e. stocks 2454, 6145, 2330 and 2441. Hence, the profit or loss for each stock transaction made during the investment planning horizon is obtained, yielding a final return on

investment of 11.46% as shown by the ROI1 value for Case 5 in Table 7.By

following the above procedure, the returns on investment for other cases in Table 1during the investment planning horizon can be obtained. This is

shown by the ROI1 values in Table 7. This table also summarizes the return

on investment when investing in stocks using only the first and second stages of the proposed portfolio optimization procedure, i.e. the Buy & Hold

strategy, denoted by ROI2, and the return on investment in the

semiconductor sub-section of Taiwan’s stock market, denoted by ROI3.

Based on the ROI1 values in Table 7, the average six-month ROI can attain

an extreme high level of 13.12%. Even in the worst case, the ROI can still reach 0.86%, which is equivalent to a yearly ROI of 1.72%. This value is still higher than the normal yearly interest rate of a fixed deposit for six to

nine months in Taiwan, which is only around 1.1%. While not each ROI1

value exceeds the corresponding ROI2 value in Table 7, all the

(18)

IJCSBI.ORG

Furthermore, the average of ROI1 values exceeds the average of ROI2 values

by 11.53%. With regard to the ROI1 and ROI3 values in Table 7, the former

are larger except in the third case, where the ROI1 value of 23.21% is

slightly smaller than its corresponding ROI3 value of 23.67%. In addition,

the average ROI1 values can attain a level of 13.12%, which is highly

superior to the ROI3 value of -2.39%. These results are shown in Figure 3.

Table 6. Partial transactions of stock 6239 (for Case 5 in Table 5)

Date Closing _price _{closing price}Forecasted Transaction Transaction _Date Transaction rule 2009/11/06 87.58 90.80 [email protected] 2009/11/09 Rule 3 2009/11/13 89.79 92.37 - - Rule 1 2009/11/20 87.38 91.54 - - Rule 1 2009/11/27 84.88 88.63 - - Rule 1 2009/12/04 87.29 89.79 - - Rule 1 2009/12/11 93.06 93.93 - - Rule 1 2009/12/18 94.70 97.39 - - Rule 1 2009/12/25 102.01 102.44 - - Rule 1 2009/12/31 104.42 104.72 - - Rule 1 2010/01/08 104.90 106.92 - - Rule 1 2010/01/15 107.78 106.64 [email protected] 2010/01/18 Rule 2

Table 7. The information for each investment portfolio in Table 5

Case No. Initial capital Final capital ROI1 ROI2 ROI3

1 1,000,000 1,187,000 18.70% -50.87% -12.47% 2 1,000,000 1,156,700 15.67% -30.79% -39.54% 3 1,000,000 1,232,100 23.21% 10.85% 23.67% 4 1,000,000 1,158,400 15.84% 73.99% 11.10% 5 1,000,000 1,114,600 11.46% 11.94% 8.28% 6 1,000,000 1,008,600 0.86% -7.67% -9.25% 7 1,000,000 1,133,100 13.31% 7.51% 5.25% 8 1,000,000 1,058,900 5.89% -2.25% -6.14% Max 1,000,000 1,232,100 23.21% 73.99% 23.67% Min 1,000,000 1,008,600 0.86% -50.87% -39.54% Average 1,000,000 1,131,175 13.12% 1.59% -2.39%

Figure 3. Comparison of ROIs based on the proposed approach, Buy & Hold strategy and stock market

23.21%

73.99%

23.67%

0.86%

-50.87%

-39.54% 13.12%

1.59%

-2.39%

-60% -40% -20% 0% 20% 40% 60% 80%

Proposed approach Buy & Hold strategy Stock market ROI

(19)

IJCSBI.ORG

ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 63 6. CONCLUSIONS

In this study, the data envelopment analysis (DEA), ant colony optimization for continuous domains (ACO ) and gene expression programming (GEP) are utilized to develop an integrated approach to deal with the portfolio optimization problems. The feasibility and effectiveness of the proposed procedure are verified through a case study on investing stocks in the semiconductor sub-section of Taiwan stock market over the period from November1, 2007 to July8, 2011. The obtained results show that the average return on investment (ROI) of six months can attain a very high level of 13.12%, as well as the ROI value for the worst case is still higher than the normal yearly interest rate of a fixed deposit for six to nine months in Taiwan. Next, the experimental results indicates that the third stage of the proposed portfolio optimization procedure indeed functions to assist the investors for determining the optimal timing for buying/selling stocks thus avoiding a substantial investment loss and eventually making a superior profit. Furthermore, the proposed procedure can positively assist the investors to make profits even though the overall stock market suffers a loss. The present study makes four main contributions to the literature. First, it successfully proposes a systematic procedure for portfolio optimization

using based on DEA, ACOR and GEP based on the data collected from the

financial reports and stock markets. Second, it can help an investor to screen stocks with the most profitable potential rapidly, even when he or she lacks sufficient financial knowledge. Third, it can automatically determine the optimal investment proportion of each stock to minimize the investment risk while satisfying the target return on investment set by an investor. Fourth, it can fill the scarcity of discussions about the timing for buying/selling stocks in the literature by providing a set of transaction rules based on the actual and forecasted stock prices.

REFERENCES

[1] Markowitz, H.M.Portfolio selection.J. Finance, 7, 1 (1952), 77–91.

[2] Anagnostopoulos, K.P.,and Mamanis, G. A portfolio optimization model with three objectives and discrete variables.Comput. Oper. Res., 37, 7 (2010), 1285–1297. [3] Zitzler, E., Laumanns, M., and Thiele, L. SPEA2: Improving the Strength Pareto

Evolutionary Algorithm. Computer Engineering and Networks Laboratory (TIK),

Department of Electrical Engineering, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland, 2001.

[4] Corne, D. W., Knowles, J. D., and Oates, M. J. The Pareto envelop-based selection algorithm for multiobjective optimization.InProceedings of the 6th International

Conference on Parallel Problem Solving from Nature(Paris, France, September 18–20,

2000). Springer-Verlag, Heidelberg, Berlin, 2000, 839–848.

(20)

IJCSBI.ORG

Swarm, Evolutionary, and Memetic Computing. Springer-Verlag, Heidelberg, Berlin,

2010, 238–245.

[6] Chen, Y., Mabu, S., and Hirasawa, K. A model of portfolio optimization using time adapting genetic network programming.Comput. Oper. Res., 37, 10 (2010), 1697– 1707.

[7] Sun, J., Fang, W., Wu, X.J., Lai, C.H., and Xu, W.B.Solving the multi-stage portfolio optimization problem with a novel particle swarm optimization.Expert Syst. Appl., 38,6 (2011), 6727–6735.

[8] Woodside-Oriakhi, M., Lucas, C., and Beasley, J.E.Heuristic algorithms for the cardinality constrained efficient frontier.Eur. J. Oper. Res., 213, 3 (2011), 538–550. [9] Chang, J.F., and Shi, P. Using investment satisfaction capability index based particle

swarm optimization to construct a stock portfolio. Inf. Sci., 181, 14 (2011), 2989–2999. [10]Sadjadi, S. J., Gharakhani, M., and Safari, E. Robust optimization framework for

cardinality constrained portfolio problem. Appl. Soft Comput., 12, 1 (2012), 91–99. [11]Yunusoglu, M. G., and Selim, H. A fuzzy rule based expert system for stock evaluation

and portfolio construction: an application to Istanbul Stock Exchange. Expert Syst.

Appl., 40, 3(2013), 908–920.

[12]Vercher, E., and Bermudez, J. D. A possibilistic mean-downside risk-skewness model for efficient portfolio selection. IEEE. T. Fuzzy Syst., 21,3 (2013), 585–595.

[13]Farzi, S., Shavazi, A. R., and Pandari, A. Using quantum-behaved particle swarm optimization for portfolio selection problem. Int. Arab J. Inf. Technol., 10, 2 (2013), 111–119.

[14]Markowitz, H.M.Portfolio Selection. John Wiley and Sons, New York, 1959.

[15]Charnes, A., Cooper, W. W., and Rhodes, E. Measuring the efficiency of decision making units.Eur. J. Oper. Res., 2, 6 (1978), 429–444.

[16]Banker, R.D., Charnes, A., and Cooper, W. W. Some models for estimating technical and scale inefficiencies in data envelopment analysis.Manage. Sci. 30, 9 (1984), 1078– 1092.

[17]Socha, K. ACO for continuous and mixed-variable optimization.In Dorigo, M., Birattari, M., Blum, C., Gambardella, L.M., Mondada, F., and Stutzel, T. (Eds.), Ant

Colony Optimization and Swarm Intelligence. Springer, Brussels, Belgium, 2004, 25–

36.

[18]Blum, C. Ant colony optimization: introduction and recent trends.Phys. Life Rev., 2, 4 (2005), 353–373.

[19]Ferreira, C. Gene expression programming: a new adaptive algorithm for solving problems.Complex Syst., 13, 2 (2001), 87–129.

[20]Chen, Y. S., and Chen, B. Y. Applying DEA, MPI, and grey model to explore the operation performance of the Taiwanese wafer fabrication industry. Technol.

Forecasting Social Change, 78, 3 (2011), 536–546.

[21]Lo, S. F., and Lu, W. M. An integrated performance evaluation of financial holding companies in Taiwan. Eur. J. Oper. Res., 198, 1 (2009), 341–350.

[22]Chen, H. H. Stock selection using data envelopment analysis. Ind. Manage. Data Syst., 108, 9 (2008), 1255–1268.

(21)

IJCSBI.ORG

ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 65 [24]Huang, C.L., and Tsai, C.Y.A hybrid SOFM-SVR with a filter-based feature selection

for stock market forecasting. Expert Syst. Appl., 36, 2 (2009), 1529–1539.

[25]Ince, H., and Trafalis, T. B. Short term forecasting with support vector machines and application to stock price prediction. Int. J. Gen. Syst., 37, 6 (2008), 677–687.

[26]Kim, K.J., and Han, I. Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index.Expert Syst. Appl., 19, 2 (2000), 125–132.

[27]Kim, K.J., and Lee, W.B.Stock market prediction using artificial neural networks with optimal feature transformation.Neural Compu. Appl., 13, 3 (2004), 255–260.

[28]Tsang, P.M., Kwok, P., Choy, S.O., Kwan, R., Ng, S.C., Mak, J., Tsang, J., Koong, K., and Wong,T.L. Design and implementation of NN5 for Hong Kong stock price forecasting.Eng. Appl. Artif. Intell., 20, 4 (2007), 453–461.

This paper may be cited as:

Hsu, C. M., 2014. An Integrated Procedure for Resolving Portfolio Optimization Problems using Data Envelopment Analysis, Ant Colony

Optimization and Gene Expression Programming. International Journal of