IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 45
An Integrated Procedure for Resolving
Portfolio Optimization Problems using
Data Envelopment Analysis, Ant
Colony Optimization and Gene
Expression Programming
Chih-Ming Hsu
Minghsin University of Science and Technology 1 Hsin-Hsing Road, Hsin-Fong, Hsinchu 304, Taiwan, ROC
ABSTRACT
The portfolio optimization problem is an important issue in the field of investment/financial decision-making and is currently receiving considerable attention from both researchers and practitioners. In this study, an integrated procedure using data envelopment analysis (DEA), ant colony optimization (ACO) for continuous domains and gene expression programming (GEP) is proposed. The procedure is evaluated through a case study on investing in stocks in the semiconductor sub-section of the Taiwan stock market. The potential average six-month return on investment of 13.12% from November 1, 2007 to July 8, 2011 indicates that the proposed procedure can be considered a feasible and effective tool for making outstanding investment plans. Moreover, it is a strategy that can help investors make profits even though the overall stock market suffers a loss. The present study can help an investor to screen stocks with the most profitable potential rapidly and can automatically determine the optimal investment proportion of each stock to minimize the investment risk while satisfying the target return on investment set by an investor. Furthermore, this study fills the scarcity of discussions about the timing for buying/selling stocks in the literature by providing a set of transaction rules.
Keywords
Portfolio optimization, Data envelopment analysis, Ant colony optimization, Gene expression programming.
1. INTRODUCTION
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 46
solution. Although quadratic programming can be used to solve the problem with a reasonably small number of different assets, it becomes much more difficult if the number of assets is increased or if additional constraints, such as cardinality constraints, bounding constraints or other real-world requirements, are introduced.
Therefore, various approaches for tackling portfolio optimization problems
using heuristic techniques have been proposed. For example,
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 47
optimization solvers including LOQO and CPLEX. Woodside-Oriakhi et al. [8] applied GAs, tabu search (TS) and simulated annealing (SA) to find the efficient frontier in financial portfolio optimization that extends the Markowitz mean-variance model to consider the discrete restrictions of buy-in thresholds and cardbuy-inality constrabuy-ints. The performance of their methods was tested using publicly available data sets drawn from seven major market indices. The implementation results indicated that the proposed methods could yield better solutions than previous heuristics in the literature. Chang and Shi [9] proposed a two-stage process for constructing a stock portfolio. In the first stage, the investment satisfied capability index (ISCI) was used to evaluate individual stock performance. In the second stage, a PSO algorithm was applied to find the optimal allocation of capital investment for each stock in the portfolio. The results of an experiment on investing in the Taiwan stock market from 2005 to 2007 showed that the accumulated returns on investment (ROIs) of the portfolios constructed by their proposed approach were higher than the ROIs of the Taiwan Weighted Stock Index (TWSI) portfolios. Sadjadi et al.[10] proposed a framework for formulating and solving cardinality constrained portfolio problem with uncertain input parameters. The problem formulation was based on the recent advancements on robust optimization and was solved using GAs. Their proposed method was examined on several well-known benchmark data sets including the Hang Seng 31 (Hong Kong), DAX 100 (Germany), FTSE 100 (UK), S&P
100 (USA), and Nikkei 225 (Japan). The results indicated that D-norm
performs better than Lp-norm with relatively lower CPU time for the
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 48
et al. [13] proposed an approach for resolving the portfolio selection problem based on quantum-behaved particle swarm optimization (QPSO). The proposed QPSO model was employed to select the best portfolio in 50 supreme Tehran stock exchange companies with aims of optimizing the rate of return, systematic and non-systematic risks, return skewness, liquidity and sharp ratio. The comparison with traditional Markowitz’s and genetic algorithms models revealed that the return of the portfolio obtained by the QPSO was smaller than that in Markowitz’s classic model. However, the QPSO can decrease risk and provide more versatile portfolios than the other models.
The above-mentioned studies prove that soft computing techniques, such as GAs, PSO and ACO, are an effective and efficient way to address portfolio optimization problems. However, the concerns and interests of investors need also to be considered. First, the total number of stocks that investors can consider in their investment portfolio is usually extremely large. Therefore, investors usually focus on a few stock components according to their experience or principles for selecting stocks that have potential to make profits. Second, most investors are interested in minimizing downside risk since the return of stocks may not be normally distributed. Unfortunately, the research on downside risk is relatively little compared to the research that measures risk through the conventional variances used in the traditional Markowitz mean-variance model. Third, investors usually buy and sell their focused stocks several times during their investment planning horizon. Here again, the research regarding the timing of buying/selling stocks is scant.
2. PROBLEM FORMULATION
This study concentrates on the cardinality constrained portfolio optimization problem, which is a variant of the Markowitz mean-variance model where
the portfolio can include at most c different assets. In addition, the minimum
proportion of the total investment of each asset contained in the portfolio is also considered to reflect the fact that an investor usually sets a minimum investment threshold for each asset held. Notably, the study measures the variance (risk) of an asset by using the below-mean semi variance [14] to reflect that only downside risk is relevant to an investor and assets distributions may not be normally distributed. First, some notations are defined, as follows:
N: the total number of assets available;
no: the total number of periods considered;
t i
r : the return of asset i in period t (i1,2,...,N,t1,2,...,no);
i
mr: the expected (mean) return of asset i (i1,2,...,N);
i
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 49 ij
: the correlation coefficient between assets i and j (
N j
N
i1,2,..., , 1,2,..., );
*
r : the expected portfolio return;
c: the maximum number of assets in the portfolio;
min
w : the minimum proportion of the total investment held in asset i, if any
investment is made in asset i (i1,2,...,N);
i
: the decision variable that represents whether asset i (i1,2,...,N) is held
in the portfolio (i 1) or not (i 0).
The below-mean semi variance for asset i can then be calculated as follows
[14]:
no t t i i mi mr r
no SV 1 2 )] ( , 0 max[ 1
,i1,...,N. (1)
Hence, the cardinality constrained portfolio optimization problem considered in this study is formulated as shown below:
N i N j ij m j m i jiw SV SV
w
1 1
Minimize (2)
subject to *
1 r mr w N i i i
(3) 1 1
N i i w (4) N i ww i i
i min , 1,2,...,
(5) c N i i
1 (6) ,...,N , ii 0or 1, 12
. (7)
Eq. (2) intends to minimize the volatility (variance or risk) associated with the portfolio. Eq. (3) ensures that the portfolio can yield an expected return
of r* at least. Eq. (4) ensures that investment proportions sum to one while
a minimum investment threshold is considered to restrict asset investments as shown in Eq. (5). Of particular importance is Eq. (5), which enforces that
the resulting proportion of wi is zero if asset i is not held in the portfolio,
i.e. i 0, and that the investment proportion of wi cannot be less than the
minimum proportion wmin if asset i is held, i.e. i 1. Eq. (6) is the
cardinality constraint that ensures the total number of assets in the portfolio
does not exceed the maximum allowable number c. Finally, Eq. (7) is the
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 50 3. METHODOLOGY ISSUES
3.1 Data Envelopment Analysis
Data envelopment analysis (DEA) is a method for measuring the relative efficiencies of a set of similar decision making units (DMUs) through an evaluation of their inputs and outputs. The two popular DEA models are the CCR model developed by Charnes et al. [15] and the BCC model proposed by Banker et al. [16]. In addition, DEA models can have an input or output orientation. In this study, the objective of applying DEA to portfolio optimization is to screen companies within a given industry on the basis of their financial performance. Since the goal is to measure the underlying financial strength of companies whose scale sizes may differ, the input-oriented CCR model is more appropriate than the output-input-oriented BBC model. Furthermore, it is easier to reduce the input quantities than to increase the output quantities. Hence, the input-oriented CCR model is
applied here. Suppose the goal is to evaluate the efficiency of d independent
DMUs relative to each other based on their common m inputs and s outputs.
The input-oriented CCR model for evaluating the performance h0 of DMU0
can be formulated as follows:
m
i i i s
r r r
x v
y u h
1 0 1
0
0
M aximize (8)
subject to j d
x v
y u
m
i ij i s
r rj r
,..., 2 , 1 , 1
1
1
(9)
s r
ur 0, 1,2,..., (10)
m i
vi 0, 1,2,..., (11)
where xij(0) and yrj(0) represent the ith input and the rth output of
DMUj, respectively; and vi and ur denote the weight given to input i and
output r, respectively.
3.2 Ant Colony Optimization for Continuous Domains
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 51
approaches, the ACO approach of Socha [17] is closest to the spirit of ACO for discrete problems [18].
Suppose a population with cardinality of k is used to solve a continuous
optimization problem with n dimensions. The Gaussian function is usually
used as the probability density function (PDF) to estimate the distribution of
each member (ant) in the solution population. For the ith dimension, the jth
Gaussian function, with mean value ij and standard deviationij, that is
derived from the jth member of the population with a cardinality of k, is
represented by: 2 2 2 ) ( 2 1 )
( ij
i j x i j i
j x e
g
,i1,...,n;j1,...,k;x (12)
Hence, an ant can choose a value for dimension i by using a Gaussian
kernel, which is a weighted superposition of several Gaussian functions, defined as:
k j i j j i x g w x G 1 ) ( )( ,i1,...,n;x (13)
where wj is the weight associated with the jth member of the population in
the mixture [18]. All solutions in the population are first ranked based on their fitness with rank 1 for the best solution, and the associated weight of
the jth member of the population in the mixture is calculated by:
2 2 2 2 ) 1 ( 2
1 qk
r j e qk w
,j1,...,k (14)
where r is the rank of the jth member and q(0) is a parameter of the
algorithm[18]. Furthermore, each ant j must choose one of the Gaussian
functions ( 12 1 1
1
1,g ,...,gj,...,gk
g ) for the first dimension [18], i.e. the first
construction step, with the probability:
k l l j j w w p 1,j 1,...,k. (15)
Suppose the Gaussian function 1
*
j
g is chosen for the ant j in the first
dimension; the Gaussian functions 2
*
j
g to n
j
g * are then used for the
remaining n-1 construction steps. In addition, for the j*th Gaussian function
in the ith dimension, the mean is set by:
i j i j* x*
,i 1,...,n, (16)
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 52
k
j
i j i j i
j k x x
1
2
) (
1
*
*
,i1,...,n (17)
where xijis the value of the ith decision variable in solution (ant)j and
) 1 , 0 (
is the parameter that regulates the speed of convergence [18].
Once each ant has completed n construction steps, the worst s solutions in
the original population are replaced by the same number of best solutions generated by the search process, thus forming a new solution population. The search process is carried out iteratively until the stopping criteria are satisfied and the near optimal solutions are obtained. The detailed execution steps of the ant colony optimization for continuous domains, denoted by ACO , are summarized as follows:
Step 1: Randomly or by using some principles, create an initial population
consisting of k solutions (ants) with n dimensions.
Step 2: Calculate the fitness of each solution and rank these solutions based on their fitness with rank 1 for the best solution.
Step 3: For each solution j, choose one of the Gaussian functions (
1 1 2 1
1,g ,...,gk
g ) for the first dimension, denoted by 1
*
j
g , based on the
probability obtained through Eqs. (14) and (15).
Step 4: For each solution j, generate a new solution by sampling the
Gaussian functions ( *, *,..., *)
2
1 n
j j
j g g
g whose means and standard
deviations are calculated using Eqs. (16) and (17).
Step 5: Replace the worst s solutions in the original population by the same
number of the best solutions generated in Step 4, thus forming a new solution population.
Step 6: If the termination criteria are satisfied, stop the search process and obtain the near optimal solutions. Otherwise, execute Steps 2 to 5 iteratively.
3.3 Gene Expression Programming
Gene expression programming (GEP) first developed by Ferreira [19] is an evolutionary methodology, based on the principles of Darwinian natural selection and biologically inspired operations, to evolve populations of computer programs in order to solve a user-defined problem. In GEP, the genes consist of a head containing symbols to represent both functions
(elements from the function set F) and terminals (elements from the
terminal set T), and a tail containing only terminals. Suppose, for a problem,
the number of arguments in the function with the most arguments is and
the length of the head is h. Then, the length of the tail t is evaluated by the
equation:
1 ) 1
(
h
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 53
As an example, consider a gene composed of [Q, ×, ÷, -, +, a, b] where the number of arguments in the function with the most arguments is 2. If the
length of the head h is set as 10, the length of the tail t can be obtained as
11, i.e. 10(21)1, and the length of the gene is 21, i.e. 1011. One such
gene is illustrated as follows:
b b a a a b a b b a b
Q a b a b Q
0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
(19)
where the tail is shown in bold and “Q” represents the square root function. The above gene (genotype) can be represented by an expression tree (phenotype) as shown in Figure 1 and decoded as follows:
]
[b a b
ab . (20)
The general execution steps of GEP are presented by Ferreira[19], and are briefly summarized as follows:
Step 1: Randomly generate an initial population of chromosomes.
Step 2: Express the chromosomes and evaluate the fitness of each individual.
Step 3: Select chromosomes from the population using a random probability based on the fitness and replicate the selected chromosomes.
Step 4: Randomly apply genetic operators to the replicated chromosomes in Step 3, thus creating the next generation. The genetic operators include mutation, IS (insertion sequence) transposition, RIS (root insertion sequence) transposition, gene transposition, one-point recombination, two-point recombination and gene recombination. Step 5: When the termination criterion is satisfied, the outcome is
designated as the final result of the run. Otherwise, Steps 2 to 4 are executed iteratively.
Figure 1. An example of the expression tree in GEP
4. PROPOSED PORTFOLIO OPTIMIZATION PROCEDURE
The proposed optimization procedure comprising three stages is described in the following sub-sections.
4.1 Selection of Stocks
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 54
assets, total equity, cost of sales and operating expenses are defined as inputs in the DEA model and two variables including net sales and net
income are defined as outputs. This is in line with previous studies [20–22].
Next, the input-oriented CCR model is applied to evaluate the underlying fundamental financial strength of companies (DMUs) by using the financial data collected from the financial reports, which consists of the four inputs and two outputs. The companies are then ranked based on their efficiency scores with the highest score as rank 1. In addition, the companies with the same efficiency score are further ranked based on their earnings per share (EPS) in a descending order. Hence, companies with rank 1, up to and
including c, are then selected as the essential candidate companies (stocks)
in the investment portfolio. These are the maximum allowable number of assets in the portfolio, as shown in Eq. (6).
4.2 Optimization of a Portfolio
In the second stage, the ACO algorithm is applied to select the final stocks in the investment portfolio, as well as optimize the investment proportion of
each selected stock. First, the expected weekly return of stock i, i.e. mri in
Eq. (3), the below-mean semi variance for stock i, i.e. SVim in Eq. (2), and
the correlation coefficient between stocks i and j, i.e. ij in Eq. (2), are
calculated based on the weekly trading data in the stock market. Next, the ACO algorithm presented in Section 3.2 is used to resolve the cardinality constrained portfolio optimization problem as formulated in Eqs. (2) to (7). Since the number of companies with superior financial strength included in
the previous stage exactly equals c, the cardinality constraint in Eq. (6) is
fulfilled. In addition, the constraint regarding the expected return in Eq. (3) is designed into the objective function in Eq. (2). Hence, the objective function to be minimized in ACO is defined as follows:
} 0 , Max{
1 *
1 1
ACO
N
i
i i N
i N
j
ij m j m i j
iw SV SV M r wmr
w
f (21)
where M is a very large number that represents the penalty, while the
portfolio cannot yield an expected return better than the desired level r* as
shown in Eq. (3). In addition, the obtained jth solution (x1j,x2j,...,xcj), i.e. the
jth ant in the solution population with a cardinality of k, from ACO is
modified according to the following equation:
k j
c i
w x x y
i j i
j i
j , 1,2,..., , 1,...,
otherwise 0
if min
. (22)
Therefore, the jth solution ( j1,...,k) in ACO can now be transformed
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 55
c i
y y
w c
i i
j i j
i , 1,2,...,
1
. (23)
In this manner, all wis lie between wmin and 1, and the sum of wis in each
solution equals one, i.e.
c
i i
w
1
1 ; thus the constraints in Eqs. (4), (5) and (7)
are met.
4.3 Buying/Selling of Stocks
In the last stage, the GEP technique is utilized to forecast stock closing prices and transaction rules are designed to determine the optimal timing for buying/selling stocks. First, fifteen technical indicators including (1) 10-day
moving average, (2) 20-day bias, (3) moving average
convergence/divergence, (4) 9-day stochastic indicator K, (5)9-day stochastic indicator D, (6) 9-day Williams overbought/oversold index, (7) 10-day rate of change, (8) 5-day relative strength index, (9) 24-day commodity channel index, (10) 26-day volume ratio, (11) 13-day psychological line, (12) 14-day plus directional indicator, (13)14-day minus directional indicator, (14) 26-day buying/selling momentum indicator and (15)26-day buying/selling willingness indicator are calculated based on the historical stock trading data. These indicators will serve as the input variables of GEP forecasting models, which is in line with previous studies [23–28].The technical indicators on the last trading day of each week, along with the closing price on the last trading day of the following week, are then randomly partitioned into training and test data based on a pre-specified proportion, e.g., 4:1. Next, the GEP algorithm is utilized to construct several forecasting models and an optimal forecasting model is determined based on simultaneously minimizing the root mean squared errors (RMSEs) of the
training and test data, named ModelGEP. Let pi represent the closing price
on the last trading day of the current week and let pˆ represent the i
forecasted closing price on the last trading day of the next week for stock i.
Four transaction rules can then be designed as follows:
(1) IF (Stock i is held) AND (pˆ p), THEN (Do not take any action);
(2) IF (Stock i is held) AND ( pˆ p), THEN (Sell stock i on the next
trading day);
(3) IF (Stock i is not held) AND (pˆ p), THEN (Buy stock i on the next
trading day);
(4) IF (Stock i is not held) AND (pˆ p), THEN (Do not take any action).
Using these rules and the forecasted closing stock price obtained by the
ModelGEP, an investor can make buy/sell decisions for each stock on the last
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 56 5. CASE STUDY
In this section, a case study on investing in stocks in the semiconductor sub-section of Taiwan’s stock market is presented.
5.1 Selecting Potential Stocks
According to the Securities and Exchange Act of Taiwan, the third-quarterly financial report and annual financial report of a listed company must be
announced before October31st of the current year and before April 30th of
the next year, respectively. Hence, the financial data obtained from the third-quarterly financial report was designed to plan the investment during
the period from November31st of the current year in which the study was
conducted to April 30th of the next year, and the financial data obtained
from the annual financial report was utilized to arrange the investment plan
from May 1st to October 31st of the current year. The release time of
financial reports, the types of financial reports, the corresponding investment planning horizons and the periods of collecting ROI and trading data in this study are summarized in Table 1.Seven financial variables described in Section 4.1 are first collected from the Taiwan Economic Journal (TEJ) database at each release time of the financial report as listed in Table 1. Taking the fifth case in Table 1 as an example, there were 65 listed companies in the semiconductor sub-section of Taiwan’s stock market on October 31, 2009. The input-oriented CCR model is then applied to the remaining 48 listed companies to evaluate their underlying fundamental financial strength by using DEA-Solver Learning Version 3.0 (http://www.saitech-inc.com) software. Therefore, the best ten companies, ranked by using their efficiency scores as the first priority and their EPS as the second priority, are selected as the essential candidate companies (stocks) in the investment portfolio as listed in Table 2 (Case 5). By following the above procedure, the essential candidate stocks in investment portfolios for the other cases in Table 1 can be obtained in Table 2.
Table 1. Release time of financial reports, investment planning horizons and periods of data collection
Case No.
Release time of the financial report (The type of the financial report)
The investment planning horizon The collection period for ROI and trading data 1 (Third-quarterly report of 2007) 2007/10/31 2007/11/01~2008/04/30 2006/11/01~2007/10/31
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 57 Table 2. Essential candidate stocks in the investment portfolio
Case 1 Case 2 Case 3
Rank Stock code Efficiency score EPS Rank Stock code Efficiency score EPS Rank Stock code Efficiency score EPS 1 2454 1.00 26.48 1 2454 1.00 32.59 1 2454 1.00 15.31 2 6286 1.00 11.02 2 6286 1.00 14.98 2 3519 1.00 11.00 3 3034 1.00 10.45 3 3034 1.00 14.02 3 3579 1.00 10.64 4 6239 1.00 7.88 4 6239 1.00 11.08 4 6286 1.00 7.92 5 2451 1.00 7.28 5 2451 1.00 7.78 5 6239 1.00 7.81 6 3443 1.00 4.52 6 3532 1.00 6.70 6 3443 1.00 4.74 7 2441 1.00 3.71 7 3443 1.00 6.41 7 2451 1.00 4.12 8 8131 1.00 3.09 8 2441 1.00 5.07 8 3588 1.00 4.07 9 2473 1.00 2.45 9 2330 1.00 4.14 9 2330 1.00 3.36 10 6145 1.00 0.01 10 8131 1.00 4.11 10 2441 1.00 2.76
Case 4 Case 5 Case 6
Rank Stock code Efficiency score EPS Rank Stock code Efficiency score EPS Rank Stock code Efficiency score EPS 1 2454 1.00 18.01 1 2454 1.00 26.04 1 2454 1.00 34.12 2 3579 1.00 14.16 2 6286 1.00 7.75 2 6286 1.00 10.93 3 6239 1.00 10.38 3 2451 1.00 7.11 3 2451 1.00 10.42 4 6286 1.00 10.05 4 6239 1.00 4.92 4 6239 1.00 7.44 5 3443 1.00 6.05 5 6145 1.00 2.84 5 2330 1.00 3.45 6 2451 1.00 5.72 6 3041 1.00 2.51 6 3041 1.00 3.23 7 3588 1.00 5.05 7 2330 1.00 2.19 7 3443 1.00 3.15 8 2330 1.00 3.86 8 2441 1.00 1.73 8 6145 1.00 3.13 9 2441 1.00 3.10 9 2473 1.00 1.29 9 3579 1.00 2.89 10 3532 1.00 2.54 10 3443 1.00 1.07 10 2441 1.00 2.74
Case 7 Case 8
Rank Stock code Efficiency score EPS Rank Stock code Efficiency score EPS 1 2454 1.00 24.95 1 2454 1.00 28.44 2 6286 1.00 11.82 2 6286 1.00 14.60 3 6239 1.00 8.37 3 6239 1.00 10.89 4 2330 1.00 4.67 4 3579 1.00 9.02 5 5471 1.00 4.15 5 2330 1.00 6.24 6 3443 1.00 3.42 6 4919 1.00 4.13 7 2351 1.00 3.14 7 2451 1.00 3.48 8 6202 1.00 3.05 8 8131 1.00 3.46 9 2451 1.00 2.79 9 8271 1.00 2.92 10 8131 1.00 2.38 10 2473 1.00 2.22
5.2 Optimizing the Portfolio
In order to select the final stocks in the investment portfolio and optimize their investment proportions, the research first collects the weekly ROI of each essential candidate stock listed in Table 2 from the TEJ database. The collection period for the ROI data is the previous 12 months starting from the release time of the financial report (see Table 1). Following the data
collection, the expected weekly return of stock i, i.e. mri in Eq. (3), the
below-mean semi variance for stock i, i.e. SVim in Eq. (2), and the
correlation coefficient between stocks i and j, i.e. ij in Eq. (2), can be
calculated.
Next, the ACO algorithm coded by using C++ programming language is used to resolve the portfolio optimization problem as formulated in Eqs. (2)
to (7) where the minimum proportion of each stock held, i.e. wmin in Eq. (5),
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 58
is set as 10. The expected portfolio return, i.e. r* in Eq. (3), is set as the
maximum of the average weekly ROI over the last twelve months’ stock market and the weekly interest rate of a fixed deposit for six to nine months bulletined by the Bank of Taiwan to reflect the activeness of investors. In addition, the objective function in ACO is designed by Eq. (21) in Section
4.2 where the parameter M is set as 1,000. To find the optimal settings of
the key parameters in ACO , including k (cardinality, i.e. the total number
of ants), q, , s and rmax (the maximum allowable cycles for the ACO
algorithm to attempt to improve on its best solution), a preliminary
experiment is conducted using a 25-1 fractional factorial design for the
seventh case in Table 1. Table 3 shows the experimental results by carrying out thirty replications for each combination of parameters, and Table 4
shows the analyzed results. The parameter k, interaction qrmax and
interactionrmax are automatically selected into the model in ANOVA, as
shown in Table 4. According to Table 4, the model is significant at
05 . 0
. From the effect plot of parameter k, interaction qrmax and
interaction rmaxgraphed in Figure2, the optimal settings of k, q, and
rmaxin ACO are set at 100, 4, 0.9 and 200, respectively. In addition, the
parameter s, i.e., the total number of worst solutions in the original
population replaced by the best solutions generated by the ACO search process, is set as 20. The fifth case in Table 1, taken as an example, shows that the weekly ROI data of the essential candidate stocks listed in Table 2 (Case 5) are collected from November 1, 2008 to October 31, 2009. The expected weekly return, the below-mean semi variance of each stock, and the correlation coefficient between each pair of stocks are calculated. The ACO search procedure is implemented for 100 runs on a personal computer with an Intel Core 2 Quad 2.66GHz CPU and 2GB RAM, and Table 5 lists the optimal portfolio. The average weekly ROI in the Taiwan stock market from November 1, 2008 to October 31, 2009 is 0.88%, and the weekly interest rate of a fixed deposit for six to nine months bulletined by the Bank of Taiwan on October 31, 2009 is 0.0142%. Therefore, the expected
portfolio return r*is set as 0.88%. According to the experimental results of
the fifth case in Table 5, the portfolio contains five stocks including stocks with codes 2454, 6239, 6145, 2330 and 2441, and their corresponding investment proportions are 0.0857, 0.2592, 0.0868, 0.4822 and 0.0861,
respectively. The investment risk (variance) of the portfolio is 1.15×10-3,
and the expected weekly ROI of the portfolio is 1.33×10-2 (1.33%), which is
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 59
corresponding investment proportions, investment risk and expected weekly ROI and CPU time. This information is summarized in Table 5.
Table 3. A preliminary experiment on ACO parameters
No. k q s rmax Mean of fACO Variance of fACO
1 50 2 0.90 10 10 3.18×10-4 4.66×10-9
2 100 2 0.90 10 10 2.92×10-4 3.19×10-9
3 50 4 0.90 10 10 3.39×10-4 5.18×10-9
4 100 4 0.90 10 10 2.98×10-4 4.88×10-9
5 50 2 0.99 10 10 3.01×10-4 3.47×10-9
6 100 2 0.99 10 10 2.92×10-4 3.74×10-9 7 50 4 0.99 10 10 3.20×10-4 5.33×10-9
8 100 4 0.99 10 10 2.75×10-4 2.06×10-9
9 50 2 0.90 20 20 3.11×10-4 3.46×10-9
10 100 2 0.90 20 20 2.95×10-4 3.74×10-9
11 50 4 0.90 20 20 2.77×10-4 3.93×10-9
12 100 4 0.90 20 20 3.10×10-4 3.92×10-9
13 50 2 0.99 20 20 3.20×10-4 3.72×10-9
14 100 2 0.99 20 20 2.90×10-4 4.34×10-9
15 50 4 0.99 20 20 3.11×10-4 5.12×10-9 16 100 4 0.99 20 20 2.80×10-4 3.62×10-9
Table 4. ANOVA for the preliminary experiment on ACO parameters
Source Sum ofsquares d.f. Meansquare Fvalue Significance Model 9.13×10-8 6 1.52×10-8 3.75 0.0012
k 5.06×10-8 1 5.06×10-8 12.48 0.0005
q 8.16×10-11 1 8.16×10-11 0.02 0.8872 5.03×10-9 1 5.03×10-9 1.24 0.2661
rmax 1.62×10-9 1 1.62×10-9 0.40 0.5272 max
r
q 1.59×10-8 1 1.59×10-8 3.93 0.0479 max
r
1.81×10-8 1 1.81×10-8 4.45 0.0353
Residual 1.92×10-6 473 4.05×10-9
Lack of Fit 5.04×10-8 9 5.60×10-9 1.39 0.1892
Pure Error 1.87×10-6 464 4.02×10-9
Corrected Total 2.01×10-6 479
(A) Effect of Parameter k(B) Effect of Interaction qrmax(C) Effect of Interaction rmax
Figure 2. Effects of the parameter and interactions
Table 5. The optimal investment portfolio obtained using ACO
Case 1 Case 2 Case 3 Case 4 Stock
code
Investment proportion
Stock code
Investment proportion
Stock code
Investment proportion
Stock code
Investment proportion 2454 0.3503 2454 0.0776 3519 0.1657 2454 0.1978 3034 0.1985 6239 0.2957 6286 0.1263 6286 0.5055 6239 0.1538 2451 0.2442 6239 0.0887 2451 0.2213 2451 0.1218 2330 0.3825 3443 0.1678 2330 0.0754 2441 0.1756 - - 2451 0.1949 - -
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 60
Investment risk (variance)
4.58×10-4 Investment risk
(variance)
7.15×10-4 Investment risk
(variance)
1.57×10-3 Investment risk
(variance)
2.62×10-3
Expected
weekly ROI 1.00×10
-2 Expected
weekly ROI 2.81×10
-3 Expected
weekly ROI -8.00×10
-3 Expected
weekly ROI -1.02×10
-3
Stock market weekly ROI 6.31×10
-3 Stock market
weekly ROI 2.80×10
-3 Stock market
weekly ROI -1.20×10
-2 Stock market
weekly ROI -6.62×10
-3
CPU Time (sec) of 100
runs
51.45
CPU Time (sec) of 100
runs
52.81
CPU Time (sec) of 100
runs
27.06
CPU Time (sec) of 100
runs
51.52
Table 5. The optimal investment portfolio obtained using ACO (Continued)
Case 5 Case 6 Case 7 Case 8 Stock code Investment proportion Stock code Investment proportion Stock code Investment proportion Stock code Investment proportion 2454 0.0857 6286 0.1074 2330 0.8706 6286 0.0850 6239 0.2592 6239 0.2581 6202 0.1294 3579 0.1384 6145 0.0868 2330 0.5226 - - 2330 0.5934 2330 0.4822 2441 0.1118 - - 2451 0.0709 2441 0.0861 - - - - 2473 0.1123 Investment
risk (variance)
1.15×10-3 Investment risk
(variance)
3.82×10-4 Investment risk
(variance)
2.86×10-4 Investment risk
(variance)
2.96×10-4
Expected
weekly ROI 1.33×10
-2 Expected
weekly ROI 7.85×10
-3 Expected
weekly ROI 2.67×10
-3 Expected
weekly ROI 3.05×10
-3
Stock market weekly ROI 8.83×10
-3 Stock market
weekly ROI 6.13×10
-3 Stock market
weekly ROI 2.67×10
-3 Stock market
weekly ROI 2.59×10
-3
CPU Time (sec) of 100
runs
50.70
CPU Time (sec) of 100
runs
51.22
CPU Time (sec) of 100
runs
54.05
CPU Time (sec) of 100
runs
51.52
5.3 Stock Buying and Selling
In this stage, the transaction rules designed in Section 4.3 are used to determine the optimal timing for buying or selling stocks with the help of stock price forecasting models constructed by the GEP technique. The fifth case in Table 1 is taken as an example. The daily trading data including opening price, highest price, lowest price, closing price and trade volume of the ten essential candidate stocks as shown in Table 2 are first collected from Taiwan Stock Exchange Corporation(TWSE) for the last twelve months starting from the release time of the financial report. The fifteen technical indicators described in Section 4.3are then calculated for the last trading day of each week. The technical indicators for the last trading day of each week along with the closing price on the last trading day of the following week are randomly partitioned into training and test data groups based on the proportion of 4:1.Next, the GEP algorithm using the GeneXpro Tools 4.0 (http://www.gepsoft.com) software is employed to construct stock price forecasting models where the fitness of an individual is evaluated through RMSE and the parameters are set as their default values. The GEP algorithm is executed 5 times and the optimal GEP forecasting model is
selected based on the training and test RMSEs, described as ModelGEP.
Next, the fifteen technical indicators for the last trading day of each week in
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 61
model, thus obtaining the forecasted closing stock price for the last trading day of the next week. With the forecasted closing stock prices, the investor can make buy/sell decisions for each stock on the last trading day of each week based on the four transaction rules presented in Section 4.3.
Here, assume that the initial investment capital is one million dollars and the total investment capital can vary at any time due to the profit or loss arising from stock transactions made during the investment planning horizon. Next, assume the stocks are arbitrarily dividable, and can be bought or sold absolutely at the opening prices on the next trading day after the day of making buy/sell decisions. In addition, the stocks held must be sold out on the last trading day of the investment planning horizon. Table 6 illustrates the partial transactions of stock 6239 contained in the portfolio listed as the fifth case in Table 5. The closing price on November 6, 2009 is 87.58 which is less than the forecasted closing price 90.80 for the last trading day of the next week, i.e. November 13, 2009. Hence, based on the third transaction rule in Section 4.3, stock 6239 is bought at the opening price of 88.06 on the next trading day after November 6, 2009, which is November 9, 2009. As for November 13, 2009, the closing price of 89.79 is less than the forecasted closing price of 92.37 for the last trading day of the next week; thus no actions are taken in keeping with the first transaction rule. In addition, the forecasted closing price for January 22, 2010 is 106.64, which is less than the closing price of 107.78 on January 15, 2010. Therefore, based on the second transaction rule, stock 6239 is sold out at the opening price of 106.82 on January 18, 2010, which yields a profit of 18.76 (106.82-88.06) for each share. The four transaction rules are likewise applied to the other stocks in the portfolio for the fifth case in Table 5, i.e. stocks 2454, 6145, 2330 and 2441. Hence, the profit or loss for each stock transaction made during the investment planning horizon is obtained, yielding a final return on
investment of 11.46% as shown by the ROI1 value for Case 5 in Table 7.By
following the above procedure, the returns on investment for other cases in Table 1during the investment planning horizon can be obtained. This is
shown by the ROI1 values in Table 7. This table also summarizes the return
on investment when investing in stocks using only the first and second stages of the proposed portfolio optimization procedure, i.e. the Buy & Hold
strategy, denoted by ROI2, and the return on investment in the
semiconductor sub-section of Taiwan’s stock market, denoted by ROI3.
Based on the ROI1 values in Table 7, the average six-month ROI can attain
an extreme high level of 13.12%. Even in the worst case, the ROI can still reach 0.86%, which is equivalent to a yearly ROI of 1.72%. This value is still higher than the normal yearly interest rate of a fixed deposit for six to
nine months in Taiwan, which is only around 1.1%. While not each ROI1
value exceeds the corresponding ROI2 value in Table 7, all the
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 62
Furthermore, the average of ROI1 values exceeds the average of ROI2 values
by 11.53%. With regard to the ROI1 and ROI3 values in Table 7, the former
are larger except in the third case, where the ROI1 value of 23.21% is
slightly smaller than its corresponding ROI3 value of 23.67%. In addition,
the average ROI1 values can attain a level of 13.12%, which is highly
superior to the ROI3 value of -2.39%. These results are shown in Figure 3.
Table 6. Partial transactions of stock 6239 (for Case 5 in Table 5)
Date Closing price closing price Forecasted Transaction Transaction Date Transaction rule 2009/11/06 87.58 90.80 [email protected] 2009/11/09 Rule 3 2009/11/13 89.79 92.37 - - Rule 1 2009/11/20 87.38 91.54 - - Rule 1 2009/11/27 84.88 88.63 - - Rule 1 2009/12/04 87.29 89.79 - - Rule 1 2009/12/11 93.06 93.93 - - Rule 1 2009/12/18 94.70 97.39 - - Rule 1 2009/12/25 102.01 102.44 - - Rule 1 2009/12/31 104.42 104.72 - - Rule 1 2010/01/08 104.90 106.92 - - Rule 1 2010/01/15 107.78 106.64 [email protected] 2010/01/18 Rule 2
Table 7. The information for each investment portfolio in Table 5
Case No. Initial capital Final capital ROI1 ROI2 ROI3
1 1,000,000 1,187,000 18.70% -50.87% -12.47% 2 1,000,000 1,156,700 15.67% -30.79% -39.54% 3 1,000,000 1,232,100 23.21% 10.85% 23.67% 4 1,000,000 1,158,400 15.84% 73.99% 11.10% 5 1,000,000 1,114,600 11.46% 11.94% 8.28% 6 1,000,000 1,008,600 0.86% -7.67% -9.25% 7 1,000,000 1,133,100 13.31% 7.51% 5.25% 8 1,000,000 1,058,900 5.89% -2.25% -6.14% Max 1,000,000 1,232,100 23.21% 73.99% 23.67% Min 1,000,000 1,008,600 0.86% -50.87% -39.54% Average 1,000,000 1,131,175 13.12% 1.59% -2.39%
Figure 3. Comparison of ROIs based on the proposed approach, Buy & Hold strategy and stock market
23.21%
73.99%
23.67%
0.86%
-50.87%
-39.54% 13.12%
1.59%
-2.39%
-60% -40% -20% 0% 20% 40% 60% 80%
Proposed approach Buy & Hold strategy Stock market ROI
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 63 6. CONCLUSIONS
In this study, the data envelopment analysis (DEA), ant colony optimization for continuous domains (ACO ) and gene expression programming (GEP) are utilized to develop an integrated approach to deal with the portfolio optimization problems. The feasibility and effectiveness of the proposed procedure are verified through a case study on investing stocks in the semiconductor sub-section of Taiwan stock market over the period from November1, 2007 to July8, 2011. The obtained results show that the average return on investment (ROI) of six months can attain a very high level of 13.12%, as well as the ROI value for the worst case is still higher than the normal yearly interest rate of a fixed deposit for six to nine months in Taiwan. Next, the experimental results indicates that the third stage of the proposed portfolio optimization procedure indeed functions to assist the investors for determining the optimal timing for buying/selling stocks thus avoiding a substantial investment loss and eventually making a superior profit. Furthermore, the proposed procedure can positively assist the investors to make profits even though the overall stock market suffers a loss. The present study makes four main contributions to the literature. First, it successfully proposes a systematic procedure for portfolio optimization
using based on DEA, ACOR and GEP based on the data collected from the
financial reports and stock markets. Second, it can help an investor to screen stocks with the most profitable potential rapidly, even when he or she lacks sufficient financial knowledge. Third, it can automatically determine the optimal investment proportion of each stock to minimize the investment risk while satisfying the target return on investment set by an investor. Fourth, it can fill the scarcity of discussions about the timing for buying/selling stocks in the literature by providing a set of transaction rules based on the actual and forecasted stock prices.
REFERENCES
[1] Markowitz, H.M.Portfolio selection.J. Finance, 7, 1 (1952), 77–91.
[2] Anagnostopoulos, K.P.,and Mamanis, G. A portfolio optimization model with three objectives and discrete variables.Comput. Oper. Res., 37, 7 (2010), 1285–1297. [3] Zitzler, E., Laumanns, M., and Thiele, L. SPEA2: Improving the Strength Pareto
Evolutionary Algorithm. Computer Engineering and Networks Laboratory (TIK),
Department of Electrical Engineering, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland, 2001.
[4] Corne, D. W., Knowles, J. D., and Oates, M. J. The Pareto envelop-based selection algorithm for multiobjective optimization.InProceedings of the 6th International
Conference on Parallel Problem Solving from Nature(Paris, France, September 18–20,
2000). Springer-Verlag, Heidelberg, Berlin, 2000, 839–848.
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 64
Swarm, Evolutionary, and Memetic Computing. Springer-Verlag, Heidelberg, Berlin,
2010, 238–245.
[6] Chen, Y., Mabu, S., and Hirasawa, K. A model of portfolio optimization using time adapting genetic network programming.Comput. Oper. Res., 37, 10 (2010), 1697– 1707.
[7] Sun, J., Fang, W., Wu, X.J., Lai, C.H., and Xu, W.B.Solving the multi-stage portfolio optimization problem with a novel particle swarm optimization.Expert Syst. Appl., 38,6 (2011), 6727–6735.
[8] Woodside-Oriakhi, M., Lucas, C., and Beasley, J.E.Heuristic algorithms for the cardinality constrained efficient frontier.Eur. J. Oper. Res., 213, 3 (2011), 538–550. [9] Chang, J.F., and Shi, P. Using investment satisfaction capability index based particle
swarm optimization to construct a stock portfolio. Inf. Sci., 181, 14 (2011), 2989–2999. [10]Sadjadi, S. J., Gharakhani, M., and Safari, E. Robust optimization framework for
cardinality constrained portfolio problem. Appl. Soft Comput., 12, 1 (2012), 91–99. [11]Yunusoglu, M. G., and Selim, H. A fuzzy rule based expert system for stock evaluation
and portfolio construction: an application to Istanbul Stock Exchange. Expert Syst.
Appl., 40, 3(2013), 908–920.
[12]Vercher, E., and Bermudez, J. D. A possibilistic mean-downside risk-skewness model for efficient portfolio selection. IEEE. T. Fuzzy Syst., 21,3 (2013), 585–595.
[13]Farzi, S., Shavazi, A. R., and Pandari, A. Using quantum-behaved particle swarm optimization for portfolio selection problem. Int. Arab J. Inf. Technol., 10, 2 (2013), 111–119.
[14]Markowitz, H.M.Portfolio Selection. John Wiley and Sons, New York, 1959.
[15]Charnes, A., Cooper, W. W., and Rhodes, E. Measuring the efficiency of decision making units.Eur. J. Oper. Res., 2, 6 (1978), 429–444.
[16]Banker, R.D., Charnes, A., and Cooper, W. W. Some models for estimating technical and scale inefficiencies in data envelopment analysis.Manage. Sci. 30, 9 (1984), 1078– 1092.
[17]Socha, K. ACO for continuous and mixed-variable optimization.In Dorigo, M., Birattari, M., Blum, C., Gambardella, L.M., Mondada, F., and Stutzel, T. (Eds.), Ant
Colony Optimization and Swarm Intelligence. Springer, Brussels, Belgium, 2004, 25–
36.
[18]Blum, C. Ant colony optimization: introduction and recent trends.Phys. Life Rev., 2, 4 (2005), 353–373.
[19]Ferreira, C. Gene expression programming: a new adaptive algorithm for solving problems.Complex Syst., 13, 2 (2001), 87–129.
[20]Chen, Y. S., and Chen, B. Y. Applying DEA, MPI, and grey model to explore the operation performance of the Taiwanese wafer fabrication industry. Technol.
Forecasting Social Change, 78, 3 (2011), 536–546.
[21]Lo, S. F., and Lu, W. M. An integrated performance evaluation of financial holding companies in Taiwan. Eur. J. Oper. Res., 198, 1 (2009), 341–350.
[22]Chen, H. H. Stock selection using data envelopment analysis. Ind. Manage. Data Syst., 108, 9 (2008), 1255–1268.
IJCSBI.ORG
ISSN: 1694-2108 | Vol. 9, No. 1. JANUARY 2014 65 [24]Huang, C.L., and Tsai, C.Y.A hybrid SOFM-SVR with a filter-based feature selection
for stock market forecasting. Expert Syst. Appl., 36, 2 (2009), 1529–1539.
[25]Ince, H., and Trafalis, T. B. Short term forecasting with support vector machines and application to stock price prediction. Int. J. Gen. Syst., 37, 6 (2008), 677–687.
[26]Kim, K.J., and Han, I. Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index.Expert Syst. Appl., 19, 2 (2000), 125–132.
[27]Kim, K.J., and Lee, W.B.Stock market prediction using artificial neural networks with optimal feature transformation.Neural Compu. Appl., 13, 3 (2004), 255–260.
[28]Tsang, P.M., Kwok, P., Choy, S.O., Kwan, R., Ng, S.C., Mak, J., Tsang, J., Koong, K., and Wong,T.L. Design and implementation of NN5 for Hong Kong stock price forecasting.Eng. Appl. Artif. Intell., 20, 4 (2007), 453–461.
This paper may be cited as:
Hsu, C. M., 2014. An Integrated Procedure for Resolving Portfolio Optimization Problems using Data Envelopment Analysis, Ant Colony
Optimization and Gene Expression Programming. International Journal of