• No results found

The SEMINAL Workshop: Reformulating Software Engineering as a Metaheuristic Search Problem

N/A
N/A
Protected

Academic year: 2021

Share "The SEMINAL Workshop: Reformulating Software Engineering as a Metaheuristic Search Problem"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

ACM SIGSOFT Software Engineering Notes vol 26 no 6 November 2001 Page 62

T h e S E M I N A L W o r k s h o p :

R e f o r m u l a t i n g S o f t w a r e E n g i n e e r i n g

as a M e t a h e u r i s t i c S e a r c h P r o b l e m

Mark H a r ~ - n Brunel University Uxbridge Middlesex UB8 3PH United Kingdom mark. harman©brunel, ac. uk

and B r y a n F. Jones School of Computing University of Glamorgan Pontypridd CF37 1DL United Kingdom bfj ones©glam, ac. uk A b s t r a c t

This paper reports on the first international Workshop on Software Engineering using Metaheuristic INnovative ALgo- rithms.

The aim of the workshop was to bring together researchers in search-based metaheuristic techniques with researchers and practitioners in Software Engineering. The workshop sought to support and develop the embryonic community which straddles these two communities and which is working on the application of metaheuristic search-based techniques to prob- lems in Software Engineering.

The paper outlines the nature of the nascent field of Search- Based Software Engineering, and briefly outlines the papers presented at the workshop and the discussions which took place.

I n t r o d u c t i o n

The first international Workshop on Software Engineering us- ing Metaheuristic INnovative ALgorithms (SEMINAL) was held as a one day workshop on Monday 14th May 2001, co- located with the IEEE International Conference on Software Engineering.

The motivation for the workshop was founded upon the fol- lowing observation:

Search techniques, such as genetic algorithms, simulated annealing and tabu search have been widely applied in other engineering disciplines but less so in Software Engineering. This seems odd, because many features of Software Engineering make it ideal for the application of these search- based techniques.

Prior to the workshop, initial work on metaheuristics for Soft- ware Engineering had been conducted in the areas of testing

[JSE96, JES98, TCM98, PHP99, WGG+96, WSJE97] and cost estimation [Dol00, Dol01]. The workshop aimed to push metaheuristics techniques into other areas of Software En- gineering research and practice. The workshop included two sessions: a specific session on the application of metaheuristics to software testing and a general session on the application of metaheuristics to the wider Software Engineering field. The workshop attracted 26 participants from five coun- tries with representation from academia and industry and with work presented on theory, practice and evalua- tion of the application of metaheuristic search-based tech- niques to Software Engineering problems. There were two keynote talks and five paper presentations and a lively discussion of the issues. Papers presented at the workshop are available on the workshop website at http ://www. brunel, ac.uk/~¢sstmmh2/seminal200 I/.

Details of future workshops and conferences on t he application of metaheuristic search techniques to Software Engineering problems can be found on the SEMINAL 3 website at

http://www.discbrunel.org.uk/seminal/seminalhome.html

The rest of this paper is organised as follows. Section briefly outlines the field of Search-Based Software Engineering, while section provides the motivation for the application of search- based techniques to Software Engineering problems. Section provides a brief overview of the papers presented at the work- shop. Section gives an account of the discussion at the work- shop and Section suggests some directions for future work on Search-based Software Engineering.

W h a t is S e a r c h - B a s e d S o f t w a r e E n g i n e e r i n g ? Search-based Software Engineering is a reformulation of soft- ware Engineering as a search problem, in which the solution to a problem is found by sampling a large search space of pos- sible solutions. The sampling techniques employed, use meta- heuristic techniques, such as genetic algorithms, tabu search and simulated annealing to optimise the search.

It is widely known that no metaheuristic search technique is suitable for all forms of search problem [WM97]. However, hitherto work on Search-Based Software Engineering has cen- tred on the use of Genetic Algorithms.

The remainder of this section briefly surveys the Genetic Al- gorithm search technique to give a flavour of the Search-based approach to Software Engineering to the reader who is fa- miliar with Software Engineering, but unfamiliar with meta- heuristic search techniques.

Genetic Algorithms

Genetic algorithms (GA) search for optimal solutions by sam- pling the search space at random and creating.a set of can- 3SEMINAL is the EPSRC-funded Network on Software Engineering using Metaheuristic INnovative ALgorithms(GR/M87083).

(2)

ACM S I G S O F T Software Engineering Notes vol 26 no 6 November 2001 Page 63

didate solutions called a 'population' [tio175, B/ic96]. These candidates are combined and m u t a t e d to evolve into a new generation of solutions which may or may not be fitter, that is closer to the desired optimum. Recombination is fundamen- tal to the GA and provides a mechanism for mixing genetic material within the population. Mutation is vital in intro- ducing new genetic material thereby preventing the search from stagnating. T h e next population of solutions is chosen from the parent and offspring generations in accordance with a survival strategy t h a t normally favours fit individuals but nevertheless does not preclude the survival of the less fit. In this way, a diverse pool of genetic material is preserved for the purpose of breeding yet fitter individuals. This terminol- ogy strongly suggests t h a t GAs are based on the principles of Darwinian evolution. This analogy should not be taken too literally since Darwinian evolution is assumed blind and improvements happen by chance rather than by aspiring to a goal; improved individuals are those with a better chance of survival and thus they are able to pass their genes to their offspring.

Chance plays an important role in GAs, though their suc- cess in locating an optimum 4 strongly depends on a judicious choice of a fitness function. T h e fitness function must be de- signed carefully to reflect the nature of the optimum and to direct the search along promising pathways.

GAs operate on a population of individuals (often called chro- mosomes) each of which has an assigned fitness. T h e popula- tion size should be sufficiently great to allow a substantial pool of genetic material, b u t not be so large t h a t the search degen- erates into a r a n d o m search. Those individuals that either un- dergo recombination or survive are chosen with a probability which depends on fitness in some way. There are many dif- ferent selection mechanisms: in the so-called roulette wheel, the population's cumulative fitness is normalised to give a set of probabilities for each individual; in the N-tournament, N individuals are selected at random from a population and the fittest chosen.

A generic evolutionary algorithm is presented in Figure 1. T h e key ingredients of the genetic algorithm are

1. A representation for candidate solutions

2. A fitness function to determine when one candidate so- lution is superior to another

3. T h e choice of recombination, mutation, and selection algorithms.

Typically a suitable representation is not too hard to find for Software Engineering candidate solutions, as many of the problems software engineers face are ultimately stored on a computer in some representation or other. Also fitness often presents few difficulties, as there are often a plethora of avail- able metrics from which to choose. Finding suitable operators

am" sub-optlmum; in this section, t h e t e r m optimum is taken t o sub- sume sub-optimum

Set generation number, m : = 0

Choose the initial population of candidate solutions, P(O) Evaluate the fitness for each individual of P(O), F(P~(O))

loop

Recombine:

P'(m)

:=

R(P(m) )

M u t a t e :

P"(m)

: =

M(P'(m))

Evaluate:

F( P" (m) )

Select:

P(m +

1) : =

S(P"(m))

m : = m + l

e x i t when goal or stopping condition is satisfied e n d

loop;

Figure h A Generic Evolutionary Algorithm

for mutation and cross-over is also often relatively straight- forward.

Simulated annealing [MRR+53] is a m e t h o d of local search- ing. Using simulated annealing the search is able to consider mutated variants of the current population which represent decreases in fitness. T h e likelihood of accepting these 'infe- rior' solutions is decreased as time progresses. T h e idea is to simulate the annealing process in the formation of crystals as t e m p e r a t u r e decreases.

Tabu search [Glo90] is an iterative procedure for solving dis- crete combinatorial optimisation problems. T h e space of local possible solutions is searched in a sequence of moves from one possible solution to the best available alternative. In order to prevent being stuck at a sub-optimal solution and to avoid drifting away from the global optimum, some moves are clas- sifted as forbidden or tabu (taboo). T h e list of t a b u moves is formed using both short-term and long-term m e m o r y of pre- vious unpromising moves. A move may also be regarded as 'unpromising' simply because it was recent. On occasions, a tabu move may be allowed. This is an aspiration criterion whereby the tabu move might lead to the best solution ob- tained so far.

W h y S e a r c h - B a s e d S o f t w a r e E n g i n e e r i n g ?

Software engineers often face problems associated with the balancing of competing constraints, trade-offs between con- cerns and requirement imprecision. Perfect solutions are of- ten either impossible or impractical and the nature of the problems often makes the definition of analytical algorithms problematic.

Like other engineering disciplines, Software Engineering is typically concerned with near optimal solutions or those which fall within a specified acceptable tolerance. It is precisely these factors which make robust metaheuristie search-based optimization techniques readily applicable.

Software Engineering problems often often lead to observa- tions such as those below (indeed, many similar remarks were to be overheard in discussions at the 23 ~d I C S E conference in

(3)

ACM SIGSOFT Software Engineering Notes vol 26 no 6 November 2001 Page 64

Toronto):

"We need to balance c o m p e t i n g constraints." "We have to cope with inconsistency."

"Unfortunately there are m a n y p o t e n t i a l solu- tions."

"There is no p e r f e c t answer, but I could recog- nise a g o o d o n e from a bad one."

"Sadly, there are no precise rules for c o m p u t - ing t h e b e s t solution, even though we know what properties we want good solutions to have." These observations are precisely those which make search- based techniques applicable in other fields of engineering [LT92, Bir92, KStIH93, PCV95, Bab98, BW96, BKAK99]. However, the discipline of S o f t w a r e Engineering appears to be unique with regard to the application of these techniques; metaheuristic algorithms have received comparatively little attention from software engineers in comparison with that which they have received from researchers and practitioners in the more established fields of engineering.

In order to reformulate Software Engineering as a search prob- lem, it will be necessary to define three things:

• a representation of the problem which is amenable to sym- bolic manipulation;

• a fitness function, defined in terms of this representation and

• a set of manipulation operators.

Fortunately, these three requirements are often relatively easy to satisfy for Software Engineering problems.

P a p e r s P r e s e n t e d

This section briefly overviews the papers presented at the workshop. Postscript copies of the papers are available on eth workshop website at:

http ://www. brunel, ac. uk/" csstmmh~/seminal2001 /

K e y n o t e : A n O v e r v i e w o f G e n e t i c A l g o r i t h m s

This talk was given by Darrell Whitley, Computer Science Department, Colorado State University, USA.

Dr. Whitley's talk provided an overview of Genetic Algo- rithms, indicating many useful and important theoretical re- sults which software engineers seeking to exploit these tech- niques need to be aware of. He highlighted the issues in- volved in representation and selection operators, indicating

that results which show binary encoding a poor second to real-number encoding can be misleading; grey coding should always be used in place of pure binary encoding.

Dr. Whitley produced an invited paper for the special issue of the Journal of Information and Software technology on Soft- ware Engineering using Metaheuristic Algorithms [Whi01]. This paper provides a printed account of some of the insights into the theory and practical of genetic algorithms presented by Dr. Whitley at the workshop.

Keynote: A n O v e r v i e w o n E v o l u t i o n a r y T e s t i n g This talk was given by Joachim Wegener, DaimlerChrysler, Germany.

Dr. Wegener provided a detailed and thorough overview of software testing using genetic algorithms and simulated an- nealing. The talk covered structural testing (using coverage analysis), mutation testing, worst- and best- case execution time testing and issues of boundary value analysis and parti- tion testing.

The talk also included a survey of empirical results and ex- perience gained over seven years of the use of evolutionary testing techniques at DaimlerChrysler.

The slides of Dr. Wegener's talk are available on the workshop website.

A skeleton for t h e Tabu Search M e t a h e u r i s t i c w i t h A p p l i c a t i o n s t o P r o b l e m s in S o f t w a r e E n g i n e e r i n g This talk was given by Maria Blesa, Department de Llen- guatges i Sistemes Informhtics, Universitat Polit~cnica de Catalunya, Spain.

The talk presented a template Tabu search engine, which can be instantiated to address problems in Software Engineering (and others). Ms. Blesa illustrated the use of the template with the instantiation to handle project management schedul- ing and provided a brief overview of the tabu search technique. Most work on Search-Based Software Engineering focuses on the use of Genetic Algorithms, so the talk also contributed to this emergent research field by presenting and alternative form of search, the provision of tools is a valuable resource for the community.

Maria's work is part of the MALLBA project. The system proposed by Ms. Blesa is available at the MALLBA project website at h t t p : / / ~ . Isi. upc. es/~mallba.

Seizing t h e O p p o r t u n i t y : H o w H a r d w a r e A d v a n c e s and G e n e t i c P r o g r a m m i n g m a y E a s e t h e S o f t w a r e Skills Shortage

This talk was given by John Hart, School of Design, Engi- neering ~ Computing, Bournemouth University, UK.

(4)

ACM SIGSOFT Software Engineering Notes vol 26 no 6 November 2001 Page 65

Mr. Hart presented an argument that software development be moved 'upstream' from coding to design. Rather than coding programs, programmers could code fitness functions which could be used to code programs. Programming could thus become a question of fitness function design. Mr. Hart pointed out that this approach might help to ameliorate the growing programming skills gap.

Mr. Hart accepted that this approach would not be suitable for all programs (most notably, it would be inappropriate for safety critical systems). However, he cited a number of ex- amples where the approach could yield additional insight into the perceived requirements captured by the fitness function. A P r e d i c t i o n S y s t e m for D y n a m i c Optimisation- Based E x e c u t i o n T i m e Analysis

This talk was given by Gerdi Gross, Fraunh0fer Institute for Experimental Software Engineering, Germany.

The talk addressed the problem of defining and evaluating metrics which determine how successful evolutionary testing is likely to be. The particular emphasis of the work was on test data generation for worst and best case execution time. The metrics provided a guide to the testability of the program under consideration.

Dr. Gross reported experience with the use of these metrics, suggesting that metrics needed to be combined to achieve a good correlation between predicated testability and actual testability.

The definition of metrics to predict aspects of evolutionary test data selection is a sign of the growing maturity of this sub-area of Search-Based Software Engineering. Initial work on the feasibility of applying evolutionary search algorithms to test data selection was successful. This work is now being developed to address issues such as predicting and improving the testability of software using these techniques.

I m p r o v i n g Heuristic Software Analysis with Semantic Information

This talk was given by Christoph Michael, Cigital, USA. The talk described an approach used by Cigital to identify in- fluencing input domains. The technique, known as abduction, identifies the conditions under which a portion of code will be executed. This work is aimed at helping to guide the choice of test data through partial analysis of program semantics. It thus supports evolutionary testing by narrowing the search space.

The approach represents the combination of existing (by par- tial) analytic algorithms with metaheuristic search.

A Genetic A l g o r i t h m Fitness F u n c t i o n for M u t a t i o n Testing

This talk was given by Leonardo Bottaci, Department of Com- puter Science, University of Hull, UK.

The talk presented a fitness function for use by a genetic algo- rithm to search for test data for mutation testing. The fitness function incorporates the necessity and sufficiency conditions for a test to kill a mutant in addition to the usual reachabil- ity measure. For programs written in a procedural style lan- guage, the system under development generates mutants that are instrumented for reachability, necessity and sufficiency. I s s u e s R a i s e d i n D i s c u s s i o n

There was a detailed and extensive discussion of genetic pro- gramming [Koz92]. Software engineers, and in particular those with extensive experience of software maintenance is- sues were concerned that the code created by this technique would present significant comprehension and maintenance challenges. Eventually a consensus emerged: these techniques should never be relied upon for safety critical systems, but that their ability to give insight into the problem under con- sideration might be useful in addition to any possible benefit from the programs they produce. In essence, it seemed that perhaps the process of using genetic programming might be at least as valuable as the products it produces.

Joachim Wegener described the evolutionary testing approach adopted by DaimlerChrysler [WGG+96, WSJE97], indicating that results always outperformed random testing on every ex- ecution and for every test object. The difficulties associated with continuous data were discussed. This typically involves a larger search space, but initial results from the Daimler- Chrysler evolutionary testing system were encouraging. Dr. Harman suggested the use of slicing [Tip95, Wei84] to re- duce search space sizes and the use of imperative-style trans- formation [Bax99, War94] to make software more amenable to the definition of suitable fitness functions.

Generic issues involved in applying metaheuristics to software Engineering problems were considered. The central issue here is that the results produced are not 'explained' other than by the fitness function itself. This suggests that the fitness function will play a central role in any exploitation of search based techniques within Software Engineering.

The general feeling was that the nature of Software Engineer- ing problems made the application of metaheuristic search techniques very attractive. Many authors and practitioners reported proposed projects involving the application of meta- heuristic techniques to application areas right through the software development process, from requirements analysis to maintenance and (software) evolution.

The workshop ended with a general discussion of the the ad- vantages of the application of metaheuristic search to prob- lems in Software Engineering. In particular the robust nature

(5)

ACM SIGSOFT Software Engineering Notes vol 26 no 6 November 2001 Page 66

of these techniques and their ability to cope with fuzzy, par- [BW96] tially defined and possible inconsistent constraints was felt to

be significant. The generic nature of the techniques was also

thought likely to lead to portability and scalability. [Dol00]

Future Growth of Search-Based Software Engi-

neering

IDol01]

Metaheuristic search techniques are currently being experi- [Glo90] mented with for many aspects of Software Engineering. It is

likely that the next five years will see these techniques applied [Ho175] to, at least, the following areas of Software Engineering:

[JES98] 1. Requirements prioritisation

2. Finding good designs 3. Test data selection

4. Reverse and Re-engineering through transformation and re-factoring

5. Development of Software Measurement

6. User-based fitness evaluation for aesthetic aspects of Software Engineering

A c k n o w l e d g e m e n t s

In addition to the authors who presented papers at the work- shop, the authors have also benefitted from extensive discussion within t h e auspices of the E P S R C - f u n d e d SEMINAL 5 network (GR/M78083). In particular, the authors would like to acknowl- edge t h e contribution made by Len Bottaci, Colin Burgess, John Clark, Jose Dolado, Gerdi Gross, Rob Hierons, Martin Lefley, Rudi Lutz, Mark Proctor, Vic Rayward Smith, Marc Roper, Nick Sharpies, Martin Shepperd, Harmon Sthamer, Nigel Tracey, Joachim Wegener and Darrell Whitley. Of course, all errors and omissions remain t h e sole responsibility of the authors.

R e f e r e n c e s

[Bab98] Vladan Babovic. Mining sediment transport data with ge- netic programming. In Proceedings of the First Interna- tional Conference on New Information Technologies for De- cision Making in Civil Engineering, pages 875-886, Mon- treal, Canada, 11-13 October 1998.

[BKc96] T. Biick. Evolutionary Algorithms in Theory and Practice.

Oxford University Press, 1996.

[Bax99] I . D . Baxter. Transformation systems: Domain-oriented component and implementation knowledge. In Proceed- ings of the Ninth Workshop on Institutionalizing Software

Reuse, Austin, TX, USA, January 1999.

[Bir92] Robert R. Birge. Protein-based optical computing and memories. Computer, 25(11):56-67, November 1992. [BKAK99] Forrest H Bennett III, Martin A. Keane, David Andre, and

John R. Koza. Automatic synthesis of the topology and sizing for analog electrical circuits using genetic program- ming. In Kaisa Miettinen, Marko M. M~kel~i, Pekka Neit- taanmilki, and Jacques Periaux, editors, Evolutionary Algo- rithms in Engineering and Computer Science, pages 199- 229, JyvKskyl£, Finland, 30 May - 3 June 1999. John Wiley

& Sons.

5 Software Engineering using Metaheuristic INnovative ALgorithms.

[JSE96] [Koz92] [KSHH93] [LT92] [MRR+53] [PCV95] [PHP99] [TCM98] [Tip95] [War94] [Wei84] [WGG+96] [Whi01l [WM97] [WSJE971

Peter J. Bentley and Jonathan P. Wakefield. Generic rep- resentation of solid geometry far genetic search. Microcom- puters in Civil Engineering, 11(3):153-161, 1996.

Jos~ Javier Dolado. A validation of the component-based method for software size estimation. IEEE Transactions on Software Engineering, 26(10):1006-1021, 2000.

Jos~ Javier Dolado. On the problem of the software cost function. Information and Software Technology, 43:61-72, 2001.

F. Glover. Tabu search: A tutorial. Interfaces, 20:74-94, 1990.

John H. Holland. Adaption in Natural and Artificial Sys- tems. MIT Press, Ann Arbor, 1975.

Bryan F. Jones, David E. Eyres, and Harmen H. Sthamer. A strategy for using genetic algorithms to automate branch and fault-based testing. The Computer Journal, 41(2):98- 107, 1998.

B.F. Jones, H.-H. Sthamer, and D.E. Eyres. Automatic structural testing using genetic algorithms. The Software Engineering Journal, 11:299-306, 1996.

J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, 1992.

C. L. Karr, S. K. Sharma, W. J. Hatcher, and T. R. Harper. Fuzzy control of an exothermic chemical reaction using ge- netic algorithms. Engineering Applications of Artificial In- telligence 6, 6:575-582, 1993.

J. E. Labussiere and N. Turrkan. On the optimization of the tensor polynomial failure theory with a genetic algorithm.

Transactions of the Canadian Society for Mechanical Engi- neering, 16(3-4):251-265, 1992.

N. Metropolis, A.W. Rosenbluth, M.N. Roseabhth, A.H. Teller, and E. Teller. Equation of state calculations by fast computing machines. Journal of Chemical Physics,

21:1087-1092, 1953.

R. Poli, S. Cagnoni, and G. Valli. Genetic design of optimum linear and nonlinear QRS detectors. I E E E Transactions on Biomedical Engineering, 42(11):1137-41, November 1995. R. P. Pargas, M. J. Harrold, and R. R. Peck. Test-data gen- eration using genetic algorithms. The Journal of Software Testing, Verification and Reliability, 9:263-282, 1999. N. Tracey, J. Clark, and K. Mander. Automated program flaw finding using simulated annealing. In International Symposium on Software Testing and Analysis, pages 73-81. ACM/SIGSOFT, March 1998.

Frank Tip. A survey of program slicing techniques. Journal of Programming Languages, 3(3):121-189, September 1995. Martin Ward. Reverse engineering through formal transfor- mation. The Computer Journal, 37(5), 1994.

Mark Weiser. Program slicing. 1EEE Transactions on Soft- ware Engineering, 10(4):352-357, 1984.

J Wegener, K Grimm, M Grochtmann, H Sthamer, and B F Jones. Systematic testing of real-time systems. In 4th In- ternational Conference on Software Testing Analysis and Review (EuroSTAR 96), 1996.

Darrell Whitley. An overview of evolutionary algorithms: Practical issues and common pitfalls. Information and Soft- ware Technology Special Issue on Software Engineering us- ing Metaheuristic Innovative Algorithms, 2001. To appear. David H. Wolpert and William G. Macready. No free lunch theorems for optimization. I E E E Transactions on Evolu- tionary Computation, 1(1):67-82, April 1997.

J Wegener, H Sthamer, B F Jones, and D E Eyres. Testing real-time systems using genetic algorithms. Software Qual- ity, 6:127-135, 1997.

References

Related documents

While all other groups in the first model, using GDP growth as indicator for cyclical behavior, do not differ significantly from the core group, there is some evidence, when using

Kurdistan regional government has opened the Erbil Stock Exchange mainly to attract local investors but public awareness and knowledage of stock market is limited in the region..

In order to make the program to automatically execute this function, the code 4022 in which the name of the cut symbol is entered has to be added to the terminal symbols in

"The German privatization debate." In Privatization or public enterprise reform?: International case studies with implications for public management, edited by Ali

Using the nominal linear velocity at the closest point (v(u)), the ∆t value is used to adjust the linear velocity and control for timing errors associated with the parallel

The joint trajectory tracking control simulations were carried out based on a simplified dynamic model of (3). The neural network controller was designed with three

In this paper,a trajtory tracking method based on the combination of fuzzy control and neural network control is proposed.We use the gauss function as fuzzy membership function

We describe the pelagic distribution of the most abundant forage fish species including walleye pollock ( Theragra chalcogramma ), capelin ( Mallotus villosus ), Pacific sandlance