Evaluating Emerging Software Development Technologies: Lessons Learned from Assessing Aspect-Oriented Programming

(1)

Murphy, Walker, Baniassad

IEEE Transactions on Software Engineering Vol. 25, No. 4, July/August 1999

(2)

Context: Aspect-Oriented

Programming

 separation of concerns (e.g. concurrency)  improvement claims:  reason  develop  maintain AspectJ

(3)

Approaches

 3-month case study

 Is it easier to write and change certain kinds

of programs?

 What effect it has on design activities?

 4 experiments

 does aspect-oriented programming show any

promise of easing programming tasks (creating, debugging, changing)?

(4)

Outline

 Case Study  Experiments

 Lessons learned

 Generalization for early assessment  Conclusion

(5)

Case study (at Xerox PARC)

 1st phase

 4 interns (3 graduate and 1 undergraduate)  Develop a distributed game using Aspect

 2nd phase

 a)

• 2 interns

• Game reimplementation in Java

 b)

• 2 interns

• Distributed library application using AspectJ

(6)

(7)

Study 1

 3 well-identified Game deliverables (with

deadlines)

 Single user on single machine  Multiple users on single machine

 Multiple users on multiple machines

 This study was planned, while Study 2

was arranged on-the-fly (since Study 1 finished 2 weeks before estimated)

(8)

“External” involvement

 Xerox researchers

 responding to problem reports with AspectJ  supervising

 help on analyzing gathered information

 UBC observers

 three on-site meetings (beginning, middle, end)  1-2 hours video conference weekly

 monitoring of artifacts

(9)

Gathered information

 Email

 Weekly video-conference meetings  Informal interactions  Documents (interns)  Documents (researchers)  Problem reports  Source code  Whiteboard drawings  Survey results (interns)

(10)

(11)

Evaluation

 Exploratory case study model (Yin)

 however the goal was broader than “develop

pertinent hypotheses and propositions for further inquiry”

• e.g. how AOP can help in development tasks?

 Useful techniques

 On-site interns, outside observers, deadlines, video conferencing

 Less useful techniques

(12)

Experiments

 Carried out at UBC (within ~ 8 months)  Comparing with OO

1. ease of creating 2. ease of debugging

3. ease of changing (also comparing with DSL) 4. combination of the above

 Participants were students and were paid for

(13)

Difficulties

 Pool of potential participants was small  The amount of time each one could

devote was short

 Cost of running and analyzing the

experiments was high

 Some precision was lost in favor of

(14)

The 4 experiments

 Pilot Study (ease of creation) - 6

programmers; 3 on Java, 3 on AspectJ; 3 hours for Java, 4 hours for AspectJ

 Debugging - 6 pairs of programmers; 3

on Java, 3 on AspectJ; 4 hours

 Change - 6 programmers; 3 on Java, 3

on AspectJ; 4 hours

(15)

Steps of a session (1)

1. Introduce the goals, overall format of the

experiment

2. 30 minutes to review webpages on

synchronization (questions were asked at the end)

3. 30 minutes to review programming

material

(16)

Steps of a session (2)

5. 30 minutes to play with small example

program (break)

6. Files were given, and a webpage containing 3

bugs to find and remove (90 minutes)

- participants were asked to talk-aloud

- at 30 minutes interval (or after each bug), they had

to answer questions:

- What have you done until now? And now?

- Any significant problems? What’s the plan from now?

(17)

Costs

(18)

Evaluation

 Focus: does aspect-oriented

programming show any promise of easing programming tasks?

 Useful techniques

 Timed interviews

 Less useful techniques

(19)

Lessons learned (Case study)

 Things to keep

 defined period of time for the case study

 separation of the observers from the day-to-day activities

 surveys

 Things to change

 set up the case study to enable more effective comparison

 to have only one case study per set of users

(20)

When to conduct a case study?

 Case studies are useful for early

assessment...

 However, it is difficult to determine

whether the technology is sufficiently stable for conducting a case study

 Only small changes should occur during

(21)

Lessons learned (Experiments)

 These did not go so well...  Things to improve:

 (since participants were students) to have more participants from other universities

 to spend more time and money performing the experiments

 more extensive preparation

 additional data gathering mechanisms

• file access, build commands, ...

(22)

Generalization for early

assessment (Method)

 For early assessment, the case study

approach is likely to be more effective

 quick identification usability issues

• e.g. understandability of error messages

 Experiments into usefulness cannot

ignore usability

 usefulness and usability are closely

intertwined for new technologies

(23)

Generalization for early

assessment (Stability)

 The experiments required more labor cost

in preparation

 This is only acceptable with a stable

technology

 With a case study, there is more

opportunity to overcome problems with the technology

 e.g. the version of the AspectJ environment

(24)

Generalization for early

assessment (Cost)

 Case study

 predominant cost was the labor of the

participants (interns)

 Experiments

 predominant cost was the labor required to

(25)

Generalization for early

assessment (Evaluation)

 Appropriate balance of construct validity,

internal validity, external validity, and reliability, may not be possible for new technologies

 Idea:

 Early stages: concentrate on the broad

features (when still possible to change)

 Later on: more statistically valid studies

(26)

Generalization for early

assessment (Realism)

 Given problem

 Since there is limited time, the problem must be representative of a problem arising in a larger software development situation

 Environment

 Let the participants to work as much as possible as they commonly do (no tool restriction)

 Participants

 Pick participants that representative within the skill-set of the target users of the software technology

(27)

Generalization for early

assessment (Study Design)

 There is need for a body of knowledge in

relating different data sources

 The approach for matching observations

was carried out in an ad hoc way

 Techniques that would provide more rigor to

the analysis would help to improve the validity of the results

(28)

Conclusion

 A researcher must typically tradeoff three

factors: validity, realism, and cost

 Since the technique was in its infancy, the

case study and experiments were largely exploratory

 qualitative insights and directions for

research

• usefulness, concrete features for improvement