Evaluating Emerging Software Development Technologies: Lessons Learned from Assessing Aspect-Oriented Programming
Murphy, Walker, Baniassad
IEEE Transactions on Software Engineering Vol. 25, No. 4, July/August 1999
Context: Aspect-Oriented
Programming
separation of concerns (e.g. concurrency) improvement claims: reason develop maintain AspectJApproaches
3-month case study
Is it easier to write and change certain kinds
of programs?
What effect it has on design activities?
4 experiments
does aspect-oriented programming show any
promise of easing programming tasks (creating, debugging, changing)?
Outline
Case Study Experiments
Lessons learned
Generalization for early assessment Conclusion
Case study (at Xerox PARC)
1st phase
4 interns (3 graduate and 1 undergraduate) Develop a distributed game using Aspect
2nd phase
a)
• 2 interns
• Game reimplementation in Java
b)
• 2 interns
• Distributed library application using AspectJ
Study 1
3 well-identified Game deliverables (with
deadlines)
Single user on single machine Multiple users on single machine
Multiple users on multiple machines
This study was planned, while Study 2
was arranged on-the-fly (since Study 1 finished 2 weeks before estimated)
“External” involvement
Xerox researchers
responding to problem reports with AspectJ supervising
help on analyzing gathered information
UBC observers
three on-site meetings (beginning, middle, end) 1-2 hours video conference weekly
monitoring of artifacts
Gathered information
Weekly video-conference meetings Informal interactions Documents (interns) Documents (researchers) Problem reports Source code Whiteboard drawings Survey results (interns)
Evaluation
Exploratory case study model (Yin)
however the goal was broader than “develop
pertinent hypotheses and propositions for further inquiry”
• e.g. how AOP can help in development tasks?
Useful techniques
On-site interns, outside observers, deadlines, video conferencing
Less useful techniques
Experiments
Carried out at UBC (within ~ 8 months) Comparing with OO
1. ease of creating 2. ease of debugging
3. ease of changing (also comparing with DSL) 4. combination of the above
Participants were students and were paid for
Difficulties
Pool of potential participants was small The amount of time each one could
devote was short
Cost of running and analyzing the
experiments was high
Some precision was lost in favor of
The 4 experiments
Pilot Study (ease of creation) - 6
programmers; 3 on Java, 3 on AspectJ; 3 hours for Java, 4 hours for AspectJ
Debugging - 6 pairs of programmers; 3
on Java, 3 on AspectJ; 4 hours
Change - 6 programmers; 3 on Java, 3
on AspectJ; 4 hours
Steps of a session (1)
1. Introduce the goals, overall format of the
experiment
2. 30 minutes to review webpages on
synchronization (questions were asked at the end)
3. 30 minutes to review programming
material
Steps of a session (2)
5. 30 minutes to play with small example
program (break)
6. Files were given, and a webpage containing 3
bugs to find and remove (90 minutes)
- participants were asked to talk-aloud
- at 30 minutes interval (or after each bug), they had
to answer questions:
- What have you done until now? And now?
- Any significant problems? What’s the plan from now?
Costs
Evaluation
Focus: does aspect-oriented
programming show any promise of easing programming tasks?
Useful techniques
Timed interviews
Less useful techniques
Lessons learned (Case study)
Things to keep
defined period of time for the case study
separation of the observers from the day-to-day activities
surveys
Things to change
set up the case study to enable more effective comparison
to have only one case study per set of users
When to conduct a case study?
Case studies are useful for early
assessment...
However, it is difficult to determine
whether the technology is sufficiently stable for conducting a case study
Only small changes should occur during
Lessons learned (Experiments)
These did not go so well... Things to improve:
(since participants were students) to have more participants from other universities
to spend more time and money performing the experiments
more extensive preparation
additional data gathering mechanisms
• file access, build commands, ...
Generalization for early
assessment (Method)
For early assessment, the case study
approach is likely to be more effective
quick identification usability issues
• e.g. understandability of error messages
Experiments into usefulness cannot
ignore usability
usefulness and usability are closely
intertwined for new technologies
Generalization for early
assessment (Stability)
The experiments required more labor cost
in preparation
This is only acceptable with a stable
technology
With a case study, there is more
opportunity to overcome problems with the technology
e.g. the version of the AspectJ environment
Generalization for early
assessment (Cost)
Case study
predominant cost was the labor of the
participants (interns)
Experiments
predominant cost was the labor required to
Generalization for early
assessment (Evaluation)
Appropriate balance of construct validity,
internal validity, external validity, and reliability, may not be possible for new technologies
Idea:
Early stages: concentrate on the broad
features (when still possible to change)
Later on: more statistically valid studies
Generalization for early
assessment (Realism)
Given problem
Since there is limited time, the problem must be representative of a problem arising in a larger software development situation
Environment
Let the participants to work as much as possible as they commonly do (no tool restriction)
Participants
Pick participants that representative within the skill-set of the target users of the software technology
Generalization for early
assessment (Study Design)
There is need for a body of knowledge in
relating different data sources
The approach for matching observations
was carried out in an ad hoc way
Techniques that would provide more rigor to
the analysis would help to improve the validity of the results
Conclusion
A researcher must typically tradeoff three
factors: validity, realism, and cost
Since the technique was in its infancy, the
case study and experiments were largely exploratory
qualitative insights and directions for
research
• usefulness, concrete features for improvement