Improving the productivity of software developers
Lecture 1 - What recommenders can be built?
Gail C. Murphy
University of British Columbia Tasktop Technologies
@gail_murphy
Laser Summer School 2014 - Lecture 1
Robillard, Walker and Zimmermann, Recommendation systems for software engineering, IEEE Software 27(4), 80-86, 2010.
“a software application that provides
information items estimated to be valuable for a software engineering task in a given context”
a recommendation system for SE is …
Robillard, Walker and Zimmermann, Recommendation systems for software engineering, IEEE Software 27(4), 80-86, 2010.
“a software application that provides
information items estimated to be valuable for a software engineering task in a given context”
a recommendation system for SE is …
Robillard, Walker and Zimmermann, Recommendation systems for software engineering, IEEE Software 27(4), 80-86, 2010.
“a software application that provides
information items estimated to be valuable for a software engineering task in a given context”
various ways to think about the space of SE recommenders
information spaces
•
source code
•
reusable software components (e.g., APIs)
•
project history (e.g., version control)
•
software process information (e.g., issues)
•
interaction information
•
web information
•
…
intent
• consider the range of recommenders to aid software development tasks that have been investigated
• consider characteristics of recommenders and their use that will underly remaining lectures
• representative examples not indicative of all SE recommenders that have been built!
• For more examples, see:
• Recommendation Systems in Software Engineering edited by Robillard, Maalej, Walker and Zimmermann, Springer, 2014.
source code
examples of recommenders based on source code
• quick fixes
• code completion
• refactoring (e.g., jDeodorant)
• program transformation
Image from: http://www.opensourceforu.com/2012/04/foss-is-fun-the-fifth-freedom-part-2/
some questions to consider for each:
1. how are recommendations generated?
2. how accurate are the recommendations?
3. how hard is it to determine which recommendation to take?
4. how easy is it to back out of a wrong choice?
quick fix
quick fix
quick fix
some questions to consider for each:
1. how are recommendations generated?
2. how accurate are the recommendations?
3. how hard is it to determine which recommendation to take?
4. how easy is it to back out of a wrong choice?
code completion
refactoring
(e.g., jDeodorant)
identify Feature Envy smells in Java code
generate moveMethod refactorings to reduce smell
!
[Fokaefs et al 2007]
program transformation
provide examples of similar changes made to source generate edit script (programming by demonstration) recommend similar code
apply edit script
[Meng et al 2013]
LASE approach and tool
code recommendation summary
recommender generation accuracy selecting undoing
quick fix heuristic high but not quantified
expertise
required limits of undo code
completion
type
information 100% often easy easy jDeodorant
(refactoring)
heuristic
metric ?? easy difficult
LASE
(transformation) AST precision
often 100% expert difficult
disclaimer: accuracy are as reported but we aren’t giving experimental
context here so tread carefully.
Selecting and undoing are subjective.
reusable software
components
examples of recommenders based on reusable software components
• FrUiT
• Strathcona
• CodeBroker
Image from: EEWeb.com
FrUiT
support the usage of frameworks
!
1. extract structural relations from applications using an API 2. use association rule mining to identify structural relations
that are commonly used
3. recommend all rules that mention any of the source code entities in the current file in the editor
[Bruch et al 2006]
FrUiT
[Bruch et al 2006]
Strathcona
find examples of API use
!
1. build a db of structural relations from example code
2. if developer needs help, extract context from developer’s existing code
3. query db with context
4. use heuristics to match examples (similarity of structural relations)
5. return examples as UML class diagram fragments
[Holmes & Murphy, 2005]
Strathcona
[Holmes & Murphy, 2005]
Codebroker
help develop new methods based on similar existing ones
!
1. monitor methods being written and extract words used in comments, signature and types
2. use LSA to match to existing methods modelled similarly 3. filter matches based on similarity of types
4. recommend most similar
!
[Ye & Fischer 2005]
Codebroker
[Ye & Fischer 2005]
api recommendation summary
recommender generation accuracy selecting undoing
FrUiT associate
rule mining ~85% expertise
required limits of undo
Strathcona heuristic case study
eval difficult limits of undo
Codebroker LSA high of ~35% easy difficult
disclaimer: accuracy are as reported but we aren’t giving experimental
context here so tread carefully.
Selecting and undoing are subjective.
project history
examples of recommenders based on project history
• eROSE
• expertise recommender
Image from: EEWeb.com
eROSE
recommend program elements likely to change together
1. mine the version control system to form rules of which elements commonly change together 2. when a developer changes element e, find all
rules with e and suggest any other elements that usually change with e
[Zimmermann et al 2004]
expertise recommender
use who changed the file to determine who is an expert in the file
[McDonald & Ackerman 2000]
software process
information
examples of recommenders based on software process information
• who should fix this bug
Image from www.geekcomic.com
who should fix this bug?
who should fix this bug?
during bug triage, a bug needs to be assigned
use machine learning to determine which values
in the bug fields suggest particular developers to fix the bug
[Anvik & Murphy 2006]
interaction information
examples of recommenders based on interaction information
• Eclipse Mylyn
• command recommendation
Image from: EEWeb.com
examples of recommenders
based on interaction information
• Eclipse Mylyn
• command recommendation
Image from: EEWeb.com
interaction history is!
a sequential record of the ! commands, artifacts, etc. !
that a developer!
has interacted with
Eclipse Mylyn
track the program elements associated with each task performed to ease recall, sharing, code
recommendations, etc.
[Kersten & Murphy 2006]
command recommenders
recommend new commands in a development environment that a developer is not using
yet many peers are
!
requires interaction data from the crowd
!
requires interaction data from a user
!
many algorithms possible, several based on collaborative filtering
[Murphy-Hill et al 2012]
web information
examples of recommenders based on web information
• Reverb
Reverb
23% of all revisits of web pages by developers are related to code Reverb recommends the web pages to revisit
[Sawadsky et al 2013]
summary
(non-code, non-API)
recommender generation accuracy selecting undoing eROSE association
rules ~26% moderate easy
expertise recommender
heuristic &
metrics no data easy not
applicable who should fix
this bug
machine learning
(SVM)
~40% moderate moderate Eclipse Mylyn degree-of-
interest 100% recall easy easy command
recommenders
collaborative
filtering ~20% easy easy
Reverb heuristic ~40% easy easy
disclaimer: accuracy are as reported but we aren’t giving experimental
context here so tread carefully.
Selecting and undoing are subjective.
summary
• wide range of recommenders have been built, lots of room for more to be built
• techniques used to generate vary substantially
• accuracy varies greatly (can only be assessed in context of a task)
• what’s next?
• common technique overview (lectures 2 & 3)
• how to deliver recommendations (lecture 4)
• how to evaluate a recommender (lecture 5)
references
Anvik & Murphy. Who should fix this bug? ICSE 2006.
!
Bruch, Schäfer, & Mezini. FrUiT: IDE support for framework understanding. ETX 2006.
!
Fokaefs, Tsantalis & Chatzigeorgiou, jDeodorant: identification and removal of feature envy bad smells. ICSM 07.
!
Holmes & Murphy, Using structural context to recommend source code examples. ICSE 2005.
!
Kertsen & Murphy. Using a task context to improve programmer productivity. FSE 2006.
!
Meng, Kim & McKinley. LASE: locating and applying systematic edits by learning from examples. ICSE 2013.
McDonald & Ackerman. Expertise recommender: a flexible recommendation system and architecture. CSCW 2000.
!
Murphy-Hill, Jiresal and Murphy. Improving software developers’ fluency by recommending development environment commands. FSE 2012.
!
Robillard, Walker and Zimmermann. Recommendation systems for software engineering, IEEE Software 27(4), 80-86, 2010.
!
Robillard, Maalej, Walker and Zimmerann (editors). Recommendation Systems in Software Engineering, Springer 2014.
Sawadsky, Jiresal and Murphy. Reverb: Recommending code-related web pages. ICSE 2013.
!
Ye & Fischer. Reuse-conducive development environments. Automat. SE Int J., 2005.
Zimmermann, Weissgerber, Diehl and Zeller. Mining version histories to guide changes. ICSE 2004.