4.5 User Experiences
4.5.3 A Financial Application
Another user has applied FindBugs to a large (350K lines of code) financial appli-
cation. He notes that upon first running FindBugs on the application, it produced
300 warnings. Of those, the development team considered about 50 to be real bugs.
Many of the remaining warnings are false positives, some of which the team sup-
presses using a pattern-matching filter.
Figure 4.4 shows a graph of the number of warnings produced by FindBugs,
CheckStyle [10], and PMD [55] on the application, over a period of about 5 months.
This graph illustrates the distinction between bug and style checkers. Both PMD
0 50 100 150 200 250 300 350 400 0 20 40 60 80 100 120 Number of warnings Build number FindBugs CheckStyle PMD
Figure 4.4: Number of warnings produced by FindBugs, CheckStyle, and PMD on a large financial application
warnings than FindBugs. This illustrates the importance of the distinction between
bug and style checkers. The fact that the style checkers produced a relatively high
number of warnings that were not fixed demonstrates that their role is not to find
specific bugs, but to check conformance to existing project standards practices.
Tools such as FindBugs can make a valuable addition to style checkers, since they
find a significant number of real bugs while producing only a moderate number of
Chapter 5
Experiences Analyzing Student Programming Projects
In this chapter, we discuss using FindBugs to analyze programs written by students
in an introductory programming course. Student projects are an interesting target
for static analysis for two reasons.
First, by analyzing student projects we can find out what kinds of coding er-
rors students are likely to make. We know from experience that many students find
introductory programming courses to be a difficult experience. Without experience
to guide them, students struggling to get past a particular stumbling block can
become caught up in unproductive “Brownian motion” programming, where they
make a series of random changes in the hope of making the project work. This kind
of experience can be very discouraging, and may have a negative effect on student
retention, especially for students without significant prior programming experience.
We surmised that at least some errors typically made by students might be de-
tectable using static analysis, and that by targeting FindBugs to find some of these
errors, we might be able to improve students’ experience in the course.
Second, student projects are a very useful source of data for evaluating and
improving the accuracy of static analysis. Code written by professional programmers
often is complex, vaguely specified, and generally lacks an objective way to measure
in introductory programming courses tend to be simple, well-specified, and most
importantly, often have a test suite to measure how well the implementation matches
the specification. A comprehensive test suite allows us to do something that is
extremely difficult to do for arbitrary production software artifacts: determine what
bugs are not found by static analysis.
By analyzing static analysis warnings and test failures for student projects,
we can make progress toward two important objectives:
1. Finding new bug patterns, by examining test failures not predicted by any
static analysis warning
2. Improving analysis effectiveness and accuracy by identifying and addressing
the causes of false negatives and false positives
5.1
The Marmoset Project
In order to better understand bugs in student code, we developed a project snapshot,
submission, and testing system called Marmoset [63, 64]. Like many similar systems,
Marmoset allows students to submit versions of their projects to a central server,
which automatically tests them against a suite of unit tests and stores the results
in a database. However, it differs from previous systems in two ways.
First, it employs a novel mechanism for providing feedback about the results of
unit tests. Students can request to see the names of passed unit tests, along with the
names of the first two failed unit tests, subject to the availability of release tokens.
after a fixed period of time. For the courses in which we have used Marmoset, we
have given students 3 release tokens, each of which regenerates 24 hours after being
used. This system encourages students to start work early (since the release tokens
can only be used a fixed number of times per day), and also provides an incentive
for students to think carefully about their work before using a token.
The second novel feature of Marmoset is that it collects fine-grained snapshots
of student projects as they work. Each time a student saves a file, her work is
automatically committed to a CVS [14] repository. Over time this creates a detailed
history of each student’s work. Most of the code changes captured by Marmoset are
small: 70% add or change no more than 4 lines of code. A summary of the number
of code snapshots and the sizes of code deltas from one snapshot to the next from
one semester of an introductory Java programming course is shown in Figure 5.1.
In this course, Marmoset captured 33,015 unique snapshots from 73 students over
the course of 8 programming projects. These snapshots are an ideal laboratory for
studying bugs: they record the work of many programmers using many different
programming styles and idioms, and record the code changes made during active
development, allowing us to see how bugs are introduced and fixed.