Would you hire a carpenter who did not have and would not use a tape measure? Probably not, because the carpenter who does not use a tape measure probably will not deliver a satisfactory job. Most people recog-nize readily that measuring tools are necessary to make sure that a struc-ture is laid out according to the plan, that the floors are level and the walls plumb. Yet we buy software to do important work that has been devel-oped and validated by people who do not use any type of measurement.
— Marnie L. Hutcheson [7]
When the whole requirements specification is properly recorded along with test cases, and when these are linked, processes and algorithms can be defined to measure quality and validate consistency of these specifications. The results are several mea-sures, metrics or indicators.
The goal of using metrics is to measure your project’s success, to know whether everything is running according to plan, and if it is not, what might be done to correct it [15].
These terms are described in the following chapter.
2.2.1 Measures, Metrics, Indicators
The terms measures, metrics and indicators are often used interchangeably and incor-rectly. In scope of this thesis, these terms are properly defined.
When you obtain/observe/measure a value of directly observable property of an entity, you have a measure. Each measure has a standard unit of measure (UOM) like sec-onds, meter, kilograms etc. In software development and software testing, most com-monly used measures are:
• Number of Defects found in a system or component
• Lines of Code (LOC, kLOC)
• Number of Test Cases
A metric, in contrast, is a derived value which cannot be observed/measured di-rectly. It is a number derived from one or more measures by a formula (or estimation).
Best known metrics in software development and software testing are:
• Number of defects found per kLOC, which serves as an estimation of quality of code
• Productivity, i.e. Size / Effort 18
2. RESEARCH& LITERATUREREVIEW
• Defect Density, i.e. number of defects related to size
There are two types of metrics, objective and subjective. Objective metrics can be quantized and are readily available, subjective metrics rely on opinions, gut feelings, personal attitudes etc. An example is CSAT (customer satisfaction), though the former is more reliable, than the later, the reliability of subjective metrics can be improved by having checklists and guidelines. For e.g.: survey question need to have, probe areas, facets, scale definitions before an option can be chosen.
An indicator is “a thing that indicates the state or level of something” 2, thus it can be simply just a number showing value of a particular measure or metric. A better indicator could be a chart comparing two measures/metrics or projecting how a mea-sure/metric developed during a time period. Also, a semaphore where red means bad and green means good is also a very simple indicator, which can be helpful in particular class of situations. Thus, indicator is most general of these terms.
2.2.2 Requirements on Metrics
When it is possible to measure something, it can be comprehended in a better way and more can be known about it. With better comprehension, there comes better chance in attempts of improving it.
Effective metrics must be simple, objective, measurable, meaningful and have easily accessible underlying data [9]. The most important attributes of a metric are summa-rized in this section.
• Simple– so that errors in computation or interpretation are avoided, and also that the metric can be comprehended.
• Objective– based on goals and objectives of the company.
• Measurable– based on things which can be reliably measured, not estimated or guessed.
• Meaningful– so that they can help the managers to understand important as-pects of their projects.
• Easy to collect– metric shall be automated and non-intrusive, i.e. not interfere with other activities of the developers.
• Easy to interpret – so that it is easy to comprehend it, understand the causes which affect it.
• Hard to misinterpret– even when the metric may be easy to interpret, there may be cases leading to wrong conclusions. Metrics shall be designed with caution so that such cases are avoided.
2. Oxford Dictionary - "indicator" -http://oxforddictionaries.com/definition/english/
indicator
19
2. RESEARCH& LITERATUREREVIEW
• Valid – to ensure, whether the metric really measures what it is supposed to;
prevent systematic errors.
• Reliable(Consistent) – they must perform their required functions under stated conditions for a specified period of time, providing consistent results; prevent random deviations in measurement.
2.2.3 Efficiency and Effectiveness
These two terms are often used interchangeably and incorrectly, which causes confu-sion. In some languages (e.g. Czech) these are even translated to just single word which makes it hard to differentiate between the two terms. There are several articles about the differences of these two terms, some of them confusing as well. For the sake of reading and understanding this thesis, the reader shall properly understand these terms and be able to recognize the difference [14].
Efficiency is the ratio of provided input and useful output. It is a measure of the resources (money, time, effort, etc.) that are needed to produce a desired result. Being efficient is about doing the things right.
Effectivenessdescribes impact of your decisions. It represents the degree to which something is successful in producing a desired result. Being effective is about doing the right things.
Efficiency and effectiveness depend on each other. Usually, when improving one, the other worsens. Therefore it is essential to focus on both.
2.2.4 Reliance on Metrics
Metrics and indicators usually play an important role in software quality and matu-rity. They provide a quick overview on the project and the direction it is moving. With metrics it is simpler to target improvements and asses them.
However, metrics can have also negative impact on developers and the code itself.
The problem occurs when people forget their original goal and fall into false proxy trap (quoted below). Usually it is not possible to directly measure quality of a product, so we choose to measure something much easier – e.g. number of defects found. This measure becomes an approximation of the original measure we targeted for.
Once you find the simple proxy and decide to make it go up, there are lots of available tactics that have nothing at all to do with improving the very thing you set out to achieve in the first place. When we fall in love with a proxy, we spend our time improving the proxy instead of focusing on our original (more important) goal instead.
— Seth Godin [5]
20
2. RESEARCH& LITERATUREREVIEW
Gaming the system to improve metrics is not the goal people shall focus on. Though, they usually do, because it is easier for them and much easier for their managers. Such metrics are not useful, or the worse, they can lower the quality of final product. They can also falsely increase confidence the manager has about the quality of his product.
The metrics themselves do not solve problems, they help to avoid them when used properly.
Description of several metrics follows, which, when blindly followed and/or im-proved, may and probably will cause issues to the project:
• Code Coverage by TestsTest cases are written to cover use cases. When there is a piece of code not covered by any test case, it means either it does not need to be there or that there is a use case not recorded in specification and not tested. It makes no sense to add a new test case only to cover a line of code, if it does not cover the whole use case. On the other hand, some pieces of code are so simple that they do not need to be tested (simple constructors, getters, setters) – errors in these should be found by other tools, like heuristics in static code analysis.
• Number of Test CasesThe more critical part of the system or the more critical use case, the higher likelihood there should be that it is properly covered by test cases. When a team’s goal is to write e.g “ten test cases a month”, they will choose the simplest parts to test – parts so simple they do not even need to be tested at all, or parts which are not critical for the software. In such a case, it might have been better if only two test cases were written instead while covering critical use cases and covering them properly.
These two are just simple examples. Similar situation can happen with any metric or indicator if a team starts blindly following them. Keep your goal in mind and think if you are aiming for it. A metric can simplify this for you, but human judgement is still important. Educate team about real goals and how to follow them. Make sure they understand these goals and will not follow the metrics blindly.