4.2 Results
6.2.5 Preference towards Layout-based or Visual GUI testing tools
Of the 78 complete surveys that were collected, students expressed a slight preference (see figure 6.11) towards the Visual testing technique with respect to the Layout- based GUI testing technique. Namely, 43 students (the 55.1% of the total sample) indicated EyeAutomate as the tool that they would use again if they had to perform GUI testing, whereas 35 (the 44.9%) indicated Espresso.
The explanations given by the students for their preference towards one or another testing tool seemed to be principally moved by individual preference rather than actual advantages or disadvantages exposed by the tools. For instance, the answer "it is easier to use", was frequently given by respondents for both the testing tools.
Most of the respondents who selected EyeAutomate as the preferred way of testing Android apps gave the easier generation of test cases – guaranteed by the Capture & Replay functionality of the EyeStudio IDE – as the main explanation of such choice. Some respondent also recognized that EyeAutomate is more indicated when the tester is not the developer itself, and when the focus of the testing is not strictly on the functionalities of the applications but principally on the actual appearance provided to the user. Other respondents praised the portability and possibility to work with multiple platforms of the Visual approach to GUI testing
On the other hand, some students who preferred Espresso pointed out that the tests developed with such tool felt more precise and reliable than the ones obtained
6.2 Results 77
through screen captures and image recognition. Emphasis was also put on the characteristics of layout-based testing tools to be not fragile to minimal changes in the graphics of the application. Several students highlighted as an advantage the greater control of the application that layout-based testing tools are able to provide. The fact that Espresso is specifically designed for Android testing – instead of being a general purpose tool for GUI applications applied to an emulated Android device on a desktop PC – was also highlighted as a reason for preferring Espresso to test an Android app.
In general, however, a significant number of participants leveraged the automated Espresso Test Recorder for the creation of test scripts, instead of manually writing down the operations for identifying views and executing operations on them. It can be assumed that many respondents’ answers about the preference towards Espresso were largely influenced by the availability of such an add-on for the Android Studio IDE.
Answer to RQ2.3: The respondents to the experiment found slightly easier the development of test suites using the EyeAutomate library, in the context of the EyeStudio companion IDE, with respect to developing scripted Espresso test cases in the Android Studio IDE. The respondents identified the imprecision of the image recognition library, and the difficulty in finding individual ids for the widgets, the most problematic aspects of, respectively, the proposed Visual and Layout-based testing tools.
Study 3: Measures of Diffusion and
Evolution of Testware in OS projects
In order to give a quantitative evaluation of the issue of maintenance and fragility among open-source Android projects, measurements were performed on a set of repositories mined according to the procedure detailed in section 3.
This study allowed answer the third research question of the study: RQ3 - What is the adoption and typical evolution of test suites with automated GUI testing frameworks among Android open source projects?
The design and results of this study have been presented in a workshop paper at INTUITEST 2017 [29], in a conference paper at PROMISE 2017 [28], and in a journal paper published in IEEE Transaction on Reliability [32].
7.1
Study design
A set of metrics have been defined, in order to (i) quantify the adoption of testing tools on sets of Android app projects; (ii) quantify the evolution needed by the test suites, during the normal evolution of an Android app project.
Hence, RQ3 is split into the following subquestions:
• RQ3.1 - Adoption and Size: What is the level of adoption of a set of auto- mated testing tools among open-source Android projects?
7.1 Study design 79
• RQ3.2 - Evolution: How much are GUI test classes associated with the analyzed sets of tools modified through consecutive releases of an open-source Android project?
To answer the two subquestions, a set of 12 metrics has been introduced. The metrics are subdivided in two different groups, and are based on an input consisting of test classes (of a single release, or belonging to two consecutive releases of the same project), production code classes and .txt difference files (from now on, diff files) computed between consecutive versions of the same file. The metrics have been defined for Java code, and are hence applicable to any kind of Java application provided with test code based on Java itself, not limited to Android applications.
Change metrics have been proposed by several works available in literature. A popular example is the set of metrics defined by Tang et al. [96], which were defined to describe the amount of bug-fixing change histories in source files. The metrics proposed by Tang et al. are subdivided in pure size metrics (e.g., the amount of added or removed lines of code between two releases of the same file), atomic metrics (e.g., boolean metrics that are equal to one if a test class features added methods or deleted methods), and semantic methods (e.g., counting the number of changed dependencies inside a test class).
Most of the metrics defined in this work are instead relative metrics because they aim at measuring the co-evolution of test code along with the normal evolution of the project the tests are associated with. The metrics are also normalized, in order to allow the comparison between different projects, with production code and test code bases of different sizes. The normalization of the metrics makes them inapplicable to testing tools that create test scripts in languages that are different from pure Java, or to apps containing fragments of code in different languages (e.g., Kotlin for Android programming).
Table 7.1 summarizes all the defined metrics, the acronyms that are used in the remainder of the paper, and the research question they contribute to answering. The formulas for computing them are provided in the following. In the definition of the metrics, production code indicates all the code of the application, including both program code and test code.
Table 7.1 Defined metrics for the computation of diffusion and evolution of test suites for Layout-based GUI testing
Group Name Explanation Diffusion
and size (RQ1)
TD Tool Diffusion
NTR Number of Tagged Releases NTC Number of Test Classes TTL Total Test LOCs TLR Test LOCs Ratio Test evolu-
tion (RQ2)
MTLR Modified Test LOCs Ratio MRTL Modified Relative Test LOCs MRR Modified Releases Ratio TSV Test Suite Volatility
MCR Modified Test Classes Ratio MMR Modified Test Methods Ratio
MCMMR Modified Classes with Modified Methods Ratio