6.2 Case Studies
7.1.2 Improving Development (and Evaluation)
One of the main problems of the field of usable and secure authentication mechanisms, within and outside of the public setting, became quickly apparent when analyzing related work. This became even more apparent during the design and evaluation of the authentication mechanisms presented in chapter 3. Even though the majority of related work describes thorough and inter- esting evaluations, there is no common approach for evaluation and moreover reporting results. This makes comparing different solutions harder than necessary. It is basically impossible to say that system A is better than system B since the information to objectively judge this is not or only partially available. We encountered this problem in its gravity when we analyzed our approaches on a comparative basis.
Therefore, the second main contribution of this thesis is a set of criteria on how to design and evaluate authentication mechanisms with the goal to improve them in a very early design stage and make the outcome comparable. It is not necessary though that a system is superior to others since it fulfills more of the criteria. Depending on the context in which the system is used, some criteria can be considered more important than others. This is to be decided by the respective researcher but it has to be kept in mind that not fulfilling specific criteria can limit the system’s security, usability or both.
With respect to a public setting as defined in chapter 1.1, we found that social factors play a very important role and can significantly influence security as well as usability. However, the analysis of related work (see chapter 2.5) revealed that there is only few work on how behavioral factors influence authentication. Not surprisingly, two of the criteria that were defined in this thesis are therefore directly related to social interaction and based on other people being in immediate vicinity to the user.
The work presented in chapter 3 represented the first active step further to identify weaknesses of the standard evaluation process of authentication mechanisms. It helped to improve and newly define criteria that could not have found without implementing and evaluating real prototypes. For instance, the work on MobilePIN [29] showed that when designing and evaluating an au- thentication mechanism for public spaces, it is extremely important to be honest with measur- ing authentication speed, and to explicitly define which interaction is part of the authentication process and which is not. We could also show that practical security evaluations often reveal weaknesses that are impossible to find in a theoretical analysis. An example of a new finding is what ended up to become criterion 5: security should not require an active user. This criterion is based on findings of the evaluation of VibraPass [34]. It showed that the theoretical security of the system was negatively influences by so-called “bad lies” by the users. This is a design flaw that we had to learn where it comes from and how to avoid it in the future. Finally, working on authentication mechanisms revealed knowledge gaps that then could be filled in the following work by performing long-term and field studies.
Based on these results, in chapter 4, a long-term study on PIN-entry with different keypad settings is presented. We made the decision to use standard PIN-entry. This way, the users were dealing with a concept they are already very familiar with and thus, novelty effects and the like could be minimized. With the knowledge that time measurement has to be very precise, an authentication speed measurement method based on different phases was proposed and applied: preparation, active authentication and cleanup phase. We could show that this approach has multiple benefits, and that in fact it helped to reveal very important factors that would have stayed hidden otherwise. For instance, a critical disadvantage of random keypad layouts considering the preparation phase could only be revealed applying this advanced measurements. Additionally, the study showed that both, inner and outer consistency are important factors that an authentication mechanisms should fulfill to a certain degree. It gave several hints that randomization has a bad influence on the performance of an authentication mechanisms and should thus be avoided if possible. This is a serious limitation since randomization is often the tool of choice to make a system more secure. These results directly influenced criteria 2 and 3.
Since the long-term study on PIN-entry focused on technical aspects of authentication, the crite- ria influenced by it are rather technical as well. To deal with the lack of knowledge on behavioral factors, their behavior and their influence on the authentication process, we conducted a long- term field study on ATM use in combination with a follow-up field study and a row of public interviews, presented in chapter 5. Based on this, five implications were derived dealing with au- thentication speed, memorability issues, social factors, distraction and the users’ role in security. These findings influenced criteria 2, 4, 5, 6 and 7 respectively. For instance, we could show that in instances in which memorability was an issue, this fact did not only reduce usability, it also tempered with security. Another important finding was that in public settings, diverse distrac- tions do occur that have a negative effect on security and usability and thus should be taken into account when designing authentication mechanisms.
In chapter 6, we presented a thorough description of the seven final criteria and how they apply to a standard development process. The criteria play different roles in the different phases of the development process. With respect to the contribution of the criteria, it was very important for us to find out, whether fulfilling them or not does actually play a role in practice besides the theoretical benefits that each of them have. Therefore, we applied them to two authentication mechanisms presented in this thesis, VibraPass and EyePassShapes. EyePassShapes, which is the more promising approach of the two, fulfills most of the criteria, while many of them are violated by VibraPass. This way, we could show that applying the criteria not only makes the design process of authentication mechanisms for public spaces more efficient, it also leads to “better” systems.
The criteria were created, extended and improved based on a long row of work covering a range of four years. In their current state, they are highly refined and proved their importance in many different contexts: For instance, the phase based time measurement revealed problems of ran- domized interfaces that would have stayed hidden measuring plain authentication only. We could also show that memorability does not only influence usability but also security since it can lead to insecure behavior. The field study showed how social factors can hinder security. Such problems can be avoided by adhering to the respective criteria.
Using and applying these criteria will help to refine the concepts with respect to very important factors in very early design stages. Additionally, it provides a way to evaluate authentication mechanisms to reveal very important findings and to directly compare the results to other systems. The seven criteria were created with the context of public spaces. Nevertheless, we argue that they can be applied to any authentication mechanism with the limitation that some of them might be neglected. For instance, designing an authentication system for a private home scenario limits the validity of criterion 6, social compatibility, to family members and close friends. Criteria 1, 2, 3 and 4 however are mostly independent of any context.