We store the data we capture in a unified format for all events, so that they can be easily transformed for export in formats used by post-processing and analysis software.
A. BELT - BEHAVIOURLOGGINGTOOL
Table A.2: Data format for keystroke events.
Sequence Event Type Action Value Time Relation Flag Additional fields
n ’K’ ’D’ String ms Seq. Int n/a
’U’ Count
Attributes shared by all events are:
• Sequence – a session-unique Integer value assigned by order of occurrence
• Event type – a char value indicating the type of event, i.e. ’K’/keyboard, ’M’/mouse, ’S’/software and ’H’/hardware;
• Action – a sequence of char values further detailing the event, depending on the event type; • Value – a String value representing an instance of a detailed event type, depending on event
typeand action;
• Time – a timestamp when the event occurred, with millisecond precision; • Relation – a Integer value representing a related Sequence.
In addition, some event types have special attributes: • Flag – used for keystroke, mouse, and software events • Value-2 – used for hardware events
• Count – used for keystroke events • Rectangle – used for mouse events
• Element description, element ID, area, extra information, additional flag – used for soft- ware events
A.4.1 Keystroke events
The data format for keystroke events is shown in Table A.2. The event type is always ’K’. Keystroke events have only two types of actions, key press (’D’/down) and key release (’U’/up). The UTF- 8-encoded Value field states which key was pressed or released; it is either the typed character or a string with a descriptive key name, enclosed by ”||”. A ISO8601-compliant timestamp of when the event occurred is recorded in milliseconds [62]. The relation attribute contains a corresponding seq. number of a previous event and is further documented in Section A.4.5. Flag is an Integer indicating which alternate/system key was active. Bit 0 (LSB) is set when the Alt key was down, bit 1 is set for Ctrl, bit 2 for Shift, bit 3 for Win, bit 4 for Caps Lock, bit 5 for Num Lock, bit 6 for Scroll Lock. e.g., if Alt+Shift was pressed, Flag would be set to 5. In case of a key up event, an additional field Countstates how often a key was repeated. Permanent use of Num Lock may indicate that the user operates a numeric keypad.
A.4.2 Mouse events
The data format for mouse events is shown in Table A.3. The event type is always ’M’. Mouse events can have four types of actions, mouse move (’M’/move), mouse wheel use (’W’/wheel), mouse button press (’D’/down), and mouse button release (’U’/up). The Value field contains the x y mouse pointer coordinates concatenated by an ’ ’ underscore character. In case of mouse wheel use, Valueis the corresponding delta value indicating how much the wheel was scrolled; positive values are upward scrolls, negative are downward scrolls. A ISO8601-compliant timestamp of when the event occurred is recorded in milliseconds [62]. The relation attribute contains a corresponding seq. 156
A.4 DATAFORMAT
Table A.3: Data format for mouse events.
Sequence Event Type Action Value Time Relation Flag Additional fields
n ’M’ ’M’ x y ms Seq. n/a n/a ’U’ x y Int. Rectangle ’D’ x y Rectangle
’W’ Delta n/a n/a
Table A.4: Data format for software events.
Sequence Event Type Action Value Time Relation Flag Additional fields
n ’S’
”FC”
s/w name ms Seq.
Desc., ID, area
”EI” Desc., ID, area
”MO”
”MMS” El. type Desc., ID
”WO”
”TC” Desc., ID, extra inf.
”VC” Desc., ID, add. flag
”OCS” State Desc., ID, Area
number of a previous event and is further documented in Section A.4.5. Flag is an Integer indicating which mouse button was pressed/released: 1 = lef t, 2 = middle, 3 = right. If several buttons are pressed, several events are generated. Mouse moves and and mouse wheel events do not use the Flagvalue. In case of a mouse button press/release, an additional field Rectangle stores the active area, i.e. the client area of the UI element, where the mouse click happened.
A.4.3 Software events
The data format for software events is shown in Table A.4. The event type is always ’S’. Software events can have eight types of actions as described in Table A.1. The Value field contains the exe- cutable file name of the process displaying the user interface. A ISO8601-compliant timestamp of when the event occurred is recorded in milliseconds [62]. The relation attribute contains a corre- sponding event ID of a previous event and is further documented in Section A.4.5. Flag is an Integer indicating the type of element according to the control type identifiers listed in5or the state of the UI
element (0 = unpressed, 1 = pressed) in case of the OCS action. Flag is stored as an offset to the UIA ButtonControlTypeIdbase constant, i.e. the one with the lowest value. Additional fields com- prise a UI element description and an element ID, i.e. the UIA NamePropertyId (description) and the UIA AutomationIdPropertyId (ID) as documented in6. An additional flag is recorded for Visual
Change (”VC”) events, with values covering 1 = Restored, 2 = M aximized, 3 = M inimized. Arealogs the rectangular area on the desktop that the element occupies, formatted as left, top, right, bottom. Extra information contains the resulting text of the UI element when a Text Change (”TC”) event occurs.
A.4.4 Hardware events
The data format for hardware events is shown in Table A.5. The event type is always ’H’. Hardware events can have five types of actions: KEY (keyboard type and language), SCR Info (info on screen resolution per physical screen whenever one is detected during the session; change of screen reso- lution also triggers a SCR Info action), SCR (change of screen used for user input in case of focus change event when more than one screen is used), RES (CPU+RAM resource use recorded every 10 minutes), DEV (insertion / removal of USB-connected devices; HID-class devices could alter input behaviour). The Value field depends on the Action. For KEY, the value is the language and
5http://msdn.microsoft.com/en-us/library/windows/desktop/ee671198.aspx 6http://msdn.microsoft.com/en-us/library/windows/desktop/ee684017.aspx
A. BELT - BEHAVIOURLOGGINGTOOL
Table A.5: Data format for hardware events.
Sequence Event Type Action Value Time Relation Flag Additional fields
n ’H’ ”KEY” Language ms n/a n/a Keyboard type ”SCR Info” Resolution ID ”SCR” ID n/a
”RES” CPU RAM
”DEV” Insert/Remove n/a
Table A.6: Events and their relationships
Event Type Action Relation to previous event
Keystroke Key pressed Software event
Key released Key pressed
Mouse
Mouse moved
0 (for beginning of sequence of mouse moves) Latest mouse button down event (if button still down, e.g. drag& drop)
Previous mouse move sequence Mouse button pressed Software event
Mouse button released Mouse button pressed Mouse wheel used Software event
Software all Latest keystroke/mouse released event
sub-language. For SCR Info and SCR, value contains the screen resolution as top, left, right, bottom. For RES events, value records the percentage of CPU use. For DEV, value is 1 for device insertion and 2 for device removal. A ISO8601-compliant timestamp of when the event occurred is recorded in milliseconds. The relation and Flag attributes are not used. Additional fields comprise keyboard typefor KEY events, screen ID for SCR Info events, and RAM for RES events, with the value stating the percentage of RAM used.
Hardware events are not a direct capture of user behaviour; they give a more complete picture of the user environment for later analysis.
A.4.5 Relation between events
Many events are related to other events. This concerns, e.g. pairs of key presses/releases and mouse button presses / releases as well as software events triggered by keyboard / mouse input and se- quences of mouse movements and software events. A full overview of events and their relationships is given in Table A.6.
Keystroke/Mouse events that are related to previous software events, i.e. key presses, mouse button presses and mouse wheel use, store the seq. number of the latest event pertaining to the active window. Key releases and mouse button releases store the seq. number of the preceding corresponding key presses and mouse button presses for the same key/button. Mouse move events are not related to previous events, i.e. store an seq. number of 0, when they are the beginning of a sequence of mouse moves. They store the seq. number of the latest mouse button down event if the button is still down at the time of the mouse move. If several buttons are pressed, relation is stored to the latest button down event. Relation to a button down event is stored to indicate that the user is likely to drag a logical object with the mouse. Mouse move events store the seq. number of the previous mouse move event in all other cases. Software events always store the seq. number of the latest keystroke/mouse event as its likely cause.
A.5
Summary
Our designed logging tool BeLT has the following functional and non-functional properties, which gives the advantages over existing logging tools [5, 50, 53, 77]:
A.5 SUMMARY
1. Continuous collection of keystroke, mouse, software interaction and hardware events; 2. Security of the users’ sensitive behaviour by exclusion of typed passwords;
3. Unobtrusive and stable data capture;
4. Efficient compression of mouse movement data; 5. Pseudonymity of the users to maintain their privacy;
6. Transmission of recorded behaviour to a server through a secure channel or storing informa- tion on the local system depending on the user’s choice;
7. Implemented filter drivers to log the keystroke and mouse data with high precision; 8. Export options including raw data, CSV, SQL query/dump, XML;
9. Industry-grade application quality as required by Microsoft7;
Our BeLT logging tool can be used in various types of experiments. In this research, we have performed an experiment on continuous authentication to utilize the highest capabilities of this tool.
Appendix B
Complexity Measurement of a Password for
Keystroke Dynamics: Preliminary Study1
Abstract
This paper discusses the complexity measurement of a password in relation to the perfor- mance of a keystroke dynamics system. The performance of any biometric system depends on the stability of the biometric data provided by the user. We first present a new way to calculate the complexity related to the typing of a password. This complexity metric is then validated with the keystroke dynamics data collected in an experiment, as well as the user’s experience during the experiment. Next, we show that the performance of the keystroke dynamics biometric sys- tem will depend on the complexity of the password and in particular that the performance of the system decreases with an increasing complexity. This leads then to the conclusion that random passwords might, although harder to guess by an attacker, might not be the most suitable choice in case of keystroke dynamics.
B.1
Introduction
The most used method for authenticating a person is still using passwords. This method is easy to implement and users are very much used to it. Passwords systems do have well known weak- nesses because people choose simple passwords that are easy to guess. Many breaches of password databases have shown this.
It is known that the quality of chosen passwords is insufficient when left to the user. It is clear that users have a tendency to select passwords that are too simple and easy to guess. Most likely this will never improve, but we can use biometric keystroke dynamics to improve the quality of the authentication mechanism, even for simple passwords.
Websites that require passwords will now more and more indicate the strength of the password chosen by the user so that at the very least the user is aware of the quality of his password. We all know the strength of a password depends on the length, the unpredictability, and the complexity of a password. In this case, complexity refers to using a combination of letters, numbers, capitals, and special characters. The complexity of a password will increase when using a combination of these instead of restricting to just a single type of characters. Sometimes complexity of a password is defined as the entropy of the password.
In this paper, we look slightly different at the complexity of passwords in relation to the per- formance of Keystroke Dynamics (KD). We will in fact only restrict to using passwords that only contain letters when defining our complexity. We will look into combining KD [11, 70, 108] with password systems, i.e. not only should the user type the correct password, he or she should also type it in the correct manner. In this case, complicated words containing capitals and special characters might hinder the performance of the system. A password like ’px7(W1x,L*’ will be difficult to type, even for the genuine user. In [73], an experiment was conducted with a random password and the best performance reported in the paper is 9.6% Equal Error Rate (EER). With a simple passwords like ’password’ will the user be much more consistent in his or her way of typing this password. In 1This chapter is based on the paper published in: [105] MONDAL, S., BOURS, P.,ANDIDRUS, S. Z. S. Complexity
measurement of a password for keystroke dynamics: Preliminary study. In 6th Int. Conf. on Security of Information and Networks (SIN’13)(2013), ACM, pp. 301–305.
B. COMPLEXITYMEASUREMENT OF APASSWORD FORKEYSTROKEDYNAMICS
[66], the authors considered identification of users based on free typed text. They concluded that short English words (for example ’the’, ’and’, ’a’ or ’in’) are not very well suited for identification of people. This might result from the fact that these words are very short and that most people are proficient in typing those words.
Our contribution made in this paper as follows,
• New way to measure the complexity of a password for KD. • Validate the complexity metric with data of 110 users.
• Performance evaluation with relation to complexity of the password for authentication. In the remainder of this paper we will present a new complexity metric in Section B.2. We will validate this metric in Section B.3 and show that the performance of a biometric KD systems depends on the complexity of a password in Section B.4. Finally we will draw conclusions and present future work in Section B.5.