A Study on Anomaly Behavior Analysis using Bayesian Inference in
BYOD Environment
Dongwan Kang*, Taeeun Kim, Jooyoung Kim, Hwankuk Kim
Security Industry Technology Division
KISA(Korea Internet&Security Agency)
IT Venture Tower, Songpa, Seoul
Republic of Korea
[email protected]*
Abstract: - Recent breakthroughs in communication technologies and smart devices are spurring a shift in the communication environment from a mostly wired one to one that integrates a vast array of wired and wireless technologies. It was only a short while ago that business organizations began to adopt a business environment utilizing mobile devices to increase productivity. This meant abandoning their closed business environment to migrate to an open structure. Such a shift in the business environment is epitomized by the emergence of the BYOD (Bring Your Own Device) policy. BYOD refers to an environment that permits employees to utilize their own ICT devices for corporate purposes, thus enabling business organizations to increase productivity and even save on device purchases. However, many businesses are reluctant to adopt the policy out of concern about the frequent loss of personally-owned devices, low security, etc. Conventional network-based security equipment is limited in the ability to detect abnormal (unauthorized, unwarranted, etc.) activities that can cause system damage or data leakage if consideration is not given to all of the various devices and connection environments. Hence, demands are rising for solutions that can detect abnormal activities by analyzing various user behaviors free from the conventional perspective of network management. This paper proposes and verifies a method that can quantify various connection environments and detect abnormal activities by analyzing behavioral patterns.
Key-Words: -BYOD, Security, Anomaly Behavior Detection, Pattern, Bayesian
1 Introduction
Recent developments in communication technologies are encouraging both people and businesses to migrate from a mostly wired environment to a wireless one, and even to a wired and wireless-integrated one. In addition, the proliferation of various mobile devices such as laptops, smartphones, tablets and the like that provide mobility are no longer just a means of personal communication but are finding applications in more and more domains of our society including business environments.
It was only a short while ago that business organizations began to include mobile devices into their business environment in a bid to increase productivity, so their mostly closed corporate environments began to gradually give way to a more open structure. The recent BYOD (Bring Your Own Device) trend in business environments leads to improved operational efficiency by allowing employees to utilize their own personalized smart devices and also results in reduced device purchase
costs. Business organizations usually provide smart devices to their employees for business purposes, but the potential loss of or tampering with the smart devices and cost implications for device purchases are still serious considerations for such a business decision.
Therefore, a multi-dimensional data analysis to have insight into the usage of devices for business operation, unwarranted connection to business networks, leakage of corporate data, etc. rather than just connection of devices is required and a security framework based on situational awareness such as [5] is suggested in that regard.
This paper proposes a method to identify behavioral patterns of users in a BYOD environment characterized by a variety of environmental factors, and to analyze abnormal activities based on such patterns.
2 BYOD Environment
BYOD (Bring Your Own Device) refers to an environment in which individuals are allowed to
bring their personally owned smart devices such smartphones, laptops, tablets, etc. and connect them to corporate ICT resources such as the corporate database and applications to conduct business operations. The trend requires adoption and adaptation of the requisite technologies, concepts and policies in general. The emergence of BYOD has triggered migrations from closed corporate internal infrastructures to ones that are open by nature. In other words, the business and service servers that previously could only be approached on the intranet of a business organization and not accessible via the Internet by personally-owned smart devices are now being transformed into networks that can be accessed by personally owned smart devices. Corporate data that was processed by and stored in enterprise-owned devices in a closed environment can now be processed by and stored in personally-owned devices operating in an open environment and ownership, control and initiative over devices is shifted from corporate IT departments to individual users. According to a survey of 600 enterprises conducted by Cisco in 2012, 95% of them have already permitted individuals to bring their own devices to their workplaces and witnessed gains in productivity of their staff.
Figure. 1 BYOD Environment
3 Identified Challenges
A new IT environment or policy such as BYOD not only improves convenience but also increases security threats to business organizations [2]. One of the causes of such security threats stems from the challenge of having to manage the heterogeneity of operating systems among smart devices. At present, many manufacturers are producing smart devices based on a variety of operating systems. According to a survey conducted by OpenSignal in 2012, there are no fewer than 3,997 different models of Android-based smart devices with about 70% of them having adopted Android OS variations adapted by their manufacturers. Let alone these constraints
inherent in the devices, negligence in device management leads to leakages of critical corporate data, and frequent changes compromises efficient control of personally-owned devices[1][3]. In the previously closed enterprise environment, it was possible to identify devices connected to the enterprise network by managing IPs and MACs statically as such devices were fixed. In addition, it was relatively easy to install and upgrade the data control programs in the enterprise-owned devices as required by business organizations.
However, in a BYOD environment, a variety of personally-owned devices are connected to heterogeneous network connection arrangements. For mobile devices epitomized by iOS and Android, there are numerous fragmented operating systems available in different versions and customized by the smart device manufacturers. Connection locations can be extended to locations within/out of the company including overseas locations. Furthermore, it is hard to determine whether the security of personally-owned devices is maintained at a predefined level. Risks leading to leakage of data within/out of a company increase ever more as the pathways of data access become more diverse, along with risks of data leakage arising from abuse of authority or malicious practices by internal employees.
4 Proposed Solution
To resolve the above issues, a security framework based on “situational information” is proposed as below. [5] defines situational information that, unlike as-is TCP/IP and user groups, is more intuitive and usable for defining behaviors and setting up policies. It identifies patterns of connecting practices of users through such situational information and sets up policies based on a combination of various environmental factors. This paper proposes a situational information approach and pattern analysis based on the above framework and tested it by setting up a virtual business and then analyzing data gathered through a very realistic business simulation.
4.1 BYOD Security Framework
The underlying framework defines situational awareness and behavior data into standardized situational information and actively manages the behavior pattern and security policy based on the situational information. Situational information is expressed in numerical and categorical data and converted into patterns per individual by big data
analytics, etc. This data pattern is used as criteria by which abnormal behavior of individuals is determined to support customizable detection of individual behavior. Then, a final decision is made in reference to the analytics and control is conducted through an interface with security equipment such as a firewall.
4.2 Behavior Analysis Approach
The behavior analysis approach in this paper used data on the past behavior of users as criteria for abnormal behavior. As connecting behaviors in various devices and environments can be diverse in accordance with the characteristics of individuals, uniform detection criteria cannot be applied to all users indiscriminately.
Therefore, in the first step of the behavior analysis behavior data for a specified period is collected, then, in the second step, a significant behavioral pattern is analyzed out of the collected behavior data, then, finally, abnormal behavior of users is determined in reference to the analyzed behavior pattern. Subsequently, the behavior pattern is managed per change in the usage characteristics of users.
4.3 Behavior Modelling
User behaviors are modeled into categorical data as below. The characteristics and scope of each category is defined in advance to specify the characteristics of each behavior and it is assumed that there is no relation of subordination for each behavior (e.g. browser used per device and network used per location, etc.). Of course, user behaviors can be subject to some behavioral environment, but behavior elements that can differentiate behaviors with discrimination are selected based on independent behavior characteristics as much as possible.
Each behavior consists of a set of elements and each user has one of the applicable elements for each behavior. For example, if behavior ‘A’ is ‘connection time’, the elements can be set as {a1:AM, a2:PM} or {a1:0H~6H, a2:6H~18H,
a3:18H~24H}, etc. Then, if a set of user or device behaviors is defined as behavior A={a1, a2, ..., ai},
behavior B={b1, b2, ..., bj}, ..., behavior N={n1,
n2, ..., nk}, current user behavior can be modeled as
{ax, by, ..., nz}.
User behavior = {ax,by,...,nz}
Table 1. Connection Behavior Information
Contents
A a1:Mobile, a2:Tablet, a3:Desktop, a4:etc B b1: During the week, b2:weekend, b3:holyday C c1:0000~0600, c2:0600-0900, c3:0900-1200,
c4:1200-1800, c5:1800-2400 D d1:iOS, d2:MacOS, d3:Android, d4:Windows,
d5:Linux, d6:etc
E e1:Chrome, e2:Safari, e3:IE, e4:Firefox, e5:etc F f1:inside, f2:local, f3:oversea
G g1:inter-wire, g2:inter-wifi, g3:broadband, g4:internet H h1:0~5, h2:6~10, h3:11~15, h4:16~ (sec)
4.4 Behavioral Pattern Analysis
To analyze a behavioral pattern, this paper used Bayesian inference. As there are no criteria for abnormal behavior, abnormal behavior is determined referring to actions taken most frequently as criteria for normal action [4].
Figure 2. Concept of Behavior Analysis
To that end, the occurrence frequency of each behavior element is analyzed and a behavior matrix is configured to analyze the occurrence probability of each element based on such frequency.
For each element constituting behavior for a Bayesian inference, the following computation is made to infer the elements to be inferred from the rest of the behavioral elements.
) a | )P(c a | ){(b P(a ) a | )P(c a | ){(b P(a ) a | )P(c a | ){(b P(a ]) c , [b | P(A 2 3 2 2 2 1 3 1 2 1 1 3 1 2 1 3 2 1
When a element whose occurrence frequency is zero ( 0 ) is to be inferred, Laplace smoothing is applied to prevent calculation error and a calculation result of zero ( 0 ).
Inference result represents the importance of the elements likely to occur in consideration of other behavioral elements. By this, the element that is most likely to occur can be inferred. Once the occurrence probability of each element is computed, it is averaged out to estimate the occurrence probability of the entire behavior.
5 Result
For this experiment, a virtual enterprise business system was set up and users enrolled to use it as they would normally with their mobile devices for about one month. Based on the actual data collected and a pattern analysis of the users’ use, the data was expanded and modeled on the assumption that the system would be used for three months.
5.1 Data Model
The data model is based on situational information as shown in Figure 4.
Overall, it is found that User 2 and User 4 had distinctive behavior elements whereas it was hard to find any significant characteristics for User 5.
5.2 Test
Pattern data per user was learned by entering modeled user data (5 users, 300 connections). After identifying patterns in the behavior of each user, connection by a 3rd party in the above scenario was attempted at random and occurrence probability was analyzed based on the behavior pattern of each user.
Figure 4. Characteristics of User
5.3 Result Analysis
The 20 most frequent and 10 least frequent behaviors were selected per each user. These behaviors were then analyzed in relation to the behavior patterns of the other users. In the test, it was confirmed that the results varied very significantly per user behavior pattern. Criteria for individual behavior determination varied per pattern level and occurrence frequency of each user and, if the behavior pattern was consistent, the occurrence probability for a 3rd user’s behavior was extremely low.
The above graph shows the results of test input behaviors based on the behavioral pattern of each user. Behaviors are configured as below.
Table 2. Input Action Configuration
For behaviors found to have a 10% pattern, the occurrence probability was inferred to be 70% or higher for Users 1, 2, 3 and 4 whereas about 40% probability was inferred for the other users. However, if the behavior pattern level was under 5% as in the case of User 5, it was effectively difficult to conduct a behavior-based analysis.
Table 3. Behavior Determination Criteria per User
Actor Self-Pattern Probability Rate (lower) Other-Pattern Probability Rate (quadrant) Major Act (P1) Minor Act (P2) Major Act (P3) Minor Act (P4) USER 1 76.800 37.760 41.607 37.860 USER 2 70.801 57.226 42.044 33.669 USER 3 71.735 54.600 42.866 36.581 USER 4 73.454 44.196 41.308 32.419 USER 5 45.367 45.280 29.440 32.653
Figure 6. Probability Rate by Pattern
In the above table, as User 4 is similar in part to User 1 in terms of behavior pattern, the probability analyzed based on other behavior patterns increased. In the final analysis, the patterns of Users 2 and 3 are well identified and therefore the possibility for mis-detecting their behavior is the lowest.
6 Conclusion
Approaches to analyzing user behavior vary significantly as they are subject to their intents and purposes. This research has proposed and tested a method for detecting abnormal behavior without uniform criteria but subject to the characteristics of individual behaviors. However, as human behaviors are hard to quantify and vary somewhat according to individual characteristics, the proposed method is limited in reflecting individual characteristics. Further studies will follow to improve the accuracy and viability of the method of determination by analyzing the pattern level of each user and its correlation to collective behaviors based on the behavioral data of a large number of users.
Acknowledgments. This work was supported by the IT R&D program of MSIP/KEIT(Ministry of Science, ICT and Future Planning/Korea Evaluation Institute Of Industrial Technology). [R0101-15-0026, The Development of Context-Awareness based Dynamic Access Control Technology for BYOD, Smart work Environment]
References:
[1] Johnson. K, Mobility/BYOD Security Survey. SANS Institute , 2012.
[2] Symantec, Smartphone Honey Stick Project, http://www.symantec.com, 2012.
[3] Miller. K.W., Voas. J., Hurlburt. G.F., BYOD: Security and Privacy Considerations, IT Professional 14(5), pp. 53–55, 2012.
[4] Dongwan K, Joohyoung O, Chetae I, A Study on Behavior Patternize in BYOD Environment Using Bayesian Theory, CSCC(International
Conference on Circuits, Systems,
Communications and Computers), 2014. [5] Dongwan K, Joohyoung O, Chetae I, Context
Based Smart Access Control on BYOD Enviroments, 15th International Workshop on Information Security Applications, LNCS Vol. 8909, August 2014.
Act No. Contents Avg. Act Count
1~20 (USER 1) Most Action Pattern 9 21~30 (USER 1) Least Action Pattern 1 31~50 (USER 2) Most Action Pattern 7 51~60 (USER 2) Least Action Pattern 1 61~80 (USER 3) Most Action Pattern 6 81~90 (USER 3) Least Action Pattern 1 91~110 (USER 4) Most Action Pattern 10 111~120 (USER 4) Least Action Pattern 1 121~140 (USER 5) Most Action Pattern 2 141~150 (USER 5) Least Action Pattern 1