Sensors Network - Improving the Capabilities of Distributed Collaborative Intrusion Detection S

Intrusion

Detection

Response

Network Traffic Traffic Features Alarms Actions

Figure 2.3:The core architectural components of anNIDS and the key data items the components transfer.

SNIDS and ANIDSs have different advantages and disadvantages. SNIDSsdo not suffer from many of the difficulties discussed in the pre- vious section (Section 2.2.1). In theory, these systems have low false positive rates and are highly effective in recognizing known attacks. To obtain signatures, instead of requiring difficult to obtain datasets, they need intrusion examples. Finally, countermeasures can be tai- lored to specific attacks. For these and other reasons, SNIDSs are the most widespreadNIDSs(e. g., Bro [Paxson,1999], Snort [Roesch,1999] and Suricata3

). Creating signatures is a major challenge of SNIDSs and many researchers have created automated mechanisms, often based on ML, to derive signatures (e. g., [Kim et al., 2004; Kreibich et al.,2004]). We have also worked in this direction [Vasilomanolakis, Srinivasa, Cordero, et al., 2016]; however, this thesis concentrates on ANIDSsas, we argue, the effectiveness of SNIDSs is in decline. SNIDSs need accurate signature databases to be effective. However, creating and maintaining these databases is becoming daunting due to 3 https://www.openinfosecfoundation.org

b a c k g r o u n d a n d r e l at e d w o r k

the constant appearance of novel threats, the popularization of en- crypted communication mediums, and the growth of attack surfaces. Under these conditions, even large and constantly updated signature datasets are not sufficient to protect networks.

Despite some of their problematic aspects, ANIDSs have qualities that make them more appropriate thanSNIDSsin many circumstances: ANIDSscan detect attacks that have not been seen before. For this rea- son, the field of NIDSs is still actively developing anomaly detection methodologies [Pimentel et al., 2014]. A popular and effective tech- nique to identify anomalies in network traffic is to use the subspace method [Lakhina, Crovella, and Christophe Diot,2004]. This method subspace method

consists in splitting network traffic into disjoint normal and abnormal subspaces using techniques such asPCA[Ringberg et al.,2007]. Many researchers, however, have criticized PCAstating that the mechanism is not robust enough (e. g., [X. Li et al., 2006]). On the other hand, recently proposed modifications to PCA have made it more robust within networks (e. g., [Chen et al.,2016]). In Chapter4, we propose a robust isomorphic alternative to PCA that detects network traffic anomalies.

2.2.3 Anomaly-based Network Intrusion Detection

ANIDSsare based on thesuspicious hypothesis. The suspicious hypothesis states that anomalous events are deemed suspicious from a se- curity point of view [Estevez-Tapiador et al., 2004]. In the context of network traffic, an anomalous event refers to traffic that does not con- form to the expected behavior of a network. Anomalous events are found using an anomaly detection system (M,D), as defined in Sec- tion2.1.4. In the context of an_ANIDS, the model of normalityMfinds representations of normal network traffic while the similarity mea- sure D finds the distance between arbitrary networks traffic andM. If, according toD, network traffic is above a predetermined threshold, the traffic is considered anomalous. The model of normalityMcan be represented in many ways. Matthew V Mahoney and P. Chan [2003] useconditional rulesto model normal behavior.Autoencoders, a type of neural network, are successfully used in the literature to model normality (e. g., [Dau et al.,2014]). Even models borrowed from physics, such asWaveletsandFourier transformation, are used to create models of normality [Jiang et al.,2014].

Anomaly detection uses two independent components to construct normality models and detect intrusions. The diagram in Figure 2.4 shows a simplified example of how information flows when con- structing normality models and detecting intrusions. Networks gen- erate traffic and one or many sensors collect the traffic to extract features. Features are aggregated and then distributed to the modeling component. This component is responsible for learning a model M modeling component

2.2 n e t w o r k i n t r u s i o n d e t e c t i o n s y s t e m s

to represent the normality of the features it received. The learned

model is shared with thedetection componentwhich, in turn, uses it to detection component

detect intrusions. To detect intrusions, the detection component com- pares the features it receives against the normality model M and, if the features are above a threshold according to similarity measureD, the traffic is labeled as anomalous. In consequence, ANIDSs need to be trained before they can detect intrusions. Note that the training process (i. e., creating a normality model) does not involve labeled network traffic and assumes that only normal network traffic is considered.

X

Net

w

ork

Sensors

Modeling

Component

Detection

Component

Model Traffic Traffic Traffic Traffic Features Features Features Features Features Features

Figure2.4:Information flow of an anomaly detection system. Network traffic is monitored by sensors that extract features. Sensors send features to a modeling component that is responsible for creating a normality model. The detection component uses the normality model to determine if some features are abnormal or not.

In principle, normality models can be constructed using arbitrary selections of features. In large networks, however, modern anomaly

detection techniques use the entropy of IP header fields as features. entropy of IP header fields

Entropy is a metric that efficiently calculates the dispersion and concentration of a distribution [Ringberg et al.,2007]. Most network-wide intrusions affect the dispersion and concentration of IP header fields. Therefore, entropy is a suitable metric to learn normality models that represent large networks. Lakhina, Crovella, and Christiphe Diot [2004] provided most of the analysis that made the (Shannon) entropy of IP header fields a default feature in most other works. Many other researchers have also experimented and successfully demonstrated the usefulness of other types of entropies as features (e. g., the nonex- tensive or Tsallis entropy [Tellenbach et al.,2011; Ziviani et al.,2007]). Beyond using the entropies of IP header features, researchers have created improved anomaly detectors by mixing the entropy of other feature types beside IP header fields. A notable example is proposed by Nychis et al. [2008]. In their work, they improve anomaly detection by using the entropy of the behavior of flows (i. e., the in- and out-degree distributions of hosts).

The diagram in Figure 2.4 assumes that one entity is responsible for building one single model of normality. Likewise, the diagram

b a c k g r o u n d a n d r e l at e d w o r k

implies that only one entity is responsible for using the normality model in the detection component. These two assumptions limit the system in its scale. To cope with this limitation, researchers propose groups of collaborative NIDSs, or Collaborative Intrusion Detection Systems (CIDSs). These systems are the topic of discussion in the next section.

2.3 c o l l a b o r at i v e i n t r u s i o n d e t e c t i o n s y s t e m s

The necessity to detect collaborative attacks has brought forth collaborative defenses. Collaborative Intrusion Detection Systems (CIDSs) are collections of NIDSs that together collaborate to detect widespread intrusions. Computer networks can reach monumental sizes, creating an environment where attackers can easily conceal their activities. The goal of a CIDSis to detect those undesired activities that would otherwise be overlooked by individualNIDSs. ACIDSis composed of multiple sensors, communication channels and one or more analysis units. As in an NIDS, sensors are responsible for monitoring local

CIDS sensors

network traffic. Analysis units, in contrast to anNIDS, can be plentiful

CIDS analyzers

and have different roles depending on whether the collaboration level of aCIDSis at the detection or alarm level (see Section2.3.2). Analysis units share the responsibilities of the modeling and detection component ofNIDSs(see Figure2.4).

CIDSsare complex systems that can be organized differently according to different criteria. In the coming two sections, we expand upon

organizational

criteria _{two different organizational paradigms of} _CIDS_s_{. The first paradigm}

considers the communication overlay of aCIDS. The second paradigm

communication

overlay _{takes into account the collaboration level at which a} _CIDS _operates. collaboration level _{Regardless of how they are organized,}_CIDS_s_{are made up of the same}

components. The architectural components of CIDSs are presented in

architectural

components _Section 2.3.3. This architecture plays an important role in this thesis as it is used as the foundation by which this thesis’ contributions are organized.

2.3.1 CIDS Communication Overlays

According to the communication overlay they use,CIDSscan be organized into three different classes. Figure 2.5shows the three communication overlays by which related work can be organized [Vasiloma- nolakis, Karuppayah, et al.,2015]. Centralized_CIDS_stend towards the centralized CIDSs

best detection accuracy given that the data of all sensors is analyzed by one single analyzer. Its obvious deficiency is that it does not scale well to large networks. HierarchicalCIDSs alleviate the scalability is-

hierarchical CIDSs

sue by creating hierarchies of analyzers. At each level of the hierarchy, an analyzer processes the data of a limited number of sensors. Analyz- ers may collect (and aggregate) the results of other analyzers to create

2.3 c o l l a b o r at i v e i n t r u s i o n d e t e c t i o n s y s t e m s

meta-models. The lower an analyzer is in the hierarchy, the narrower its view is and, theoretically, the less accurate its network-wide detection capabilities are. Both centralized and hierarchical overlays have

In document Improving the Capabilities of Distributed Collaborative Intrusion Detection Systems using Machine Learning (Page 43-47)