• No results found

Vol 8, No 2 (2018)

N/A
N/A
Protected

Academic year: 2020

Share "Vol 8, No 2 (2018)"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Research Article

a

February

2018

Computer Science and Software Engineering

ISSN: 2277-128X (Volume-8, Issue-2)

Intrusion Detection System using Artificial Immune

Systems: A Case Study

Ojasvini, Nitesh, Piyush, Narina Thakur, Arvind Rehalia

C.SE. Department, Bharati Vidyapeeth's College of Engineering, New Delhi, India

Email- [email protected], [email protected], [email protected], [email protected], [email protected]

Abstract— Networks are working at their apical efficiency and are increasing in size by every second; emergence of various threats becomes hindrance in the growth and privacy of the users. The network is vulnerable to security breaches, due to malicious nodes. Intrusion detection systems aim at removing this vulnerability. In this paper, intrusion detection mechanisms for large-scale dynamic networks are investigated. Artificial immune system is a concept that works to protect a network the way immune systems of vertebrates work in nature. This paper also illustrates this artificial immune system, the integration of bio-inspired algorithms, and its functionality with the computer networks.

Keywords— Intrusion Detection, Artificial Immune System, Negative Selection Algorithm, R Contiguous bits, Network Security

I. INTRODUCTION

Humans made computers to automate our routine tasks. The outset of computer systems eased a lot of pain on mankind and therefore it should be our duty to care for our helper as one of our own. In the last few decades, a lot of efforts have been put in the operational welfare of computer systems. The aim is to work with an intrusion detection system like, the immune system of humans that protects the systems from intrusions and remove anomalies. In this paper, existing intrusion detection system models are enunciated. An intrusion detection system (IDS) is responsible for monitoring visitors on a network and units that approach the network, for suspicious activities and informs administrator responsible, about the anomalies. There are completely network-based (NIDS) and completely host-based (HIDS) intrusion detection structures. There are IDS that genuinely alert and reveal about the anomalies and there are IDS that carry out an action or moves in response to a detected risk. IDS may be a high-quality tool for proactively tracking and shielding your network from the malicious activity, but, they are also at risk of false alarms. Just like our immune systems.

So, models of artificial immune systems that work as an efficient intrusion detection mechanism for computer networks are scope of studying.

The main issue that is being reviewed is the identification of detectors which are an integral unit of Artificial Immune System (AIS). A constructive approach to understanding the formation and identification of detectors and their functionality that involve concepts of machine learning and data analysis is taken. Although the mechanisms are still raw as compared to their biological counterparts, it has thus been only possible to analyze the functional behavior of the AIS model and evaluating the value of the process.

The goal of this paper is make the biological connections more concrete and emphasize the adaptive systems framework in which is the basis of our implementation. In next section (II), the basic description of Artificial Immune System is given. In section III, a gist of working of human immune system is given where the immune response is described in sub-section A and structure of immune system is described in sub-section B. The core issue is addressed in section IV. Section V holds the details of detection mechanisms where negative selection algorithm is discussed in sub-section A and r-contiguous bits is discussed in sub-sub-section B. Further, life phases of a detector string and its phases are discussed in section V and application scenario is discussed in section VI.

II. ARTIFICIAL IMMUNE SYSTEM

(2)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 19-25

attributes of the immune system like its diverse, distributed and dynamic nature, its ability to adapt to the environment, its sensitivity to individuality and its forbearance to faults [1].

The information dealing competence of an immune system is extremely potent as a result of a lot more features like extraction, pattern recognition, learning memory and its robust nature, making it a resonant paradigm for its artificial equivalent [2].

Therefore, AIS can be considered to be an endeavor to execute fortification against faults, within the system and external attacks, overtly stimulated by the functioning of natural immune systems.

Aspects like Ag-Ab binding, Immune network, Distributed control, Self/non-self Discrimination, Clonal selection, and Affinity maturation of the biological immune system have found applications in fields such as Fraud detection, Computer security, Chemical spectrum recognition, Data analysis, Clustering, Pattern matching, Optimization, and Hardware fault detection and tolerance[5].

III. HUMAN IMMUNE SYSTEM

About the human immune system, or the immune system of vertebrates in general, one can expressly say that it protects the body from foreign elements called pathogens. The Immune cells use blood and lymph pathways to patrol the body and the system essentially work to determine if any item encountered can be classified as „self‟ or „non-self‟.

The word „self‟ here, represents the individuality of a body that‟s biologically marked by the chemical composition of individual DNA.

The immune system primarily works round the construct that all elements recognized as „self‟ should remain untouched while everything else, that is „nonself‟ elements should be destroyed.The fact that functioning of such a fancy network of cells that form tissues, which successively form organs, could be a biological marvel in itself.

A. Immune System

The immune system begins by recognizing those bodies that appear in conjunction with self-marker molecules. So, if the system detects any anomaly with the marker molecules, it would immediately launch an attack on that foreign element. Here, something that triggers an immune response is called an antigen.

An intricate yet dynamic communication network among the immune cells yield prudent changes and produce powerful substances that allow cells to regulate their growth and behavior, and form a defense response.

Now, a problem may arise here when the Immune System makes a mistake in discrimination of self and non-self elements.

It is in some rare or abnormal conditions, „self‟ elements are mistaken to be „non-self‟ and are attacked or the foreign elements are not detected at all. This leads to various „Autoimmune disorders‟ such as some forms of diabetes or arthritis.

B. Structure of Immune System in Humans

The organs of the immune system are positioned throughout the host body. They are known as lymphoid organs because they are house lymphocytes, which are small white blood cells, and are the most significant component of the immune system. The source of immune cells is the bone marrow that essentially is the source of all blood cells. The spleen, however, serves as the meeting ground for the army of immune cells as it contains special compartments where the immune cells gather and work a technique to confront the antigens. There are three kinds of immune cells called the T-cells, B-T-cells, and phagocytes. All of these are kinds of lymphocytes. These are produced in the bone marrow in form of undeveloped stem cells. Whenever an intrusion appears, the detectors realize a solution to counterattack and the stem cells multiply to form an army of immune cells.

The thymus that is found behind the breastbone matures the white blood cells to create lymphocytes or the T-cells.

Moreover, the immune system uses the blood vessels and the lymphatic vessels, which run parallel to blood vessels, as the transfer medium. These lymphatic vessels carry lymph that is a fluid coating body tissues.

Lymphocytes travel into the bloodstream to patrol the body and come back to Lymph nodes. These lymph nodes lace the lymphatic vessels and exist in clusters in the body, at neck, armpits, abdomen, and groin.

Besides, the immune system has a line of defense made of clumps of lymphoid tissues found in areas most susceptible to attacks, such as digestive tract and airways. These are found in tonsils, adenoids, and appendix.

(3)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 19-25

few structural variety, known as isotypes. A single B-cell can clone multiple B-cells, each with a distinct isotype, even while the receptor variable regions remain constant. This is often called isotype switching and enables the immune system to choose between numerous effective functions via chemical binding [1]. These cells follow a specific process:

Figure 1: This shows the method of operating of B-cells, called affinity maturation.

Also, the T-cells detect fragments on the surface of infected or cancerous cells and work in two major ways: First, it either initiates and regulates immune responses such as symptoms of allergic reaction or graft rejection of transplanted organs or second, it directly attacks the infections with its clear instincts.

In some cases, T-cells solely acknowledge an antigen if it is carried on the surface of a cell by one of the body‟s own major histocompatibility complex or MHC, molecules. MHC molecules are proteins recognized by T-cells when distinguishing between self and non-self. A self MHC molecule provides a recognizable scaffolding to present a foreign antigen to the T cell.

In the case of autoimmune disorders, it can be prevented by a method known as negative selection, which enables only the survival of those T-cells that do not acknowledge self-cells.

IV. ADDRESSING THE PROBLEM

The perplexity of detecting pathogens within the host system is mostly delineated as discrimination of “self” from “non-self” (which are elements of the body, and pathogens, respectively) [1]. However, several pathogens do not seem to be harmful, and an immune response to eliminate them might harm the body. The domino effect of this response is called allergy and such pathogens are called allergens. In such cases, it would be healthier to not respond, and thus it would be a lot more precise to say that the genuine problem faced by the immune system is that of the distinction between harmful non-self elements and everything else [3,4].

Now, the immune system can make two kinds of distinction errors as described in [1]:

● False positive: When an element is identified as non-self by the system but is truly a self, and there is no response from the host system.

● False negative: When an element is identified as self by the system but is actually non-self.

When the real-world computational tribulations are delivered to this dimension, it is necessary to choose the proper characteristics of the immune system for the application domain.

V. DETECTION MECHANISM

In the system that is being studied, a tendency to alter the understanding of the whole process by considering lymphocytes to be a single kind of detector in the system with a combination of properties of T-cells, B-cells, and other phagocytes, is observed.

B-cells idendify triggering antigens

Produces large number of plasma

cells

Plasma cells produce antibodies for that

antigen

Antigen interlocks with anti bodies

(4)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 19-25

Lymphocytes are termed monoclonal if they have numerous identical receptors on their surface. These receptors bind to regions called epitopes or the marker molecules, on pathogens. Binding depends on the chemical structure of components; therefore receptors are likely to bind to a few similar forms of epitopes.

The larger chance of an occasion of bonding, the higher will be the affinity between the receptors. In the model, each epitope and receptor are thought of as binary strings of fixed length l, and chemical binding between them, as approximate string matching. Here, every detector is related to a binary string, which represents its receptors.

During affinity maturation, stimulated antibodies undergo a somatic mutation with high rates (somatic hypermutation). The amount of mutation that a B-cell can bear is reciprocally proportional to how well it matches the antigenic pattern: the higher the affinity (match) the lower the mutation, and vice versa [6].

A. Negative Selection Algorithm

The negative selection algorithm is galvanized by the maturation of T-cells in the thymus [7]. It has been first utilized by Forrest et al, as a method of detecting unidentified or illegal strings for virus detection in computer systems/networks.

To test this, an assortment of detectors is created to behave like T-cells. These detectors are simple, fixed-length binary strings. A simple rule is used to compare bits in two such strings and decide whether a match has occurred. Such a match is equivalent to a match between lymphocyte and antigen [8].

All randomly generated detectors are compared to every pattern in the self-set. The self-set is a set of detectors that are akin to the self-proteins stored in the thymus which contains samples of self, against which all detectors are tested.

Any detector which matches any pattern in the self-set of binary strings is not included in the detector set and the negative, that does not match is considered a detector.

The time complexity of detector generating algorithm is derived based on two factors: first, time to generate a number of candidate detectors and second, the time to compare each one of them with self-data.

More patterns can then be gathered from the system which is being monitored which are converted into the appropriate binary form and compared to the detector set.

If the tested pattern matches any detector in the detector set then it can be guaranteed that that pattern is non-self and action can be taken accordingly [8].

Now, this algorithm works in two steps as described in [7] in two steps. First one is the censoring step, which actually generates the detector set and the monitoring step, which controls the mechanism.

If an exact match were required by the matching rule, the detector set would need to contain detectors for every possible illegal string which could occur. This would lead to a huge computational overhead and an impractical algorithm. Instead, detection is probabilistic. Only r contiguous bits are required to be identical for a match to occur. The value of r is known as the matching threshold. The algorithm consists of two stages: censoring and monitoring.

Process (Sentential Format):

1. The elements of the self-string set are divided into equal size segments to detect and comparison. 2. The equal size segments are assumed not to change over time.

3. The rule for comparison and detection is that two strings must match in r contiguous places regardless of the position of substring found in the current matching string set.

Advantages:

1. The template matching (lock-and-key mechanism) reactions would be used to detect changes in the coming string and detector set that is to be learned.

2. Each copy of the detector set is unique to other detector set and the system on which it is generated.

3. Violation of scheme at one site means all sites are compromised as many strings are generated from different detector set to reduce computational overheads provided that the discrimination is probabilistic in nature.

4. This algorithm detects any intrusive activity rather than looking for specific intruders using r contagious algorithm to check the running tasks. However, due to the probabilistic nature of discriminations, the cost of computing the generation of detector set is pretty high.

Detector set Do they

match?

Reject NO

YES Set of generated detectors

(5)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 19-25 B. R Contiguous Bits Mechanism

The approximate matching rules used within the ecosystem include Hamming distance, edit distance or Levenshtein distance [10], but an immunologically credible mechanism is embraced, called r-contiguous bits [9]: Two strings are said to match if they have r-contiguous bits in common.

The value of r marks an edge and determines the certainty of the detector, which is an indication of the size of the subset of strings that a single detector can match.

It is to be kept in mind, if the length of r is equal to the length of the string, the matching will be completely explicit, which means that the detector will match only with itself, but if the value of r is taken to be zero, the matching will be extensive, i.e. detector would match with every single string of specified length. This is the boundary condition of r contiguous rule.

Figure 2 describes matching under r contiguous bits matching rule

An upshot of a partial matching rule with a ceiling value, such as r-contiguous bits, is that there exists a trade-off between the number of detectors used and their credibility: as the credibility of the detectors increases, the number of detectors required to achieve a certain level of detection also increases. This might lead to over-fitting of the training set for the detection mechanisms. The optimal value of r is one which minimizes the amount detectors needed, but nevertheless, gives proper discrimination [1].

VI. LIFE OF A DETECTOR STRING

Randomly created: This step describes the creation of binary string sets that is an entirely random process.

Immature: A lymphocyte is said to be activated only when its receptors bind to epitopes, which actually needs to the elimination of antigens. Before this could happen a lymphocyte is said to be immature. As the lymphocytes are trained to bind to non-self the immune system responds as if a foreign element has been detected at the time of activation.

Figure 3 describes the essential life cycle of every detection mechanism which means that any algorithm that is used for such purposes(r contiguous rule) would follow this process. The description of the process is given in

accordance with [1] as: Given: length of string = 7 bits

r = 4

0101011 1010101

1101010 1101010

(6)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 19-25

To avoid these lymphocytes go through a less complicated learning method called tolerisation, where the lymphocytes are trained to be tolerant of self-elements. This process often occurs in the thymus in the human body. The cells from thymus are said to be immature if they do not cross the threshold values for activation.

Mature and Naïve: Once the lymphocytes have crossed the activation threshold after going via a kind of asynchronous and distributed tolerisation process [1]. Each detector remains immature for a certain time duration called the tolerisation period during which it is exposed to its surroundings and if it matches any of the binary strings in the detector set it is eliminated. But if it does not match anything in this duration it becomes a mature detector. Now, this detector string is a part of detector set but the system does not pay any heed to its existence.

It is known that the natural immune system to follow a memory based detection mechanism. It has an adaptive response which facilitates to remember the molecular structures that form the characteristics of the pathogen it stumbles upon so that it can remember the structures for future references that makes the immune system dynamic.

VII. WORKING OF MEMORY BASED DETECTION SYSTEM

The primarily memory based detection system is programmed on a subset of non-self to detect unique elements of that subset [1].

This is also a two-step process like most of the processes quoted earlier. First, is a primary response that is whilst the immune system is gaining knowledge to apprehend previously undetected substrings and when the system encounters the same pattern once more it launches the secondary response that indicates that the system has formerly seen the pattern. This ensures the rapid activation of the lymphocytes.

Besides activation, the lymphocytes have to go through another step to be recognized into memory the T-cells need to be constimulated to be activated. After binding to a molecular structure on a pathogen (called signal one) it should receive a second signal as well. This second signal is usually a chemical reaction that happens when the host is tormented by the pathogen in some manner. If this second signal is not received during the tolerisation period within the constimulation delay, the lymphocyte dies off through apoptosis ( timed cell death ) during its development phase. This is the entire memory based detection process.

VIII. APPLICATION SCENARIO OF THE AIS

The artificial immune system finds its applications in various fields one of which is network security as it is a natural domain for adaptive systems one of such systems is LISYS, Lightweight Intrusion Detection System.

LISYS is an archetype of AIS specialized for the problem of network intrusion recognition while maintaining low error rates. It was developed and enhanced by Steven Hofmeyr. The LISYS is a complex system in which each node in the protected system is a networked computer which has a local collection of receptors and a local sensitivity level. The antigens that must be monitored by the detectors are strings that contain information about the network traffic that affects the protected nodes. The detection of anomalous string results in the generation of an alarm for a human operator.

LISYS uses non-self recognition discloses that valid detectors are those that fail to match the normally occurring connections in the network. Detectors are generated arbitrarily and suitable detectors that match connections observed in the network during the tolerisation period are eliminated. Detectors, in addition, have a fixed probability of dying randomly at each time step. The delimited length of existence of detectors, when combined with detector re-generation and tolerisation, benefits in rolling coverage of the self-set. For the r-contiguous bits concurrence rule and fixed self-sets which don‟t change over time, randomly generating detectors is inefficient. Judicious results can be observed in [Glickman et. al. 2005] results procured from LISYS are encouraging and permit, in particular, a configuration that all the implemented process and concept contribute to the system performance.

IX. WORKING OF LISYS

LISYS is a partial particularization of AIS to the problem of network intrusion detection ([1]; Glickman et al. 2005). In LISYS the monitored nodes are computers that must be protected against unauthorized access. The activation of detectors corresponds to the detection of suspect traffic that could correspond to an attack. In this case, false negatives correspond to undetected attacks and the false positives correspond to unnecessary signals sent to the human operator. The strings which might be monitored summarize the facts about the connections that subject the nodes. Every string includes the identification (IP addresses) of the connected nodes and the specification of the form of carrier requested. For simplicity, it is assumed that all the information about the traffic of the whole network is available at each node. The system was examined with data amassed from actual computer networks which contained recognized intrusions and was able to come across all the intrusion attempts, apart from very short ones, with a small rate of false positives.

(7)

ISSN(E): 2277-128X, ISSN(P): 2277-6451, pp. 19-25

X. CONCLUSIONS

In this paper, a model of the artificially immune system that seeks to form a system to solve computational issues by behaving analogously to the natural immune system of vertebrates is reassessed. The expertise required to build an immune system for computers exists in segments and require a considerable amount of experimentation to produce an efficient working model. Since AIS is an abstract concept, it finds applications in various architectures that require anomaly detection and monitoring methods. Eventually, the result is a vivid analysis of inception of an iteratively constructive model and analysis of its flexibility towards variation in further research prospects. It simplifies the task of analysis of the theoretical concepts before the actual analysis of the data conducted at network terminals at scenes of the intervention of malicious entities.

REFERENCES

[1] Hofmeyr, Steven A., and Stephanie Forrest. "Architecture for an artificial immune system." Architecture 8.4 (2006) .

[2] Deaton, R., et al. "A DNA based artificial immune system for self-nonself discrimination." Systems, man, and Cybernetics, 1997. Computational Cybernetics and simulation., 1997 IEEE International Conference on. Vol. 1. IEEE, 1997.

[3] P. Matzinger. Tolerance, danger and the extended family. Annual Review of Immunology, 12:991–1045, 1994. [4] P. Matzinger. An innate sense of danger. Seminars in Immunology, 10:399–415, 1998.

[5] Dasgupta, Dipankar, Zhou Ji, and Fabio Gonzalez. "Artificial immune system (AIS) research in the last five years." Evolutionary Computation, 2003. CEC'03. The 2003 Congress on. Vol. 1. IEEE, 2003.

[6] Ayara, Modupe, et al. "Negative selection: How to generate detectors." Proceedings of the 1st International Conference on Artificial Immune Systems (ICARIS). Vol. 1. Canterbury, UK:[sn], 2002.

[7] Forrest, Stephanie, et al. "Self-nonself discrimination in a computer." Research in Security and Privacy, 1994. Proceedings., 1994 IEEE Computer Society Symposium on. Ieee, 1994.

[8] Taylor, Dan, and David Corne. "An investigation of the negative selection algorithm for fault detection in refrigeration systems." Artificial Immune Systems (2003): 34-45.

[9] J. K. Percus, O. E. Percus, and A. S. Perelson. Predicting the size of the antibody-combining region from consideration of efficient self/non-self discrimination. In Proceedings of the National Academy of Science 90, pages 1691–1695, 1993.

Figure

Figure 2 describes matching under r contiguous bits matching rule

References

Related documents

The risks of traditional alternative investments may include: can be highly illiquid, speculative and not suitable for all investors, loss of all or a substantial portion of

Seminar Assignments: Over the 13 week course, approximately every other week, students will turn in a written assignment in their discussion section to their TA, at the beginning

(iii)For anew class stocks of a company which had already listed shares in the futures market, total face value (of the newly listing class of stocks) must be more than 2,000,000

It included four parts, the first part includes questions related to pharmacy students’ perceptions and knowledge about antimicrobial stewardship, second part

The data from this study calls attention to the fact that frail and pre-frail individuals present a higher number of associ- ated CVD risk factors – four or five factors, in

(2015) and explored its use in the service industry context. Results from this study suggest that trust erosion mainly impacts cognitive-based consumer trust. Although

His research interests span across the fields of software engineering, information systems and business process management.. His ongoing work focuses on combining data mining