• No results found

A Predict Deterministic Finite Automaton for Practical Deep Packet Inspection


Academic year: 2021

Share "A Predict Deterministic Finite Automaton for Practical Deep Packet Inspection"


Loading.... (view fulltext now)

Full text


Procedia Engineering 29 (2012) 2156 – 2161 1877-7058 © 2011 Published by Elsevier Ltd. doi:10.1016/j.proeng.2012.01.279 Procedia Engineering 00 (2011) 000–000




2012 International Workshop on Information and Electronics Engineering (IWIEE 2012)

A Predict Deterministic Finite Automaton for Practical Deep

Packet Inspection

Qiang Wei


*, Yunzhao Li


, Yanjie Chu


aScience and Technology On Blind Signal Processing Laboratory, Chengdu, China


Deep packet inspection has become extremely important due to network security. In deep packet inspection, the packet payload is compared against a set of patterns specified as regular expressions. Regular expressions are often implemented as deterministic finite automaton (DFA) for matching in linear time at high network link rates.

We proposed a predict DFA which can accelerate the processing speed of DFA. A predict DFA uses additional information to predict several next transitions. We tested our proposal on Layer7 rule-set and validated it on real traffic traces, experiments show that our approach offers a significant performance improvement by accelerate rate factors from 1.6 to 2.8 over original DFA.

© 2011 Published by Elsevier Ltd.

Keywords:Deep packet inspection; Regular expression ;DFA

1. Introduction

Packet content scanning is crucial to network security and network monitoring applications. In these applications, the payload of packets in a traffic stream is matched against a given set of patterns to identify specific classes of applications, viruses, protocol definitions, etc.

Due to expressive power, simplicity, flexibility, regular expressions are replacing explicit string patterns in deep packet inspection. Most popular software tools—including Snort[1] and Bro[2]and

networking devices—including the Cisco family of Security Appliances[3] and the Citric Application

Firewall[4]— use regular expressions as their pattern language.

* Corresponding author. Tel.: 086- 028-8659-5396; fax: 086-028-8659-5365.

E-mail address: weiqianglg@gmail.com.

Open access under CC BY-NC-ND license.


Finite automatons are typically employed to implement regular expressions. There are two types of finite automation: Non-deterministic Finite Automaton (NFA) and Deterministic Finite Automaton (DFA). Unlike NFA, DFA has a predictable memory bandwidth requirement, processing an input string involves one DFA state traversal per character. Consequently, DFA is a preferred method at high network link rates. But DFA corresponding to large sets of complicate regular expressions requires prohibitively amount memory.

When DFA cannot satisfy the matching speed, a stride-k DFA is considered [5]. A stride-k DFA consumes k characters per state transition rather than just one, thus yielding a k-fold performance increase. The basicproblem in building a DFA of stride k, or k-DFA, lies in the memory requirement. In fact, given a DFA defined on an alphabet Σ, the non-compressed k-DFA will have |Σ|k outgoing transitions per state.

In this paper we propose a predict DFA, a solution accelerating the processing speed of DFA. When constructing a predict DFA, some additional information is added to DFA. Comparing with k-DFA, a predict DFA nearly equal DFA for size, and can process more than one character at the same time.

2. Related work

To cope with the prohibitively amount memory of DFA, Several algorithmic techniques have been proposed.

Yu et al [6] proposed segregating rules into multiple groups and evaluating the corresponding DFAs concurrently. Yu’s solution decreases memory space requirements but increases memory bandwidth linearly with the number of active DFAs.

D2FA [7] exploit the redundancy present in the transitions between states and save 90% space. D2FA leads to a trade-off between the size of DFA representation and the memory bandwidth required to traverse it. Following D2FA, CD2FA [8] was proposed by using recursive content labels to reduce the diameter bounds of D2FA to 2 with one 64-bit wide memory access. In [9], Becchi limited the bound of the number of default paths in D2FA, and improve the worst case of D2FA.

Hybrid FA [10] combines the benefits of DFA and NFA. Any nodes that would contribute to state explosion retain an NFA encoding, while the rest are transformed into DFA nodes. Hybrid FA gets size nearly as same as an NFA and small memory bandwidth of a DFA.

In [11], XFA was proposed to use finite automata plus sketch memory to solve the problem. The auxiliary memory (data domain in their terminology) was used to keep track of Kleene closure and length restrictions. However, XFA requires to manually generating EIDD (Efficient Implementable Data Domain) for each regex which could be very time-consuming and error prone if the operators are not experienced.

Besides these, Alphabet reduction is a basic technique for mapping the set of symbols found in an alphabet to a smaller set by grouping characters that label the same transitions everywhere in the automaton[5][9].

Majumder designed a high performance regular expression matching system that employs DFA state caching [12]. In [12], high frequency and densely connected DFA portions was extracted and stored in a static cache for accelerating the matching speed of DFA.

3. Predict DFA

Our approach is not designed for compress the space of DFA but for accelerating it. A predict DFA is orthogonal to other methods such as D2FA, hybrid FA and states caching.


3.1.·Motivating example

We introduce our approach using an example. Figure 1(a) shows part of a standard DFA defined on alphabet ASCII that recognizes the rlogin and RTSP protocol from Linux layer 7 filter [13], p1

=^[a-z][a-z0-9][a-z0-9]+/[1-9][0-9]?[0-9]?[0-9]?00, p2=rtsp/1.0 200 ok. In this DFA, state 0 is the initial state.

Fig.1 (a) Part of DFA which recognize the expressions ^[a-z][a-z0-9][a-z0-9]+/[1-9][0-9]?[0-9]?[0-9]?00, rtsp/1.0 200 ok; (b) Predict DFA table for Fig.1(a)

It shows that transitions from a state to its next states are not evenly distributed in figure 1(a). Assuming characters are evenly distributed in packets payload, when current state is 0, next state will be state 2 with much smaller possibility (1/256) contrasting to state 3 with possibility of 230/256. So we can predict that state 0’s next state will be state 3, and the next two state of 0 will be state 3 with possibility of 0.89(230/256*255/256).

Figure 1(b) shows the 2-step predict table of the above DFA. All the tables are all implemented on hardware (FPGA or ASIC). When processing the ith character in Str, the i+1th and i+2th character are read in parallel at the same time. Consider the operation on the input string AB@#. Str[0]=A makes the initial state 0 go to state 3. At the same time, the 1-step predict table gives state 3 for predication on state 0, Str[1]=B makes state 3 go to itself. The 2-step predict table also gives state 3 on state 0, state 3 still goes to itself on Str[2]=@. In this situation, both of the predict table give correct prediction, so the predict DFA now can jump to process Str[3]=# on state 3, and the 4 characters only cost 2 clock circles.

Note that the predict DFA is a combination of DFA and predict table. It is obvious that the (i+1)-step predict table relies on the i-step one, because if the prediction of the i-step is wrong, there is no need to do the (i+1)-step. If the predict tables are computed properly, we could expect that most of the input packets can be processed much quicker than the original DFA.

3.2. Computing the predict tables 3.2.1. Convert DFA to Markov chain

All the predict tables can be computed easily if characters are evenly distributed and irrelevant. We first compute the transition probabilities of the original DFA. The transition probability between state m

and state n is equal to the proportion of transitions number between them to alphabet size (typically 256). After that, the original DFA is converted to a homogeneous Markov chain. Figure 2 shows the Markov chain corresponding to figure 1.


It is well known that the steady state distribution of a irreducible ergodic Markov chainπ can calculated by [1,1...1]T 1 P π π π = ⎧ ⎨ = ⎩ (1)

Where P is the transition matrix. Generally, we can estimateπ by


π π= (2)

Where (0) [1,0,...0]π = is the initial distribution, l should be big enough. By orderingπ , we can pick

the most frequently visited states into set S1 for prediction. For each picked states m in S1, the 1-step

predict state n is decided by

( , ) max ( , ) k S

p m n p m k

= (3)

Where S is the states set, p(m,k) is the transition probability between state m and state k. If p(m,n)<0.5,

state m will be removed from S1. The i-step popular set Si is initialized by Si-1 and the i-step predict state n

for state mis decided by

( , ) max ( , ) i i k S p m n p m k ∈ = (4)

Where pi(m,k) is the i step transition probability which can be calculated by Chapman-Kolmogorov

equation. If pi(m,k) <0.5, state m will also be remove from S i .

3.2.2. Learning from real data streams

In real network data streams, characters are relevant and not evenly distributed, the Markov chain model is not proper. So we introduce a training mechanism to compute the predict tables. The pseudo-code of the algorithm is provided below.


input: dfa: the original DFA, Ni (1≤ ≤i n): size limit of the i-step predict table, training_data: the

training network data stream.

output:Ti (1≤ ≤i n): i-step predict table

use dfa to match training_data, pick the most N1frequently visited states into S0;

initialize round r=1;


clear Count, tempT; Sr =Sr−1;

cur_state=initial state of dfa;

for character c in training_data {

next_one_state = dfa (cur_state, c);

if cur state S_ ∈ r and all Ti (1≤ <i r) predict right{

next_rth_state = the next rth state from cur_state; Count[cur_state->next_rth_state]++;


cur_state = next_one_state; }

for s Sr

pick next state twhere Count[s->t] is maximum for state s, add s->t to tempT; order tempT by the value in Count, pick the top Nr ones, add to Tr;

remove items from Srwhich cannot be find in Tr;


until r=n

--- In this algorithm, we first pick the most frequently visited states. Based these, we compute all the predict tables one by one recursively. The sequence of the size limit Ni (1≤ ≤i n) is an important factor

for performance.

4. Experimental results

To focus on regular expressions commonly used in networking applications, we select the Linux layer 7 filter [13] which contains 70 patterns for detecting different protocols. We segregated them into 4 groups using Yu algorithm [6] and then made the corresponding predict DFA for each group.

In our experiments, we used the real-world TCP and UDP network packet trace obtained from the MIT Lincoln Laboratory [14]. This data set contains payload captured during a particular week. We used Monday data for training, and the rest for testing. Because of small steps (≤8) and predict table size (≤5), both of the Markov chain and learning algorithm gave the same results in our experiments.

Becchi’s C++ code was used to convert regular expressions into DFAs and match the network data stream for simulation.

Figure 3 shows the result for 1-step predict DFA. In figure 3(a), the original DFA which has 3719 states is from group 1. Most time the 1-step predict table gives right prediction, so the predict DFA can processing 2 characters in a single clock circle. In figure 3(b), we made 4 predict DFA for the 4 DFAs, and tested them on Tuesday’s data set. The results show that the 1-step predict DFA can offer a significant performance improvement by accelerate rate factor over 1.6.

Figure 4(a) shows the result for multi-step predict DFA. In this experiment, we made Nn= Nn-1=…= N1

for convenience. The original DFA is from group 2 which has 3829 states and the test data set is Tuesday’s. When the number of the predict tables increases, the accelerate rate factor increases too, but there are two limitations. The first one is predicting steps. When the predicting steps increase, the right probability for prediction decreases, so the growth speed of the accelerate rate slows down. Figure 4(b) illuminates this situation. In figure 4(b), Nn= Nn-1=…= N1=5; the second limitation is resource. Too many

predict tables are impossible to store for hardware.

1 1.5 2 2.5 3 3.5 4 4.5 5 1.89 1.9 1.91 1.92 1.93 1.94 1.95 1.96 1.97 1.98

size limit of 1-step predict table

acce le ra te r ate Tuesday Wednesday Thursday Friday 1 1.5 2 2.5 3 3.5 4 4.5 5 1.6 1.65 1.7 1.75 1.8 1.85 1.9 1.95 2 2.05 2.1

size limite of 1-step predit table

ac cel er at e r at e

group 1 DFA(3719 states) group 2 DFA(3829 states) group 3 DFA(3124 states) group 4 DFA(1530 states)


(a) (b)

Fig.3 (a) accelerate rate for different days on DFA from group 1; (b) accelerate rate for different DFAs on Tuesday’s data set


1 1.5 2 2.5 3 3.5 4 4.5 5 1.6 1.8 2 2.2 2.4 2.6 2.8 3

size limite of multi-step predict table




te ra


1-step predict DFA 2-step predict DFA 3-step predict DFA 4-step predict DFA

1 2 3 4 5 6 7 8 1.5 2 2.5 3 3.5 steps of predicition ac ce le rat e r at e

Fig.4 (a) accelerate rate for multi-step DFA on Tuesday’s data set for DFA from group 2; (b) accelerate rate for different steps predict DFA on Tuesday’s data set for DFA from group 2

5. Conclusions

In this paper, we presented a predict DFA for accelerating the original DFA on hardware devices. In order to computing the predict tables, we proposed two methods, the Markov chain for simply situation and the learning algorithm for practical network. The predict DFA is orthogonal to other methods such as D2FA and hybrid FA, they can be used together to improve performance and space explosion. Experiment results shows that our predict DFA implementation is 1.6 to 2.8 orders of magnitude faster than DFA.


[1] Snort network intrusion detection system. http://www.snort.org.

[2] Bro intrusion detection system. http://bro-ids.org.

[3] Cisco systems. Cisco ASA 5505 adaptive security appliance. http://www.cisco.com.

[4] Citrix systems. Citrix application firewall. http://www.citrix.com.

[5] Brodie BC, Cytron RK, Taylor DE. A scalable architecture for high-throughput regular-expression pattern matching.

ISCA 2006.

[6] Fang Yu, Zhifang Chen, Yanlen Diao, et al. Fast and memory-efficient regular expression matching for deep packet

inspection. Proceedings of the 2006 ACM/IEEE Symposium on Architectures for Networking and Communications Systems(ANCS). 2006. 93-102.

[7] Kumar S, Dharmapurikar S, Fang Yu ,et al. Algorithms to accelerate multiple regular expressions matching for deep

packet inspection. Proceedings of the 2006 conference on Applications, Technologies, Architectures, and Protocols for Computer Communications. 2006. 339-350.

[8] Kumar S, Turner J, Williams J. Advanced algorithms for fast and scalable deep packet inspection. Proceedings of the

2006 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS). 2006. 81-92.

[9] Becchi M, Crowley P. An improved algorithm to accelerate regular expression evaluation. ANCS .2007.145-154.

[10] Becchi M, Crowley P. A hybrid finite automaton for practical deep packet inspection. Proceedings of the International

Conference on emerging Networking EXperiments and Technologies (CoNEXT).2007.1-12.

[11] Smith R, Estan C, Jha S. XFA: Faster signature matching with extended automata. IEEE Symposium on Security and

Privacy. 2008. 187-201.

[12] Majumder A, Rastogi R, Vanama S. Scalable regular expression matching on data streams. SIGMOD. 2008.161-172.

[13] Levandoski J, Sommer E, Strait M. Application layer packet classifier for Linux. http://l7-filter.sourceforge.net/.


Related documents

and incidentals) for the length of the program, a financial support document in the form of either the applicant’s bank statement or a certified affidavit of

ii PPG2r dnjijjcwou qnwmiG2 j LGJOu qnwm!c2 o Juqfr2LA qnwwic2 uq o ic o G2wpJ!2PWGUI qnwuJc. otG: nujc oipuii wtcq COUtLOJ jo ucjnqcq !U ccp ccIniou !AGLG G uq 1f2 2dflLG injc

Furthermore these distributions can be extended to a multivariate context and may be estimated in several steps and thus we propose to estimate portfolio VaR

Emergency Fund, this investment would bring total funding for universal influenza vaccine development to $123 million in FY 2015. Investing in Health Care Facilities Construction

The interaction variable between

for message injection can be easily ported to other smart phone platforms if these allow application level access to the serial lines of the modem or the ability to replace or add

Key words: hybridization, secondary contact, insecticide resistance evolution, gene flow, invasive

Additional simulation parameters which remain constant during a set of trials specify the runway spacing, landing trajectories (including headings and glideslope angles), escape