2017 2nd International Conference on Software, Multimedia and Communication Engineering (SMCE 2017) ISBN: 978-1-60595-458-5
RSNC-based Mechanism against Content Pollution
Attack Method in NDN
Tao FENG and Wu-dian KOU
School of Computer and Communications, Lanzhou University of Technology, Lanzhou 730050, China
Keywords: Named data networking, Content pollution attack, Request-specified network coding, Homomorphic signature.
Abstract. Network coding facilitates content distribution for named data networking (NDN), which improves the cache hit ratio, decreases average download time and reduces transmission amount. However, NDN is likely to cause content pollution attacks because of the cache mechanism and network coding. Request-specific network coding (RSNC) is a new class of network coding mechanism, by introducing the homomorphic hash mechanism, and this paper proposes a new method based on RSNC mechanism against content pollution attack in NDN. The new method, firstly, ensures consumers receiving linear independent coding blocks by using RSNC. Secondly, the content server adopts content-chunk-based transmission mechanism, and binds the name and content by homomorphic signing the transferred content blocks and names. Thirdly, using dynamic public key technique, our scheme authenticates each generation without updating the initial secret key. Finally, using homomorphic hash can verify the signature receives by intermediate nodes or the destination nodes. By security analysis and proof, the new method can resist the content pollution attack in the named data networking while guaranteeing the network performance and resist the Intra/Inter-GPAs at the same time.
Introduction
Named Data Networking (NDN) is one of the future Internet architecture, which name is used for routing. And router for NDN can cache content, so that data transfer is faster and the efficiency of content retrieval is improved.
However, caching the entire data block in the router makes the cache inefficient. In order to improve cache efficiency, Liu Yan et al. [1] proposed a novel scheme of Request-Specified Network Coding (RSNC) by chunk-based transmission mechanism. The scheme improves the cache hit ratio, decreases average download time and reduces transmission amount. However, relying on the cache causes the attacker carrying out some content pollution attacks (CPA) [2]. In addition, the introduction of network coding in the transmission can cause the pollution attack by the malicious nodes, resulting in the destination node can’t be decoded correctly.
For CPA, NDN usually use a signature verification mechanism to resolve it [3]. NDN is named for content. For named content, the name and content will be bound together for certification [4]. The signature formula is M CN P C
, ,
CN C, ,SignP
CN C,
. In the formula, CN represents the name of content C,P represents the content provider, and M represents the text of the transmission. This signature formula solves the problem that existing methods can’t fully ensure data validity, relevance and provenance of three content attributes.of Intra-GPAs, but the problem of Inter-GPAs is not discussed. For the problem of Inter-GPAs, GuangJun Liu et al. [8] proposed a new homomorphic signature scheme using dynamic public key technology, which not only guarantee Intra-GPAs, but also resist the Inter-GPAs without renewing the initial private key to authenticate each generation. Feng et al. [9] proposed a secure network coding against CPA in NDN. Firstly, the dynamic public key technology is used to authenticate each generation without renewing the private key. Secondly, the message received by the intermediate node and the destination node is verified by using the homomorphism of the hash function. This scheme can not only resist Intra-GPAs / Inter-GPAs, but also can prevent CPA for NDN.
With further research, the scheme of [9] can resist Intra-GPAs / Inter-GPAs. However, this scheme does not show Interest packet how to request coding block. In addition, the signature of the scheme is only for contents, which can’t resist name tampering to ensure that the content and the name of the relevance. The scheme of [1] employ RSNC to improve network performance. But, the scheme does not include signatures and can’t resist CPA caused by caching and network coding.
In the NDN, the intermediate nodes need to verify the content passed, which makes the choice of smaller overhead scheme. Inspired by RSNC of scheme [1], we propose a new method of RSNC-based against CPA for NDN by introducing the homomorphic hash function in the new network coding mechanism with the consideration of content relevance. Firstly, the new scheme ensures that consumers receive linear independent coding blocks by using RSNC. Secondly, the content server adopts chunk-based transmission mechanism, taking into account the signature formula M CN P C
, ,
CN C, ,SignP
CN C,
of NDN, and binds the name and content by homomorphic signature on the transferred content blocks and names. Thirdly, using dynamic public key technique, our scheme authenticates each generation without updating the initial secret key. Finally, using homomorphic hash can verify the signature receiving by intermediate nodes or the destination nodes. Our scheme is suitable for NDN, which can not only against CPA for NDN, but also effectively against Intra-GPAs / Inter-GPAs.Therefore, our scheme should resist the CPA. It needs to guarantee the three content attributes of data validity, relevance and provenance in low communication overhead and calculation cost.
RSNC Method for NDN
Liu Yan et al. [1] proposed RSNC providing network coding-based multisource content delivery mechanism, which enhances network performance. RSNC is implemented on chunk-based and hash-based routing schemes [10].
RSNC is introduced NDN, each Interest requests for a set of chunks,U
C C1, 2,...,CN
, which may be stored in different places in the network. Therefore, the Interest will be divided into several sub-Interests before arriving at the destinations. Every sub-Interest requests for a subset Ui of U.Suppose consumer1 requests for a set of original blocks U1, and consumer2 requests for U2, then
U n,
can satisfy both of the consumers, if UU1 U2 is the set of original blocks to be used for encoding nmax |
U1|,|U2|
linearly independent coded blocks. So, the node needs to request onlyn coded blocks from its upstream nodes. When n|U|, the content delivery amount required from the upstream is reduced. Therefore, we define the following aggregation operation to combine two Interests,
U n1, 1
and
U n2, 2
, that must come from at least two different consumers for blocks of the same content. The aggregation operation is defined as follows:
1, 1
2, 2
,
def
U n U n U n , whereU U1 U2, and nmax
n n1, 2
(1)
,
3, 3
, 4, 4
def
Div U n U n U n , where
U n3, 3
U n4, 4
U n,
,3 4
U U , UU3 U4, n3 min |
U3|,n
, n4 min |
U4|,n
(2)Based on the idea of separation operations, increment Interest can be determined. Suppose Interest
U n2, 2
arrives after Interest
U n1, 1
has been sent out and currently the aggregated pending Interestis
U n,
U n1, 1
U n2, 2
. However, as
U n2, 2
may be contain some blocks that
U n1, 1
has been requested, those blocks need to pick out from it. So, increment Interest is denoted as follows:
2, 2
\ 1, 1
2, 2
def
U n U n U n , whereU1U1 U2
1 min | 1|, 2
n U n
,U2U2\U1 and n2 min |
U2|,n2
(3)By this way described above, an Interest
U n,
will be divided several times before arriving destinations, and the sub-Interests will form a multicast tree. Suppose the tree has form in Figure 1, where W W, 1 and W2are sets of original blocks cached in the nodes. Since we have W1U1 and1 1
W U , the leaf nodes of the tree do not need to send Interests to other nodes. These coded blocks will be sent back from the leaf nodes to the branch node. In the branch node, the coded blocks of
U n1, 1
and
U n2, 2
are used to generate the new coded blocks of
' '
,
U n . Then employing the stored original blocks set W and the coded blocks of
' '
,
U n , the coded blocks of
U n,
can be generated. However, if |Ux|nxcan be satisfied, network coding is not implemented. [image:3.595.148.455.422.580.2]A branch node
Figure 1. Multicast tree.
Obviously, since the same coded blocks cached in different nodes may be responded to Interests multicasted by the same consumer, the received coded blocks can’t be cached in the branch node for satisfying future Interests. Therefore, each node only caches or stores original blocks. The received coded blocks will be used only for responding the pending Interests and be collected locally for gathering enough coded blocks to decode more original blocks.
RSNC-based Network Model for NDN
combinations and generation number. The Content carries the original/coded content block. For convenience, major information of Interest packet and Data packet show as follow:
(1) Interest O V m
: ,
: It is an Interest packet for content O, V is the set of original block indices, and m| V | is the required number of coded blocks that are encoded by the original ones specified by the set V .(2) Data O Y block
: , ,
: It is a Data packet of contentO, contains blockand signature , an original block or a linear combination of the original blocks specified by the setY .In the model, every node maintains three data structures: FIB(Forwarding Information Base), CS (Content Store) and PIT(Pending Interest Table). Each CS entryCS O W
:
contains following information: Content Name O and Indices Set W of stored content blocks.Differently from traditional NDN, in this model every node has multiple PITs, each for an interface of the node. Each PIT records the sending and receiving Interests of the corresponding interface.
(1) PITOUT out i O V m
: : ,
: It denotes that
O V m: ,
is the cumulative request contained in one or more Interests sent out from interface i, and the node has not received the response for the cumulative request from interface i.(2) PITIN in i O V m
: : ,
: It denotes that
O V m: ,
is the cumulative request contained in one or more Interests received from interface i, and the node has not responded it yet.CPA for NDN
This section describes the scenarios of CPA for NDN.
Content pollution is an attack, in which an attacker injects fake content into the cache of router. Specifically, we can divide this attack into two types. Firstly, an attacker attempts to compromise the integrity of the content by injecting or modifying the content block of the data packet transmitted in NDN, which make consumer receiving invalidity data packet. Secondly, the attacker attempts to modify the name of the data packet in the network, but the content block of the data packet is correct and has not been modified. In this situation, the attacker is to make consumers can’t retrieve the required content in order to achieve a denial of service attacks, or to make consumer getting irrelevant data packet, because the name and content do not match. The purpose of this attack is that legitimate nodes can’t identify these faked content in data packets. Due to the propagation mode of the NDN, these polluted content packets will pollute the information flow of the whole network.
In this paper, we consider this kind of initiative CPA. For example, an attacker targets a specific router of R. Attacker send a polluted content to the router of R, and it will be the router of cache. Therefore, R receive a polluted content. The router will send this polluted content, leading to polluting the entire network, if it receives a real Interest packet.
RSNC-based Signature Scheme in NDN
Figure 2. RSNC-based signature scheme in NDN.
Setup The original content server serves as source node generating content of O.The original content server divides the content of O into chunks and groups the content chunks into h
generations. Each generation contains N chunks, having generation identifier id . The original content server generates the content and initializes our scheme. The process is as follows:
(1) Given the system parameters 1 , m , n , p , q , g , where p and q are prime, and 2
q ,and ( )o g q. Let o g
denotes the order of g in the field Fqof prime order, and Fqis a subfield of Fp.(2) Choose n1 random numbers u ii( 0,1, , )n from Fp combing private key of the system such that: sk {uiF iq| 0,1, 2, , }n , and generate the public key of the system:
{ ui | 0,1, 2, , }
i
pk p g i n .
Sign In the NDN, the consumer send Interest to the network, and separate Interest into sub-Interests according to the Eq.(2). To facilitate multisource parallel transmission, a hash-routing scheme is employed. In the hash-routing, a content request will be forwarded to the responsible cache by matching the hash of the content identifier to one of the cache identifiers instead of the original content server. In this process, the Interest is separated and combined according to the Eq. (1) and Eq. (3), and sub-Interest reaches the routing node specified by the hash route finally. If such a sub-Interest is requested for the first time, the specified routing node haven’t the requested content blocks, and the specified routing node will send the Interest to the original content server. The original content server will receive several sub-Interests from several interfaces, and Interest O U n
: x, x
comes from interfacek, where
1, 2,...,
x
x n
U V V V and |Ux|nx. Original content server have whole required block. Content server forwards nx required content blocks and the corresponding signatures in turn from the receiving interface k. These original blocks of content can be cached in any router they pass through in order to respond to future Interests. Content server signs as follows:
(1) Randomize sk according to generation identifiers id and content names cn, such that
0, ,1 2,..., n
qS s s s s F , wheres0 u cn0 1,si
ui id u0
cn 1
and
i1,...,n
.(2) Compute the signature of Vi, such that : 1
0
n i j ij
j i
s cn s v
s cn
.Combine The intermediate node receives data packet Data O Y block
: , ,
from the interfacek. Data packet passes the verification according to verify algorithm. The intermediate node looks up PIT IN entry. Suppose intermediate node currently has n original/coded blocks b1,...,bn. The indices set of block bk is Bk . Let U kBk , suppose PITin entry of interface j existsPITIN in j O U n
: : x, x
, Y Ux and U Ux& &nnx. Pick up 'n blocks related to
: : x, x
PIT IN in j O U n . If '
n r, the node need to generate nxcoded blocks. Firstly, the node generatenxgroups of linearly independent coding coefficientsc ci1, i2,...,cilFqusing random network
ve
ri
fy
ve
ri
fy
combine
coding technique, where i1,...,nx. Secondly, the node input a group of 1,...,l, corresponding
1, 2,...,
n l q
V V V F and nx groups of c ci1, i2,...,cilFq . Finally, the node output nx groups of
1 1 1
, ,
i
l l l
i i W ij j ij j ij j
j j j
W c c V c
forming coded data packet, where i1,...,nx. Deletethis PITIN in j O U n
: : x, x
entry.The intermediate node sends nx coded data packets to the network through the interface j. The coded data packet can’t be cached locally by the router their passes through to response future Interest.
Verify Suppose that the intermediate node or destination node receive data packet
: , ,
Data O Y block through the interface k. The intermediate node looks up PITOUT entry about interfacek. If there is PITOUT out k O V m
: : ,
entry for content O and Y V , the data packet was requested by the node. If the data packet is through the verification, the data packet carried original block is added to CS, data packet carried coded block is added to cache temporarily. The entry of PIT OUT is updated or deleted. The node checks out whether more original blocks can be obtained by decoding the blocks in CS and Cache. If the data packet can’t pass the verification, the node discards it. Then, the data packet is forwarded according to Algorithm 3 described in [1]. Finally, the consumer (destination node) caches linearly independent data packet about O in order to recover the original content. The verification algorithm is as follows.The intermediate node or destination node computes 0
W
H g H W , ' 1
1 mod
i
l c cn i i
H
g p, where
1
mod
ij
n v
i j
j
H V g p
, g0p0,gi p pi 0id
i1, 2,...,n
, and checks whether H H'.Correctness and Security
Homomorphism of Hash Function H( ) Currently, most signature algorithms are based on
homomorphic hash function, and Korhn etc. [11] use homomorphic hash function construct the verification algorithm. Then prove the homogeneity of the hash function in our scheme.
A trusted party globally generates a set of hash parametersG
p q g, ,
, wherep andq are two large random primes such that |q p1. The hash parameter g is a 1n row-vector, composed of random elements of Fp , all orderq. That is, g
g g1, 2,...,gn
, where gi 1modp,giFp , 1 i n.Let h h1, 2,...,hm denote the hash value for content blocks V V1, 2,...,Vm, that is the hash value for content blocks V V1, 2,...,Vm is generated as follows:
1
mod ,1
ij
n v
i i j
j
h H V g p i m
Proof: we suppose that Y
y y1, 2,...,yn
Fqn andZ
z z1, 2,...,zn
Fqn, then we have: Form the definition of functionH( ), we can get:
1 1 1 1
i i i i i i
n y z n y z n y n z
j i i i i
j i i i
H YZ
g
g g
g
g H Y H ZTherefore, the hash function H( ) is additive homomorphic function for arbitrary Y Z, Fqn. For arbitrary coded content blocks
1
1 ,...,
m n i i
i
W w w c V
1
1 1
1 1 1 1 1
mod mod
mod mod mod
m i ij
j i
i
i ij ij i
c v
n w n
j j
j j
c
n m c v m n v m c
j j i
j i i j i
H W g p g p
g p g p p h
The Correctness of the Scheme Theorem 1 W
w1,...,wn
and W is a combined content block and signature received from one node. There is1
l i i i W c V
,H g0WH W
, '1 mod i l c i i
H g p
, if'
H H , W is not polluted.
Proof: To the original content blockVi, we have:
01 1
0 0
1
1 1
0 0 1 mod 1 mod
mod mod
ij j ij
i i i
n n
i j ij j ij
j j i
n v s cn n s cn v
i j j j
s cn s cn s v cn s s cn v
s cn
i
g H V g g p g g p
g g p g p g
To the combined content block
1
l i i i W c V
, we have:
1 1 0 10
1 1
0 0 1 1 1
1 1 1
mod mod
mod mod mod
n
l l
i j ij
i i i ij
j
W i i i i
n i i j ij
j i i i
c cn s v
c c v
n l c s cn l
j
j i i
c s cn cn s v
l l c s l c cn
i
i i i
H g H W g g p g g p
g p g p g p
And by the known public parameters, we have: ' 1
1 mod
i
l c cn i i
H
g p, so H H'.The analysis of security In the NDN network, the original content server send the data packet (including signature) to the network. And other node in the network haven’t private key. In addition, it is infeasible for the attack node to compute private key of theid-th from the public key.
Theorem 2 For the given public key pk
p ii| 0,1, 2,...n
, it is infeasible to calculate the privatekey sk
uiF iq| 0,1, 2,...,n
such that gui pk supposingCDHP is a difficult question.Proof: If there is an algorithm f which takes pk as input and outputs sk, then one could use f
to compute
, ui
i
u f g g from g, thus this is contradictory to hypothesis.
We have already proved that pk cannot push outsk. In our scheme sk is vector group randomly selected by the content server S in Fq, so the possibility to find out sk is1/q. This possibility can be ignored when q is big enough, so we can't calculate the corresponding secret key of id-th generation. In the case of multi-generation transmission, due to the combination operation of intermediate nodes may lead to the Intra-GPAs/Inter-GPAs. First of all, polluted content is defined as follows:
Definition 1 If W is a fake content in the id-th generation, this message is not belong to the linear subspace of id-th generation. i.e.,
1 ,...,
id id m
Wspan V V , where 1 ,...,
id id m
V V denote all the source message of the id-th generation,
1 ,...,
id id m
span V V denote the linear subspace of 1 ,...,
id id m V V . Next, we prove the security of this algorithm in two aspects as follows.
The security of intra-GPAs Suppose all messages in following proof from the id-th generation.
Theorem 3 W
w1,...,wn
is a coded block, i.e.,1
l i i i W c V
. Computing 0
W
H g H W and
1
'
1
i
l c cn i i
Proof: If W is not polluted, according to the Definition 1, W is a linear combination of the source message, so H H' according to the Theorem 1. Therefore, we mainly prove if H H'.
Using reduction to absurdity, we assume that the attacker A aims at faking content W , and makesH H', then we will discuss the possibility of such a situation.
(1) Given the fake content W
w1,...,wn
, which make the equation'
H H , then we have:
1
0 1 1
j j
W n w l c cn
j j
j j
g g g
This is equivalent to solving the problem of discrete logarithm. (2) Given
W
, computes Wspan V
1,...,Vm
, which makes'
H H , then we have:
1
0 1 1
j j
W n w l c cn
j j
j j
g g g
We assume that the first n1 elements in W
w1,...,wn
have been confirmed, thus the lastelement wn is computed as follows:
1 1 1 0 1 j n j W
l c cn j j w
n n w
j j g g
g g
.Obviously, if we want to computewn, we still need to solve the discrete logarithm problem. (3) Given the fakecn, which makeH H', then we have:
1
0 1 1
j j
W n w l c cn
j j
j j
g g g
This is equivalent to solve the problem of discrete logarithm.
From above we can know that the attacker can’t fake message, which makes
1
0 1 1
j j
W n w l c cn
j j
j j
g g g
. So, if1
0 1 1
j j
W n w l c cn
j j
j j
g g g
, we can judge W isn’t polluted.The security for Inter-GPAs The attacker can take advantage of a valid message W of the k-th generation to act as the message of the ki-th generation. Because the signature of this message is calculated by the source, this message can passed the verification. Therefore, it can forge a message of the ki-th generation. We will prove our scheme can resist inter-GPAs.
We consider a content block W of the k-th generation, where W k denote the signature of the content block W in the k-th generation. According to sign algorithm, we have:
The private key group of the k-th generation is S k
s0 k ,s1k ,...,sn k
, where 0 0 1k
s u cn ,
10
k
i i
s u k u cn ,
i1, 2,...,n
, that is:
1
1
1
0 , 1 0 ,..., 0
k
n
S u cn u ku cn u k u cn
The corresponding signature is: 1
1
n
k k
W i j n
j
s cn w s cn
.If an adversary employ content W of the k-th generation to forge a content of the ki-th(i0) generation, assume that the private key of the ki-th generation is:
1
1
1
0 , 1 ,..., 0 , 1 0 ,..., 0
k i k i k i k i
n n
S s s s u cn u ki u cn u k i u cn
Obviously, S k Sk i , therefore, when the intermediate nodes verify this message, have:
1 1 1
0 0
1 1 0
0 1 0 1
n n
k k k
i j j j j
k k k k k
j j
W i W i i i i
s cn s cn w s s cn s w cn cn
n w n s w cn s u k i u
i
i i
g g g g g g g g g
For the given ki, by above proof, the possibility 0 1
1 1
k
W n wi l c cni
i i
i i
g
g
g can be ignored. So, we conclude
1 1 0 1 k i i cn l k i u
u c cn
i i
g g g
To sum up, our signature scheme can ensure the relevance of the content, the validity. So our scheme can resist CPA in NDN. In the case of the generation of transmission, the algorithm is secure against Intra-GPAs /Inter-GPAs respectively.
Characteristics Comparison
[image:9.595.60.541.248.371.2]Basing on RSNC in [1], our scheme bring the homomorphic hash function, and employee dynamic public key technique and the NDN signature formula forming RSNC-based homomorphic signature scheme to resist CPA in NDN, and our scheme can resist Intra-GPAs /Inter-GPAs in multi-generation transmission. As shown in Table 1, we compare the functions of this paper with those in [5], [8] and [9]. The scheme is the only RSNC-based homomorphic signature scheme that can improve hit ratio and is the only homomorphic signature scheme to ensure the content relevance.
Table 1. Characteristics comparison in [5], [8], [9] and our scheme.
Relevance Validity Provenance Hit Ratio Resist Inter-GPAs
[5] × √ √ --- ×
[8] × √ √ --- √
[9] × √ √ --- √
Our Scheme √ √ √ Improvement √
Summary
In this paper we propose a new method of RSNC-based mechanism against CPA for NDN. The RSNC-based homomorphic signature scheme ensures the network performance and enhances the ability of resisting the CPA in NDN at the same time. Obviously, the application of the network coding in NDN is still in the stage of theoretical proof and experimental simulation, which is not really applied to the actual application environment, so the conclusion of our scheme has limitations.
Acknowledgments
Project supported by the National Nature Science Foundation of China (No. 61462060).
References
[1] Liu Yan, Yu S Z. Network coding-based multisource content delivery in Content Centric Networking [J]. Journal of Network & Computer Applications, 2016, 64(C):167-175.
[2] Ghali C, Tsudik G, Uzun E. Needle in a Haystack: Mitigating Content Poisoning in Named-Data Networking[C] The Workshop on Security of Emerging NETWORKING Technologies. 2014:68-73.
[3] Gasti P, Tsudik G, Uzun E, et al. DoS DDoS in Named-Data Networking[J]. Acm Sigcomm Computer Communication Review, 2012, 44(3):66-73.
[4] Smetters D, Jacobson V. Securing network content [J]. 2009.
[5] Yu Z, Wei Y, Ramkumar B, et al. An Efficient Signature-Based Scheme for Securing Network Coding Against Pollution Attacks[C]. INFOCOM 2008. the, Conference on Computer Communications. IEEE. IEEE, 2008.
[7] Chou P A, Wu Y, Jain K. Practical network coding [C]. Proceedings of the Annual Allerton Conference on Communication Control and Computing, 2003.
[8] Liu G, Wang B. Secure Network Coding Against Intra/Inter-Generation Pollution Attacks [J]. China Communications, 2013, 10(8):100-110.
[9] TaoFeng, XiaomeiMa, XianGuo, and Jing Wang. Secure Network Coding against Content Pollution Attacks in Named Data Network. Journal of Advances in Computer Networks, Vol. 3, No. 4, December 2015
[10] Saino L, Psaras I, Pavlou G. Hash-routing schemes for information centric networking[C]. ACM SIGCOMM Workshop on Information-Centric NETWORKING. 2013:27-32.