• No results found

Symmetric Private Polynomial Computation From Lagrange Encoding

N/A
N/A
Protected

Academic year: 2021

Share "Symmetric Private Polynomial Computation From Lagrange Encoding"

Copied!
22
0
0

Loading.... (view fulltext now)

Full text

(1)

arXiv:2010.09326v1 [cs.IT] 19 Oct 2020

Symmetric Private Polynomial Computation From

Lagrange Encoding

Jinbao Zhu, Qifa Yan and Xiaohu Tang

Abstract

The problem of X-secure T -colluding symmetric Private Polynomial Computation (PPC) from coded storage system with B Byzantine and U unresponsive servers is studied in this paper. Specifically, a dataset consisting of M files are stored across N distributed servers according to(N, K + X) Maximum Distance Separable (MDS) codes such that any group of up to X colluding servers can not learn anything about the data files. A user wishes to privately evaluate one out of a set of candidate polynomial functions over the M files from the system, while guaranteeing that any T colluding servers can not learn anything about the identity of the desired function and the user can not learn anything about the M data files more than the desired polynomial function, in the presence of B Byzantine servers that can send arbitrary responses maliciously to confuse the user and U unresponsive servers that will not respond any information at all. Two novel symmetric PPC schemes using Lagrange encoding are proposed. Both the two schemes achieve the same PPC rate1 −G(K+X−1)+T +2B

N−U , secrecy rate N−(G(K+X−1)+T +2B+U )G(K+X−1)+T , finite field size and decoding complexity, where G is the maximum degree over all the candidate polynomial functions. Particularly, the first scheme focuses on the general case that the candidate functions are consisted of arbitrary polynomials, and the second scheme restricts the candidate functions to be a finite-dimensional vector space (or sub-space) of polynomials over Fpbut requires less upload cost, query complexity and server computation complexity. Remarkably, the PPC setup studied in this paper generalizes all the previous MDS coded PPC setups and the two degraded schemes strictly outperform the best known schemes in terms of (asymptotical) PPC rate, which is the main concern of the PPC schemes.

Index Terms

Private information retrieval, symmetric private polynomial computation, Lagrange encoding, computation complexity.

I. INTRODUCTION

With the rapid evolution of big data, machine learning and distributed computing, there arises substantial concerns about protecting the computing privacy of a user from public servers. This problem is referred to as Private Computation (PC), which seeks efficient solutions for the user to compute a function of files stored at distributed servers, without disclosing the identity of the desired function to the servers. The PC problem was firstly introduced in [12], [20] and has attracted remarkable attention in the past few years within information-theoretic community [9], [13], [15], [16]. In the classical PC setup, the user wishes to compute one out of any ξ candidate functions over M files from N non-colluding servers, each of which stores all the M files, while preventing any individual server from obtaining information about which function is being computed.

To this end, the user sends N query strings, one to each server. After receiving the query, each server truthfully responds an answer string to the user based on the information it stores. Finally, the user is able to recover the desired function from the collected answer strings.

A trivial strategy is to download all the files from the servers and then compute the desired function locally, or request severs to compute all the functions and then download all the evaluations, which incurs significant communication cost and therefore is highly impractical in practice. It was proved that the naive strategy is the only feasible solution in the sense of information-theoretic privacy if the files are stored at a single server [3]. To alleviate this inefficiency, the information-theoretic PC with low communication cost can be achieved by replicating the files at multiple non-colluding servers [3]. In such systems, the most important measure of communication effectiveness is the PC rate, defined as the number of bits of the desired function that can be privately retrieved per downloaded bit from all servers. The supremum of PC rates over all achievable schemes is referred to as its capacity. Indeed, private computation is a generalization of Private Information Retrieval (PIR) problem

J. Zhu and X. Tang are with the Information Security and National Computing Grid Laboratory, Southwest Jiaotong University, Chengdu 611756, China (email: [email protected], [email protected]).

Q. Yan is with the Department of Electrical and Computer Engineering, University of Illinois at Chicago, IL, 60607, USA (email: [email protected]).

(2)

wherein the user wishes to privately retrieval one out of theM files from the N servers, while instead hiding the identity of the desired file from the servers, see [4], [7], [28] for PIR details.

In a recent influential work by Sun and Jafar [20], the exact capacity of classical Private Linear Computation (PLC) problem, where the user wants to privately compute a linear combination of theM files, was characterized as 1 +N1 + . . . +NM −11

−1

. Soon afterwards, the problem of PLC over Maximum Distance Separable (MDS) coded storage (or MDS-PLC in short), where the files are distributed across the N servers according to (N, K) MDS codes, was considered by Obead et al. in [13] and its capacity was subsequently characterized in [14] to be 1 +KN + . . . +KNM −1M −1

−1

. Moreover, they [15] further constructed PLC schemes on arbitrary linear storage codes and showed that the capacity of MDS-PLC can be achieved for a large class of linear codes.

Particularly in [15], [9], [16], the problem of PC was focused on the setup that the candidate functions are polynomials with maximum degree G in M variables (files) over a finite field Fp, called Private Polynomial Computation (PPC). Very recently, Obeady et al. [15] presented two novel non-colluding PPC schemes from systematic and nonsystematic Reed-Solomon coded servers for arbitrary number of candidate polynomial functions, also referred to as systematic MDS-PPC and nonsystematic MDS-PPC, respectively. In [9], [16], the ξ candidate functions were restricted to be a finite-dimensional vector space (or sub-space) of polynomials over Fp. Accordingly, Karpuk [9] investigated PPC with T colluding and systematically MDS coded servers (systematic MDS-TPPC), where anyT out of the N servers can collude to deduce the identity of the interesting function, and proposed an MDS-TPPC scheme achieving the rate min{N −(G(K−1)+T ),K}

N by generalizing the star-product PIR codes [5]. Later in [16], the security setup was further generalized by Raviv and Karpuk to the scenarios of X-secure data storage, B Byzantine servers and U unresponsive servers (U-B-MDS-XTPPC), where the data security is guaranteed against up to X colluding servers, while any group of up to B servers return arbitrary responses maliciously to confuse the user and any U disjoint servers do not respond any information at all. As a result, they constructed an U-B-MDS-XTPPC scheme achieving the rate N−(G(K+X−1)+T +2B+U)

N−U · G(K+X−1)+1K [16] by leveraging ideas from Lagrange coded computation [29]

and successive decoding with interference cancellation strategy [8], [21].

The problem of secure multi-party computation, first introduced by Yao in [27], focuses on jointly computing an arbitrary polynomial function of some private datasets distributed at the users (parties) under the constraint that each user must not gain any additional information about the datasets beyond the function interested. Naturally, it is also desirable to keep the data files private from the user more than the desired function in PC. For example, if one wishes to privately compute a feature function from massive medical datasets in medical big data, it is supposed to prevent he/she from learning anything about the medical records more than the desired function results, beside keeping the feature function private from the servers. This new constraint is called server-privacy and the corresponding problem is called symmetric PC. To protect server-privacy, all the servers are allowed to share a common randomness that is independent of the files and unavailable to the user. Consequently, secrecy rate, defined as the ratio of the amount of common randomness shared by the servers and the number of desired function, becomes another metric to measure the effectiveness of symmetric PC schemes.

In this paper, we consider the general problem of U-B-MDS-XTSPPC, i.e., symmetric private polynomial computation from (N, K + X) MDS coded storage with X-secure data storage, T -colluding privacy, B Byzantine servers and U unresponsive servers, see Table I for the comparison with PPC schemes in previous setups. In PPC, the computation complexities, consisting of generating queries at user, computing answers at severs and decoding at user, should be considered to further measure the efficiency of PPC schemes. The upload cost (the total length of query strings) and the size of finite field Fp operated by PC schemes are other two important practical design factors. For these reasons, the objective of this paper is to design U-B-MDS-XTSPPC schemes with PPC rate as high as possible, while keeping secrecy rate, upload cost, finite field size, query complexity, server computation complexity and decoding complexity as small/low as possible.

We propose two novel U-B-MDS-XTSPPC schemes using Lagrange encoding [29], named U-B-MDS-XTSPPC Scheme-1 and U-B-MDS-XTSPPC Scheme-2, respectively. Both the two schemes have the same PPC rate1 −G(K+X−1)+T +2B

N−U , secrecy rate N−(G(K+X−1)+T +2B+U)G(K+X−1)+T , finite field size and decoding complexity. Specifically, U-B-MDS-XTSPPC Scheme-1 focuses on the general case of arbitrary candidate polynomial function set. Similar to [9], [16], U-B-MDS-XTSPPC Scheme-2 restricts the candidate functions to be a finite-dimensional vector space (or sub-space) of polynomials over Fp, however achieves a better upload cost, query complexity and server computation complexity. Notably, in terms of PPC rate, our degraded PPC schemes are strictly superior to the previous best known schemes for U-B-MDS-XTPPC [16], MDS-TPPC [9], and asymptotic

(3)

MDS-PPC (i.e., the number of files M → ∞), see Section V for details. In addition, the U-B-MDS-XTPPC scheme in [16]

and PIR schemes in [21], [8] require the queries, answers and decoding to happen over multi-rounds, and successive decoding with interference cancellation strategy is employed by the user, i.e., the user will cancellate the interference from the decoded information of previous rounds. However, our schemes can be carried out independently and concurrently, which improves the efficiency of retrieving desired information.

MDS coded storage T -Colluding Privacy X-Security

Nonsystematic MDS-PPC Scheme [15]

Systematic MDS-PPC Scheme [15]

Systematic MDS-TPPC Scheme [9]

U-B-MDS-XTPPC Scheme [16]

U-B-MDS-XTSPPC Scheme-1

U-B-MDS-XTSPPC Scheme-2

Byzantine and Unresponsiveness Server-Privacy Candidate Polynomial Functions

Nonsystematic MDS-PPC Scheme [15] Arbitrary

Systematic MDS-PPC Scheme [15] Arbitrary

Systematic MDS-TPPC Scheme [9] Vector Space

U-B-MDS-XTPPC Scheme [16] Vector Space

U-B-MDS-XTSPPC Scheme-1 Arbitrary

U-B-MDS-XTSPPC Scheme-2 Vector Space

TABLE I: Comparison for PPC schemes in previous setups. The U-B-MDS-XTSPPC setup studied in this paper generalizes the previous setups of MDS-PPC [15], MDS-TPPC [9], and U-B-MDS-XTPPC [16].

The rest of this paper is organized as follows. In Section II, the problem of U-B-MDS-XTSPPC is formally formulated. In Section III, U-B-MDS-XTSPPC Scheme-1 with Lagrange encoding is constructed for any candidate function set. In Section IV, U-B-MDS-XTSPPC Scheme-2 with low upload cost, query complexity and server computation complexity is presented when the candidate functions to be computed are restricted to be a finite-dimensional vector space (or sub-space) of polynomials.

In Section V, the proposed schemes are compared with known results for the degraded problems of MDS-PPC, MDS-TPPC, and U-B-MDS-XTPPC. Finally, the paper is concluded in Section VI.

The following notations are used throughout this paper.

Let boldface capital and lower-case letters represent matrices and vectors, respectively, e.g., W and q, while their entries are denoted by the corresponding small Roman letters, e.g., w and q;

For any positive integers m, n such that m ≤ n, [n] and [m : n] denote the set {1, 2, . . . , n} and {m, m + 1, . . . , n}, respectively;

DefineAΓ as {Aγ1, . . . , Aγm} for any index set Γ = {γ1, . . . , γm} ⊆ [n];

For a finite set X , |X | denotes its cardinality.

II. SYSTEMMODEL

Consider a dataset that is stored at a distributed system with N servers. The dataset comprises M independent files, W(1), . . . , W(M), where each file W(m) is denoted by a random matrix of dimension λ × K, with each entry choosing independently and uniformly over the finite field Fp for some prime powerp, i.e.,

W(m)=



w1,1(m) . . . w(m)1,K ... . .. ... wλ,1(m) . . . wλ,K(m)



, ∀ m ∈ [M ]. (1)

Let L, λK be the number of symbols contained in the file.1 The independence between all the files can be formalized as H(W(1), . . . , W(M)) =

XM m=1

H(W(m)) = M L, where the entropy functionH(·) is measured with logarithm p.

1As is typical in information theory, the file size is unbounded and the coding schemes may freely choose λ to maximize their effectiveness.

(4)

The dataset is stored at the distributed system by using MDS codes over Fp and kept secure from any group of up to X colluding servers. In analogy to [8], [16], security and MDS property are guaranteed by employing(N, K + X) MDS codes.

Denote the information stored at servern by yn for anyn ∈ [N ]. Specifically, the storage system needs to satisfy

MDS Property: The dataset can be reconstructed by connecting to at leastK + X servers to tolerate up to N − K − X server failures, i.e.,

H(W(1), . . . , W(M)|yΓ) = 0, ∀ Γ ⊆ [N ], |Γ| ≥ K + X.

The storage at each server is constrained asM LK , which is reduced by a factor ofK1 compared to repetition coding storage, i.e.,

H(yn) = M λ, ∀ n ∈ [N ].

X-Security: Any X servers remain oblivious perfectly to the dataset even if they collude, i.e.,

I(yX; W(1), . . . , W(M)) = 0, ∀ X ⊆ [N ], |X | = X. (2) Obviously, the storage system degrades to the classical (N, K) MDS coded setup when X = 0.

Letφ(u)(x1, . . . , xM) ∈ Fp[x1, . . . , xM], u ∈ [ξ] be ξ candidate multivariable polynomial functions and G be the maximum degree, i.e.,

G = max{deg(φ(u)) : u ∈ [ξ]}.

In Private Polynomial Computation (PPC), a user privately and uniformly selects a numberθ ∈ [ξ] to evaluate the polynomial φ(θ) over theM files W(1), . . . , W(M)from theN servers, while keeping the index θ private from any colluding subset of up toT out of the N servers. Here, the privacy of the user is restricted to [ξ]. That is, each of servers knows the set of candidate polynomial functions {φ(u)}u∈[ξ], but any T colluding servers can not learn any information about which function is being computed other than it being in {φ(u)}u∈[ξ]. Let V(θ) , φ(θ)(W(1), . . . , W(M)) be the desired function evaluations of the user, where V(θ) is aλ × K random matrix of the form

V(θ) =



v1,1(θ) . . . v(θ)1,K ... . .. ... vλ,1(θ) . . . vλ,K(θ)



, ∀ θ ∈ [ξ] (3)

with

vi,j(θ)= φ(θ)(wi,j(1), . . . , w(M)i,j ), ∀ i ∈ [λ], j ∈ [K]. (4) For this purpose, the user sends S queries to each server, which accordingly responds the user with S answers according to the information available. Consequently, the user is able to decode the desired evaluations from the answers of servers. In addition, it is required that the user must not gain any information about the data files W(1), . . . , W(M)more than the desired evaluations, called symmetric PPC. This is guaranteed by the assumption that all the servers share a common randomness F, which is independent of all the stored information y[N ] but unavailable to the user.

For convenience, we divide the queries, answers and decoding into S rounds or iterations. During each round s ∈ [S], we assume the presence of some serversBsof size at mostB that pretend to send arbitrary answers to confuse the user, known as Byzantine servers, and another set of disjoint serversUs of size at mostU that do not respond at all, known as unresponsive servers, where the identities of the servers which are Byzantine and unresponsive may change from round to round. Note that the user has no priori knowledge of the identities of the Byzantine serversBsand unresponsive serversUs, other than knowing the values of B and U .

Formally, an X-secure T -colluding Symmetric PPC scheme from MDS coded storage system with B Byzantine and U unresponsive servers, also referred to as U-B-MDS-XTSPPC scheme, is composed of the queries, answers and decoding ofS rounds, and each of roundss ∈ [S] is described as follows.

1) Query Phase: The user generatesN queries qs[N ] and sends qsn to servern for all n ∈ [N ].

2) Answer Phase: Upon receiving the query qsn, a Byzantine servern ∈ Bsoverwrites its answer maliciously and sends an arbitrary responseAsn to confuse the user, whereBs ⊆ [N ], |Bs| ≤ B. An unresponsive server in Us will not respond

(5)

any information at all, where Us ⊆ [N ], Us∩ Bs= ∅, |Us| ≤ U . And the remaining servers in [N ]\(Bs∪ Us), known as authentic servers, will truthfully respond the answers, which are the determined functions of the received queries and the stored information, i.e.,

H(Asn|qsn, yn, F ) = 0, ∀ n ∈ [N ]\(Bs∪ Us).

3) Decoding Phase: The user decodes some interested dataVsfrom the information available to it in rounds, i.e., H(Vs|{As[N ]\U s′, qs[N ] }s∈[s]) = 0.

The following conditions must hold for an U-B-MDS-XTSPPC scheme.

Correctness: The desired function evaluations V(θ) must be obtained by converging the decoded data over theS rounds, i.e.,

H(V(θ)|{Vs}s∈[S]) = 0, ∀ θ ∈ [ξ]. (5)

User-Privacy: The desired function indexθ must be hidden from all the queries sent to any T colluding servers, i.e.,

I({qsT}s∈[S]; θ) = 0, ∀ T ⊆ [N ], |T | = T. (6)

Server-Privacy: The user must not gain any additional information in regard to all the data files more than the desired polynomial function evaluations, i.e.,

I({As[N ]\Us, qs[N ]}s∈[S]; W(1), . . . , W(M)|V(θ)) = 0, ∀ θ ∈ [ξ]. (7) The performance of an U-B-MDS-XTSPPC scheme can be measured by the following five quantities:

1. The PPC rate, which is the ratio of the number of desired function evaluations to the total downloaded symbols, defined as

R, L

D, (8)

whereD =P

s∈[S]

P

n∈[N ]\UsH(Asn) is the average download cost from the responsive servers over all rounds.

2. The secrecy rate, which is defined as the ratio of the amount of common randomness shared by the servers and the number of desired function evaluations, i.e.,

ρ,H(F )

L . (9)

3. The upload cost, which is the number of symbols required to send the queries to the servers, i.e.,

η, X

s∈[S]

X

n∈[N ]

H(qsn).

4. The finite field sizep, which ensures the achievability of the storage codes and coded PPC schemes.

5. The system complexity, which includes the complexities of queries, server computation and decoding. Define the query complexity Cq at the user as the order of the number of arithmetic operations required to generate all the queries {qs[N ]}s∈[S]. Similarly, define the server computation complexityCsto be the order of the number of arithmetic operations required to generate the response{Asn}s∈[S], maximized over n ∈ [N ]. Finally, define the decoding complexity Cd at the user to be the order of the number of arithmetic operations required to decode the desired function evaluations V(θ) from the answers of responsive servers.

In principle, the PPC rate is preferred to be high, while the secrecy rate, upload cost, finite field size and system complexity are preferred to be small/low.

Remark 1. Different from all the previous MDS-PPC works [9], [15], [16], we consider the generic U-B-MDS-XTSPPC problem. When server-privacy is not considered (i.e., the constrain (7) is removed and thus the common randomnessF is not necessary and the secrecy rate ρ can be set to be 0), our problem straightly degrades to the problems of MDS-PPC [15] by setting U = B = X = 0 and T = 1, MDS-TPPC [9] by setting U = B = X = 0, and U-B-MDS-XTPPC [16]. In general, U-B-MDS-XTSPPC is an integration and generalization of previous studies on PPC extensions.

For clarity, the parameters used in our U-B-MDS-XTSPPC system are listed in Table II.

(6)

TABLE II: Parameters Used in U-B-MDS-XTSPPC

N number of servers M number of files

λ number of rows of each data file K number of columns of each data file X number of colluding data-curious servers T number of colluding function-curious servers B number of Byzantine servers U number of unresponsive servers

ξ number of candidate polynomial functions G maximum degree over candidate polynomial functions p finite field size V(θ) desired polynomial function evaluations

F common randomness across servers S number of rounds

R PPC rate ρ secrecy rate

η upload cost Cq query complexity at user

Cs server computation complexity Cd decoding complexity at user

III. U-B-MDS-XTSPPC SCHEME-1 BASED ONLAGRANGEENCODING

In this section, we present an U-B-MDS-XTSPPC scheme based on Lagrange encoding, which works for the general case of any ξ candidate polynomial functions, also referred to as U-B-MDS-XTSPPC Scheme-1.

Before that, we first introduce three useful lemmas, which will be employed by the later U-B-MDS-XTSPPC schemes to preserve/resist X-security, Byzantine and unresponsiveness, and user-privacy.

Lemma 1 ([17]). Given any positive integers N, K, X such that N ≥ K + X, let w1, . . . , wK ∈ Fp be K secrets and z1, . . . , zX be X random variables chosen independently and uniformly from Fp. Letα1, . . . , αN beN distinct numbers from Fp. Denote

ϕ(α) = w1h1(α) + . . . + wKhK(α) + z1c1(α) + . . . + zXcX(α), where h1(α), . . . , hK(α), c1(α), . . . , cX(α) are the deterministic function of α. If the matrix

C=





c1n1) c2n1) . . . cXn1) c1n2) c2n2) . . . cXn2)

... ... . .. ... c1nX) c2nX) . . . cXnX)





X×X

is non-singular over Fp for any X = {n1, . . . , nX} ⊆ [N ] with |X | = X, then the X values {ϕ(αn1), . . . , ϕ(αnX)} can not learn any information about the K secrets w1, . . . , wK, i.e.,

I(ϕ(αn1), . . . , ϕ(αnX); w1, . . . , wK) = 0, ∀ X = {n1, . . . , nX} ⊆ [N ], |X | = X.

Lemma 2 ([10]). An(n, k) maximum distance separable code with dimension k and length n is capable of resisting b Byzantine errors and u unresponsive errors if dmin= n − k + 1 ≥ 2b + u + 1.

Lemma 3 (Generalized Cauchy Matrix [10]). Letα1, . . . , αkandβ1, . . . , βkbe the elements fromFpsuch thatαi6= αj, βi6= βj

for any i, j ∈ [k] with i 6= j, and v1, . . . , vk be k nonzero elements from Fp. Denote by fi(α) a polynomial of degree k − 1 fi(α) = Y

ℓ∈[k]\{i}

α − β

βi− β

, ∀ i ∈ [k].

Then the following generalized Cauchy matrix Fc is invertible overFp.

Fc =





v1f11) v1f21) . . . v1fk1) v2f12) v2f22) . . . v2fk2)

... ... . .. ... vkf1k) vkf2k) . . . vkfkk)





 .

In order to efficiently resist the Byzantine errors and unresponsive errors such that the user can maximally retrieve the desired function evaluations from the answers, it is desirable to enable that the responses of all the servers live in some

(7)

form of MDS code because it has maximum minimum Hamming distance. In each round, our schemes just allow each server to respond one symbol except for the unresponsive servers. Intuitively, among the server responses of N dimensions in each round, our schemes exploit T dimensions to preserve user-privacy and G(K + X − 1) dimension to struggle the randomness incurred by the composition of the(N, K + X) MDS coded data storage and the polynomial functions of degree G. In addition, 2B + U dimensions are used to correct the B Byzantine errors and U unresponsive errors. Accordingly, the remainingN − (G(K + X − 1) + 2B + U ) dimensions are left for us to retrieve desired function evaluations.

Let E denote the number of desired function evaluations that the user can retrieve in each round, then set

E, N − (G(K + X − 1) + T + 2B + U) (10)

with N > G(K + X − 1) + T + 2B + U . Recall from (3) that the user needs to compute L = λK polynomial function evaluations. The parameterλ and number of rounds S should satisfy

ES = λK. (11)

Here, we choose the smallest integers satisfying (11), i.e., λ = E

∆, S = K

∆, (12)

where∆, gcd(K, E).

A. Public Elements

To construct U-B-MDS-XTSPPC schemes, we first need to generate elements{βi,ℓ: i ∈ [λ], ℓ ∈ [K +X]} and {α1, . . . , αN}, which will be publicized to the user and severs in advance.

Denote {βi,ℓ: i ∈ [λ], ℓ ∈ [K + X]} by a matrix β of dimension λ × (K + X), i.e.,

β,



β1,1 . . . β1,K β1,K+1 . . . β1,K+X

... . .. ... ... . .. ... βλ,1 . . . βλ,K βλ,K+1 . . . βλ,K+X



. Throughout this paper, let {βi,ℓ, αn : i ∈ [λ], ℓ ∈ [K + X], n ∈ [N ]} ⊆ Fp satisfy

P1. The entries in each row of the matrix β are distinct, i.e., for each giveni ∈ [λ], βi,j6= βi,k for allj, k ∈ [K + X] with j 6= k;

P2. For any given s ∈ [S], all the entries in columns [(s − 1)∆ + 1 : s∆] of the matrix β are distinct, i.e., βi,j 6= βk,l for any(i, j) 6= (k, l) such that i, k ∈ [λ] and j, l ∈ [(s − 1)∆ + 1 : s∆];

P3. The elements α1, . . . , αN are distinct, i.e.,αi 6= αj for alli, j ∈ [N ] with i 6= j;

P4. The elements α1, . . . , αN are distinct from the ones in columns[K] of the matrix β, i.e., {αn : n ∈ [N ]} ∩ {βi,ℓ: i ∈ [λ], ℓ ∈ [K]} = ∅.

The following lemma states the sufficient condition of the finite field for finding such elements, which will be proved in Appendix.

Lemma 4. There must exist a group of elementsi,ℓ, αn : i ∈ [λ], ℓ ∈ [K + X], n ∈ [N ]} ⊆ Fp satisfying P1-P4 if p ≥ N + max{K, E}.

In addition, the candidate polynomial functions{φ(u)}u∈[ξ] are also publicized in advance.

B. Secure Lagrange Storage Codes

In this subsection, we describe the data encoding procedures, which use Lagrange interpolation polynomials to encode each row data of each file separately. For anym ∈ [M ], i ∈ [λ], let zi,K+1(m) , zi,K+2(m) , . . . , zi,K+X(m) be X random variables distributed independently and uniformly on Fp. Choose a polynomialϕ(m)i (α) of degree at most K + X − 1 for every m ∈ [M ], i ∈ [λ]

such that

ϕ(m)ii,ℓ) =

(w(m)i,ℓ , ∀ ℓ ∈ [K]

zi,ℓ(m), ∀ ℓ ∈ [K + 1 : K + X] , (13)

(8)

wherew(m)i,ℓ is the i-th element in the ℓ-th column of data file W(m) defined in (1).

By P1, the Lagrange interpolation rule and the degree restriction guarantee the existence and uniqueness ofϕ(m)i (α), which is expressed as

ϕ(m)i (α) = XK l=1

w(m)i,l · Y

k∈[K+X]\{l}

α − βi,k

βi,l− βi,k

+

K+XX

l=K+1

zi,l(m)· Y

k∈[K+X]\{l}

α − βi,k

βi,l− βi,k

. (14)

Then the evaluations of ϕ(m)i (α) (m ∈ [M ], i ∈ [λ]) at point α = αn are stored at then-th server for any n ∈ [N ], i.e., yn=

ϕ(1)1n), . . . , ϕ(M)1n), . . . , ϕ(1)λn), . . . , ϕ(M)λn)

. (15)

Notice that, such Lagrange encoding is equivalent to the (N, K + X) Reed-Solomon (RS) code [16] with a class of specific basis polynomials σi,1(z), σi,2(z), . . . , σi,K+X(z) for any i ∈ [λ], where

σi,l(z) = Y

k∈[K+X]\{l}

z − βi,k

βi,l− βi,k

, ∀ l ∈ [K + X].

Hence, ϕ(m)i1), . . . , ϕ(m)iN)

is an(N, K + X) RS codeword over Fpfor anym ∈ [M ], i ∈ [λ] and the storage encoding has the property of (N, K + X) MDS.

C. Construction of U-B-MDS-XTSPPC Scheme-1

Recall that the user wishes to privately compute the function evaluations V(θ) in (3), whereθ ∈ [ξ]. To this end, the queries, answers and decoding ofS rounds will be operated as follows.

Since the candidate polynomial functions {φ(u)}u∈[ξ] are completely public to all the servers, based on the stored data in (15), server n first computes the λξ polynomial evaluations:

gn=

gn,1(1), . . . , gn,λ(1), . . . , g(θ)n,1, . . . , gn,λ(θ), . . . , g(ξ)n,1, . . . , g(ξ)n,λ

, ∀ n ∈ [N ], (16)

where

gn,j(u)= φ(u)(1)jn), . . . , ϕ(M)jn)), ∀ j ∈ [λ], u ∈ [ξ].

During rounds ∈ [S], the user independently and uniformly generates ξλT random variables {z(u),sj,t }u∈[ξ],j∈[λ],t∈[T ] from Fp. Then, given any u ∈ [ξ], j ∈ [λ], construct the query polynomial q(u),sj (α) of degree E + T − 1 = λ∆ + T − 1 such that

q(u),sji,ℓ) =

(1, if u = θ, j = i

0, otherwise , ∀ i ∈ [λ], ℓ ∈ [(s − 1)∆ + 1 : s∆], (17) qj(u),st) = zj,t(u),s, ∀ t ∈ [T ].

By P2-P4, the λ∆ + T elements {βi,ℓ: i ∈ [λ], ℓ ∈ [(s − 1)∆ + 1 : s∆]} ∪ {αt: t ∈ [T ]} are distinct for any s ∈ [S]. Then, the polynomialq(u),sj (α) can be accurately written as

qj(u),s(α) = X

l∈[T ]

z(u),sj,l ·

 Y

k∈[λ],r∈[(s−1)∆+1:s∆]

α − βk,r

αl− βk,r

 Y

v∈[T ]\{l}

α − αv

αl− αv

+









P

l∈[(s−1)∆+1:s∆]



Q

k∈[λ],r∈[(s−1)∆+1:s∆]

(k,r)6=(j,l)

α−βk,r

βj,l−βk,r

 Q

v∈[T ] α−αv

βj,l−αv

!

, if u = θ

0, otherwise

. (18)

Next, the user evaluates theλξ query polynomials {qj(u),s: u ∈ [ξ], j ∈ [λ]} at α = αn and sends them to servern, i.e., qsn =

q(1),s1n), . . . , q(1),sλn), . . . , q(θ),s1n), . . . , q(θ),sλn), . . . , q(ξ),s1n), . . . , q(ξ),sλn)

, ∀ n ∈ [N ]. (19) Let

Fs= {zis: i ∈ [G(K + X − 1) + T ]} (20)

(9)

be G(K + X − 1) + T random variables distributed independently and uniformly over Fp, which are shared by all the servers but unknown to the user. Define an interpolation polynomial ψs(α) of degree E + G(K + X − 1) + T − 1 such that

ψsi,ℓ) = 0, ∀ i ∈ [λ], ℓ ∈ [(s − 1)∆ + 1 : s∆], (21)

ψsi) = zis, ∀ i ∈ [G(K + X − 1) + T ]. (22)

Note from P2-P4 again that{βi,ℓ: i ∈ [λ], ℓ ∈ [(s− 1)∆+ 1 : s∆]} ∪{αi: i ∈ [G(K + X − 1)+ T ]} are E + G(K + X − 1)+ T distinct elements from Fp due toG(K + X − 1) + T < N . Thus, the polynomial ψs(α) is the form of

ψs(α), X

l∈[G(K+X−1)+T ]

zls·

 Y

k∈[λ],r∈[(s−1)∆+1:s∆]

α − βk,r

αl− βk,r

 Y

v∈[G(K+X−1)+T ]\{l}

α − αv

αl− αv

. (23)

Then, server n computes a response by taking the inner product of the received query vector qsn and the local computed evaluations gn, and adding on an evaluation ofψs(α) at α = αn, i.e.,

Asn= hqsn, gni + ψsn). (24)

Note that there are at mostB Byzantine servers, each of which instead generates an arbitrary element from Fp to confuse the user. Meanwhile, there are at most U unresponsive servers that will not respond any information at all.

Denote the answer polynomialAs(α) by As(α) =

Xξ u=1

Xλ j=1

q(u),sj (α) · φ(u)(1)j (α), . . . , ϕ(M)j (α)) + ψs(α). (25) Obviously, the answerAsn is equivalent to evaluatingAs(α) at α = αn for any authentic servern ∈ [N ]\(Bs∪ Us) and round s ∈ [S]. Since φ(u) is a polynomial in M variables with degree at most G for any u ∈ [ξ] and the degree of polynomial ϕ(m)j (α) is K + X − 1 for any m ∈ [M ], j ∈ [λ] by (14), the composite polynomial φ(u)(1)j (α), . . . , ϕ(M)j (α) has degree at mostG(K + X − 1). We know deg(qj(u),s(α)) = E + T − 1 for any u ∈ [ξ], j ∈ [λ] from (18). Thus, As(α) can be viewed as a polynomial of single variableα with degree G(K + X − 1) + E + T − 1. Recall from P3 that {αn}n∈[N ]are distinct elements from Fp. So, (As1), . . . , AsN)) forms an (N, G(K + X − 1) + E + T ) RS codeword, which provides robustness against B random errors and U erasure errors at the same time by (10) and Lemma 2. Then, the user can decode the polynomial As(α) from the answers (As1, . . . , AsN) by using RS decoding algorithms [10], [6] even if there exists B Byzantine servers andU unresponsive servers.

Finally, for anyi ∈ [λ], ℓ ∈ [(s − 1)∆ + 1 : s∆], the user evaluates As(α) at α = βi,ℓ to obtain Asi,ℓ) =

Xξ u=1

Xλ j=1

qj(u),si,ℓ) · φ(u)(1)ji,ℓ), . . . , ϕ(M)ji,ℓ)) + ψsi,ℓ)

(a)= φ(θ)(1)ii,ℓ), . . . , ϕ(M)ii,ℓ)) + ψsi,ℓ)

(b)= φ(θ)(1)ii,ℓ), . . . , ϕ(M)ii,ℓ))

(c)= φ(θ)(wi,ℓ(1), . . . , w(M)i,ℓ )

(d)= vi,ℓ(θ), (26)

where(a) is due to qj(u),si,ℓ) = 1 if u = θ, j = i and q(u),sji,ℓ) = 0 otherwise for any i ∈ [λ], ℓ ∈ [(s − 1)∆ + 1 : s∆] by (17); (b) is because of ψsi,ℓ) = 0 by (21); (c) is due to ϕ(m)ii,ℓ) = wi,ℓ(m) for anym ∈ [M ] by (13); (d) follows by (4).

Therefore, in round s, the desired evaluations in columns [(s − 1)∆ + 1 : s∆] of V(θ) (3) can be obtained by evaluating As(α) at βi,ℓfor all i ∈ [λ], ℓ ∈ [(s − 1)∆ + 1 : s∆]. As a result, the user can decode V(θ) correctly after traversings ∈ [S], whereS = K/∆ by (12).

Remark 2. Evidently, U-B-MDS-XTSPPC Scheme-1 allows the decoding of S rounds to be carried out independently and concurrently, which are exceedingly efficient for retrieving desired function evaluations.

D. Illustrative Example for U-B-MDS-XTSPPC Scheme-1

In this subsection, we present an explicit example to illustrate the main ideas of the proposed U-B-MDS-XTSPPC Scheme-1 for the parametersN = 21, K = 4, X = 2, G = 2, M = 2, T = 2, B = 1, U = 1, where E = 6, ∆ = 2, λ = 3 and S = 2.

(10)

Lagrange Data Encoding: The data encoding operates as follows. Let{βi,ℓ, αn : i ∈ [3], ℓ ∈ [6], n ∈ [21]} ⊆ Fpbe a group of elements satisfying P1-P4. For any m ∈ [2] and i ∈ [3], choose X = 2 random variables zi,5(m), z(m)i,6 independently and uniformly from Fp and design the Lagrange interpolation polynomialϕ(m)i (α) of degree K + X − 1 = 5 such that

ϕ(m)ii,1) = w(m)i,1 , ϕ(m)ii,2)=w(m)i,2 , ϕ(m)ii,3)=w(m)i,3 , (27) ϕ(m)ii,4) = wi,4(m), ϕ(m)ii,5)= zi,5(m), ϕ(m)ii,6) = zi,6(m). (28) The data stored at servern ∈ [21] is

yn = ϕ(1)1n), ϕ(2)1n), ϕ(1)2n), ϕ(2)2n), ϕ(1)3n), ϕ(2)3n) . By (3), the user wishes to compute the following polynomial evaluations from the system.

v(θ)i,j = φ(θ)(w(1)i,j, w(2)i,j), ∀ i ∈ [3], j ∈ [4]. (29) U-B-MDS-XTSPPC Scheme-1: For this purpose, each servern ∈ [21] first evaluates all the candidate polynomial functions φ(1), . . . , φ(ξ) at its stored data, as follows.

gn=

g(1)n,1, gn,2(1), gn,3(1), . . . , gn,1(θ), gn,2(θ), gn,3(θ), . . . , gn,1(ξ), g(ξ)n,2, gn,3(ξ) , where

gn,j(u)= φ(u)(1)jn), ϕ(2)jn)), ∀ j ∈ [3], u ∈ [ξ].

During rounds ∈ [2], the user privately generates ξλT = 6·ξ random variables {zj,t(u),s}u∈[ξ],j∈[3],t∈[2]distributed independently and uniformly on Fp, and then sends the query vector qsn to servern:

qsn =

q(1),s1n), q2(1),sn), q(1),s3n), . . . , q(θ),s1n), q2(θ),sn), q(θ),s3n), . . . , q(ξ),s1n), q(ξ),s2n), q3(ξ),sn) , whereq(u),sj (α) is the query polynomial with degree E + T − 1 = 7 for any given u ∈ [ξ], j ∈ [3] such that

qj(u),si,ℓ) =

(1, if u = θ, j = i

0, otherwise , ∀ i ∈ [3], ℓ ∈ [2s − 1 : 2s], (30)

qj(u),st) = z(u),sj,t , ∀ t ∈ [2]. (31)

To ensure server-privacy, define an interpolation polynomialψs(α) of degree G(K + X − 1) + E + T − 1 = 17 such that

ψsi,ℓ) = 0, ∀ i ∈ [3], ℓ ∈ [2s − 1 : 2s], (32)

ψsi) = zis, ∀ i ∈ [12], (33)

wherez1s, . . . , z12s are the random variables shared over the servers.

Let As(α) be the response polynomial of degree deg(As(α)) = 17 in round s:

As(α) = Xξ u=1

X3 j=1

qj(u),s(α) · φ(u)(1)j (α), ϕ(2)j (α)) + ψs(α).

Then, any authentic server n responds the user with Asn = hqsn, gni + ψsn), which is equivalent to evaluating As(α) at α = αn. Remarkably,(As1), . . . , As21)) is a (21, 18) RS codeword, which is robust against any B = 1 Byzantine error andU = 1 unresponsive error. Hence, the user can decode As(α) from the answers of responsive servers by using RS decoding algorithms. Then, by (27)-(33), evaluating the polynomial atα = β1,2s−1, β2,2s−1, β3,2s−1, β1,2s, β2,2s, β3,2s can obtain

Asi,2s−1) = vi,2s−1(θ) , Asi,2s)=vi,2s(θ), ∀ i ∈ [3].

Finally, the user can recover the desired function evaluations in (29) after two rounds. The scheme achieves the PPC rate R = 103 and secrecy rateρ = 2.

(11)

E. Feasibility and Performance of U-B-MDS-XTSPPC Scheme-1

In this section, we show the feasibility of U-B-MDS-XTSPPC Scheme-1 and analyse its performance. Before that, some lemmas are presented, which will be used for analysing the arithmetic complexities of PPC schemes.

Lemma 5 (Polynomial Evaluation and Interpolation [22]). The evaluation of a k-th degree polynomial at k + 1 arbitrary points can be done inO(k(log k)2log log k) arithmetic operations, and consequently, its dual problem, interpolation of a k-th degree polynomial fromk + 1 arbitrary points can be performed in the same arithmetic operations O(k(log k)2log log k).

Lemma 6 (Multivariate Polynomial Evaluation [1], [11]). The evaluation of a multivariate polynomial of degree n in k variables can be done in O(nk) arithmetic operations.

Lemma 7 (Decoding Reed-Solomon Codes [10], [6]). Decoding Reed-Solomon codes of dimension n with b errors and u erasures over arbitrary finite fields can be done in O(n(log n)2log log n) arithmetic operations by utilizing fast polynomial multiplications [22] if its minimum distance satisfies d > 2b + u.

Theorem 1. The proposed PPC scheme in Section III-C is robust against X-secure data storage, B Byzantine and U unresponsive servers, T -colluding user-privacy, and server-privacy.

Proof: It is sufficient to prove that U-B-MDS-XTSPPC Scheme-1 satisfies the constraints of (2), (5), (6), and (7).

X-Security: For any subset {n1, . . . , nX} ⊆ [N ] of size X, let

Ci=





ci,1n1) ci,2n1) . . . ci,Xn1) ci,1n2) ci,2n2) . . . ci,Xn2)

... ... . .. ... ci,1nX) ci,2nX) . . . ci,XnX)





, ∀ i ∈ [λ],

where

ci,l(α) =

 Y

k∈[K]

α − βi,k

βi,K+l− βi,k

· Y

k∈[K+1:K+X]\{K+l}

α − βi,k

βi,K+l− βi,k

, ∀ l ∈ [X].

Thus, by applying P1-P4 to Lemma 3, Ci is a generalized Cauchy matrix and is non-singular over Fp for any i ∈ [λ]. Then, according to (14) and Lemma 1,

I(ϕ(m)in1), . . . , ϕ(m)inX); w(m)i,1 , . . . , w(m)i,K) = 0, ∀ m ∈ [M ], i ∈ [λ].

Therefore,

I(yn1, . . . , ynX; W(1), . . . , W(M))(a)= I({ϕ(m)in1), . . . , ϕ(m)inX)}i∈[λ],m∈[M]; {w(m)i,1 , . . . , w(m)i,K}i∈[λ],m∈[M])

(b)= X

m∈[M]

X

i∈[λ]

I(ϕ(m)in1), . . . , ϕ(m)inX); w(m)i,1 , . . . , w(m)i,K)

= 0,

where(a) is due to (1) and (15); (b) follows from the fact that all the symbols {wi,1(m), . . . , wi,K(m), zi,K+1(m) , . . . , z(m)i,K+X}i∈[λ],m∈[M]

are generated independently and uniformly from Fpand thus the sets of variables{ϕ(m)in1), . . . , ϕi(m)nX), w(m)i,1 , . . . , w(m)i,K} are independent across all m ∈ [M ], i ∈ [λ]. Therefore, X-security of the data files follows from (2).

Byzantine and Unresponsiveness: By (26), the user can recover all the desired evaluations correctly. Thus, the scheme can resist any B Byzantine servers and U unresponsive servers even if their identities change from round to round. Accordingly, (5) follows.

(12)

User-Privacy: Let T = {n1, n2, . . . , nT} ⊆ [N ] be any T indices of the N servers. By (18) and (19), the query elements qj(u),sn1), qj(u),sn2), . . . , qj(u),snT) sent to the servers in T are protected by T independent and uniform random noises for any u ∈ [ξ], j ∈ [λ], s ∈ [S], as shown below.





qj(u),sn1) qj(u),sn2)

... qj(u),snT)





=





c(u),sjn1) c(u),sjn2)

... c(u),sjnT)





| {z }

=c(u),sj

+





f1n1) f2n1) . . . fTn1) f1n2) f2n2) . . . fTn2)

... ... . .. ... f1nT) f2nT) . . . fTnT)





| {z }

,Fs





 z(u),sj,1 z(u),sj,2

... z(u),sj,T





| {z }

=z(u),sj

, (34)

where

c(u),sj (α) =









P

l∈[(s−1)∆+1:s∆]



Q

k∈[λ],r∈[(s−1)∆+1:s∆]

(k,r)6=(j,l)

α−βk,r

βj,l−βk,r

 Q

v∈[T ] α−αv

βj,l−αv

!

, if u = θ

0, if u 6= θ

and

fl(α) =

 Y

k∈[λ],r∈[(s−1)∆+1:s∆]

α − βk,r

αl− βk,r

· Y

v∈[T ]\{l}

α − αv

αl− αv

, ∀ l ∈ [T ].

According to P3 and P4 again, the elementsα1, . . . , αN are distinct and{αn: n ∈ [N ]} ∩{βk,r: k ∈ [λ], r ∈ [(s− 1)∆+ 1 : s∆]} = ∅ for any s ∈ [S]. Hence, Fs is invertible by Lemma 3, whose inverse matrix is denoted by(Fs)−1. Then,

I {q(u),sjn1), . . . , qj(u),snT)}u∈[ξ],j∈[λ],s∈[S]; θ

= I {c(u),sj + Fs· z(u),sj }u∈[ξ],j∈[λ],s∈[S]; θ

(35)

= I {(Fs)−1· c(u),sj + z(u),sj }u∈[ξ],j∈[λ],s∈[S]; θ

(36)

= H {(Fs)−1· c(u),sj + z(u),sj }u∈[ξ],j∈[λ],s∈[S]

− H {(Fs)−1· c(u),sj + z(u),sj }u∈[ξ],j∈[λ],s∈[S]|θ

(37)

(a)= H {(Fs)−1· c(u),sj + z(u),sj }u∈[ξ],j∈[λ],s∈[S]

− H {(Fs)−1· c(u),sj + z(u),sj }u∈[ξ],j∈[λ],s∈[S]

 (38)

= 0, (39)

where(a) is due to the fact that z(u),sj is generated independently of(Fs)−1· c(u),sj , θ for all u ∈ [ξ], j ∈ [λ], s ∈ [S] and thus H {(Fs)−1· c(u),sj + z(u),sj }u∈[ξ],j∈[λ],s∈[S]|θ

= H {(Fs)−1· c(u),sj + z(u),sj }u∈[ξ],j∈[λ],s∈[S]

. Further, by (19),

I {qsn1, . . . , qsnT}s∈[S]; θ

= I {qj(u),sn1), . . . , qj(u),snT)}u∈[ξ],j∈[λ],s∈[S]; θ

= 0.

Thus, (6) is proved.

Server-Privacy: For anyBs⊆ [N ], Us⊆ [N ], |Bs| ≤ B, |Us| ≤ U, Bs∩ Us = ∅ such that s ∈ [S], the identifies of which are unknown to the user, we have

0 ≤ I({As[N ]\Us, qs[N ]}s∈[S]; W(1), . . . , W(M)|V(θ))

= I({qs[N ]}s∈[S]; W(1), . . . , W(M)|V(θ)) + I({As[N ]\Us}s∈[S]; W(1), . . . , W(M)|V(θ), {qs[N ]}s∈[S])

(a)= I({As[N ]\Us}s∈[S]; W(1), . . . , W(M)|V(θ), {qs[N ]}s∈[S])

(b)= I({As[N ]\Us, As1), . . . , AsN)}s∈[S]; W(1), . . . , W(M)|V(θ), {qs[N ]}s∈[S])

= I({As1), . . . , AsN)}s∈[S]; W(1), . . . , W(M)|V(θ), {qs[N ]}s∈[S])

+I({As[N ]\Us}s∈[S]; W(1), . . . , W(M)|V(θ), {q[N ]s }s∈[S], {As1), . . . , AsN)}s∈[S])

(c)= I({As1), . . . , AsN)}s∈[S]; W(1), . . . , W(M)|V(θ), {qs[N ]}s∈[S])

+I({AsBs}s∈[S]; W(1), . . . , W(M)|V(θ), {q[N ]s }s∈[S], {As1), . . . , AsN)}s∈[S])

(d)= I({As1), . . . , AsN)}s∈[S]; W(1), . . . , W(M)|V(θ), {qs[N ]}s∈[S])

References

Related documents