Performance measurements - Actively Secure Two-Party Computation: Efficient Beaver Triple Gener

The tests were executed on the Sharemind cluster where each miner ran in a different machine and they were communicating over LAN. Each of the cluster machines had 48 GB of RAM, two Intel Xeon X5670 CPUs and were connected with 1 GB/s LAN connection.

All the given results are average running times of the operations over at least ten repeated tests, more tests were used for faster operations. Column length denotes the length of the input and output vectors, other columns in the following tables denote various implemented protocols.

All the experiments were executed using a SecreC script. We recorded the running times of each independent execution of separate operations. These results are fixed at a miner level, thus allowing us to get separate measurements from both miners. The latter is mostly important for the online protocols of the asymmetric protocol set. It is important to note that the precomputations are running in parallel with online operations during the measurements of the online phase. This mostly affects the multiplication operation because it uses up a lot of precomputed triples that need to be replaced.

7.3.1 Online protocols

This section analyses the time requirements of the online phase of the asymmetric and symmetric protocol sets. We use the asymmetric setting with 2048-bit key and give the symmetric setting for 2048-bit prime as a comparison to that, as they represent similar data types. In addition, we compare the efficiency of the two computing parties in the asymmetric setting and give a 65-bit version of the symmetric PD.

Tables 7.1 and 7.2 illustrate the time requirements of the two computing parties in the asymmetric setting. Theoretical analysis in Section 4.6 indicated that this setup results in unbalanced workload for the two computing parties, and our measurements also reflect this. Local protocols of CP1 are two to three times faster than the same

protocols for CP2 who also has to compute with ciphertexts. There is less difference

Length Publish Add Subtract ConstAdd ConstMult Multiply 1 21.28 0.03 0.06 0.002 0.15 218.57 10 197.67 0.10 0.41 0.008 1.37 572.57 100 1974.02 0.62 3.93 0.037 13.49 4135.15 1000 19 732.16 6.27 38.87 0.170 134.19 39 866.97 10000 197 276.02 72.75 400.92 3.652 1343.81 392 461.09 Table 7.1: Time requirements of asymmetric computation protocols for party CP1 in

Sharemind (milliseconds)

Length Publish Add Subtract ConstAdd ConstMult Multiply 1 24.76 0.02 0.11 0.003 0.47 222.54 10 210.76 0.15 0.98 0.004 4.65 599.92 100 2103.54 1.38 9.64 0.025 46.19 4399.50 1000 20 919.80 13.92 96.69 0.227 461.83 42 510.33 10000 209 190.81 172.94 989.36 5.749 4613.70 418 776.28 Table 7.2: Time requirements of asymmetric computation protocols for party CP2 in

Sharemind (milliseconds)

CP1 has to wait until CP2 finishes some computations and answers on the network,

before the parties can continue. Time requirements of both miners demonstrate a linear growth as the test inputs increase, illustrating that we actually do not gain much from vectorisation and that the computations are more likely to be CPU than network bounded.

The asymmetric setting can be compared to the symmetric setting with a 2048-bit modulus. Comparing the asymmetric results in Tables 7.1 and 7.2 to those of the symmetric protocols in Table 7.3 reveals that the gain from the symmetric protocol is significant. The declassifying and, thus, also multiplication protocols have gained most as there are no more encryption operations involved in the symmetric setting.

Length Publish Add Subtract ConstAdd ConstMult Multiply 1 10.28 0.01 0.01 0.003 0.01 110.48 10 10.56 0.03 0.05 0.004 0.03 112.36 100 9.94 0.26 0.39 0.024 0.24 127.89 1000 11.27 2.71 3.89 0.175 1.41 223.09 10000 22.65 34.83 48.63 2.534 12.27 1147.97

Table 7.3: Time requirements of symmetric computation protocols for 2048-bit modulus in Sharemind (milliseconds)

A new trend in the symmetric setting is that the times to declassify a value or multiply shares do not increase linearly as the input size grows, at least for small input sizes. This probably indicates that these protocols depend more on the network speed than computation power. The sudden growth in multiplication cost for length 10000 can be explained by the fact it has to perform several Publish operations and the network capacity may become a bottleneck. In addition, it requires as many triples as the input length and, thus, there is continuous precomputation in the background to replace those triples. These trends can be especially well seen from Table 7.4 which also includes longer input lengths.

The comparison of Table 7.3 to Table 7.4 shows, that the considerable differences in the data type size affect the running time less than we might expect. According

Length Publish Add Subtract ConstAdd ConstMult Multiply 1 10.51 0.02 0.01 0.005 0.01 55.79 10 10.27 0.04 0.02 0.007 0.01 56.76 100 10.16 0.23 0.19 0.023 0.05 54.33 1000 11.01 1.37 1.75 0.188 0.62 65.84 10000 24.77 13.56 17.84 0.886 4.49 203.48 100000 102.27 146.48 185.64 10.462 46.20 1880.76 1000000 846.05 1467.25 1682.50 97.189 460.60 14 084.73

Table 7.4: Time requirements of symmetric computation protocols for 65-bit modulus in Sharemind (milliseconds)

to Tables 7.3 and 7.4, computation with 65-bit modulus in only two to three times faster than computing with 2048-bit modulus. The difference between using 65-bit and 33-bit modulus illustrated the same trend where 33-bit modulus is only slightly faster than 65-bit. The surprising result that ConstMult is faster than Add results from the specifics of our setup where the public value is a uniformly random 32-bit element, which is small compared to general tested values. Measuring the symmetric setup with 33-bit prime gives a better estimate where ConstMult is actually approximately three times slower than Add.

These results clearly show that the symmetric setting can be more efficient than the asymmetric one, as expected. However, the symmetric PD can only be made usable if there also exists a reasonably efficient precomputation phase. In conclusion, the protocol set for the symmetric setup is a reasonable focus for future developments.

For simple comparison, in traditional Sharemind three miners PD multiplication of vectors of length 10000 took less than 100 milliseconds and was close to that also for shorter input lengths of 32-bit secrets [12]. Our asymmetric protocol set is significantly slower than that, but actually our symmetric protocol set can show similar speeds for 65 or 33-bit moduli. The main difference here is of course that [12] does not do precomputations. Covertly secure SPDZ [23] for two-parties reports doing 64-bit multiplications of input length 10000 in about 76 milliseconds for one thread and vectorised inputs. Our symmetric protocol set is currently slightly slower than that, but seems to be a good step from the asymmetric version.

7.3.2 Precomputation protocols

This section analyses the behaviour of our precomputation protocols. Table 7.5 gives the results of the time requirements of the precomputation of the asymmetric protection domain. The precomputation phase of the asymmetric protocol set is clearly less efficient than the online phase. In addition, measured results also indicate that the zero-knowledge proofs are the most expensive part of these protocols as also noted in Section 4.6. The proofs take approximately 4₅ of time in the singles protocol and 3₄ of total time in the triples protocol. We need approximately 1.6 seconds for one 2048-bit triple, whereas SPDZ [23] can prepare one 128-bit triple in 0.4 seconds.

Protocol B-Triples is used exactly as given in its protocol description in Algorithm 15 as packing smaller data types was native to this algorithm. The ShareConv-Triples from Algorithm 17 is benchmarked using the packing idea based on the Chinese remainder theorem. We only consider packings where all packed moduli are of equal bit length for simpler exposition and comparison. We chose 65-bit and 33-bit moduli as they are

Length Singles with ZK Triples with ZK Singles Triples 1 315 1852 42 529 10 2335 15 786 402 4699 100 22 492 154 853 4014 46 487 1000 226 923 1 544 571 40 257 464 853 10000 2 233 351 15 464 414 402 658 4 678 799

Table 7.5: Time requirements of asymmetric precomputation protocols in Sharemind (milliseconds)

sufficient to keep traditional 32-bit or 64-bit integers in them.

The CRT packing enables us to pack 15 elements of length 65-bits and 31 elements of length 33-bits into one ciphertext for 2048-bit modulus. This also explains the phenomena in Table 7.6 that lengths 1 and 10 take the same time for Algorithm 17— in both cases they are packed into one ciphertext and the main algorithm has the same workload. Difference between packing efficiency results in the approximately double difference between efficiency of 33-bit and 65-bit versions of these algorithms. Theoretical analysis in Section 5.4 showed that ShareConv-Triples is the most efficient of our proposals and the measurements clearly illustrate this. ShareConv-Triples can prepare about 186 packed 65-bit triples in a second, which is approximately 12 triple generation operations. In comparison, this means that ShareConv-Triples can prepare a semi-honestly secure 65-bit triple in 0.005 seconds, and SPDZ can prepare an actively secure 64-bit triple in 0.027 seconds [23].

B-Triples ShareConv-Triples Length 33-bit 65-bit 33-bit 65-bit

1 63 64 152 155

10 287 311 153 153

100 2617 2767 398 661 1000 25 686 27 199 2789 5458 10000 256 775 270 903 26 948 53 674

Table 7.6: Time requirements of Beaver triple protocols with packing in Sharemind (milliseconds)

For linear packing in B-Triples, we use a security constant σ = 112, which enabled us to pack 11 elements of 33-bits and 8 elements of length 65-bit into 2048-bit of plaintext space. Both this packing inefficiency and considerably higher requirements on the network made this less efficient than ShareConv-Triples. These packing counts also explain the relatively small difference in runningtimes for 33 and 65-bit cases. For both of these moduli, CP1 has to encrypt all length elements and the gain of

packing only comes from a shorter result it gets back from CP2 which also lessens the

amount of decryptions. Hence, the effect the packing has on the overall performance is substantially smaller than for packing with CRT, but the latter gain most from reducing the amount of necessary encryption and decryption functions.

In conclusion, it seems realistic to combine one of our Beaver triple protocols with CRT packing and share conversion to use it as full precomputation in the symmetric setting. The main open issue is defining efficient general share conversion that applies to additive shares and protection mechanisms.

Chapter 8

Conclusions

Secure multi-party computation is a general solution for privacy preserving data pro- cessing tasks. This thesis explores the subcase of SMC for two computing parties with the additional benefit that the parties can detect faults in the computation results. The main tools used to achieve this are an additively homomorphic cryptosystem, additive secret sharing and message authentication codes. We introduced a popular computation model that divides work to preprocessing and online phase. The latter is used to prepare some randomness that helps to speed up computations in the online phase, that performs all desired computations.

The goal of this thesis is to propose and implement new protocols for secure two- party computation for both online and precomputation phase. We concentrate mostly on common operations as sharing and publishing secret data as well as addition and multiplication. The latter is commonly implemented using Beaver triples, that are prepared in the offline phase. One of the important goals of our protocol sets is to define efficient generation of Beaver triples using an additively homomorphic cryptosystem.

The main result of this thesis is the introduction of three different flavours of setup for secure two-party computation, including asymmetric, symmetric and shared key setup. Their theoretical differences are stressed by the exact initialisation and imple- mentation of the first two. For our initialisation, the symmetric setup is both more efficient and more flexible than the asymmetric setting. The shared key setup is pre- sumably more efficient than the symmetric one, but adds additional complexity to verify the correctness of both computing parties.

The main goal of the Beaver triple generation protocols is to maximise the total bit length of the triples we can obtain from one multiplication using the Paillier cryptosystem. The main difficulties are coming up with a good way to pack smaller elements into the plaintext space of the Paillier cryptosystem and modifying the multiplication with the Paillier cryptosystem to give correct results for other moduli than the Pail- lier modulus. Two possibilities to pack smaller values into the plaintext space include linear packing and packing using the Chinese remainder theorem. The former is useful because it proposes no limits to the packed types, but the latter can be more efficient. We can also correct the results of the Paillier multiplication by analysing the potential outcomes of the protocol and collaboratively deciding which of those happened.

Current results show that actively secure multi-party computation is significantly slower than passively secure versions. However, our results indicate that fully implemented symmetric protocol set could be close to the performance of the SPDZ framework that is the current leader in actively secure multi-party computation frameworks. In addition, achieving security against malicious adversaries can be very important for

data mining tasks that have important economical or societal outcomes. Therefore, in many cases the extra time consumption is a reasonable trade-off for the additional layer of security.

Future work should extend the symmetric setup to include a full precomputation phase and add new operations to both introduced protocol sets. In addition, an im- plementation of the shared key setup using precomputation with Paillier cryposystem would provide an interesting comparison to the existing asymmetric and symmetric setups. Furthermore, the protocols for collecting inputs or returning outputs should be implemented to allow us to use these protection domains in real world applications. Likewise, it would be important to fully specify the universally composability of each protocol as well as define protocols for setting up the necessary keys of the protection domains.

Kahe osapoolega turvaline ühisarvutus: efektiivne Be-

averi kolmikute genereerimine

Magistritöö

Pille Pullonen

Resümee

Turvaline ühisarvutus võimaldab salajaste sisenditega funktsioone väärtustada ning seeläbi lahendada turvaliselt mitmeid andmetöötlusülesandeid. Passiivselt turvaline ühisarvutus kindlustab, et kui kõik osapooled järgivad protokolli, siis jäävad sisen- did salajaseks ning väljundid on õiged. Aktiivne turvamudel tagab privaatsuse ka siis, kui osapooled ei käitu ausalt ning võimaldab kontrollida saadud tulemuste korrektsust. Käesolev töö uurib turvaliste ühisarvutuste erijuhtu, kus on kaks arvutavat osa- poolt. Neile lisaks võib olla ka kolmandaid osapooli, kes annavad arvutusele sisendeid või soovivad saada tulemusi. Töö peamiseks eesmärgiks on kirjeldada aktiivses mudelis turvalisi kahe osapoolega protokollistike ning implementeerida need turvalise ühisar- vutuse raamistikus Sharemind. Meie protokollid on jagatud kahte osasse: ettearvu- tamine ning tööfaas. Efektiivse ettearvutamise saavutamiseks vaatleme eraldi, kuidas genereerida Beaveri kolmikuid, mis võimaldavad tööfaasis teha kiiret korrutamist.

Kahe osapoolega ühisarvutuse ülesseadmiseks on vähemalt kolm erinevat võimalust: asümmeetriline, sümmeetriline ja jagatud konfiguratsioon. Käesolev töö keskendub ka- hele esimesele ning defineerib kummagi jaoks konkreetse protokollistiku näite. Kolmas on olemas meie tööd oluliselt mõjutanud SPDZ protokollistikus. Meie põhiline töö- riist aktiivses mudelis turvalisuse saavutamiseks on sõnumiautentimiskood, mille abil kontrollitakse salastatud väärtuste korrektsust. Ebasümmeetrilises protokollistikus kasutame lisaks ka kinnistusskeeme ja nullteadmustõestusi. Mõlemad protokollistikud põ- hinevad aditiivsel ühissalastusel. Nii meie teoreetiliste arutluste kui implementatsiooni järgi on sümmeetriline protokollistik efektiivsem ning paindlikum kui ebasümmeetrili- ne. Eelkõige on sümmeetriline praktilisem, sest võimaldab vähese vaevaga defineerida erineva suurusega andmetüüpe.

Ettearvutamise osas keskendusime eelkõige Beaveri kolmikute ehk juhusliku väärtu- sega multiplikatiivsete kolmikute (a, b, c) genereerimisele, kusjuures a, b on juhuslikud, ning c = a · b. Kasutame selleks aditiivselt homomorfset Paillier’ krüptosüsteemi ning klassikalist algoritmi aditiivselt jagatud andmete korrutamiseks Paillier’ krüptosüstee- mi kasutades. Peamiseks väljakutseks on selle algoritmi kohandamine erinevatele and- metüüpidele sõltumata krüptosüsteemi jaoks defineeritud moodulist. Eelkõige vaatame, kuidas garanteerida, et korrutamisprotokoll annaks sõltumata moodulist korrektseid tulemusi. Selgub, et võimalikud tekkivad vead on hästi defineeritud ning arvutavad osapooled saavad turvaliselt kontrollida, kas viga esines või mitte.

Efektiivsuse tõstmiseks analüüsime ka erinevaid viise, kuidas väiksemaid andme- tüüpe Paillier’ avateksti sisse pakkida nii, et lõpptulemusena saame iga pakitud ele- mendi jaoks korrektse kolmiku. Elemente saab pakkida nii lineaarselt kui ka Hiina jää- giteoreemi kasutades. Meie tulemuste kohaselt on viimane neist pakkimise mõttes efektiivsem, kuid seab lisapiiranguid pakitud elementide moodulitele. Praktikas tähendab see, et Hiina jäägiteoreemi järgi pakkimisele lisaks võime me vajada ka algoritme jagatud andmete mooduli vahetamiseks.

Realiseerisime nii asümmeetrilise kui sümmeetrilise protokollistiku tööfaasi ja asüm- meetrilise protokollistiku ettearvutamise faasi. Lisaks realiseerisime ühe lineaarse pakkimisega ning ühe ühe Hiina jäägiteoreemil põhineva pakkimisega Beaveri kolmikute genereerimise protokolli. Katsed näitavad, et aktiivselt turvalise sümmeetrilise protokollistiku tööfaas on rohkem kui kaks korda ajamahukam kui traditsiooniline kolme osapoolega passiivselt turvaline Sharemindi protokollistik. Samas on jõudluse vahe piisavalt väike selleks, et sümmeetriline protokollistik oleks praktikas kasutatav. Lisaks võivad tugevamated turvagarantiid paljude kriitilise tähtsusega andmetöötlusülesanne- te lahendamisel kaaluda üles jõudluse puudujäägid.

Ettearvutamise osas on selgelt näha, et asümmeetriline protokollistik jääb oluliselt alla SPDZ protokollistiku täishomomorfsel krüposüsteemil põhinevale ettearvutamise- le. Samas on meie Hiina jäägiteoreemil põhinev pakkimismeetod koos sobiva kolmikute genereerimise meetodiga piisavalt efektiivne, et oleks võimalik selle alusel defineerida ettearvutusfaas sümmeetrilisele protokollistikule.

Bibliography

[1] Aiello, W., Ishai, Y., and Reingold, O. Priced oblivious transfer: How to sell digital goods. In Proceedings of the International Conference on the Theory and Application of Cryptographic Techniques: Advances in Cryptology (London, UK, UK, 2001), EUROCRYPT ’01, Springer-Verlag, pp. 119–135.

[2] Barak, B., Canetti, R., Nielsen, J. B., and Pass, R. Universally compos- able protocols with relaxed set-up assumptions. In Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science (Washington, DC, USA, 2004), FOCS ’04, IEEE Computer Society, pp. 186–195.

[3] Barker, E., Barker, W., Burr, W., Polk, W., and Smid, M. Recommen- dation for key management – part 1: General (revision 3). Tech. rep., National Institute of Standards and Technology, 2012. NIST Special Publication 800-57. [4] Beaver, D. Efficient multiparty protocols using circuit randomization. In

Proceedings of the 11th Annual International Cryptology Conference. CRYPTO ’91 (1991), J. Feigenbaum, Ed., vol. 576 of Lecture Notes in Computer Science, Springer, pp. 420–432.

[5] Beaver, D., Micali, S., and Rogaway, P. The round complexity of secure protocols. In Proceedings of the twenty-second annual ACM symposium on Theory of computing (New York, NY, USA, 1990), STOC ’90, ACM, pp. 503–513. [6] Ben-Or, M., Goldwasser, S., and Wigderson, A. Completeness theorems

for non-cryptographic fault-tolerant distributed computation. In Proceedings of the twentieth annual ACM symposium on Theory of computing (New York, NY, USA, 1988), STOC ’88, ACM, pp. 1–10.

[7] Bendlin, R., Damgård, I., Orlandi, C., and Zakarias, S. Semi- homomorphic encryption and multiparty computation. In Proceedings of the 30th Annual international conference on Theory and applications of cryptographic techniques: advances in cryptology (Berlin, Heidelberg, 2011), EUROCRYPT’11, Springer-Verlag, pp. 169–188.

[8] Blakley, G. R. Safeguarding cryptographic keys. In Proceedings of the 1979 AFIPS National Computer Conference (1979), vol. 48, pp. 313–317.

[9] Bogdanov, D. Sharemind: programmable secure computations with practical applications. PhD thesis, University of Tartu, 2013. http://hdl.handle.net/ 10062/29041.

[10] Bogdanov, D., Laud, P., and Randmets, J. Domain-polymorphic program- ming of privacy-preserving applications.

[11] Bogdanov, D., Laur, S., and Willemson, J. Sharemind: A framework for fast privacy-preserving computations. In Proceedings of the 13th European Symposium on Research in Computer Security - ESORICS’08 (2008), S. Jajodia and J. Lopez, Eds., vol. 5283 of Lecture Notes in Computer Science, Springer Berlin / Heidelberg, pp. 192–206.

[12] Bogdanov, D., Niitsoo, M., Toft, T., and Willemson, J. High- performance secure multi-party computation for data mining applications. Int. J. Inf. Sec. 11, 6 (2012), 403–418.

[13] Boost - C++ libraries. http://www.boost.org/. Last accessed 2013-04-02.

[14] Brakerski, Z., and Vaikuntanathan, V. Fully homomorphic encryption from ring-lwe and security for key dependent messages. In Proceedings of the 31st annual conference on Advances in cryptology (Berlin, Heidelberg, 2011), CRYPTO’11, Springer-Verlag, pp. 505–524.

[15] Brier, E., and Joye, M. Weierstraß elliptic curves and side-channel attacks.

In document Actively Secure Two-Party Computation: Efficient Beaver Triple Generation (Page 76-89)