A Fair Protocol for Data Trading Based on Bitcoin Transactions

(1)

A Fair Protocol for Data Trading

Based on Bitcoin Transactions

∗

Sergi Delgado-Segura, Cristina P´

erez-Sol`

a,

Guillermo Navarro-Arribas, Jordi Herrera-Joancomart´ı

Department of Information Engineering and Communications,

Universitat Aut`

onoma de Barcelona

Abstract

On-line commercial transactions involve an inherent mistrust between participant parties since, sometimes, no previous relation exists between them. Such mistrust may be a deadlock point in a trade transaction where the buyer does not want to perform the payment until the seller sends the good and the seller does not want to do so until the buyer pays for the purchase. In this paper we present a fair protocol for data trading where the commercial deal, in terms of delivering the data and performing the payment, is atomic since the seller cannot redeem the payment unless the buyer obtains the data and the buyer cannot obtain the data without per-forming the payment. The protocol is based on Bitcoin scripting language and the fairness of the protocol can be probabilistically enforced.

1 Introduction

From its first uses, computer networks have been applied as a business base ground to perform commercial transactions. Obviously, on-line interactions impose some restrictions on how such transactions can be performed since there is no physical contact between the parties. One of the first problems that on-line economic transactions faces is distrustful that parts in the transaction may have on each other. When a buyer is in a regular shop buying some groceries, whether the seller will provide the groceries in a bag first or the buyer will give the money for that purchase before is irrelevant since the on-site transaction reduces the distrustful of the parts. However, if the economic transaction is on-line such situation implies a disadvantage for the party that performs the fist step of the protocol. If the buyer pays before receiving the purchase, the seller could act maliciously by not sending the goods and, conversely, if the seller delivers the

∗_{This manuscript is a preprint of the paper published as: Sergi Delgado-Segura, Cristina}

(2)

goods before getting paid, the buyer could disappear with the product without paying. Such situation is solved in real transactions by creating some trust relationship between buyer and seller. Such trust is somehow based on the fact that the buyer is willing to pay before the product is delivered because standard on-line payment systems, like credit cards or bank transfers, have the possibility to be reversed. So the buyer has some confidence that if something goes wrong, for instance, the product is not delivered, he could prove it, the system could reverse the payment and the he would not be at a disadvantage.

However, the situation is different when blockchain based cryptocurrencies are the payment system for purchases. Blockchain based cryptocurrencies avoid the double-spending problem of digital cash systems by maintaining a general ledger in which all transactions are stored. The main property of such ledger is its immutability which makes payments final once a transaction is deep enough in the blockchain. Once the payment is firmly included in the blockchain, it is impossible to reverse such payment, unless the payee of such transaction unilaterally agrees to return the money. Such mechanism leaves the buyer in a weaker position when the payment is performed before obtaining the goods.

The best approach to solving this unequal advantage of the parties in a purchase is to set the whole transaction atomic in the sense that the payment and good delivery takes place at the same time. With this approach, in case one of the parties does not complete his part, any exchange is finished (neither delivery nor payment). In fact, such situation is the best emulation of on-site shop scenario where the payment and the delivery take place at the same time, almost atomically. Of course, translating such approach to a virtual environment implies that goods traded would be restricted to digital data.

The contribution of this paper is the following. We propose a fair protocol for data trading. Our protocol is fair since none of the participants have an advantageous position in the execution of the protocol. The protocol is atomic in the sense that either it is fully executed, ending the buyer with the data and the seller with the payment, or no party incurs in any loss. Our proposal is a practical one, based on Bitcoin scripting language, and can be deployed using existing technology, in contrast to other theoretical approaches that are reviewed in Section 2.

The rest of the paper is organized as follows. In Section 2 we review the state of the art in fair exchange protocols. Section 3 provides some background on Bitcoin transactions, its scripting language and the main relevant transactions used in our protocol. In Section 4 the fair protocol for data trading is presented and its main properties analyzed. Finally, Section 5 concludes the paper.

2 Related work

(3)

carried. The signature is produced so that no party can gain advantage over the other. We rely in Bitcoin as a way to actually implement the exchange contract. The contract does not need to be enforced by other parties (even if the signature of the contract is fairly produced by both parties) because Bitcoin itselfexecutes the contract in the form of a transaction script.

Fair exchange protocols are usually divided into two party protocols and protocols requiring a trusted third party (TTP). Two party protocols, provide a gradual exchange of messages or information between the two parties, to gradually decrease uncertainty and increase fairness in the transaction, without the need for a TTP. First proposed in [7], the idea is for the two parties to exchange secrets bit by bit allowing them to verify the correctness of the received bits. This idea was also proposed in other approaches [9] more specifically for the signature of contracts. Probabilistic protocols for fair exchange were introduced in [4], where the goal is for the parties to end up with a given probability on the fairness (commitment to the contract by the two parties) at a given time (or step).

Regarding the use of a TTP, we usually distinguish between online and offline TTP. The online TTP acts as an intermediary between the two parties ensuring the fairness of the exchange [20, 10]. On the other hand an offline TTP only acts in case of dispute and does not participate in the protocol if all parts act honestly, also called anoptimistic fair exchange [2, 21, 3].

Authors refer to the notion of perfect fairness (also called strong fair-exchange) when a party cannot leave the protocol with a small advantage over the oppo-nent [16]. Perfect fair exchange usually requires the use of TTP-based protocols, although there are several alternatives to implement it. Some of them relay in some penalty mechanism to be applied to the misbehaving parties [19]. How-ever, it is important to note that from a practical perspective, this advantage could be small enough in order to be tolerated by both parties.

In [11] Bitcoin is used for the payment in an optimistic fair exchange (with a TTP) with anonymity. Most notably, Bitcoin has been proposed for fair exchange as a mean of implementing a penalty mechanism [5]. The idea is that if a party leaves the protocol with more knowledge than the rest, those honest parties are compensated. The same idea is applied to multiparty computation in [1].

In our proposal Bitcoin is used as the main mechanism to implement and enforce the fair exchange. We do not rely in the use of a TTP and instead of fairly sign a contract for the exchange of data, the contract is explicitly executed as a Bitcoin smart contract. This has the advantage that the exchange is produced, that is the buyer gets the data and the seller the money, completely or not. In some sense we can say that the exchange is atomic. Although our approach does not achieve strong fairness, as we will show, the advantage that a party (in our case the buyer) takes by leaving the protocol can be bound by the other party.

(4)

proof is used (externally to Bitcoin) to prove the validity of the encrypted data and the secret key, thus providing some sort of strong fair exchange. This is only feasible if a zero knowledge proof exists and is feasible for the specific case (data). A more generic solution is outlined in [18, 17] where a symmetric key is used to encrypt chunks of data such that a subset can be revealed as a proof. As different keys are used for each chunk, revealing a subset does not ensure that the key of the other chunks is correct. In our proposal, since the key revealed is a private key, it is easy to verify its correctness with the corresponding public key. This ensures that the key is valid for all data chunks by just verifying one of them.

3 Bitcoin transactions background

Bitcoin transactions are usually seen as a transfer of bitcoins from a source to a destination address, in which the former can prove the ownership of such bit-coins, and thus spend them, by providing a digital signature. However, Bitcoin transactions are far more complex, and allow the creation of richer conditions that have to be met to redeem the funds [14]. Transactions may be seen, in a more general way, as a collection of inputs, containing references to previous transaction outputs, and a collection of outputs with their corresponding condi-tions under which they will be able to be redeemed. Each input of a transaction refers to an output of a previous one, where unlocking conditions have been established. Hence, each input has to contain the proof of fulfillment of the established conditions of the output he tries to redeem from.

Both unlocking conditions and proofs are coded in transactions usingScript, a stack-based, not Turing-complete scripting language with no loops. In order to check the correctness of a transaction, the full script is executed by concatenat-ing both lockconcatenat-ing and unlockconcatenat-ing scripts, leadconcatenat-ing to a finalTrue popped up into the top of the stack if and only if the proof satisfies the unlocking conditions. This kind of transactions including complex unlocking scripts are also known as smart contracts. However, not every single condition can be coded nor checked usingScript. A limited number of operations (opcodes) are defined in Bitcoin, bounding the variety of scripts that can be encoded using the language. Fur-thermore, its use is even more restricted, since not all the defined operations can actually be used. For instance, the most common script transaction within Bit-coin, the standard Pay-To-Public-Key-Hash where a digital signature is needed to redeem a transaction, is next provided:

ScriptPubKey: OP DUP OP HASH160 <pubKeyHash> OP EQUALVERIFY OP CHECKSIG

ScriptSig: <sig> <pubKey>

(5)

are described.

3.1 Time locked transactions

Time locked transaction outputs are outputs that require a certain time in the future to be reached in order to be redeemed. Depending on whether the future time is absolute to Bitcoin, or relative to the transaction publishing time, two types of time-locks can be found. On the one hand, absolute time-locks, those based on theCheckLockTimeVerifyopcode, establish a fixed date in the future from when the transaction can be redeemed. Down bellow an example of such time-lock (locked until 2022/12/13), along with a standard signature, is provided:

ScriptPubKey: <2022/12/13> OP CHECKLOCKTIMEVERIFY OP DROP <pubKey> OP CHECKSIG

ScriptSig: <sig>

On the other hand, relative time-locks, those based on the

CheckSequenceVerifyopcode, establish an amount of time, starting from the transaction publishing time, that has to be spent to unlock the output. An example of such time-lock (locked for 25 days), together with a traditional sig-nature, can be found as follows:

ScriptPubKey: <25d> OP CHECKSEQUENCEVERIFY OP DROP <pubKey> OP CHECKSIG

ScriptSig: <sig>

Notice that in both examples theScriptSig, included in the transaction that will spent the output, does not contain any time reference, since the transaction creation time is used to check the time-locks. Moreover, both examples include a traditional signature lock. The reason behind this second lock is to restrict the redeemer to a single person, otherwise anyone will be able to spend the output once the requested time has been reached. Figure 1 depicts a general time locked transaction, and can be seen as a representation of any of the two introduced types.

From: Someone

Someone

1 BTC

Signed:

To: Alice

Required to unlock:

1 BTC

From: Alice 1 BTC

Signed:

To: Someone else

Required to unlock:

1 BTC

Someone else signature m

wait until

Alice Signature

Alice

(6)

3.2 Private key locked transactions

Private key locked transactions [8] are another special case of Bitcoin transac-tions in which the transaction output can be redeemed by anyone who provides a private key corresponding to a public predefined key.

Two different approaches can be used to implement private key locked trans-actions, via the definition of new Bitcoin opcode or by taking advantage of a well-known vulnerability of ECDSA algorithm.

3.2.1 New Bitcoin opcode

One possibility to implement private key locked transactions is through the im-plementation of a new crypto opcode that performs precisely a matching

valida-tion between the public key and the corresponding private key: OP CHECKKEYPAIRVERIFY.

TheOP CHECKKEYPAIRVERIFY opcode would check whether the top two items

of the stack,pubKeyand privKey(corresponding, respectively, to a public key and private key), match.

With the usage of this new opcode, a transaction output could be constructed such that, in order to be redeemed, the private key matching the specified public key has to be revealed. An example1 _{of the}_scriptPubKey _{of such an output}

together with thescriptSigneeded to spend it would be:

ScriptPubKey: <pubKey> OP CHECKKEYPAIRVERIFY OP NIP OP CHECKSIG ScriptSig: <sig> <privKey>

The script will first check that the public and private keys belong to the same key pair. Note that, if the validation is successful, the stack values will remain untouched. Therefore, before checking the validity of the signature with

OP CHECKSIG, the privKeyvalue has to be removed from the stack (since it is not needed for signature validation). The execution ofOP NIPremovesprivKey

from the stack. Finally, OP CHECKSIG validates the signature with the public key. If the signature is correct, the script terminates successfully.

Note that the execution of OP CHECKKEYPAIRVERIFYwould fail if the valida-tion is unsuccessful and would leave the stack as it was before if the validavalida-tion is successful. This ensures that the new opcode can be implemented as a soft fork modification of the Bitcoin core protocol by reusing one of the currently unusedOP NOPxopcodes, in a similar way that it has been done in the past with

OP CHECKLOCKTIMEVERIFY(OP NOP2) andOP CHECKSEQUENCEVERIFY(OP NOP3).

3.2.2 ECDSA vulnerability

Since theOP CHECKKEYPAIRVERIFYopcode described in the previous section is not still available, another approach to build transaction outputs that require

1_{The provided script includes a digital signature condition, following the structure of the}

(7)

the disclosure of a specific private key to be redeemed can be taken, using a vulnerability in the ECDSA signature scheme.

ECDSA (Elliptic Curve Digital Signature Algorithm) is the cryptographic algorithm used by Bitcoin to create and validate digital signatures. The ECDSA signature scheme is probabilistic in the sense that there exist many different valid signatures made with the same private key for the same message. Such feature is based on the selection of a specific random valuekduring the signature process. There exists a well known ECDSA signature vulnerability2 _{by which an}

attacker that observes two signatures of different messages made with the same private key is able to extract the private key if the signer reuses the same k

during the signature process. Therefore, the selection of k is critical to the security of the system.

To implement private key lock transactions, we can make use of the afore-mentioned ECDSA vulnerability to perform targeted private key disclosure within Bitcoin. The Private key disclosure mechanism is performed by con-structing transaction outputs that need to reveal a private key in order to be redeemed, in such a way that we ensure the revealed private key is the counter-part of a certain public key.

Let {P K, SK} be an ECDSA key pair belonging to Bob (with Addr(P K)

the Bitcoin address associated to it) and sigprev an existing signature made

with SK. Alice (that is interested in acquiring Bob’s private key) needs to know the value of the previous signaturesigprev, in order to be able to request,

afterwards, a second signature made with the same k. In contrast with the approach followed in [8], where the previous signature appears in the blockchain as the input script of an existing transaction, in this paper our approach is that the existing signaturesigprev does not appear in the blockchain but it is sent

to Alice by an off-chain exchange of values. In this case, the previous signature

sigprev may be transmitted confidentially (and thus only Alice and Bob know

its value). Following this approach, the signed message m does not need to correspond to a Bitcoin transaction hash.

Once an existing previous signature sigprev is known by Alice, she creates

a transaction with an output that requires a second signaturesig to be spent. However, instead of using the classical pay-to-pubkey-hash script, she uses a special script that forces Bob (the redeemer) not only to prove he has the private keySKassociated to the given addressAddr(P K) by creating a valid signature, but also to deliver a signature that has exactly the samekvalue that was used when creatingsigprev.

Doing so accomplishes two purposes: on the one hand, Bob proves he knows the private key associated to the public key by generating a signature that correctly validates with that public key; on the other hand, Bob is implicitly revealing the private key associated to the same public key. Note that Bob does not directly provide the private key, but provides information from which the private key can be derived.

2_{The vulnerability is also present in the non-elliptic curve signature scheme of}

(8)

Figure 2 shows an scheme of the Bitcoin transactions involved in the con-struction of a a private key locked output. Once Alice knows the previous signature, she can construct the transaction T x2, that transfers some bitcoins

of her property to Bob, only if Bob provides a valid signature that has the same

kas the previous signaturesigprevthat Bob previously sent to Alice. Moreover,

the output has an additional condition with a time lock allowing Alice to get a refund of her bitcoins if Bob decides not to collaborate and does not redeem

T x2’s output.

Bob

From: Someone

Someone 1 BTC

Signed:

To: Alice Required to unlock:

1 BTC

Alice

From: Alice Signed:

To: Bob |Alice Required to unlock:

Or:

Alice

1 BTC

Bob Signature ( SK )

1 BTC

Alice Signature

SK

Tx1

Tx2

B

A

SigPrev

Figure 2: Transactions involved in the scheme.

The ScriptPubKey of the output (and its corresponding ScriptSig) that implement the mechanism described above are the following:

ScriptPubKey: OP DUP <pubKey> OP CHECKSIGVERIFY OP SIZE <0x47> OP EQUALVERIFY <sigmask> OP AND <kprev> OP EQUAL ScriptSig: <sig>

First, the script validates the signature against the specified public key. Then, the length of the signature is checked. Finally, a bitwise AND between the new signature andsigmask3is computed, and the result is compared with the

kvalue of the previous signature. If both values are equal, the script terminates successfully; otherwise, the script terminates with a False value on the stack, making it fail.

Note that the only way to ensure that the script succeeds is by providing a

3_sig

mask: a byte array that has 1s on selected positions and 0s in the rest of positions in

(9)

valid signature that has exactly the samekas the previous signature. Therefore, although the redeemScriptSig that spends the output does not include the value of the private key directly, it is implicitly leaking its value by the ECDSA vulnerabilty.

Also note that theScriptSigneeded to spend the output only requires one value: the new signature.

4 The data trading protocol

As we have already stated, the main property of our data trading protocol is its atomicity which provides its fairness for the participants of the protocol. The proposed protocol is run by two parties, the buyer, B, and the seller,

S, and no additional party, like a TTP is needed. Notice that, contrary to other fair exchange protocols [2, 21, 3], our proposal does not define a dispute mechanism, in which some proposals also need a TTP, thanks to the atomicity of the protocol. Furthermore, since our protocol is based on bitcoins, both parties need to be connected with the Bitcoin network to send/receive transactions from the blockchain.

In our scenario, the buyer, B, wants to buy some data D to the seller, S, and he is willing to pay x bitcoins for such data. In order to minimize any possible advantage of one of the parties, we consider that the data being sold can be divided inndifferent parts and each of those parts may have a meaning by itself. Notice that this scenario is not as restrictive as it would appear since multiple data falls into this category. For instance, movies or songs can be sliced and each slice may be recognized as part of the whole performance. On the other hand, when dealing with sensor data, some sensing values may provide evidence that the sensing is correct but the whole sensing data could be needed for specific purposes.

4.1 Protocol description

The full protocol, depicted in Figure 3, can be divided in three main parts: the Data correctness proof, in which a cut & choose protocol between B and

S is performed in order to convince B that the acquired data is correct. The

Signature commitment, used forB to obtain a previous signature performed by

Swith the private key used to encrypt the data. And thePrivate Key Exchange, used to exchange, atomically, the private key that allows to decrypt the sold information for the agreed amount of bitcoins.

In the following paragraphs, we describe in detail each subprotocol. We denote by _{P K, SK_} a public key pair and EP K(·) the encryption function

using the public key.

(10)

P riv a te k ey e xchange

B

S

Sig na tur e commitmen t Da ta c or rec tness p roof

proof=Dij ∀ij

- Verify(Dij,cond) ∀ij

- CheckEqual(Encrypt(Dij,PK),EPK(Dij)) ∀ij

getSigPrev(nonce,PK)

sig_prev

From: B

Signed:

To: S |B

Required to unlock:

Or:

Buyer

x BTC

B Signature

S Signature (( SK )

From: S x BTC

Signed:

Seller( SK )

Required to unlock:

Someone signature

x BTC To: Someone Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Step 9 (Tx1)

getData(cond)

PK,EPK(D),x

EPK(D)={EPK(D0),EPK(D1),...,EPK(Dn−1)}

getProof(ij ∀j ∈ [0,...,m−1] and ij∈ [0,...,n−1])

Validate(message, sig_prev,PK)

Step 8 ExtractSK(Tx2) Step 10 (Tx2) Step 11 Decrypt(D,SK) Step 12

(11)

complete data being sold but also for each of the sliced part of the data4_{. Upon}

reception, S generates a new key pair {P K, SK} and sends B the following information (see Step 2 in Figure 3): the public keyP K, the requested dataD

encrypted using P K, and the data price x. In order to allow B to prove the data correctness,S does not send the D as a whole bunch of encrypted data, but split innchunks which are encrypted individually (as shown in Figure 4), that is: EP K(D) ={EP K(D0), EP K(D1), ..., EP K(Dn−1)}.

D₀ D_ ... D_n−

E_PK E_PK EPK EPK

E_PK(D₀) E_PK(D_) ... _E

PK(Dn−)

Data

E_PK(Data)

Figure 4: Split and encrypt procedure.

WhenB receives all the encrypted data, he requests a correctness proof to

Sconsisting in a random subset of non-encrypted data fromD. To that end,B

selects the subset by randomly choosing a set of mpieces from the encrypted dataset, that isij ∀j∈[0, m−1], ij ∈[0, n−1]. B sends this information and

S can build the correctness proof by choosing the unencrypted pieces of data that matches the received indexes, that is,proof ={Dij ∀j∈[0, m−1], ij∈

[0, n−1]}. S sends such correctness proof toB.

Once B has received the proof he verifies the correctness of D by check-ing that the proof satisfies the conditions. Furthermore, B validates that the received data also matches with the subset of received encrypted data, by recre-ating the data encryption using P K. Therefore, since the subset has been randomly chosen byB, the correctness of the full dataset can be proved with a given probability. Section 4.3 analyzes in depth the impact of the parameters of the scheme on such probability.

Once the data correctness has been proven, theSignature commitment

subprotocol is performed. B requests a signatureSigprevover a nonce message

performed with the private key SK generated by S. S sends Sigprev and B

validates that the signature is correct, using the public keyP Kthat has received in Step 2 of the Data Correctness Proof subprotocol.

Finally, thePrivate Key Exchangesubprotocol is performed. In such sub-protocol,B builds a private key locked transaction,T x1, to perform the atomic

exchange between the private key, SK, and the bitcoin price x. Such private key locked transaction is built using the technique described in Section 3.2 and also adding another time constrain condition following the details of Section 3.1. Such time constrain is used forB to recover the amount ofxbitcoins in caseS

decides not to reveal the private key by not spending the received transaction.

4_{Notice that such conditions will be verified by a validation mechanism. Whether such}

(12)

B broadcasts the transaction T x1 to the Bitcoin P2P network. Once T x1 is

included in a block,S can spend the output of such transaction with an input of a new transaction T x2 in which S will provide the second signature with

the samekof sigprev. Once T x2 appears on the blockchain, B will be able to

recover the private key SK (following the details of Section 3.2) and decrypt the dataEP K(D) he received in Step 2 to retrieve the purchased data.

4.2 Implementation details

In the protocol description provided in the previous section some implementa-tion details have deliberately omitted to allow a better understanding of the general protocol. In this section, we provide some comments regarding such specific details.

4.2.1 Privacy protection

First of all, sensitive information exchanged in the protocol should be protected from third parties. For instance, if an attacker could retrieve the information transmitted in Step 2 and in Step 7, later on, with the knowledge ofT x2, which

is publicly available in the blockchain, he could decrypt the information and retrieve the original dataD. To avoid such situation, information transmitted on steps 2 and 7 could be encrypted using the public key ofB, that could be sent toSin Step 1. Furthermore, in Step 4, some part of the data is transmitted in clear for validation purposes. In this case, an external attacker could also obtain such information. Again, such situation can be avoided by encrypting the information of Step 4 in the same way we just described for Steps 2 and 7.

4.2.2 Data encryption mechanism

As it is well known, public key cryptography is not suitable for encrypting large files due to its poor performance. Then, since the size of the data chunks that are encrypted and transmitted in Step 2 can vary depending of the traded data, we suggest to use digital envelopes to encrypt D. Digital envelopes [13] protect the message by using a two layer encryption in which the data itself is encrypted using symmetric encryption, and then the symmetric key is encrypted using public-key cryptography. Following such an approach, for each chunk i

of data created from D, Di, a symmetric keyki is also generated. Di will be

encrypted using ki, that is Ci =Eki(Di) andki will then be encrypted using

P K, that is, ci = EP K(ki). Thus, each encrypted chunk of data Di sent by

S to B during Step 2 should be replaced by _{Ci, ci}, that is, EP K(Di) → {Ci, ci}. Furthermore, when sending the correctness proof, S will include the

corresponding symmetric encryption keyski ∀i ∈0, ..., m−1. Finally, B will

(13)

4.2.3 Script building

The private key locked transaction used in the secret key exchange subprotocol,

T x1 also includes a time lock condition to allow B to refund hisxbitcoins in

caseS decides not correctly follow the last step of the protocol. The details on how theScriptPubKeyof such transaction can be build are next provided5_:

ScriptPubKey: IF

OP DUP <B pubKey> OP CHECKSIGVERIFY OP SIZE <0x47> OP EQUALVERIFY <sigmask> OP AND <rprev> OP EQUAL ELSE

<expiring time> OP CHECKLOCKTIMEVERIFY OP DROP <S pubKey> OP CHECKSIG

ENDIF

4.3 Protocol fairness discussion

The main objective of the proposed protocol is to achieve fairness in the sense that neitherBnorSwould have any advantage in the protocol. By advantage we mean thatB cannot obtain the data without paying xbitcoins andS cannot obtain the bitcoins without revealing the data. The Data Correctness Proof subprotocol ensures thatS cannot sell fake data. WithoutB verifying parts of the encrypted data,S could encrypt fake data and whenS obtains the bitcoins in T x2 and B the decryption key,B will learn that he was cheated but it will

be too late sinceS already has the bitcoins.

In the following paragraphs, we will show how the buyer B is probabilisti-cally protected against deception by using a cut-and-choose mechanism. Fur-thermore, the level of protection may be adjusted by fixing the ratio of chunks revealed on Step 2.

In the main steps of the Data Correctness Proof subprotocol:

1. Sencrypts each of thenchunks of data with a public keyP Kand commits to the ciphered chunks by sending them toB.

2. B chooses a subset of m chunks and asks S to reveal the original data corresponding to those chunks.

3. B validates the received chunks by checking both that the original data meets the specified conditions and that the encryption of the original data is equal to the committed values.

We say that a sellerS successfully deceives a buyerBif the seller is able to includebcorrupted chunks of data within thentraded chunks without the buyer

5_{An example of such a transaction can be found in}

(14)

noticing it after having validated themrevealed chunks (that is, after finishing the Data Correctness Proof subprotocol). The probability Ω of S successfully deceivingB is given by the following equation:

Ω(m, n, b) = 1−

min{b,m}

X

i=1

b i

n−b m−i

n m

Indeed, _mn

counts the number of ways of choosing melements from a set ofnelements. We are interested in knowing how many of those ways include at least one corrupted chunk. We compute this value by counting the number of ways of selectingmelements with exactlyiof them being corrupt and summing them up for all possible i values. b_i n−b

m−i

computes the number of ways of selecting exactlyi corrupted chunks, that is, the number of ways of selectingi

bad chunks from the set ofb corrupted chunks, b_i

, multiplied by the number of ways of selecting the restm₋i elements from the non corrupted set n₋b,

n−b m−i

. The summation gives the probability of selecting at least one corrupted chunk within the m reveleaded, that is, the probability of detecting a fraud. Therefore, the probability of deception is the complement.

Figure 5 shows the probability of deception Ω for different ratios of chunks revealed, m_n, and different number of corrupted chunks included by the seller,

b, for n = 1 000. Note that, even when the checked chunks ratio is low, the probability of successfully deceiving a buyer is low wheneverbis over a certain threshold. For instance, when 20% of the chunks are checked, the probability of deception is 0.8 if the seller includes just 0.1% of corrupted chunks (b= 1, red dot on Fig. 5). However, if the seller includes 1% of corrupted chunks (b= 10), the probability of successfully deceiving the buyer decreases to 0.106 (green dot on Fig. 5).

When using the proposed data selling protocol, the two parties (buyer and seller) agree on the value of the parameterm. Therefore, the buyer can decide beforehand whether or not to buy a given dataset depending on the deception risk he is willing to assume. Buyers will be interested in using highm values, since these offer higher levels of protection. Of course, even honest sellers will prefer low m values, since if the client does not finally buy the data, m data chunks end up being revealed for free.

As an example, let us consider a certain sellerS that sells a dataset divided inn= 1 000 chunks. The buyer may allow to accept, at most, 5% of corrupted chunks in his purchased data, and the seller does not want neither to reveal in the Step 2 more than the 5% of the chunks. With this configuration, if we analyze the probability Ω forb= 50 (the 5% of the 1 000 chunks),

Ω(50,1 000,50)≈0.072

(15)

Figure 5: Probability of deception Ω. Grey lines highlight the values mentioned as numerical examples.

2. With this settings, the buyer knows that the probability the seller cheats is almost negligible, less than a 0.004.

5 Conclusion

In this paper we have introduced a fair data trading protocol based on Bitcoin transactions. The protocol uses a new type of transactions, the private key locked transaction, that provide an atomic way of exchanging a private key for Bitcoins. Such key is used to encrypt all the traded data, and will be traded, as a part of a Bitcoin smart contract, only when the two parties agree. The correctness of the data sold using the protocol is verifiable by the buyer before performing the transaction by checking a small random subset of data. By using such a cut-and-choose technique, deception is avoided with a high probability while only a small part of the information is learned by the buyer.

The protocol can be implemented by using the recently proposed private key locked transaction and exchanging a few messages between the parties involved in the process, making it easy to deploy. Moreover, it lays on the security measures Bitcoin provides, without introducing more complexity, and it is bound to the computational capabilities of the Bitcoin Scripting language.

(16)

currency for trading with non-physical goods. The application of such a protocol covers a wide range of topics, from general interest data, such as songs, pictures or movies, to even specific purpose data, such as data sensing readings. The integration of such a data trading in data sensing scenarios could provide a secure way of data correctness verification, reducing users misbehaving ratio and optimizing the rewarding system.

Acknowledgements

This work is partially supported by the Spanish ministry under grant number 785 TIN2014-55243-P and the Catalan AGAUR grant 2014SGR-691. Authors would like to thanks Aaron van Wirdum from the Bitcoin Magazine for his illustrations of Bitcoin transactions, in which we have based to create the ones provided along the paper. Also, authors would like to thank our colleague Roger Ten-Valls for his help on the discussion during the calculation of the deception probability equation involved in the protocol.

References

[1] Marcin Andrychowicz, Stefan Dziembowski, Daniel Malinowski, and Lukasz Mazurek. Secure multiparty computations on bitcoin. Commun. ACM, 59(4):76–84, March 2016.

[2] N. Asokan, V. Shoup, and M. Waidner. Optimistic fair exchange of digital signatures.IEEE Journal on Selected Areas in Communications, 18(4):593– 610, April 2000.

[3] Feng Bao, R. H. Deng, and Wenbo Mao. Efficient and practical fair ex-change protocols with off-line ttp. InProceedings. 1998 IEEE Symposium on Security and Privacy (Cat. No.98CB36186), pages 77–85, 1998.

[4] M. Ben-Or, O. Goldreich, S. Micali, and R. L. Rivest. A fair protocol for signing contracts.IEEE Transactions on Information Theory, 36(1):40–46, January 1990.

[5] Iddo Bentov and Ranjit Kumaresan. How to use bitcoin to design fair protocols. In Juan A. Garay and Rosario Gennaro, editors, Advances in Cryptology – CRYPTO 2014: 34th Annual Cryptology Conference, Santa Barbara, CA, USA, August 17-21, 2014, Proceedings, Part II, pages 421– 439. Springer Berlin Heidelberg, 2014.

[6] Bitcoin Wiki. Zero knowledge contingent payment. https://en.bitcoin.

it/wiki/Zero_Knowledge_Contingent_Payment, February 2016.

(17)

[8] Sergi Delgado-Segura, Cristina P´erez-Sol`a, Jordi Herrera-Joancomart´ı, and Guillermo Navarro-Arribas. Bitcoin private key locked transactions (short paper). Cryptology ePrint Archive, Report 2016/1184, 2016. http://

eprint.iacr.org/2016/1184.

[9] Shimon Even, Oded Goldreich, and Abraham Lempel. A randomized pro-tocol for signing contracts. Commun. ACM, 28(6):637–647, June 1985.

[10] Matthew K. Franklin and Michael K. Reiter. Fair exchange with a semi-trusted third party (extended abstract). In Proceedings of the 4th ACM Conference on Computer and Communications Security, CCS ’97, pages 1–5, 1997.

[11] D. Jayasinghe, K. Markantonakis, and K. Mayes. Optimistic fair-exchange with anonymity for bitcoin users. In 2014 IEEE 11th International Con-ference on e-Business Engineering, pages 44–51, 2014.

[12] Gregory Maxwell. The first successful zero-knowledge contingent pay-ment. Bitcoin Core Blog, https://bitcoincore.org/en/2016/02/26/

zero-knowledge-contingent-payments-announcement/, February 2016.

[13] Alfred J Menezes, Paul C Van Oorschot, and Scott A Vanstone. Handbook of applied cryptography. CRC press, 1996.

[14] Arvind Narayanan, Joseph Bonneau, Edward Felten, Andrew Miller, and Steven Goldfeder. Bitcoin and cryptocurrency technologies. Princeton Uni-versity Pres, 2016.

[15] Christof Paar and Jan Pelzl. Understanding cryptography: a textbook for students and practitioners. Springer Science & Business Media, 2009.

[16] Indrajit Ray and Indrakshi Ray. Fair Exchange in E-commerce. SIGecom Exch., 3(2):9–17, March 2002.

[17] Peter Todd. OP CHECKLOCKTIMEVERIFY. BIP 65,

CHECKLOCKTIMEVERIFY, 2014.

[18] Peter Todd and Amir Taaki. Paypub: Trustless payments for information publishing on bitcoin. Github Project, https://github.com/unsystem/ paypub, October 2004.

[19] T.W. Sandholm and V.R. Lesser. Advantages of a leveled commitment contracting. In Proceedings of the Thirteenth National Conference on Ar-tificial Intelligence and Eighth Innovative Applications of ArAr-tificial Intelli-gence Conference, AAAI 96, IAAI 96,, volume 1, pages 126–133, 1996.

(18)