• No results found

PhD Research Proposal: Incentive Elicitation and Behavior Induction on Anonymous Communication

N/A
N/A
Protected

Academic year: 2021

Share "PhD Research Proposal: Incentive Elicitation and Behavior Induction on Anonymous Communication"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

PhD Research Proposal: Incentive Elicitation and Behavior Induction on Anonymous Communication

Zhongliu Xie ([email protected]) 24th November 2011

1 Introduction

Anonymity technology is used to conceal user’s personal information during web browsing, online chatting and making email conversations, etc., thus provide secure network communications. Major modern anonymous communication systems such as Tor [3], Babel [8], Mixmaster [13], Tarzen [5] and Morphmix [14] are based on the mix networks paradigm originally suggested by David Chaum in [2]. The fundamental principle is encrypting messages or packets in layers and passing through a sequence of relays (called mixes), each removes a layer of seal and forwards to its successor, until reaching destination. According to [3], relay-based anonymity is divided into two categories: high-latency design and low-latency design. The former is more resistant to attacks but suffers from significant lags in interaction while the latter although is less secure, turns out to be more attractive, particularly in interactive tasks such as web browsing and online chatting.

However, anonymous communication is still far from being widely used. There are a number of reasons, where the most critical is probably the lack of incentive for users to use them. For instance, many people find it unnecessary to anonymize scamming news on BBC website or playing online games. A promising focus of research is exploring anonymity into more specific domains to tackle more specific problems. An example is hiding personal information when using location-based service on mobile devices. Longitude [4] is a typical technology (but not based on mix networks) that enables location-sharing service between friends within social networks while at the meantime preserves location privacy against service provider. In fact, anonymity on mobile computing has much wider usability. Another example is using anonymity technology to evade Internet censorship, such as the one launched by the “Great Firewall of China”. This is one of the popular uses of Tor.

However, such anonymity systems generally rely on computers supplied by volunteers to provide a large-scale anonymity service, where malicious users or software may abuse the use of these resources, by for example downloading large-sized but trivial files or widely spreading spams. These mischievous behaviors significantly demotivate the volunteers from contributing to the anonymity system.

To better elicit user incentive of using anonymous communication technology and to more efficiently man-

age the use of resources against abuse or other mischievous behaviors, introducing economic elements to de-

velop incentive-driven and behavior-induction mechanisms could potentially provide an effective solution. Such

economic approach could be centered at building a market-based mechanism for users to use anonymity tech-

nology, rewarding contributions and penalizing misbehaviors. For example, each user in an anonymity network

is associated with some virtual bank account, and use of service is associated with a price that may change

dynamically based on demand and supply conditions. Providing resource for the network would earn the vol-

unteer some virtual currency, whereas performing mischievous behaviors would incur fines. Also, there should

be a policy to refill free rider’s account slowly (but not too slow) so that (more) users are motivated to use the

anonymity technology. In that sense, all the users in the anonymity network collectively build up a community,

sharing resources.

(2)

2 Related Work

2.1 Mix Networks:

According to [2], the mix networks paradigm works based on public key cryptography: to encrypt/decrypt some message X , one uses a pair of keys K and K 1 , where the former is made public (i.e. known by other users) while the latter is kept private. The encryption of X by key K is denoted K(X), and the keys satisfy the following condition:

K 1 (K(X)) = K K 1 (X) = X

In that sense, when a message X is encrypted using the public key K, only the holder of the private key K 1 can decrypt and access the information. On the other hand, one can encrypt a message using its private key to only allow the public key holders to discover the content. Furthermore, to increase the difficulty of verifying a guess X = Y by formulating K(X) = K(Y ), some random bits R can be attached to X, denoted K(R , X). R could also be replaced by some signature S, so that when others decrypt the sealed message K 1 (S ,X ), they can check whether it has been signed by the private key holder.

Moreover, although message content may not be discovered, correspondences among users could still be tracked, which is recognized as the traffic analysis problem. A mix is an untraceable message delivery system associated with a public key K 0 and a private key K 0 1 . If a user wants to deliver a message X to address A (sealed by K A ), she needs to compose another message that contains both the sealed message and the address, encrypt it using K 0 and send to the mix. Then the mix decrypts the message and throws the random bits as below

K 0 

R

,K A (R, X ),A 

−→ R

, K A (R, X), A −→ K A (R, X), A

after which, the mix sends K A (R, X) to address A. The order of message arrival could be hidden by outputting lexicographically ordered batches periodically (high latency mechanism). In that sense, an attacker will not be able to trace the messages and determine the occurrence of a conversation between target users.

However, there is still a problem: all privacy-preserving communications rely on the intermediate mix, which has to be provided by a trustable authority. Once the mix is compromised, all correspondences will be exposed. To overcome this, Chaum suggests using a cascading architecture of mixes instead of a single mix.

Supposing there is a cascade of N mixes, then a message sent from one person to another will be encrypted using the N public keys one by one in an inverse order of the sequence. After that, the message is passed through the cascade of mixes one after another, each decrypts one layer of seal then passes it to the next mix, and the last mix delivers the message to destination. In that sense, each node knows nothing but its predecessor and successor, and unless all the mixes are compromised, the correspondence shall remain secret (ideally). More details such as return address and digital pseudonym could be found in [2].

2.2 Onion Routing and Tor

Onion routing is an anonymous routing approach based on mix networks, formalized by authors in [6]. Tradi- tional routing approach requires data source and destination to be publicly known so that routers can construct a right path for the data to pass through. This is subject to the traffic analysis problem described earlier. In the onion routing architecture, each intermediate router is called an onion router (originally called proxy/routing node) and plays a role like a mix. The initiating onion router holds all the information and constructs a rout- ing path (termed a circuit), where a packet is encrypted layer-by-layer (like an onion) and transmitted through the other onion routers one after another, in exactly the same procedure as the operation of the mix cascade described in Section 2.1.

The original onion routing system was left behind at a proof-of-concept level, with a lot of limitations and

problems unaddressed. In [3], the authors have theorized and built the second generation onion routing system

– Tor. Tor is well developed with a number of advanced features including perfect forward secrecy, conges-

tion control, leaky-pipe circuit topology, variable exit policies and so forth. Its advanced design has solved

a list of widely-concerned anonymity problems therefore has become one of the most influential anonymous

communication platforms.

(3)

However, there are still a number of important limitations associated with Tor. According to [3], typical limitations lie on the aspects of scalability, bandwidth heterogeneity, cover traffic, multisystem interoperability, directory distribution and so forth. Among them, the most significant weakness is probably the difficulty of defending end-to-end timing attacks. This problem could be largely relieved if there is a very large number of users. Another key feature of Tor is that it overlays upon nodes provided by voluntary users, and more nodes and users inherently imply better anonymity. However, node providers are frequently bothered by the abuse of use, which significantly demotivates them from doing so. This brings in the problems of incentive elicitation and behavior induction.

Apparently, there are other valuable anonymity systems such as Babel, Mixmaster, Tarzen and Morphmix, etc., each is associated with different advantages and weaknesses. These technologies may point out incentive and misbehavior problems from different angles.

2.3 Market-based Mechanism on Shared Computing Resource

Introducing economic elements into the research of computer science and developing hybrid approaches to ad- dress technical problems have been strongly advocated in my previous paper [20]. Using market-based mecha- nisms in distributing shared resources could significantly increase the efficiency of resource allocation. In fact, numerous attempts have been made on simulating a market-based environment, where the earliest research in maximizing computing utility using economic incentives could date back to 1966, when Greenberger discussed ways and methods of achieving fairness in queuing systems [7]. More frequently referenced is the auction system developed by Sutherland in 1968 [17]. In this auction system, users bid for exclusive access to PDP-1, which was the first computer created by Digital Equipment Corporation in 1960. Different users hold different levels of budgets, which are refilled every day and residuals could not be brought forward to the next day. All users are placed in equal positions and auction is purely based on price of bid.

Following that, authors in [1] have suggested two auction-based resource allocation environments, which are respectively Bellagio and Mirage. They have discussed the high-level design and their deployment experiences on the worldwide PlanetLab research network and on a shared sensor network testbed. The design is centered at a resource allocation mechanism and a virtual currency system, where the latter is the core of incentive elicitation. Users in these systems are distributed with virtual currencies, and resource use is associated with a price. There is a specially developed currency refilling policy, which penalizes/rewards users based on usage or lack of usage during peak time.

Another good example is the market-oriented P2P backup system developed in [15]. In this system, users trade their resources such as disk space and bandwidth in exchange for a reliable backup service provided by other users in a local P2P network. However, there is no explicit currency mechanism in this system. Instead, it uses an implicit policy where each type of resource is associated with a hidden price, and the amount of cost or earning each time is directly changed to the level of backup service usable before prompted to user. This conforms to the Hidden Market Design suggested in [16], where the authors suggest developing some “Hidden Market User Interface” that wraps around an actual market to hide its existence from unsophisticated users in the interaction, to avoid “decision overload” and reduce cognitive cost. This is similar to the Adapter design pattern in object-oriented programming approaches.

2.4 Pricing Models for Computing Resource

In a market of computing resource, a seller exchanges his/her resource for money, which could be used to

purchase service in the future. However, a transaction may only be made when a seller and a buyer could reach

in an agreement with a desirable price. In that sense, how to properly price the resource is the key, and pricing

should consider both demand and supply. In Grid computing market, authors in [9, 10] have developed a number

of advanced models using a multi-agent simulation environment to analyze buyer and seller behaviors. In that

system, each buyer/seller is associated with a price of bid/ask (P b /P a ); and when a buyer and a seller approaches

each other, a transaction will be carried out if P b > P a . The pricing models developed include a Plain Vanilla

Model and a Middleware for Activating the Global Open Grid (MaGoG) model, each contains several sub-

models, where price could either be static or dynamic. Later on, they have also extended the models to simulate

(4)

a Markovian futures market for computing utility and demonstrated the potential of computing utility as a financial derivative [12]. Another notable point of their work is the use of multi-agent simulation environment.

An agent-based simulation could be used to model distributed computing as a process of interaction, where agents cooperate and compete with other agents with respect to their own “economic purposes” [18]. It usually involves a large number of interacting and decision-making processes conducted by agents, which are difficult to be modeled otherwise.

Furthermore, authors in [11] have presented a pricing model for cloud computing market, where they use Genetic Algorithms that takes in some simple pricing functions and generates new pricing functions, which are hopefully better in the sense of offering suitable prices. The meaning of “suitable” refers to more stable and quicker in matching sellers and buyers in settling a deal. Moreover, in my master’s thesis [19], I have proposed a systematic theory to instruct cloud providers in developing pricing strategies to maximize profits under various circumstances, including monopoly or oligopoly, constrained or unconstrained total resources, single or multiple products, possibilities and different types of outsourcing and broking. I have also developed a number of mathematical models to simulate the cloud market situations, including the Timely Optimized Static Regression, Time Series Regression, Parametric Random Walk Model and Geometric Brownian Motion. The theory covers a wide range of fields including Cloud Computing, Utility Computing, Economics, Econometrics, Mathematical Modeling, Operations Research, Mathematical Finance, Game Theory and Industrial Organization; and it may also be applied in domains other than cloud industry.

3 Objectives and Approaches

Objective 1: Identify the incentive factors and possible misbehaviors

To develop incentive elicitation and behavior induction mechanisms, the pre-condition is a clear understanding of the incentive factors of using anonymity technologies and potential mischievous behaviors. The examples described above are merely tips of an iceberg. Exploring demand in more specific domains, discovering real privacy concerns and investigating levels of acceptance by users on the presence of a market on anonymity resources are the essentials to the research, and whether the market should be explicit or hidden is also an open question. A typical starting point is conducting a survey and/or arranging interviews with current and potential users of various anonymous communication systems, including Tor, Morphmix, Tarzen, Mixmaster and so forth, regarding privacy concerns and observed malicious behaviors. Another point to approach is reading the archives of existing systems and trying to find recorded problems or weaknesses posed by undesirable user behaviors.

Incentive factors and misbehaviors should be classified based on significance or other factors, for example there is a stronger incentive to anonymize using mobile services that may reveal location information, whereas there is a lower incentive to anonymize browsing websites that contain insensitive contents.

Objective 2: Determine levels of resource use involved in anonymity technologies

What resources are used in realizing anonymous communication must be clearly identified before a market could be established. This is because these resources are the goods to purchase or exchange with. Different technologies may involve different resource usages, and in some systems resources are provided by dedicated service providers (e.g. Longitude) whereas in other systems they are supplied by volunteers (e.g. Tor). Low- latency technologies usually involve CPU, memory, bandwidth and data traffic while high-latency ones normally also involve disk space to buffer messages or data. Furthermore, different technologies may be associated with different levels of resource use. Classifying existing anonymity systems by features and observing them type by type may be a good approach.

Objective 3: Design virtual currency policies and pricing mechanisms

Once incentive factors and possible mischievous behaviors are identified, and resource usage information is

determined, the attempt of introducing market-oriented mechanisms could be started. A typical market-based

(5)

mechanism should contain a series of virtual currency policies and pricing mechanisms. The former is consist of an initial currency distribution policy, a free rider’s auto-generation policy, a contributor’s earning policy and a misbehavior penalty policy, whereas the latter defines how the price of a particular resource is determined and how it changes w.r.t. demand and supply. Moreover, the resources could be bundled up instead of being treated separately. In fact, the contributor’s earning policy and the pricing mechanisms may be better off analyzed together. Incentive elicitation and behavior induction are centered at stimulating contributions of volunteers with benefits and limiting misbehaviors with penalties. The design of pricing mechanisms would probably involve mathematical modeling of demand and supply as well as relevant work from operations research (to maximize total profit or utility). My master’s thesis should be very helpful. In addition, materials about algorithmic game theory and multi-agent systems should be very relevant.

Objective 4: Develop market-oriented systems to assess practical values

The final market-based system should contain an operating module, a currency policy module and a pricing module. The operating module is consist of a number of monitors to collect resource usage statistics and also possibly includes a central controller to make sure the entire system work properly. The existence of a central controller could make currency policies and pricing mechanisms easier to be executed but it would bring in the problem of global synchronization. This is because partial update of market information may jeopardize the development of effective monetary policies and pricing mechanisms. On the other hand, decentralization of policy governing solves the problem of synchronization but may increase the difficulty of execution. Both ways need to be carefully analyzed on a case-by-case basis, each case could be one of the current anonymity technologies. A significant amount of experiments should be conducted both in simulated environments and on real anonymous communication systems. Design of currency policies and pricing mechanisms may also be revisited and optimized as per need.

Objective 5: Investigate alternative incentive-driven approaches

Except for the market-oriented approach to induce user behaviors, there may also be a number of alternatives.

The example given in Introduction Section was investigating incentives of using anonymous communication in more specific domains. These alternative approaches could help better shape the research of incentive elicitation and behavior induction, probably into a broader perspective. The point to approach could be a detailed analysis of the incentive factors identified but not applied in market-based mechanisms. For instance, preserving privacy using service from dedicated anonymity technology provider may be difficult to be applied with a market-based solution, as users do not exchange anything in return.

4 Potential Threats

Threat 1: Sabotage by government and/or other parties

Government may try to prevent the use of anonymity technologies such as Tor, in order to, for example, better execute some Internet censorship policy. Some other organizations may also try to limit anonymous commu- nications in some local domain due to various reasons. Government and other powerful organizations usually possess large amounts of computers and/or other computing devices. When they deploy their resources into some market-based anonymous network, they may turn out to be a major share holder of total resources. In that case, they may be able to sabotage the economic system with their dominant power. In fact, this is a key challenge of the virtual currency system and pricing system, and requires careful analysis during the design of the market mechanisms.

Threat 2: Potential of undesirable strategic behaviors

In a market-based anonymous network, strategic behaviors in trading may always occur, some of which may

be undesirable. A positive strategic behavior, for instance, could be contributing to the network when there are

(6)

abundant resources, in order to secure enough virtual currency to purchase large-scale anonymity service during resource shortage. An example of undesirable strategic behavior is: someone may buy out a large amount of service on some particular time interval, and let the remaining users to approach him/her with a higher price.

In [1], the authors have identified four types of undesirable strategic behavior in the auction-based resource allocation systems, although not all of which may be relevant, they point out the risks of such behaviors in a market-based system. Threat 2 differs from Threat 1 in the sense that the troublemaker does not intend to prevent anonymity technology but merely wants to “make money” to improve his/her use experience. In that sense, everyone may have an incentive to do so. This will be another challenge of the virtual currency system and pricing mechanisms.

5 Plans for the First Year

The first year will be commenced by careful literature readings of existing anonymity technologies and user incentives, as well as potential/existing misbehaviors in each anonymity system. Besides, a careful analysis of resource use for each technology will also be carried out. Meanwhile, I will also read more materials about developing hybrid approaches of computer science and economics to tackle real problems. At this stage I cannot assert too much confidence in knowing the materials well, but I have enough familiarity with these lines of work to know where to start and which directions to carry on. My previous work on economic and financial aspects of cloud computing could be very useful. I expect that there are a lot of similarities in the modeling of demand and supply as well as pricing algorithms. Furthermore, I have working knowledge of economics, econometrics, operations research and mathematical finance, and the pricing system developed in my master’s project has been adopted and is now running on top of the commercial Imperial College Cloud Computing Platform (IC-Cloud), which demonstrates the practical values of my pricing theory. In addition, I remain frequent contact with Prof.

Yike Guo and Dr. Daniel Kuhn, both are experts in economic and financial aspects of computing science at Imperial College London. They could provide some ideas to help with my research.

References

[1] Alvin AuYoung, Philip Buonadonna, Brent N. Chun, Chaki Ng, David C. Parkes, Jeffrey Shneidman, Alex C. Snoeren, and Amin Vahdat. Two auction-based resource allocation environments: Design and ex- perience. In Rajmukar Buyya and Kris Bubendorfer, editors, Market Oriented Grid and Utility Computing, chapter 23. Wiley, 2009.

[2] David Chaum. Untraceable electronic mail, return addresses, and digital pseudonyms. Communications of the ACM, 24(2), February 1981.

[3] Roger Dingledine, Nick Mathewson, and Paul Syverson. Tor: The second-generation onion router. In Proceedings of the 13th USENIX Security Symposium, August 2004.

[4] Changyu Dong and Naranker Dulay. Longitude: A privacy-preserving location sharing protocol for mobile applications. In Ian Wakeman, Ehud Gudes, Christian Damsgaard Jensen, and Jason Crampton, editors, IFIPTM, volume 358 of IFIP Publications, pages 133–148. Springer, 2011.

[5] Michael J. Freedman and Robert Morris. Tarzan: A peer-to-peer anonymizing network layer. In Proceed- ings of the 9th ACM Conference on Computer and Communications Security (CCS 2002), Washington, DC, November 2002.

[6] David M. Goldschlag, Michael G. Reed, and Paul F. Syverson. Hiding Routing Information. In R. Ander- son, editor, Proceedings of Information Hiding: First International Workshop, pages 137–150. Springer- Verlag, LNCS 1174, May 1996.

[7] M. Greenberger. The priority problem and computer time sharing. Management Science, 12(Novem-

ber):pp. 888–906, 1966.

(7)

[8] Ceki Gülcü and Gene Tsudik. Mixing E-mail with Babel. In Proceedings of the Network and Distributed Security Symposium - NDSS ’96, pages 2–16. IEEE, February 1996.

[9] Uli Harder and Fernando Martinez Ortuno. Simulation of a peer to peer market for Grid Computing. In The 15th International Conference on ANALYTICAL and STOCHASTIC MODELLING TECHNIQUES and APPLICATIONS, ASMTA 2008, volume 5055 of Lecture Notes in Computer Science, pages 234–248, 2008.

[10] Uli Harder and Fernando Martinez Ortuno. A more realistic Peer-to-Peer Grid Market Model. In EPEW’09, 6th European Performance Engineering Workshop Imperial College London, 9-10 July, volume 5652 of Lecture Notes in Computer Science, July 2009.

[11] J. MacÃas, M. & Guitart. A genetic model for pricing in cloud computing markets. In 26th Symposium on Applied Computing, 2011.

[12] Fernando Martinez Ortuno, Peter G. Harrison, and Uli Harder. A Markovian Futures Market for Comput- ing Power. In WOSP/SIPEW, January 2010.

[13] Ulf Möller, Lance Cottrell, Peter Palfrader, and Len Sassaman. Mixmaster Protocol — Version 2. IETF Internet Draft, July 2003.

[14] Marc Rennhard and Bernhard Plattner. Practical anonymity for the masses with morphmix. In Ari Juels, editor, Proceedings of Financial Cryptography (FC ’04), pages 233–250. Springer-Verlag, LNCS 3110, February 2004.

[15] Sven Seuken, Denis Charles, Max Chickering, and Sidd Puri. Market Design and Analysis for a P2P Backup System. In Proceedings of the ACM Conference on Electronic Commerence (EC’10), 2010.

[16] Sven Seuken, Kamal Jain, and David C. Parkes. Hidden Market Design. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI’10), pages 1498–1503, 2010.

[17] I. E. Sutherland. A futures market in computer time. Commun. ACM, 11:449–451, June 1968.

[18] M. Wooldridge. Introduction to Multi-agent systems. Wiley, 2nd edition edition, 2009.

[19] Z. Xie. Cloud pricing models (distinction award). Master’s thesis, Imperial College London, 2011.

[20] Z. Xie. Economic perspectives of cloud computing. In Proceedings of the 4th IEEE International Confer-

ence on Utility and Cloud Computing (UCC 2011), 2011.

References

Related documents

A possible objection to the analysis above is that if co-ordination for some reason breaks down, the central bank will be tempted to switch to a stricter policy, to avoid the

The invasion risk is separated into the contributions of “background” introductions as a global, spatially implicit propagule source (used by all models, with a time trend for all

Fleming, Griffith, Mounter and Baker (2018) identified four types of clubs for taking collective action in food value chains: (1) horizontal clubs comprising businesses that

We state in §2.1 an approximation lemma which will enable the proof of our local limit theorems, and in Section 2.2, we explain how to recover quickly the Stone–Feller local

Social return is related to the intrinsic motivation – people are satisfied when they see that the project has been realized and they don’t want to receive anything as a return – so

Finally, the degree of suitability of the instruments for economic evalua- tions in the palliative care setting will be assessed by scoring whether the domains or dimensions

Telephone number the phone number that can be used to contact the user E-mail the user's primary email address for Cisco WebEx Mail Web page a web address that pertains to

The main drawback of linearization is that in certain operating points (i.e., where the linearization was done) one may get big deviations from the ideal