We have implemented CAPE as a Facebook application. It is now available at http:
//cape-facebook.herokuapp.com4.
CAPE is hosted on Heroku5, a cloud application platform. It consists of one PHP
application to deploy user interface and connect Facebook API6, as well as one back-
bone Python application with Flask7 to realize the functionality of our mediation
engine. Currently, we adopt Method OO in this implementation of CAPE.
4We should stress that because of the frequent changes in Facebook privacy policy, CAPE might
need to slightly change the settings or migrate to new versions in the future.
5https://www.heroku.com/
6https://developers.facebook.com/docs/graph-api 7http://flask.pocoo.org/
Figure 6-4: CAPE–Login
Figure 6-5: CAPE–PEScores
CAPE requests its user to give the permissions to access the user’s basic infor- mation, his photos and photos shared with him on Facebook. Besides, CAPE also requests at least some of the user’s Facebook friends are also using CAPE and can see the posts he has been tagged. It is because CAPE needs his friends’ participation in the procedure, too.
Briefly, CAPE consists of the following steps. In the first step, the originator needs to login with his Facebook account. Figure 6-4 shows the web interface of CAPE to explain the detailed permission issues and request users to login to Facebook.
After login, CAPE lists all the originator’s friends using CAPE, and asks the user to update the PEScores, as shown in Figure 6-5.
Next, CAPE loads the photos with tagged friends who are also using CAPE. The originator now can configure the I-Scores for each photo. A screenshot of this step is taken as shown in Figure 6-6.
Figure 6-6: CAPE–IScores
After all the players’ configurations are collected, CAPE will present the originator the mediation outcome it derives. Figure 6-7 shows such an example. We should stress that this is an asynchronous procedure. Hence the results may not likely to be available immediately. The user may collect the results next time when he logs in after all the settings have been collected.
6.8 Summary
In this chapter, we have revisited the problem of protecting user privacy in online social networks (OSNs). In particular, we have investigated the design of access con- trol mechanisms for protecting shared content where co-owners may have differing and conflicting privacy preferences. A novel collaborative access control mechanism has been designed. Our key insight is that peer effects should be a key contributing factor to be considered in resolving conflicting preferences. Our proposed framework, CAPE, is based on graph theoretic model, and is able to lead to consensus that is acceptable to the co-owners. Our CAPE framework can be applied to both distance- based and circle-based networks. We have also looked how the peer effects scores should be set to ensure equilibrium. Moreover, we have also discussed how to handle the scenario when a player may not be satisfied with the outcome.
Chapter 7
Conclusion and Future Directions
The goal of research on privacy is to develop mechanisms to protect an individual’s privacy and to prevent unauthorized access or leakage of sensitive data. Effective methods will be able to tame public fears of hidden privacy leakage and bring back the trust over the Internet in this digital era. In recent years, large scale integration between e-commerce tech giants and OSNs is clearly on the upswing. The prevalence of OSN apps in app eco-systems also yields an increasing demand to access users’ data in OSNs. As such, there is a trend to fuse and integrate data. It is hence very urgent to develop faithful and yet efficient privacy-preserving techniques for OSNs, and to do it rapidly.
This thesis is intended to investigate practical techniques to protect OSN users’ privacy. As practitioners, we’ve covered two topics of privacy-preserving practices, one from the enterprise’s point of view and another from that of the individual. In this chapter, we recap the major advances and our contributions on each topic, see how the topics are related in the cutting edge research arena, and point out the main challenges that are emerging in new directions.
7.1 Towards Faithful & Practical Privacy-Preserving OSN
data publishing
In our first two works, we’ve covered two privacy-preserving mechanisms for OSN data publishing, one employinganonymity and another using differential privacy(DP). We also give a coherent view of the overall development in defining information pri-
vacy, that is, how our understanding in information privacy have changed and matured over the last decade.
Recall that anonymity(including randomization, k-anonymity, l -diversity, etc.)
was the first mainstream privacy model adopted by many works. Our first work LORA also falls into this category. LORA considers just to publish simple undirected graphs. However, in real-world scenarios, it is not uncommon to see graphs often
contain other additional information. For example, in [SKX+12], we investigate
the release of networking data where the edges are labeled. Our method adopts l -
diversity as the privacy model. There are also works on graphs that contain weights
and directions on edge[SMG+12; DEA12].
As DP now has become the emerging standard for data publishing, many works employ it for answering summary statistics of the underlying data. For example, we
have looked at publishing counting summaries on streaming binary data in[CXG+13].
There are also numerous works on publishing histograms, trajectories and frequent items counting problems. However, there are so far limited progress for synthetic data approximation, which is particularly obvious for network data. This in part is because of current DP mechanisms’ limitations. But another major reason, we think, is the missing of links between statistics and graph theory. It is still not clear now which summary statistics really capture the entire function of a network.
In our second work, we tend to view the network itself as statistical data, a sample drawn from an underlying distribution. This is particularly meaningful in the real world, since the formation of real-world networks has some elements of randomness. We’ve shown that our method has significant improved accuracy under the same DP level, compared to other state-of-the-art approaches. The intuition behind our ap- proach is that, by mapping a graph to another statistical model space and sampling in the calibrated statistical distribution, we can effectively control the influence caused by the change in the input. Specifically, we can limit the influence only on one parameter in the model while leaving the rest intact. One interpretation is that, even though the network itself is essentially high dimensional, its intrinsic dimension can be very low in most real-world scenarios. The parameters of the model in a high dimension space are often interlocking, the independent components can be more clearly seen once the graph has been transformed into a low dimensional space. However, the
global sensitivity in DP, if not through careful design, can be easily affected by the high extrinsic dimension/network size(akin to the curse of dimension appearing in machine learning). Hence, it is crucial to first reduce the dimension via sampling, approximating, or mapping graph to other feature domains, in order to lay the ground for constructing the low sensitivity.
As such, we hope our methodology can call out further development of methods in this line of work. It will be interesting to see how more existing sampling or approximation methods on graph can naturally fulfill DP, to avoid directly injecting noises into each part of the feature model. In this way, the impact of the previously “prohibitive” sensitivity that result in poor data utility can be diluted through these sampling or approximation processes.
7.2 Integrating data-access policies with differential pri-
vacy
In our third work, we’ve demonstrated a collaborative access control strategy. The main observation is that, in the case where a collective data-access policy is needed, it is common that some OSN users’ decision would be greatly influenced as they consider their peers’ privacy needs. Many works in this line assume, in such scenario, OSN users’ benefits shall be competing with each other. That is, each user tends to selfishly maximize his own gain. However, in contrast, we point out that it is more suitable to assume ONS users tend to be considerate about their friends’ emotional needs. This is more reasonable since OSN users are typically friends. To this end, we’ve designed a framework to simulate emotional negotiation, in which OSN users can adjust their data-access policies regarding such peer effects. We wish our design can function as a knot, providing more flexibility for OSN users in support of constructing a positive, collaborative atmosphere for collective decision-making.
It’s also worth to point out another key feature of our design. That is, our mech- anism is also a data-driven model. The final collective decision depends on how each OSN user perceives his friends, in terms of peer effect scores. Clearly, as OSN users become the data creators, many users’ privacy preferences are data-driven and context- aware. Hence, it is also pressing to enable policies to support this change. More
recently, some researchers propose a few works devoted to bridge such data-access
policy-making strategies with differential privacy [KM12; HMD14]. The authors
advocate to integrate differential privacy with policy-making procedures, by allowing the users to specifysecrets and constraints. The line of works is poised to lead to further development in data-driven access control strategies. We believe it is equally important to develop works in similar spirit for OSN data.