Lessons from this framework could be applied to another parallel line of futuristic work related to “interpretable and explanatory machine learning and data mining”. In the recent times there has been a growing concern over algorithmic decision making. Machine learning based systems, for instance, online recommendation systems act like black boxes, often making decisions on behalf of users, without informing them, thus leaving them unaware of personalization employed as well as various choices available. In recent times, many online platforms have attempted to explain the decision making
humanly. For example, to explain their content based recommendations, Netflix offers - “because you watched”. In order to explain their collaborative filtering, Amazon offers - “customers who bought xxx, also bought yyy”. While these are steps taken in the right direction, there are still unaddressed gaps. For instance, both of these examples focus on explaining what was filtered IN by the algorithm. However, it is equally or in fact more important that the user knows what was filtered OUT. We believe the case studies presented in chapter 6will motivate the reader to extend the model in this direction.
Further, the system discussed in chapter 6 can be easily extended to build futuristic social network that allow the users to visually explore the content related to a topic and self-adjust diversity of their content consumption. We believe such systems will be of high value in designing social media platforms which encourage discussion and debates between users of diverse view points. We call such social network as Next Generation Social Networks - “social networks which help reduce the ideological segregation of users instead of reinforcing them”.
Lastly, I believe, this research is a small step towards broader issues in ethics of big data that are in rise, and would continue to rise in the next few years. To give an example, European Union’s General Data Protection Regulation (GDPR 2018) [1] covers two key aspects of data and algorithms - “algorithmic decision-making” and “right to explanation”. The GDPR policy on “Right to explanation” would require algorithms to provide user an explanation for algorithmic decisions made for them. While this policy poses large challenges for the industry, it highlights the gap between the legal aspirations and technical realities. In its current form, there is a large disparity between the complexity of algorithms and desired legal frameworks. Perhaps some of the results of this research could be used to address reduce this gaps, as well as guide technology and industry, for example, in adapting the GDPR policy if and when it becomes a reality in 2018.
8
REFERENCES
[1] Parliament and council of the european union (2016). general data protection regulation.
[2] Blue Feed, Red Feed. accessed 25-June-2017,
http://graphics.wsj.com/blue-feed-red-feed/, 2012. [3] Escape Your Bubble. accessed 25-June-2017,
https://www.escapeyourbubble.com/, 2012.
[4] so-is-mediaite-conservative-or-libera. accessed 25-June-2017,
http://www.dan-abrams.com/so-is-mediaite-conservative-or-liberal/, 2012.
[5] L. Akoglu. Quantifying political polarity based on bipartite opinion networks. In ICWSM, 2014.
[6] J. An, D. Quercia, and J. Crowcroft. Partisan sharing: facebook evidence and societal consequences. In Proceedings of the second ACM conference on Online social networks, pages 13–24. ACM, 2014.
[7] E. Bakshy, S. Messing, and L. Adamic. Exposure to diverse information on facebook. Facebook Research Blog, 2015.
[8] P. Barberá, J. T. Jost, J. Nagler, J. A. Tucker, and R. Bonneau. Tweeting from left to right: Is online political communication more than an echo chamber?
Psychological science, 26(10):1531–1542, 2015.
[9] D. Cai, X. He, X. Wu, and J. Han. Non-negative matrix factorization on manifold. In Data Mining, 2008. ICDM’08. Eighth IEEE International Conference on, pages 63–72. IEEE, 2008.
[10] M. Conover, J. Ratkiewicz, M. R. Francisco, B. Gonçalves, F. Menczer, and A. Flammini. Political polarization on twitter. ICWSM, 133:89–96, 2011. [11] C. Ding, X. He, and H. D. Simon. On the equivalence of nonnegative matrix
factorization and spectral clustering. In Proceedings of the 2005 SIAM International Conference on Data Mining, pages 606–610. SIAM, 2005.
[12] C. Ding, T. Li, W. Peng, and H. Park. Orthogonal nonnegative matrix
t-factorizations for clustering. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 126–135. ACM, 2006. [13] S. Dori-Hacohen, M. Jang, and J. Allan. Is climate change controversial? modeling
controversy as contention within populations. arXiv preprint arXiv:1703.10111, 2017. [14] S. Flaxman, S. Goel, and J. M. Rao. Filter bubbles, echo chambers, and online news
consumption. Public Opinion Quarterly, 80(S1):298–320, 2016.
[15] K. Garimella, G. De Francisci Morales, A. Gionis, and M. Mathioudakis. Quantifying controversy in social media. In Proceedings of the Ninth ACM International
Conference on Web Search and Data Mining, pages 33–42. ACM, 2016.
[16] K. Garimella, G. De Francisc iMorales, A. Gionis, and M. Mathioudakis. Mary, mary, quite contrary: Exposing twitter users to contrarian news. In Proceedings of the 26th International Conference on World Wide Web Companion, pages 201–205.
International World Wide Web Conferences Steering Committee, 2017.
[17] K. Garimella, G. De Francisci Morales, A. Gionis, and M. Mathioudakis. Reducing controversy by connecting opposing views. In Proceedings of the Tenth ACM
International Conference on Web Search and Data Mining, pages 81–90. ACM, 2017. [18] M. Gentzkow and J. M. Shapiro. Ideological segregation online and offline. The
Quarterly Journal of Economics, 126(4):1799–1839, 2011.
[19] E. Graells-Garrido, M. Lalmas, and D. Quercia. Data portraits: connecting people of opposing views. arXiv preprint arXiv:1311.4658, 2013.
[20] Q. Gu and J. Zhou. Co-clustering on manifolds. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 359–368. ACM, 2009.
[21] K. Hosanagar, D. Fleder, D. Lee, and A. Buja. Will the global village fracture into tribes? recommender systems and their effects on consumer fragmentation.
Management Science, 60(4):805–823, 2013.
[22] G. Karypis and V. Kumar. Unstructured graph partitioning and sparse matrix ordering system, version 2.0, 1995.
[23] D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788–791, 1999.
[24] T. Li and C. Ding. The relationships among various nonnegative matrix factorization methods for clustering. In Data Mining, 2006. ICDM’06. Sixth International Conference on, pages 362–371. IEEE, 2006.
[25] Q. V. Liao and W.-T. Fu. Can you hear me now?: mitigating the echo chamber effect by source position indicators. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing, pages 184–196. ACM, 2014. [26] H. Lu, J. Caverlee, and W. Niu. Biaswatch: A lightweight system for discovering and
tracking topic-sensitive opinion bias in social media. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pages 213–222. ACM, 2015.
[27] A. Morales, J. Borondo, J. C. Losada, and R. M. Benito. Measuring political polarization: Twitter shows the two sides of venezuela. Chaos: An Interdisciplinary Journal of Nonlinear Science, 25(3):033114, 2015.
[28] S. A. Munson, S. Y. Lee, and P. Resnick. Encouraging reading of diverse political viewpoints with a browser widget. In ICWSM, 2013.
[29] E. Pariser. The filter bubble: What the Internet is hiding from you. Penguin UK, 2011.
[30] S. Park, S. Lee, and J. Song. Aspect-level news browsing: understanding news events from multiple viewpoints. In Proceedings of the 15th international conference on Intelligent user interfaces, pages 41–50. ACM, 2010.
[31] M. Prior. Post-broadcast democracy: How media choice increases inequality in political involvement and polarizes elections. Cambridge University Press, 2007. [32] W. M. Rand. Objective criteria for the evaluation of clustering methods. Journal of
the American Statistical association, 66(336):846–850, 1971.
[33] P. Resnick, J. Konstan, and A. Jameson. Panel on the filter bubble. In The 5th ACM conference on Recommender systems, 2011.
[34] P. Resnick, R. K. Garrett, T. Kriplean, S. A. Munson, and N. J. Stroud. Bursting your (filter) bubble: strategies for promoting diverse exposure. In Proceedings of the 2013 conference on Computer supported cooperative work companion, pages 95–100. ACM, 2013.
[35] C. R. Sunstein. Republic. com 2.0. Princeton University Press, 2009.
[36] N. X. Vinh, J. Epps, and J. Bailey. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. Journal of
Machine Learning Research, 11(Oct):2837–2854, 2010.
[37] D. Wang, T. Li, S. Zhu, and C. Ding. Multi-document summarization via
sentence-level semantic analysis and symmetric matrix factorization. In Proceedings of the 31st annual international ACM SIGIR conference on Research and
development in information retrieval, pages 307–314. ACM, 2008.
[38] C. Ware. Information visualization: perception for design. Elsevier, 2012.
[39] F. M. F. Wong, C. W. Tan, S. Sen, and M. Chiang. Quantifying political leaning from tweets and retweets. ICWSM, 13:640–649, 2013.
[40] J. Yoo and S. Choi. Orthogonal nonnegative matrix tri-factorization for co-clustering: Multiplicative updates on stiefel manifolds. Information processing & management, 46(5):559–570, 2010.
[41] Y. Zhao and G. Karypis. Criterion functions for document clustering: Experiments and analysis. Technical report, Technical report, 2001.
[42] S. Zhu, K. Yu, Y. Chi, and Y. Gong. Combining content and link for classification using matrix factorization. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 487–494. ACM, 2007.