7. Conclusion
7.2. Contributions
This section briefly revisits the contributions from the research of this PhD thesis, which were initially presented in Section 1.4.
The research of this thesis yielded a major contribution and a minor contribution. The major contribution is the three user modelling approaches that infer a user’s expertise information by exploiting their various social content on SNSs: (1) Language and Social Relation-based Factor Graph Model (LSR-FGM) that exploits the user’s social profiles to infer their language expertise; (2) Sentiment-weighted and Topic Relation-regularized Learning (SeTRL) model that exploits the user’s posted tweets on Twitter to infer their topical expertise and (3) multi-Data and Topic relatedness Combined (DnTCom) learning model that exploits multiple types of user data on Twitter to better infer the user’s topical expertise.
The LSR-FGM is a learning model that enables us to predict the languages that cold start users comprehend through the use of their static social profiles. LSR-FGM is novel in that, it is the first attempt to acquire the online users’ language information via an analysis of their social experiences, instead of their writing or reading history. The LSR-FGM advances the state of the art, because, apart from the textual attributes of social profiles, it also incorporates two advanced structural factors in profiles, i.e. dependency relations between languages and social relations between users, to enhance the language prediction accuracy. LSR-FGM offers the community an effective approach for language prediction of cold start users via the exploitation of limited information about them, i.e. their static social profiles. Experimentation conducted on the LSR-FGM demonstrates the existence of the correlation between the user’s language expertise and the proposed structural
125
factors. This contributes a basis for the development of more advanced approaches in this area.
Aiming to infer the topical expertise of active Twitter users, SeTRL is a learning model that allows us to predict the topical expertise of Twitter users through the use of their previously posted tweets. SeTRL advances the state of the art as it uses a novel approach to evaluate the importance of user features (extracted from their tweets) in expertise inference, i.e. the tweet sentiment analysis based approach, which has been shown to be superior to the traditional word frequency based approaches. Additionally, SeTRL also incorporates prior knowledge of relations between expertise topics into the process of inference in order to deliver better prediction performance. SeTRL offers the community an effective approach to infer Twitter users’ topical expertise based on their posted tweets. The discovered impact of tweet sentiment and/or topic relation on expertise inference contributes a research foundation for future study in this area.
Based on SeTRL, the multi-Data and Topic relatedness Combined (DnTCom) learning model is proposed that allows us to predict the topical expertise of Twitter users through the collective use of their various activities on Twitter (including previously posted tweets). The DnTCom advances the state of the art (including SeTRL) as it can deliver
effective inference as long as there are some types of user data available. It takes multiple types of user data as input for expertise inference and overcomes the problem that many Twitter users tend to rarely conduct certain types of activities, e.g. they never post any tweets, which will lead to the failure of approaches only relying on a single type of user activity. In addition, the DnTCom learning model also takes into consideration the inference consistency of different types of user data in order to improve inference performance. DnTCom offers the community an effective approach to infer the topical expertise of a larger amount of Twitter users (when compared with SeTRL), where as long as there are some types of activities conducted by the user on Twitter, inference can be performed.
The minor contribution is the design and implementation of experiments that investigate the usefulness of the proposed expertise modelling approaches in a real-world application. This research takes the popular CQA site Quora as the target application and aims to improve answerer finding on the platform through the use of the proposed expertise modelling approaches. Experimental data is harvested to simulate the application scenario, where cold start users are not considered by the traditional answering history
126
based approaches in finding potential answerers for new questions, even though many of them have linked their Quora accounts to their social media accounts. An answerer finding approach is proposed that models the cold start user’s expertise information based on their Twitter activities (using the proposed modelling approaches in this PhD research) and then matches them with the newly posted questions on Quora. Evaluation methods are designed to test the performance of the new answerer finding approach and results show that it outperforms the answering history based approach when less than ~40 answered questions are considered for user expertise modelling. This means the proposed user expertise modelling approaches are useful to model the expertise of cold start users on CQA sites and enable the inclusion of them in the pool of potential answerers. The evaluation experiments not only further show the significance of the proposed user modelling approaches, but also demonstrate a specific example for their use in more application scenarios.