6.4 Qualitative feedback
6.4.3 Coverage vs Rating
In this section we investigate whether a query with a high rating score would also have a high coverage score, while queries with low ratings would be associated with low coverage scores. The coverage scores provided are compared with the ratings users provided for queries. An assumption is that queries that are easy to understand should be easier for retrieving documents that have good coverage of relevant topics, given that the documents are contained in the collection.
To enable us make a comparison between coverage and ratings, we take the average of the rating score for the top 3 query-recommendation pairs for each query, to generate a single rating score for a query. This is because the coverage scores are based on individual queries. Figure 6.14 is a heat map which contains a comparison of the coverage scores given to queries with rating scores. The first column contains the coverage scores for the queries. The following columns contain the rating scores for the CONCEPTBASED-QR, HYBRIDand BOWmethods respectively.
In plotting the heat map, the coverage scores were sorted in descending order. Each row represents the entry for a single query. The columns for the 3 methods are sorted based on the query IDs, so each row captures the same query across the entire row, for the coverage score and for each of the 3 methods respectively. This allows us to gain insights from specific queries. Figure 6.14 highlights the cases where hybrid gets it right or wrong, by deciding to use either BOW or CONCEPTBASED-QR to refine a query. This is because HYBRID takes on either the colour of BOWor CONCEPTBASED-QR, thus allowing us to see where HYBRIDhas been generated from. The heat map also allows us to see when HYBRIDis better than BOWor CONCEPTBASED-QR.
6.4. Qualitative feedback 112
Figure 6.14: Visualisation of the coverage scores per query for each method
high coverage also have high rating scores for all the methods. This result is as one would expect. Where queries that are easy to understand are also easy to find good documents for retrieval, and such documents are contained in the collection. Hence the documents retrieved for such queries are rated highly by users. It is observed on the point marked a, that all the methods have the same rating score of 3 for the retrieval produced by that query. Although the rating score given is 3, the users reported that the documents shown for this query had complete coverage of topics relevant to the query. So this gives an indication that users were satisfied with the documents that were recommended for this query.
The standard BOW approach has less consistent performance overall when coverage and rat- ings are compared. BOWhas these queries with bad retrievals, such that all 3 (q, r) pairs have low ratings, hence the full red colour on some of its queries as seen on the points marked b, c, and d. The ratings for these queries are low even though the queries are in the pale green segment which is for queries with good coverage. It is observed that on such queries, CONCEPTBASED-QR is not red, perhaps the documents produced by CONCEPTBASED-QR influenced the coverage score
6.5. Summary 113
for these queries.
At the bottom of the heat map in the yellow-orange and red segments, notice that queries which have low ratings also tend to have low coverage. Such queries may either be difficult to understand or lacking in relevant documents in the collection. However, at the points marked e, f and g, it is observed that CONCEPTBASED-QR produces better documents with higher ratings compared to HYBRIDand BOW. The CONCEPTBASED-QR method takes advantage of its extra domain knowledge to refine queries that are complex and still find relevant documents for such queries. So, CONCEPTBASED-QR is able to perform well on queries that have low coverage scores. These low coverage scores can be a contribution from the documents produced by BOW and HYBRID. Overall, it is seen that queries with high ratings generally have a high coverage score while queries with low ratings tend to have low coverage scores for the CONCEPTBASED-QR and HYBRIDmethods.
6.5
Summary
A user evaluation of the HYBRID, CONCEPTBASED-QR, and BOWquery refinement methods has been presented in this chapter. The evaluation is not a typical user trial, but instead it is relevance judgement which employed knowledgeable users. The design of the user evaluation, the evalua- tion metrics used and the results have also been discussed. The e-Learning recommender system developed in Chapter 5 was employed for the user evaluation task. The evaluation performed was two-fold. First an evaluation of the relevance of recommendations made by the three methods. Second, an evaluation of the coverage of relevant topics across the documents rated.
A collection of queries and a dataset of Machine Learning and Data Mining documents were used for the evaluation. The evaluation system was hosted online for 8 weeks. There were 22 users who evaluated 105 queries and provided ratings for 521 query-recommendation (q, r) pairs. Users completed a questionnaire at the start of the evaluation, which provided data about their expertise and experience in the Machine Learning and Data Mining domain. All the users had at least an MSc degree. The questionnaire data allowed us to gain useful insights from our results.
The evaluation was designed to allow users to provide unbiased judgements on the methods. The order of documents shown to users were randomized to prevent the bias of earlier documents being regarded as relevant over those shown further down the list. The evaluation design also prevented a potential bias to a method because each user evaluated the (q, r) pairs for a query
6.5. Summary 114
for all methods without knowing which method was being evaluated. The nature of the HYBRID method meant its rating scores could be generated from the ratings of either the CONCEPTBASED- QR or BOWquery refinement methods.
Results for the relevance of recommendations showed that HYBRIDdid best in producing high quality documents. It was observed that the CONCEPTBASED-QR method is particularly good at preventing documents with potentially low ratings from being retrieved. However, the standard BOWmethod struggled to prevent documents with low ratings from being retrieved, so BOWhad the highest number of poor retrievals. Overall, the CONCEPTBASED-QR method had the best performance by producing many good retrievals, and the fewest poor retrievals. There was a good level of consensus on the judgements provided for each method by users with different levels of expertise. Results from experts, competent users and beginners all showed that using queries refined using the CONCEPTBASED-QR and HYBRIDmethods to search produced documents that were consistently more relevant to learners than when the standard BOWmethod was used.
Evaluation results for the coverage of relevant topics across the documents evaluated showed that most of the documents recommended covered topics that were relevant to the query. There were 50% of entries which stated that documents had good coverage of topics for the query. In addition, 19% of entries stated that documents had complete coverage, while 21% had partial coverage and only 10% had limited coverage.
A close examination of some queries that had low ratings as well as low coverage scores, revealed that some learner queries can be difficult to understand as they may not be well written. So this causes a challenge for query refinement. One way of addressing this challenge can be by exploring more query features when designing a HYBRIDmethod. HYBRIDuses the features of a query to make a dynamic choice in determining which method to use for refining a query. The results show HYBRID to perform better than the standard approach, thus highlighting the advantage in exploiting query features for determining a suitable query refinement approach to adopt for a query.
A comparison of the relevance and coverage scores generally showed that (q, r) pairs that had high rating scores associated with them also had good coverage scores. User evaluation re- sults demonstrate the effectiveness of using the CONCEPTBASED-QR and HYBRIDmethods for identifying relevant learning concepts that are employed in refining queries, to help learners find relevant learning materials.
Chapter 7
Conclusions and Future Work
The research presented in this thesis investigates knowledge driven approaches for supporting e-Learning recommendation to enable learners find relevant learning materials. This chapter discusses the contributions made and how this research can be applied to other domains. The achievement of the research objectives are presented and some ways that this research can be taken forward are discussed. The chapter ends with a reminder, highlighting the key insights of this research.
7.1
Contributions
This research has developed techniques that support e-Learning recommendation by helping learn- ers to ask effective queries and find relevant documents from the large amounts of learning materi- als that are currently available. Two key issues that make e-Learning recommendation challenging were identified. The first issue was the need to provide a shared vocabulary for teaching experts and learners, in order to support the representation of learning materials and enable the materials to be more accessible during recommendation. Tackling this issue enabled us to address the se- mantic gap that learners face. The second issue was the need to help learners to identify relevant learning topics and craft effective queries when trying to find relevant learning materials. Solving this issue enabled us to address the intent gap faced by learners.
A key contribution of this research is the creation of background knowledge which contains important information that can be employed for general understanding and problem-solving in a given domain. The background knowledge creation process generates a set of domain concepts containing concept labels, and their respective concept descriptions. The background knowledge
7.1. Contributions 116
captures important domain concepts as highlighted by teaching experts, thus providing a shared vocabulary for teaching experts and learners.
The domain concepts are used to underpin the representation of learning materials. Using learning concepts for the representation of learning materials allows the retrieval to focus on doc- uments that contain relevant concepts. The method for representing documents using domain concepts was presented in §4.1. In this approach, the domain concepts are used to underpin the similarity between documents. The evaluation results show that employing domain concepts to represent learning materials supports e-Learning recommendation by enabling relevant materials to be more accessible for retrieval.
The concept vocabulary is also employed in the development of suitable methods for the re- finement of learners’ queries. By using domain concepts, we are able to help learners to identify relevant learning concepts and use the concept vocabulary to inject knowledge of intent when re- fining learners’ queries. The query refinement method that uses domain concepts is presented in §5.2. The result of applying this method is the creation of effective queries that can be used to find and retrieve relevant materials for learners. A HYBRID query refinement method is developed to cater for queries that can be specific or generic. The HYBRID method automatically determines which method to use for refining a query based on the features of the query. This method is pre- sented in §5.5. The evaluation results show that harnessing the knowledge of domain concepts for refining queries helps learners to seek relevant documents.
The learning domain used to test the methods developed in this research has been Machine Learning and Data Mining, an area in which the author is knowledgeable. However, the methods can easily be applied for e-Learning recommendation in other domains. Given a collection of learning materials, background knowledge for the new domain would be created using data sources such as: TOCs of eBooks for generating concepts; a domain lexicon, for verifying concepts; and an Encyclopedia source, such as Wikipedia and DBpedia for generating concept descriptions. The new background knowledge will then be embedded in the e-Learning recommender system in §5.1, and used for recommendation in the new domain.