Clustering is known as unsupervised learning, the concept of clustering has been used for decades in many
fields, such as medical image analysis [268,269], clustering gene expression data [270,271], investigating
and analysing air pollution data [272, 273], power consumption analysis [274, 275], and many more fields
Table 2.9: Summary of recommendation techniques in e-learning systems
Used Recommendation Techniques
Study Cold-Start Issues CF CBF KB HF
Predict relevant learning materials to every learner based on
collaboration with other learners [42].
D No past
preferences D
Recommend the most suitable learning materials for each learner based on the rating similarity with other learners [256].
D No past preferences or No high rated LO in past preferences. D
Finding learning objects suitable for learners’ preferences
(knowledge level and learning style) [257].
D No past preferences D
Recommend course learning objects
based on neighbourhood rating [258]. D
No high rated LO in past preferences D Constructed a course ontology and
retrieved the course according to learner’s learning styles [259].
D D
(+ontology) D
Recommend learning materials [260]. D D D
(+ontology) D Proposed a hybrid recommender
system to recommend learning items in users learning processes [261].
D No past preferences or No high rated LO in past preferences. D D (+SPM algorithm)
Proposed a recommender system to store and share research papers and glossary terms among university students and industry practitioners [262].
D No past
preferences D
Recommend learning contents to users based on similarity between user profiles [263].
D No past
preferences D Course recommendation system
based on ontology and context aware e-learning [264].
D D
(+ontology) Framework for recommending learning
resources based on the learner’s recent navigation history and by comparing similarities and differences among different learner’s
preferences and instructional content available in the e-learning system [265] .
D No past preferences or No high rated LO in past preferences. D D
Recommend learning Materials based
on multidimensional attributes [266]. D
No high rated LO
in past preferences D D Proposed approach for selecting and
sequencing the most appropriate learning objects [267].
D No high rated LO
in past preferences D
D(+association pattern analysis)
hierarchical-based methods. Hierarchical methods found in algorithms such as single link, complete link
and average link, are utilised to generate a nested succession of partitions in two modes: the agglomerative
mode which initiates with a distinct cluster pattern and subsequently merges nearly all identical cluster
pairs resulting in the formation of a cluster hierarchy; or the divisive mode which is initiated with a
single cluster pattern and is followed by splitting every cluster into smaller clusters till the stopping
criteria is encountered. Partitional methods made use of in algorithms including k-means and expectation
maximisation methods detect clusters all at once as a partition and do not inflict any clustering hierarchy.
Clustering algorithms are normally developed based on proximity, i.e. similarity or distance. The
process of clustering can be described as grouping a selection of objects into clusters with respect to
dissimilarities between them. It is important to note that data objects in one cluster are similar to each
other and dissimilar from objects in different clusters.
Several recommendation methods related to partition-based clustering algorithms have been pro-
posed, as shown in Table 2.10. In this thesis, the k-means clustering algorithm has been utilised to
enhance computational efficiency specifically in terms of the quality and preciseness of recommenda-
tions.
As discussed above, the k-means [280] clustering algorithm is based on partition clustering methods
and has seen several applications specifically within the field of personalised e-learning [280], the AHA!
e-learning system being an illustration of such an application [281]. The AHA! e-learning system devel-
ops clusters corresponding to an individual student’s model and cumulates relevant data in an XML file,
where the centroid of each cluster is collected. In other words, a typical user is represented by a centroid
within a cluster.
The k-means approach has found various applications, some of which are described in this section. [282]
utilised k-means to examine students’ behaviours based on an annotation dataset of 40 students. The
k-means algorithm found another application related to students when [283] administered a study to
comprehend the influence of human characteristics on students’ performance while listening to music.
Furthermore, [284] applied the k-means algorithm to aid in building the neighbourhood. In this case,
the neighbourhood was not restricted to the cluster of the user, but the user’s distance to various cluster
centres was used as a pre-selection step for neighbours. Successful results of this effortless and efficient
technology related to clustering were achieved and it was interpreted that the results surpassed that of
KNN-based CF to a higher standard. The k-means clustering algorithm is one of the simplest and most
Table 2.10: An overview of recent researches on adaptive e-learning recommender systems using clustering algorithms
Recommendation Techniques
Study CBF CF HF Method Result
[276] D D D Classification & association rule (Apriori algorithm) & clustering (K-means) K-means clustering, ADTree classification
& Apriori association rule algorithm is the best combination compared with ADTree & Apriori Rule.
[48] D
Clustering (K-means) & association rule
(Apriori algorithm)
K-means clustering and
Apriori association have the best results and improve the accuracy of the course recommendation.
[277] D Clustering (K-means) &
Knowledge-based (ontology)
Improve the LO’s recommendation taking the advantage of K-means clustering according to the student’s prior knowledge.
[278] D
Clustering (K-means) , cosine similarity metrics & Pearson correlation
Improving in learning object recommender accuracy.
[279] D
K-means Bi-Directional based frequent closed sequence mining
The result shows higher
LOs recommendation accurately according to the real-time up dated contextual information.
[152] D Clustering Machine learning Learners in the experimental group complete a course in less time than learners
in the control group.
Standard k-clustering methods which assign objects to the closest cluster are initialised by randomly
choosing k cluster centres, where k is the cluster number. Next, distance methods such as Euclidean
distance (Section 2.7.1) are used to determine the similarity between the data objects and cluster centres.
Each object is then assigned to one of the cluster groups with the closest centre followed by redefining the
cluster centres by finding the mean vector of all objects belonging to each cluster group. Finally, the ob-
jects are reassigned according to their distance to these new cluster centres. This iterative process repeats
until there are no changes in the assignment of objects to cluster groups. The traditional method aims to
by making use of other distance functions excluding the Euclidean function, the algorithm could be pre-
vented from converging. The algorithm is implemented in the following steps.
Step 1: Select K random points from the dataset as initial cluster centroids.
Step 2: Create K clusters by associating each data point with their closest cluster centroid, making
use of the Euclidean distance defined Eq. (2.1).
Step 3: Recalculate the centroid of every cluster as the mean of all data points in that cluster.
Step 4: Repeat steps 2 and 3 till there is no modification in the centroids.
Enhancing the efficiency of recommender systems, clustering offers an offline feature to partition
the dataset in a way that similar data is placed within the same cluster while dissimilar data is placed in
various clusters, followed by which the algorithm is applied only with the highest similarity to the user.
Some benefits of this algorithm include its simplicity as well as its time complexity (can be used to cluster
large data sets).
The next section presents the similarity metrics often utilised in e-learning recommender systems