• No results found

CHAPTER 3 – Method

3.2. Ask opinion from the students

A question that asks the opinion from the students about a concept receives various answers because students have their different thoughts. Even a question that might contain general and particular concept(s) could be responded in different style. Therefore, it is complicated to grade this kind of question, either manually or automatically.

Previous works in automated grading which are similar to this set of question are automated essay grading systems (Bin, Jun, Jian-Min, & Qiao-Ming, 2008; Perera, Perera, & Weerasinghe, 2016; Phandi, Chai, & Ng, 2015; Shehab et al., 2017; Wonowidjojo, Hartono, Frendy, Suhartono, & Asmani, 2016). The system marks the essay with classification techniques by using features of the essay, such as the grammar, word vectors, or word count. Most of the systems relied on supervised machine learning techniques and more than 50 sample data. However, not all courses in the higher-education system have students more than 50 people. These methods are not suitable for small classes.

Therefore, this study explored another approach through clustering to assess this question. Clustering does not need training data during the learning process. It groups the data based on a particular characteristic(s) and similarities between the data, while different data is clustered to another group. The determinant factors used in this research are sentiment analysis polarity, length of the answer, and the TF-IDF word vectors. Figure 10 below shows the process for this question type.

Figure 10 Main Process of Question Type 2

The process receives the folder and answers files inside it. Next, perform sentiment analysis to gather information about student’s opinion on the topic asked in the question. Sentiment analysis is one example of text mining application to determine the feeling expressed in a text. The feeling is divided into positive, negative, and neutral (Justicia De La Torre, Sánchez, Blanco, & Martín-Bautista, 2018; Kent, 2014). There is

29 also polarity confidence as the result of the analysis. The opinion and the polarity confidence value are retained for the clustering. After the sentiment analysis, calculate the length of each answer to be included in the clustering process.

Then, create the word vectors. In this phase, the answers are pre-processed using NLP techniques as displayed in Figure 11. The result of this subprocess is a TDM based on TF-IDF.

Figure 11 Inside Create Word Vectors Process

• “Change to lowercase” transforms all letters into lowercase because the TF-IDF calculation is case- sensitive. To reduce the number of words, changing all words into lowercase is needed.

• Tokenize splits the text into a list of words because the delimiter specified is non-letter characters. Therefore, all numbers, symbols, and whitespaces are removed.

• The tokens of words are filtered based on its length. In this study, the length of the token to be considered as meaningful information for the clustering is three characters or more.

• Stop words are also removed to preserve only relevant words into the calculation.

• Some words have the same root. Therefore, stemming is applied to reduce the number of word lists. This study used the Porter algorithm to do the stemming.

• Because the combination of two words might contain useful information than one word only, unigram and bigram are generated to be included in the calculation.

Determining the number of optimal clusters is difficult, especially when there is no exact number specified beforehand. Davies-Bouldin (DB) index becomes an option to find out the optimal clusters. DB index is a function of the ratio of the sum of within-cluster scatter to between-cluster separation (Bandyopadhyay & Maulik, 2001). The smaller value of DB index is preferred since it indicates more compact and well-separated clusters (Visvanathan, Srinivas, Lushington, & Smith, 2009). Therefore, the k value with the smallest DB index value is chosen as the optimal clusters.

Finally, the clustering is executed, and human grader analyses the result. This study used X-Means algorithm to cluster the answers. X-means algorithm is found by Dan Pelleg and Andre Moore in 2000 to overcome limitations in the k-means algorithm (Pelleg & Moore, 2000). The algorithm is faster than k-means and computes the number of clusters dynamically using the lower and upper bound supplied by the user (Kumar & Wasan, 2010). The algorithm searches the space of cluster locations and a number of clusters efficiently by optimizing Bayesian Information Criterion (Alickovic & Babic) or The Akaike Information Criterion (AIC) measure (Kumar & Wasan, 2010).

This study selects Silhouette index as the measurement metric. Silhouette index is a validity index to examine the quality of the clusters through cohesion and separation (Errecalde, Cagnina, & Rosso, 2015; Pérez-Delgado, Escuadra, & Antón, 2010). Cohesion indicates how similar are the objects within the cluster, while separation signifies how different a cluster to each other is (Errecalde et al., 2015). A value close to -1

30 indicates the object is clustered into the wrong cluster; if the value is near to 0, then the object is in the border of the cluster, and it is not clear whether it really belongs to its cluster or should be placed into its neighbor; and value close to 1 or higher denotes a good cluster (Shanie, Suprijadi, & Zulhanif, 2017; Visvanathan et al., 2009).

31

Related documents