Comparison of Image Thresholding and Clustering
Segmentation Methods for Understanding Nutritional Content of Food Images
Yuita Arum Sari
Faculty of Computer Science University of Brawijaya
Malang, Indonesia [email protected]
Vriza Wahyu Saputra
Faculty of Computer Science University of Brawijaya
Malang, Indonesia [email protected]
Andini Agustina
Faculty of Computer Science University of Brawijaya
Malang, Indonesia [email protected]
Yudi Arimba Wani
Nutrition Department, Medical Faculty University of Brawijaya
Malang, Indonesia [email protected]
Yusuf Gladiensyah Bihanda
Faculty of Computer Science University of Brawijaya
Malang, Indonesia [email protected]
ABSTRACT
In a hospital, nutritionists and dietitians have to pay aention to how much food consumed by patients since it can affect the nutritional intake they get if patients leover their meals. Usually, measuring leover food is done by using the optical measurement, and it may have different prediction values as well when different observers evaluate leover food. We propose an automatic estimation of the nutritional content of food by focusing on image segmentation from food images. Two algorithms are proposed:
image thresholding and K-means++ clustering using HSV color transformation. e result evaluation function shows that image segmentation providing with two-steps thresholding can achieve beer rather than using K-means++, with the number of accuracies is 95% and 53.44%, respectively. It concludes that by utilizing the improved image thresholding method, it is robust to identify the food area images that represent as nutritional content.
CCS CONCEPTS
• Computing methodologies~Artificial intelligence~Computer vision~Computer vision problems~Image segmentation
KEYWORDS
Image segmentation, food image segmentation, food images
ACM Reference format:
Yuita Arum Sari, Vriza Wahyu Saputra, Andini Agustina, Yudi Arimba Wani and Yusuf Gladiensyah Bihanda. 2020. Comparison of Image
resholding and Clustering Segmentation Methods for Understanding Nutritional Content of Food Images. In Proceedings of SIET ’20: 5th International Conference on Sustainable Information Engineering and Technology, November 16–17, 2020, Malang, INDONESIA. ACM, New York, NY, USA, 6 pages. hps://doi.org/10.1145/3427423.3427441
1 Introduction
Nowadays, many applications related to measuring nutritional con- tent in food images are developing. One of them is to deal with the problem of monitoring diet patients [1]. Nutritionists used to evaluate the leftover food from the patients so that it can perform analysis further about the patients’ condition [2]. The previous method is weighing by digital scales, but this method is not efficient since they have to measure it one by one on a large scale. Thus, another approach is proposed, which is the Comstock method [3], scaling the remaining food by using vary scales from three to seven levels. However, it still has a drawback that subjective perception when measured by different observers [4], [5]. Besides, it needs to do fast, and if it needs to repeat the evaluation at another time, this cannot be done because the food scraps have been discarded and are not documented. So the Comstock method is refined by using digital imaging [6]. The digital imaging method is to measure leftover nutritional content by analyzing manually from food images using the observer’s perception. This method can be used several times since the observer can give double evaluation or more when needed.
However, this method still has drawbacks same as Comstock, since different observers may have different perceptions of measuring leftover food. To minimize these drawbacks, researchers conduct a training to obtain good interobserver reliability, but it needs an innovation to improve efficiency.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
SIET '20, November 16–17, 2020, Malang, Indonesia
© 2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-7605-1/20/09…$15.00
https://doi.org/10.1145/3427423.3427441
minimize the subjective perception by analyzing food images using image segmentation approach in order to understand the nutritional content from a food image. In the previous research, image segmentation is a challenging process that related to the lighting environment. In [4], the research the quality segmentation result is affected by the glaze from the tray box image and automatically hard to achieve leftover prediction near to the original measurement. Then [5] enhances the quality of image segmentation based on the color channel using CLAHE equalization was proposed. The result of segmentation is slightly better to cope with the problem of glaze.
However, both of the previous methods are still weak in terms of handling shadow in edge surroundings which have happened when capturing food on the plate as a dataset of this paper.
The color transformation is needed for handling that problem [7].
The HSV color channel is acknowledged to remove the shadow when the image segmentation process [8]. Same as the previous study that related to the image segmentation result, that segmented image is an important part to use as further information that we would like to retrieve [9], [10]. In this paper, we implement HSV color space and compare the segmentation method using two-steps of image thresholding and K-means++ clustering.
Image thresholding is a base method in the image segmentation in which two phases of image thresholding are proposed to remove some noises twice. K-means++ clustering segmentation is to restrict that there is only two-part of an image that will be separated by foreground and background. K-means++ is better for handling centroids over K-means, which can affect the clustering result of classifying [11]. For instance: multi-centroids may recognize as a single cluster and it causes of poor clustering result. If in the previous study, K-means++ used in classifying, in this paper, we undertake it for segmentation.
2 Research Methodology
The whole stages of acknowledging the nutritional contents of food images are presented in Figure 1. In the beginning, image segmentation is applied to give information on the central area of food on the plate. In this paper, we compare two methods of image segmentation: image thresholding and K-means++ clustering.
After obtaining the central area of image segmentation, then calculate the food area corresponds to the previous image input that is: food before being eaten, and then, it automatically can recognize the nutritional content in that leftover food.
Figure 1: e general phase of the proposed method
The primary dataset was taken by capturing food images in a public hospital in Malang-Indonesia, with a DSLR camera. The camera was placed in a tripod with a 45° angle. A plate is centered on a grid board as a base of taking the picture. The food was laid out on the white plate. In this paper, a limited image of the food used is solid food, as shown in Figure 2.
2.2 Two-Steps Image resholding for Segmentation
Image thresholding is necessary to implement an analysis of the food image in detail [12]. In this paper, the first stage of image segmentation using image thresholding is filtering to remove some noises among pixels. Therefore, the Gaussian filter is applied with kernel size 5x5. After the filtering process, the next stage is transforming default color space RGB to HSV. HSV is chosen since its capability to recognize color close to human eye perception [13]. The following thresholding methods are used. In this paper, we utilize two steps thresholding method. In the beginning, we use threshold = 80, then conversion again to a grayscale image, and in the last threshold
= 110 is applied. It means that the threshold number below 80 and 110 would be replaced to 0 and otherwise is 255. Two phases of image thresholding are to distinguish second times the foreground and background so that it can achieve the performance of image segmentation. As stated in the previous research, several steps of image segmentation can reach a good result of image segmentation [14]. After the second phase of image segmentation, the exact retrieval color is given, as depicted in Figure 3.
2.3 K-means++ Clustering Segmentation
Besides image thresholding, we also compare it with K-means++
clustering since, in K-means++, we do not acquire tune the threshold parameter [15]. In this case, clustering is considered by using color. After converting RGB color space to HSV, then clustering method was employed. In this paper, we use a 2-means clustering algorithm, since we separate food images into two parts:
foreground and background. The foreground from food images is represented by the item of food on a plate, while background means plate and grid table, as shown in Figure 4. After the clustering process, then converts the image into grayscale and get its binary image. Morphology is also applied the same as in the thresholding phase to compare the different results of segmentation.
Figure 2: e example of the dataset
Segmentation Methods for Understanding Nutritional Content of Food Images
SIET’20, November 16-17, 2020, Malang, INDONESIA
Figure 3: Image segmentation using image thresholding
Figure 4: K-means++ clustering
Figure 5: Example of segmentation result of food image before and aer being eaten
2.4 Calculating Food Area
Segmentation is the central part of dealing with the problem of recognizing nutritional values from an input image. After having
the segmentation result, the calculating food area is computed.
We still need information about the weight of food before being eaten to predict the leftover food after being eaten by the patients, as described in Figure 5. The pseudocode is placed in Algorithm 1.
Algorithm 1. Measuring the area of food after being eaten Input: the binary image as the result of image segmentation Process:
1. Count pixel that contains 1 (white pixel) from before and after food images.
2. Calculate the area by using Equation (1) Output: Area of identified food after being eaten
𝐹𝐹𝐹𝐹 =
𝐴𝐴𝐴𝐴𝑎𝑎𝑎𝑎𝑏𝑏𝑏𝑏𝑎𝑎
x 100 (1)
where 𝐹𝐹𝐹𝐹 represents food area of food images after being eaten, 𝐴𝐴
𝑎𝑎𝑎𝑎is the number of pixel 1 in segmentation of food image after being eaten while 𝐴𝐴
𝑏𝑏𝑏𝑏𝑎𝑎represents the number of white pixels of food image before being eaten where the value of 𝐴𝐴
𝑏𝑏𝑏𝑏𝑎𝑎is greater than 0.
2.5 Nutritional Content Measurement
To acknowledge the nutritional content from an input image, which is leftover food images, we define a given table containing a nutritional value from each food that used to be consumed by patients in a public hospital in Indonesia based on their diets.
Figure 6 shows a sample of the database that consists of food and its nutritional content. There are 21 attributes of nutrition values.
In this application, we have a restriction on identifying the food manually. From the input of the food image, the label is matched with the database related to its nutrition. Equation 2 represents the calculation of leftover nutritional content. This equation is obtained by observation with some nutritionists.
𝑁𝑁
𝑟𝑟= 𝑣𝑣𝐹𝐹𝑣𝑣 100 × 𝐹𝐹𝐹𝐹
(2)
where 𝑁𝑁
𝑟𝑟is the nutritional content of leftover food. The variable val is the number of nutrition items based on nutritional table, as stated in Figure 6.
2.6 Evaluation
Evaluation measure is essential to understand how far the image segmentation can cover the food area well in the whole dataset.
We prepare ground truth that segment it manually, then we
compare the ground truth and the segmentation result using
image thresholding and K-means++ clustering segmentation. The
evaluation measure is presented by the confusion matrix table
where establish in Table 1.
Foreground
(actual values) Background (actual values) Foreground
(predicted values)
TP FN
Background (predicted
values)
FP TN
By using the confusion matrix, we use precision, recall, and accuracy to give the evaluation of segmentation results.
𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 = 𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝑇𝑇
(3)
𝑝𝑝𝑝𝑝𝑝𝑝𝐹𝐹𝑣𝑣𝑣𝑣 = 𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝑁𝑁
(4)
𝐹𝐹𝑝𝑝𝑝𝑝𝑎𝑎𝑝𝑝𝐹𝐹𝑝𝑝𝑎𝑎 = 𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑁𝑁 𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑁𝑁 + 𝐹𝐹𝑇𝑇 + 𝐹𝐹𝑁𝑁
(5)
where TP is True Positive, FN is False Negative, FP is False Positive, and TN is True Negative.
Figure 6: Nutrition Table
This evaluation is used to determine the level of accuracy of the results of the segmentation of each image. To find out the level of accuracy is by comparing automatic segmentation data, namely image thresholding and K-means++ clustering methods with ground truth. Each image analysis using the confusion matrix method.
Table 2. The average evaluation measure Evaluation
measure Image
thresholding K-means++
clustering
Average Precision 85.25% 22.98%
Average Recall 83.68% 56.38%
Average Accuracy 92.35% 53.44%
Based on the test results can be seen from Table 2, the highest average accuracy produced by the thresholding method, which reaches 92.35% compared to the K-means++ clustering method, which is 53.44%. It happens because the clustering method produces image segmentation that is incompatible with ground truth. Figure 6 show some images of the results of manual segmentation, thresholding segmentation, and K-means++
clustering segmentation.
Figure 6: e example of segmentation results of K- means++ clustering, image thresholding, and ground truth From the picture above, it can be seen that the results of thresholding segmentation can provide images that correspond to manual segmentation. The evaluation values the precision, recall, and accuracy of each image can be seen in Table 3 to Table 6.
In the results of Table 3, the precision is 32.38%, recall is 99.84%, and accuracy is 90.89%. That is because there is still a high amount of data predicted 0 to 1 resulting in low precision. In the results of Table 4 produces precision that is 99.22%, recall is 99.32%, and
Water Energi Protein Fat Carb. Fiber Ash
0 Asam asam buncis 89,60 34,00 2,40 0,30 7,20 1,90 0,50
1 Ayam bumbu Srundeng 54,52 1022,70 15,87 107,75 1,00 0,00 0,87
2 Ayam suwir 54,30 314,74 17,68 27,14 0,00 0,00 0,87
3 Ayam bumbu bistik 54,30 314,74 17,68 27,14 0,00 0,00 0,87
4 Ayam goreng Laos 52,17 337,07 16,99 30,00 0,00 0,00 0,84
5 Bali Putih telur 75,26 169,14 9,26 14,29 0,69 0,00 0,51
6 Bali Tahu 77,36 127,29 10,26 10,31 0,75 0,09 1,32
7 Bali Telur 68,58 210,15 11,45 17,66 0,65 0,00 0,74
8 Bening Bayam Labu Siam 93,40 23,00 0,75 0,25 4,80 0,35 0,80
9 Bening kangkung labu Siam 91,35 30,00 1,90 0,40 5,70 0,90 0,65
10 Bobor Bayam 87,90 67,33 1,45 6,05 3,35 0,58 1,25
11 Bubur 12,00 357,00 8,40 1,70 77,10 0,20 0,80
12 Dadar jagung goreng 27,52 339,60 9,73 13,57 47,33 1,21 1,63
13 Ikan acar kuning 60,11 205,64 15,24 10,00 14,04 0,05 0,98
14 Kare wortel labu siam 85,07 81,50 1,37 6,01 7,02 0,42 0,54
15 Kroket Daging 68,09 195,11 12,41 14,59 3,83 0,14 0,98
16 Lodeh labu Siam + wortel 81,66 113,60 1,31 9,77 6,74 0,40 0,52
17 Nasi 56,70 180,00 3,00 0,30 39,80 0,20 0,20
18 Opor Ayam 52,68 333,44 15,63 30,20 0,62 0,00 0,86
19 Opor Telur 66,76 225,33 10,48 19,88 1,31 0,00 0,77
20 Orek Tempe 52,05 241,18 19,58 14,16 12,71 1,32 1,51
21 Oseng- oseng bakso 50,92 277,91 15,54 18,22 14,25 0,06 1,07
22 Rendang Ayam 52,17 337,07 16,99 30,00 0,00 0,00 0,84
22 Oseng sosis 34,18 487,64 13,18 47,55 2,09 0,00 3,00
23 Oseng Tahu 77,36 127,29 10,26 10,31 0,75 0,09 1,32
24 Perkedel Kentang 75,58 139,38 3,52 9,51 10,49 0,38 0,74
26 Sambel goreng kentang 80,39 91,71 2,02 3,81 13,01 0,48 0,77
27 Sayur Asam Labu Siam kacang Panja 92,05 30,50 1,45 0,10 6,00 1,35 0,40 28 Sayur buncis soun kacang merah 78,02 79,23 3,90 0,58 16,16 1,78 0,66
29 Sop buncis soun 78,64 78,86 2,73 0,27 17,90 1,63 0,46
30 Sop kembang tahu Wortel 82,71 67,27 5,35 1,80 9,30 0,91 0,84
32 Soto Soun Kentang 8,14 481,71 2,89 25,06 61,80 2,17 2,11
33 Tahu bb kecap 82,20 80,00 10,90 4,70 0,80 0,10 1,40
34 Tahu bulat 73,07 169,33 9,69 15,29 0,71 0,09 1,24
35 Tahu goreng 73,07 169,33 9,69 15,29 0,71 0,09 1,24
36 Telur mata sapi 68,58 210,15 11,45 17,66 0,65 0,00 0,74
37 Telur Orak Arik 68,58 210,15 11,45 17,66 0,65 0,00 0,74
38 Tempe bb Kuning 55,30 201,00 20,80 8,80 13,50 1,40 1,60
39 Tempe bumbu bacem 55,30 201,00 20,80 8,80 13,50 1,40 1,60
40 Tempe goreng 49,16 276,89 18,49 18,93 12,00 1,24 1,42
41 Tim 71,00 120,00 2,40 0,40 26,00 0,00 0,20
42 Urap urap 56,63 219,42 21,50 10,11 14,25 1,13 1,68
43 Bola bola daging 50,92 277,91 15,54 18,22 14,25 0,06 1,07
44 Dadar jagung kukus 29,49 300,71 10,43 7,40 50,71 1,30 1,74
45 Ikan bumbu asam manis 63,10 192,00 12,70 10,10 12,70 0,00 2,40
46 Sambel goreng rempelo ati 27,35 296,64 28,06 18,16 5,41 0,00 1,55
47 sop gambas wortel soun 86,10 52,15 1,19 0,38 11,85 0,46 0,48
48 sop wortel labu siam 91,27 32,57 0,77 0,31 7,21 0,43 0,43
49 Tumis Bakso 65,96 207,04 18,03 14,97 0,00 0,00 1,13
50 tumis soun 11,60 402,20 4,22 10,20 73,80 0,00 0,18
51 Tumis Tahu 73,89 161,30 9,80 14,34 0,72 0,09 1,26
Code Food type Nutritional Content