{"title":"Indonesian Street Food Calorie Estimation Using Mask R-CNN and Multiple Linear Regression","authors":"Nadya Aditama, R. Munir","doi":"10.1109/ICPC2T53885.2022.9776804","DOIUrl":null,"url":null,"abstract":"Indonesian people need to know about the calorie information of the street food. One of effective ways to get that is using image-based calorie estimation technologies. However, there are two limitations. First, the problem of occluded food and second is the low R Squared value in measurement using linear regression model with area feature. This research proposed the Mask R-CNN model for amodal instance segmentation task to get the complete object shape and multiple linear regression model with area, perimeter, length, and width to predict the food weight. This research proposed Indonesian street food dataset that has six classes. There are 1646 images of the dataset and total instance of each food are 644 bakwan, 812 bolu, 918 cireng, 679 serabi, 711 tahu, and 766 tempe. The number of data point in multiple linear regression model is 230 bakwan, 200 bolu, 250 cireng, 240 serabi, 230 tahu, and 230 tempe. The proposed multiple linear regression model has the highest R Squared score in all classes with the average R Squared 0.80425. Mask R-CNN ResNeXt-101-FPN in amodal instance segmentation task reaches the best F1 Score. In occluded scenario this model gets F1 Score 0.821 in IoU threshold 0.85. In non-occluded scenario the model gets F1 Score 0.994 in IoU threshold 0.9. Even though the F1 Score is high, there are some false detections and the bad segmentation quality. In calorie prediction, the proposed model is not reducing MAE score in some classes due to the segmentation quality and food characteristic.","PeriodicalId":283298,"journal":{"name":"2022 Second International Conference on Power, Control and Computing Technologies (ICPC2T)","volume":"2021 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Second International Conference on Power, Control and Computing Technologies (ICPC2T)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPC2T53885.2022.9776804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Indonesian people need to know about the calorie information of the street food. One of effective ways to get that is using image-based calorie estimation technologies. However, there are two limitations. First, the problem of occluded food and second is the low R Squared value in measurement using linear regression model with area feature. This research proposed the Mask R-CNN model for amodal instance segmentation task to get the complete object shape and multiple linear regression model with area, perimeter, length, and width to predict the food weight. This research proposed Indonesian street food dataset that has six classes. There are 1646 images of the dataset and total instance of each food are 644 bakwan, 812 bolu, 918 cireng, 679 serabi, 711 tahu, and 766 tempe. The number of data point in multiple linear regression model is 230 bakwan, 200 bolu, 250 cireng, 240 serabi, 230 tahu, and 230 tempe. The proposed multiple linear regression model has the highest R Squared score in all classes with the average R Squared 0.80425. Mask R-CNN ResNeXt-101-FPN in amodal instance segmentation task reaches the best F1 Score. In occluded scenario this model gets F1 Score 0.821 in IoU threshold 0.85. In non-occluded scenario the model gets F1 Score 0.994 in IoU threshold 0.9. Even though the F1 Score is high, there are some false detections and the bad segmentation quality. In calorie prediction, the proposed model is not reducing MAE score in some classes due to the segmentation quality and food characteristic.