{"title":"Fusion of machine learning and explainable AI for enhanced rice classification: a case study on Cammeo and Osmancik species","authors":"Ahmet Çifci, İsmail Kırbaş","doi":"10.1007/s00217-024-04614-9","DOIUrl":null,"url":null,"abstract":"<div><p>The accurate identification and classification of rice species is critical for increasing crop productivity, quality, and diversity. Traditional rice classification methods involving manual inspection can be time-consuming, costly, and error-prone. This study addresses to address this challenge by exploring the potential of machine learning (ML) models for automated and accurate rice classification. The key objectives of this paper are threefold. First, the study evaluates the discriminative power of various morphological features extracted from rice grain images using feature selection methods. Second, it compares the performance of several ML models, including Artificial Neural Network (ANN), Categorical Boosting (CatBoost), Gradient Boosting (GBoost), k-Nearest Neighbours (k-NN), Logistic Regression (LR), Naïve Bayes (NB), Random Forest (RF), Stochastic Gradient Descent (SGD), Support Vector Machine (SVM), and eXtreme Gradient Boosting (XGBoost), in classifying two rice species (Cammeo and Osmancik). Third, the study implements explainable artificial intelligence (XAI) techniques, namely SHapley Additive exPlanation (SHAP) and Individual Conditional Expectation (ICE) plots, to provide transparency and interpretability into the inner workings and decision-making processes of the ML models. The findings indicate that the LR model achieved the highest classification accuracy, with a rate of 93.1%. Feature analysis identified Major Axis Length, Perimeter, Convex Area, and Area as the most influential features in distinguishing between rice species. This study highlights the successful application of advanced ML techniques in automating industrial rice classification, facilitating automated packaging and quality control processes without the need for human intervention. By improving the efficiency of rice classification and reducing reliance on manual labour, this approach offers significant benefits to both the agricultural industry and food production sectors.</p></div>","PeriodicalId":549,"journal":{"name":"European Food Research and Technology","volume":"251 1","pages":"69 - 86"},"PeriodicalIF":3.0000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s00217-024-04614-9.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Food Research and Technology","FirstCategoryId":"97","ListUrlMain":"https://link.springer.com/article/10.1007/s00217-024-04614-9","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The accurate identification and classification of rice species is critical for increasing crop productivity, quality, and diversity. Traditional rice classification methods involving manual inspection can be time-consuming, costly, and error-prone. This study addresses to address this challenge by exploring the potential of machine learning (ML) models for automated and accurate rice classification. The key objectives of this paper are threefold. First, the study evaluates the discriminative power of various morphological features extracted from rice grain images using feature selection methods. Second, it compares the performance of several ML models, including Artificial Neural Network (ANN), Categorical Boosting (CatBoost), Gradient Boosting (GBoost), k-Nearest Neighbours (k-NN), Logistic Regression (LR), Naïve Bayes (NB), Random Forest (RF), Stochastic Gradient Descent (SGD), Support Vector Machine (SVM), and eXtreme Gradient Boosting (XGBoost), in classifying two rice species (Cammeo and Osmancik). Third, the study implements explainable artificial intelligence (XAI) techniques, namely SHapley Additive exPlanation (SHAP) and Individual Conditional Expectation (ICE) plots, to provide transparency and interpretability into the inner workings and decision-making processes of the ML models. The findings indicate that the LR model achieved the highest classification accuracy, with a rate of 93.1%. Feature analysis identified Major Axis Length, Perimeter, Convex Area, and Area as the most influential features in distinguishing between rice species. This study highlights the successful application of advanced ML techniques in automating industrial rice classification, facilitating automated packaging and quality control processes without the need for human intervention. By improving the efficiency of rice classification and reducing reliance on manual labour, this approach offers significant benefits to both the agricultural industry and food production sectors.
期刊介绍:
The journal European Food Research and Technology publishes state-of-the-art research papers and review articles on fundamental and applied food research. The journal''s mission is the fast publication of high quality papers on front-line research, newest techniques and on developing trends in the following sections:
-chemistry and biochemistry-
technology and molecular biotechnology-
nutritional chemistry and toxicology-
analytical and sensory methodologies-
food physics.
Out of the scope of the journal are:
- contributions which are not of international interest or do not have a substantial impact on food sciences,
- submissions which comprise merely data collections, based on the use of routine analytical or bacteriological methods,
- contributions reporting biological or functional effects without profound chemical and/or physical structure characterization of the compound(s) under research.