Rapid analysis of hydrogen cyanide in fresh cassava roots using NIRSand machine learning algorithms: Meeting end user demand for low cyanogenic cassava.
Michael Kanaabi, Fatumah B Namakula, Ephraim Nuwamanya, Ismail S Kayondo, Nicholas Muhumuza, Enoch Wembabazi, Paula Iragaba, Leah Nandudu, Ann Ritah Nanyonjo, Julius Baguma, Williams Esuma, Alfred Ozimati, Mukasa Settumba, Titus Alicai, Angele Ibanda, Robert S Kawuki
{"title":"Rapid analysis of hydrogen cyanide in fresh cassava roots using NIRSand machine learning algorithms: Meeting end user demand for low cyanogenic cassava.","authors":"Michael Kanaabi, Fatumah B Namakula, Ephraim Nuwamanya, Ismail S Kayondo, Nicholas Muhumuza, Enoch Wembabazi, Paula Iragaba, Leah Nandudu, Ann Ritah Nanyonjo, Julius Baguma, Williams Esuma, Alfred Ozimati, Mukasa Settumba, Titus Alicai, Angele Ibanda, Robert S Kawuki","doi":"10.1002/tpg2.20403","DOIUrl":null,"url":null,"abstract":"<p><p>This study focuses on meeting end-users' demand for cassava (Manihot esculenta Crantz) varieties with low cyanogenic potential (hydrogen cyanide potential [HCN]) by using near-infrared spectrometry (NIRS). This technology provides a fast, accurate, and reliable way to determine sample constituents with minimal sample preparation. The study aims to evaluate the effectiveness of machine learning (ML) algorithms such as logistic regression (LR), support vector machine (SVM), and partial least squares discriminant analysis (PLS-DA) in distinguishing between low and high HCN accessions. Low HCN accessions averagely scored 1-5.9, while high HCN accessions scored 6-9 on a 1-9 categorical scale. The researchers used 1164 root samples to test different NIRS prediction models and six spectral pretreatments. The wavelengths 961, 1165, 1403-1505, 1913-1981, and 2491 nm were influential in discrimination of low and high HCN accessions. Using selected wavelengths, LR achieved 100% classification accuracy and PLS-DA achieved 99% classification accuracy. Using the full spectrum, the best model for discriminating low and high HCN accessions was the PLS-DA combined with standard normal variate with second derivative, which produced an accuracy of 99.6%. The SVM and LR had moderate classification accuracies of 75% and 74%, respectively. This study demonstrates that NIRS coupled with ML algorithms can be used to identify low and high HCN accessions, which can help cassava breeding programs to select for low HCN accessions.</p>","PeriodicalId":49002,"journal":{"name":"Plant Genome","volume":" ","pages":"e20403"},"PeriodicalIF":3.9000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Genome","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/tpg2.20403","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/11/8 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
This study focuses on meeting end-users' demand for cassava (Manihot esculenta Crantz) varieties with low cyanogenic potential (hydrogen cyanide potential [HCN]) by using near-infrared spectrometry (NIRS). This technology provides a fast, accurate, and reliable way to determine sample constituents with minimal sample preparation. The study aims to evaluate the effectiveness of machine learning (ML) algorithms such as logistic regression (LR), support vector machine (SVM), and partial least squares discriminant analysis (PLS-DA) in distinguishing between low and high HCN accessions. Low HCN accessions averagely scored 1-5.9, while high HCN accessions scored 6-9 on a 1-9 categorical scale. The researchers used 1164 root samples to test different NIRS prediction models and six spectral pretreatments. The wavelengths 961, 1165, 1403-1505, 1913-1981, and 2491 nm were influential in discrimination of low and high HCN accessions. Using selected wavelengths, LR achieved 100% classification accuracy and PLS-DA achieved 99% classification accuracy. Using the full spectrum, the best model for discriminating low and high HCN accessions was the PLS-DA combined with standard normal variate with second derivative, which produced an accuracy of 99.6%. The SVM and LR had moderate classification accuracies of 75% and 74%, respectively. This study demonstrates that NIRS coupled with ML algorithms can be used to identify low and high HCN accessions, which can help cassava breeding programs to select for low HCN accessions.
期刊介绍:
The Plant Genome publishes original research investigating all aspects of plant genomics. Technical breakthroughs reporting improvements in the efficiency and speed of acquiring and interpreting plant genomics data are welcome. The editorial board gives preference to novel reports that use innovative genomic applications that advance our understanding of plant biology that may have applications to crop improvement. The journal also publishes invited review articles and perspectives that offer insight and commentary on recent advances in genomics and their potential for agronomic improvement.