Fan Liu, Fang Wang, Zaiqi Zhang, Liang Cao, Jinran Wu, You-Gan Wang
{"title":"Classical and machine learning tools for identifying yellow-seeded <i>Brassica napus</i> by fusion of hyperspectral features.","authors":"Fan Liu, Fang Wang, Zaiqi Zhang, Liang Cao, Jinran Wu, You-Gan Wang","doi":"10.3389/fgene.2024.1518205","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Due to its favorable traits-such as lower lignin content, higher oil concentration, and increased protein levels-the genetic improvement of yellow-seeded rapeseed has attracted more attention than other rapeseed color variations. Traditionally, yellow-seeded rapeseed has been identified visually, but the complex variability in the seed coat color of <i>Brassica napus</i> has made manual identification challenging and often inaccurate. Another method, using the RGB color system, is frequently employed but is sensitive to photographic conditions, including lighting and camera settings.</p><p><strong>Methods: </strong>We present four data-driven models to identify yellow-seeded <i>B. napus</i> using hyperspectral features combined with simple yet intelligent techniques. One model employs partial least squares regression (PLSR) to predict the R, G, and B color channels, effectively distinguishing yellow-seeded varieties from others according to globally accepted yellow-seed classification protocols. Another model uses logistic regression (Logit-R) to produce a probability-based assessment of yellow-seeded status. Additionally, we implement two intelligent models, random forest and support vector classifier to evaluate features selected through lasso-penalized logistic regression.</p><p><strong>Results and discussion: </strong>Our findings indicate significant recognition accuracies of 96.55% and 98% for the PLSR and Logit-R models, respectively, aligning closely with the accuracy of previous methods. This approach represents a meaningful advancement in identifying yellow-seeded rapeseed, with high recognition accuracy demonstrating the practical applicability of these models.</p>","PeriodicalId":12750,"journal":{"name":"Frontiers in Genetics","volume":"15 ","pages":"1518205"},"PeriodicalIF":2.8000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11774891/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.3389/fgene.2024.1518205","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Due to its favorable traits-such as lower lignin content, higher oil concentration, and increased protein levels-the genetic improvement of yellow-seeded rapeseed has attracted more attention than other rapeseed color variations. Traditionally, yellow-seeded rapeseed has been identified visually, but the complex variability in the seed coat color of Brassica napus has made manual identification challenging and often inaccurate. Another method, using the RGB color system, is frequently employed but is sensitive to photographic conditions, including lighting and camera settings.
Methods: We present four data-driven models to identify yellow-seeded B. napus using hyperspectral features combined with simple yet intelligent techniques. One model employs partial least squares regression (PLSR) to predict the R, G, and B color channels, effectively distinguishing yellow-seeded varieties from others according to globally accepted yellow-seed classification protocols. Another model uses logistic regression (Logit-R) to produce a probability-based assessment of yellow-seeded status. Additionally, we implement two intelligent models, random forest and support vector classifier to evaluate features selected through lasso-penalized logistic regression.
Results and discussion: Our findings indicate significant recognition accuracies of 96.55% and 98% for the PLSR and Logit-R models, respectively, aligning closely with the accuracy of previous methods. This approach represents a meaningful advancement in identifying yellow-seeded rapeseed, with high recognition accuracy demonstrating the practical applicability of these models.
Frontiers in GeneticsBiochemistry, Genetics and Molecular Biology-Molecular Medicine
CiteScore
5.50
自引率
8.10%
发文量
3491
审稿时长
14 weeks
期刊介绍:
Frontiers in Genetics publishes rigorously peer-reviewed research on genes and genomes relating to all the domains of life, from humans to plants to livestock and other model organisms. Led by an outstanding Editorial Board of the world’s leading experts, this multidisciplinary, open-access journal is at the forefront of communicating cutting-edge research to researchers, academics, clinicians, policy makers and the public.
The study of inheritance and the impact of the genome on various biological processes is well documented. However, the majority of discoveries are still to come. A new era is seeing major developments in the function and variability of the genome, the use of genetic and genomic tools and the analysis of the genetic basis of various biological phenomena.