Hao Wu, Rui Han, Liang Zhao, Mengyao Liu, Hong Chen, Weifu Li, Lin Li
{"title":"AutoGP: An intelligent breeding platform for enhancing maize genomic selection.","authors":"Hao Wu, Rui Han, Liang Zhao, Mengyao Liu, Hong Chen, Weifu Li, Lin Li","doi":"10.1016/j.xplc.2025.101240","DOIUrl":null,"url":null,"abstract":"<p><p>In the face of climate change and the growing global population, there is an urgent need to accelerate the development of high-yielding crop varieties. To this end, vast amounts of genotype-to-phenotype data have been collected, and many machine learning (ML) models have been developed to predict phenotype from a given genotype. However, the requirement for high densities of single-nucleotide polymorphisms (SNPs) and the labor-intensive collection of phenotypic data are hampering the use of these models to advance breeding. Furthermore, recently developed genomic selection (GS) models, such as deep learning (DL), are complicated and inconvenient for breeders to navigate and optimize within their breeding programs. Here, we present the development of an intelligent breeding platform named AutoGP (http://autogp.hzau.edu.cn), which integrates genotype extraction, phenotypic extraction, and GS models of genotype-to-phenotype data within a user-friendly web interface. AutoGP has three main advantages over previously developed platforms: 1) an efficient sequencing chip to identify high-quality, high-confidence SNPs throughout gene-regulatory networks; 2) a complete workflow for extraction of plant phenotypes (such as plant height and leaf count) from smartphone-captured video; and 3) a broad model pool, enabling users to select from five ML models (support vector machine, extreme gradient boosting, gradient-boosted decision tree, multilayer perceptron, and random forest) and four commonly used DL models (deep learning genomic selection, deep learning genomic-wide association study, deep neural network for genomic prediction, and SoyDNGP). For the convenience of breeders, we use datasets from the maize (Zea mays) complete-diallel design plus unbalanced breeding-like inter-cross population as a case study to demonstrate the usefulness of AutoGP. We show that our genotype chips can effectively extract high-quality SNPs associated with days to tasseling and plant height. The models show reliable predictive accuracy on different populations and can provide effective guidance for breeders. Overall, AutoGP offers a practical solution to streamline the breeding process, enabling breeders to achieve more efficient and accurate genomic selection.</p>","PeriodicalId":52373,"journal":{"name":"Plant Communications","volume":" ","pages":"101240"},"PeriodicalIF":9.4000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Communications","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.xplc.2025.101240","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
In the face of climate change and the growing global population, there is an urgent need to accelerate the development of high-yielding crop varieties. To this end, vast amounts of genotype-to-phenotype data have been collected, and many machine learning (ML) models have been developed to predict phenotype from a given genotype. However, the requirement for high densities of single-nucleotide polymorphisms (SNPs) and the labor-intensive collection of phenotypic data are hampering the use of these models to advance breeding. Furthermore, recently developed genomic selection (GS) models, such as deep learning (DL), are complicated and inconvenient for breeders to navigate and optimize within their breeding programs. Here, we present the development of an intelligent breeding platform named AutoGP (http://autogp.hzau.edu.cn), which integrates genotype extraction, phenotypic extraction, and GS models of genotype-to-phenotype data within a user-friendly web interface. AutoGP has three main advantages over previously developed platforms: 1) an efficient sequencing chip to identify high-quality, high-confidence SNPs throughout gene-regulatory networks; 2) a complete workflow for extraction of plant phenotypes (such as plant height and leaf count) from smartphone-captured video; and 3) a broad model pool, enabling users to select from five ML models (support vector machine, extreme gradient boosting, gradient-boosted decision tree, multilayer perceptron, and random forest) and four commonly used DL models (deep learning genomic selection, deep learning genomic-wide association study, deep neural network for genomic prediction, and SoyDNGP). For the convenience of breeders, we use datasets from the maize (Zea mays) complete-diallel design plus unbalanced breeding-like inter-cross population as a case study to demonstrate the usefulness of AutoGP. We show that our genotype chips can effectively extract high-quality SNPs associated with days to tasseling and plant height. The models show reliable predictive accuracy on different populations and can provide effective guidance for breeders. Overall, AutoGP offers a practical solution to streamline the breeding process, enabling breeders to achieve more efficient and accurate genomic selection.
期刊介绍:
Plant Communications is an open access publishing platform that supports the global plant science community. It publishes original research, review articles, technical advances, and research resources in various areas of plant sciences. The scope of topics includes evolution, ecology, physiology, biochemistry, development, reproduction, metabolism, molecular and cellular biology, genetics, genomics, environmental interactions, biotechnology, breeding of higher and lower plants, and their interactions with other organisms. The goal of Plant Communications is to provide a high-quality platform for the dissemination of plant science research.