Using machine learning algorithms to cluster and classify stone pine (Pinus pinea L.) populations based on seed and seedling characteristics

IF 2.6 2区 农林科学 Q1 FORESTRY European Journal of Forest Research Pub Date : 2024-06-29 DOI:10.1007/s10342-024-01716-7
Servet Caliskan, Elif Kartal, Safa Balekoglu, Fatma Çalışkan
{"title":"Using machine learning algorithms to cluster and classify stone pine (Pinus pinea L.) populations based on seed and seedling characteristics","authors":"Servet Caliskan, Elif Kartal, Safa Balekoglu, Fatma Çalışkan","doi":"10.1007/s10342-024-01716-7","DOIUrl":null,"url":null,"abstract":"<p>The phenotype of a woody plant represents its unique morphological properties. Population discrimination and individual classification are crucial for breeding populations and conserving genetic diversity. Machine Learning (ML) algorithms are gaining traction as powerful tools for predicting phenotypes. The present study is focused on classifying and clustering the seeds and seedlings in terms of morphological characteristics using ML algorithms. In addition, the k-means algorithm is used to determine the ideal number of clusters. The results obtained from the k-means algorithm were then compared with reality. The best classification performance achieved by the Random Forest algorithm was an accuracy of 0.648 and an F1-Score of 0.658 for the seed traits. Also, the best classification performance for stone pine seedlings was observed for the k-Nearest Neighbors algorithm (k = 18), for which the accuracy and F1-Score were 0.571 and 0.582, respectively. The best clustering performance was achieved with k = 2 for the seed (average Silhouette index = 0.48) and seedling (average Silhouette Index = 0.51) traits. According to the principal component analysis, two dimensions accounted for 97% and 63% of the traits of seeds and seedlings, respectively. The most important features between the seed and seedling traits were cone weight and bud set, respectively. This study will provide a foundation and motivation for future efforts in forest management practices, particularly regarding reforestation, yield optimization, and breeding programs.</p>","PeriodicalId":11996,"journal":{"name":"European Journal of Forest Research","volume":"19 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Forest Research","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1007/s10342-024-01716-7","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"FORESTRY","Score":null,"Total":0}
引用次数: 0

Abstract

The phenotype of a woody plant represents its unique morphological properties. Population discrimination and individual classification are crucial for breeding populations and conserving genetic diversity. Machine Learning (ML) algorithms are gaining traction as powerful tools for predicting phenotypes. The present study is focused on classifying and clustering the seeds and seedlings in terms of morphological characteristics using ML algorithms. In addition, the k-means algorithm is used to determine the ideal number of clusters. The results obtained from the k-means algorithm were then compared with reality. The best classification performance achieved by the Random Forest algorithm was an accuracy of 0.648 and an F1-Score of 0.658 for the seed traits. Also, the best classification performance for stone pine seedlings was observed for the k-Nearest Neighbors algorithm (k = 18), for which the accuracy and F1-Score were 0.571 and 0.582, respectively. The best clustering performance was achieved with k = 2 for the seed (average Silhouette index = 0.48) and seedling (average Silhouette Index = 0.51) traits. According to the principal component analysis, two dimensions accounted for 97% and 63% of the traits of seeds and seedlings, respectively. The most important features between the seed and seedling traits were cone weight and bud set, respectively. This study will provide a foundation and motivation for future efforts in forest management practices, particularly regarding reforestation, yield optimization, and breeding programs.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
根据种子和幼苗特征使用机器学习算法对石松(Pinus pinea L.)种群进行聚类和分类
木本植物的表型代表其独特的形态特性。种群区分和个体分类对于培育种群和保护遗传多样性至关重要。机器学习(ML)算法作为预测表型的有力工具,正日益受到重视。本研究的重点是利用 ML 算法对种子和幼苗的形态特征进行分类和聚类。此外,还使用了 k-means 算法来确定理想的聚类数量。然后将 k-means 算法得出的结果与实际情况进行比较。在种子性状方面,随机森林算法取得的最佳分类性能是 0.648 的准确率和 0.658 的 F1 分数。此外,k-近邻算法(k = 18)对石松幼苗的分类效果最好,准确率和 F1 分数分别为 0.571 和 0.582。对于种子(平均剪影指数 = 0.48)和幼苗(平均剪影指数 = 0.51)性状,k = 2 的聚类效果最好。根据主成分分析,两个维度分别占种子和幼苗性状的 97% 和 63%。种子和幼苗性状之间最重要的特征分别是圆锥体重量和花芽分化。这项研究将为今后的森林管理实践,特别是造林、产量优化和育种计划提供基础和动力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
5.10
自引率
3.60%
发文量
77
审稿时长
6-16 weeks
期刊介绍: The European Journal of Forest Research focuses on publishing innovative results of empirical or model-oriented studies which contribute to the development of broad principles underlying forest ecosystems, their functions and services. Papers which exclusively report methods, models, techniques or case studies are beyond the scope of the journal, while papers on studies at the molecular or cellular level will be considered where they address the relevance of their results to the understanding of ecosystem structure and function. Papers relating to forest operations and forest engineering will be considered if they are tailored within a forest ecosystem context.
期刊最新文献
Allometric equations for biomass and carbon pool estimation in short rotation Pinus radiata stands of the Western Cape, South Africa Effect of bedrock, tree size and time on growth and climate sensitivity of Norway spruce in the High Tatras Pure and mixed Scots pine forests showed divergent responses to climate variation and increased intrinsic water use efficiency across a European-wide climate gradient Preliminary validation of automated production analysis of feller buncher operations: integration of onboard computer data with LiDAR inventory Variability in fine root decomposition after forest thinning: effects of harvest intensity and root size
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1