Yanqi Dong , Zhibin Ma , Jiali Zi , Fu Xu , Feixiang Chen
{"title":"Multiscale feature fusion and enhancement in a transformer for the fine-grained visual classification of tree species","authors":"Yanqi Dong , Zhibin Ma , Jiali Zi , Fu Xu , Feixiang Chen","doi":"10.1016/j.ecoinf.2025.103029","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate and rapid fine-grained visual classification (FGVC) of tree species within the same family can provide technical support for tree surveys, research, and conservation. However, FGVC faces challenges such as large intraclass differences and small interclass differences. Recognizing tree species within the same family requires focusing on and correlating overall and multiorgan features of the trees while mitigating the influence of complex natural backgrounds, occlusion effects and other factors. To address these challenges, we propose multiscale feature fusion (MFF) and enhancement in transformers to improve recognition performance. The method consists of a Swin transformer backbone, an MFF module, a discriminative feature enhancement (DFE) module, and a texture feature enhancement (TFE) module. The MFF module aims to strike a balance between global and local feature extraction. The DFE module is employed to mitigate the impact of background noise, whereas the TFE module is used to enhance the feature extraction associated with complex textures and spatial patterns. We conducted experiments on a constructed dataset of tree species from the same family, achieving a top-1 accuracy of 90.3 % and a top-3 accuracy of 96.8 %. In addition, the method performed well on three popular FGVC datasets, namely, the Flavia, Oxford Flowers, and PlantCLEF 2015 datasets, with top-1 accuracies of 100 %, 99.2 %, and 81.4 %, respectively. The ablation experiments and module visualizations also yielded satisfactory results. Thus, this work provides a solution to enhance the FGVC task.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"86 ","pages":"Article 103029"},"PeriodicalIF":5.8000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S157495412500038X","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate and rapid fine-grained visual classification (FGVC) of tree species within the same family can provide technical support for tree surveys, research, and conservation. However, FGVC faces challenges such as large intraclass differences and small interclass differences. Recognizing tree species within the same family requires focusing on and correlating overall and multiorgan features of the trees while mitigating the influence of complex natural backgrounds, occlusion effects and other factors. To address these challenges, we propose multiscale feature fusion (MFF) and enhancement in transformers to improve recognition performance. The method consists of a Swin transformer backbone, an MFF module, a discriminative feature enhancement (DFE) module, and a texture feature enhancement (TFE) module. The MFF module aims to strike a balance between global and local feature extraction. The DFE module is employed to mitigate the impact of background noise, whereas the TFE module is used to enhance the feature extraction associated with complex textures and spatial patterns. We conducted experiments on a constructed dataset of tree species from the same family, achieving a top-1 accuracy of 90.3 % and a top-3 accuracy of 96.8 %. In addition, the method performed well on three popular FGVC datasets, namely, the Flavia, Oxford Flowers, and PlantCLEF 2015 datasets, with top-1 accuracies of 100 %, 99.2 %, and 81.4 %, respectively. The ablation experiments and module visualizations also yielded satisfactory results. Thus, this work provides a solution to enhance the FGVC task.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.