首页 > 最新文献

Ecological Informatics最新文献

英文 中文
Machine learning reveals microclimate-specific drivers of a cosmopolitan supervector's population dynamics 机器学习揭示了世界性超向量种群动态的微气候特定驱动因素
IF 7.3 2区 环境科学与生态学 Q1 ECOLOGY Pub Date : 2026-03-01 Epub Date: 2026-03-04 DOI: 10.1016/j.ecoinf.2026.103690
Arinder K. Arora , Nolan Anderson , Kiran R. Gadhave
Forecasting pest population dynamics under variable microclimates is essential for understanding and managing ecological interactions in agroecosystems. In this study, we employed machine learning to model the population dynamics of a cosmopolitan supervector across two contrasting production environments—open fields and high tunnels. Using data from 1686 weekly trap observations (standardized to 2254 modeling units) and 16 environmental predictors, we developed and compared Random Forest, Gradient Boosting Machine (GBM), and XGBoost models to identify key abiotic and biotic drivers of population fluctuations. Random Forest achieved the highest predictive accuracy in open fields (87.7%), while XGBoost performed best under high-tunnel conditions (84.9%). Parent (seed) population and temperature consistently emerged as dominant predictors, with humidity and wind showing secondary effects. Models trained in one microclimate failed to predict populations in the other (≤44% accuracy), revealing distinct ecological processes governing pest dynamics in adjacent systems. These results demonstrate that machine learning can disentangle nonlinear interactions among environmental variables and improve predictive understanding of vector population ecology. Our framework illustrates how ecological informatics can integrate environmental sensing, population monitoring, and data-driven modeling to forecast biologically meaningful patterns across heterogeneous agroecosystems.
预测不同小气候条件下害虫种群动态对于理解和管理农业生态系统中的生态相互作用至关重要。在这项研究中,我们使用机器学习来模拟世界性超向量在两种截然不同的生产环境(开阔的田野和高隧道)中的种群动态。利用1686个周捕集器观测数据(标准化至2254个建模单元)和16个环境预测因子,我们开发并比较了随机森林、梯度增强机(GBM)和XGBoost模型,以确定种群波动的关键非生物和生物驱动因素。Random Forest在开阔场地的预测准确率最高(87.7%),而XGBoost在高隧道条件下的预测准确率最高(84.9%)。亲本(种子)种群和温度一直是主要的预测因子,湿度和风的影响次之。在一种小气候中训练的模型无法预测另一种小气候中的种群(准确率≤44%),这揭示了在相邻系统中控制害虫动态的不同生态过程。这些结果表明,机器学习可以解开环境变量之间的非线性相互作用,提高对媒介种群生态的预测理解。我们的框架说明了生态信息学如何整合环境感知、人口监测和数据驱动建模,以预测异质农业生态系统中有生物学意义的模式。
{"title":"Machine learning reveals microclimate-specific drivers of a cosmopolitan supervector's population dynamics","authors":"Arinder K. Arora ,&nbsp;Nolan Anderson ,&nbsp;Kiran R. Gadhave","doi":"10.1016/j.ecoinf.2026.103690","DOIUrl":"10.1016/j.ecoinf.2026.103690","url":null,"abstract":"<div><div>Forecasting pest population dynamics under variable microclimates is essential for understanding and managing ecological interactions in agroecosystems. In this study, we employed machine learning to model the population dynamics of a cosmopolitan supervector across two contrasting production environments—open fields and high tunnels. Using data from 1686 weekly trap observations (standardized to 2254 modeling units) and 16 environmental predictors, we developed and compared Random Forest, Gradient Boosting Machine (GBM), and XGBoost models to identify key abiotic and biotic drivers of population fluctuations. Random Forest achieved the highest predictive accuracy in open fields (87.7%), while XGBoost performed best under high-tunnel conditions (84.9%). Parent (seed) population and temperature consistently emerged as dominant predictors, with humidity and wind showing secondary effects. Models trained in one microclimate failed to predict populations in the other (≤44% accuracy), revealing distinct ecological processes governing pest dynamics in adjacent systems. These results demonstrate that machine learning can disentangle nonlinear interactions among environmental variables and improve predictive understanding of vector population ecology. Our framework illustrates how ecological informatics can integrate environmental sensing, population monitoring, and data-driven modeling to forecast biologically meaningful patterns across heterogeneous agroecosystems.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"94 ","pages":"Article 103690"},"PeriodicalIF":7.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147422416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UAV-based remote sensing for rangeland monitoring, a generalized and transparent workflow with an Australian lead 基于无人机的牧场监测遥感,澳大利亚领先的通用透明工作流程
IF 7.3 2区 环境科学与生态学 Q1 ECOLOGY Pub Date : 2026-03-01 Epub Date: 2026-02-28 DOI: 10.1016/j.ecoinf.2026.103663
Toufique A. Soomro , Allister Clarke , Jonathan Medway , Bin Liang , Stephen Summerhayes , Juan Pablo Guerschman , Robert de Ligt , Hugh Armitage , Clinton Ayers
Vegetation is a central indicator of rangeland condition, yet monitoring it across vast and heterogeneous dryland landscapes remains a major challenge. Although the advantages of remote sensing and unmanned aerial vehicles (UAVs) have been recognized for many years, their evolving role in rangeland vegetation assessment warrants a fresh examination, particularly in light of recent advances in sensor design, data processing, and multiscale integration. This systematic review, conducted using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) approach, synthesizes contemporary developments in UAV-based monitoring of rangeland vegetation. Key progress includes improved flight configurations, enhanced photogrammetric reconstruction for structural mapping, and the increased use of machine learning for estimating vegetation cover and above ground biomass. Emerging tools such as real-time kinematic positioning, automated image processing, and cloud-based computing are accelerating the transition toward transparent, repeatable, and scalable workflows. The integration of UAV products with satellite observations further strengthens regional vegetation assessment and supports broader ecosystem monitoring frameworks. Together, these advances highlight the growing capacity of UAV-based methods to deliver consistent, high-resolution vegetation information for sustainable rangeland management in Australia and globally.
植被是牧场状况的中心指标,但在广阔而异质性的旱地景观中监测植被仍然是一项重大挑战。尽管遥感和无人驾驶飞行器(uav)的优势已经被认识多年,但它们在牧场植被评估中不断发展的作用需要重新审视,特别是考虑到最近在传感器设计、数据处理和多尺度集成方面的进展。本系统综述采用系统综述和荟萃分析的首选报告项目(PRISMA)方法进行,综合了基于无人机的牧场植被监测的当代发展。关键进展包括改进飞行配置,增强用于结构测绘的摄影测量重建,以及增加使用机器学习来估计植被覆盖和地上生物量。诸如实时运动学定位、自动图像处理和基于云的计算等新兴工具正在加速向透明、可重复和可扩展的工作流程的过渡。无人机产品与卫星观测的整合进一步加强了区域植被评估,并支持更广泛的生态系统监测框架。总之,这些进展突出了基于无人机的方法不断增长的能力,为澳大利亚和全球的可持续牧场管理提供一致的高分辨率植被信息。
{"title":"UAV-based remote sensing for rangeland monitoring, a generalized and transparent workflow with an Australian lead","authors":"Toufique A. Soomro ,&nbsp;Allister Clarke ,&nbsp;Jonathan Medway ,&nbsp;Bin Liang ,&nbsp;Stephen Summerhayes ,&nbsp;Juan Pablo Guerschman ,&nbsp;Robert de Ligt ,&nbsp;Hugh Armitage ,&nbsp;Clinton Ayers","doi":"10.1016/j.ecoinf.2026.103663","DOIUrl":"10.1016/j.ecoinf.2026.103663","url":null,"abstract":"<div><div>Vegetation is a central indicator of rangeland condition, yet monitoring it across vast and heterogeneous dryland landscapes remains a major challenge. Although the advantages of remote sensing and unmanned aerial vehicles (UAVs) have been recognized for many years, their evolving role in rangeland vegetation assessment warrants a fresh examination, particularly in light of recent advances in sensor design, data processing, and multiscale integration. This systematic review, conducted using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) approach, synthesizes contemporary developments in UAV-based monitoring of rangeland vegetation. Key progress includes improved flight configurations, enhanced photogrammetric reconstruction for structural mapping, and the increased use of machine learning for estimating vegetation cover and above ground biomass. Emerging tools such as real-time kinematic positioning, automated image processing, and cloud-based computing are accelerating the transition toward transparent, repeatable, and scalable workflows. The integration of UAV products with satellite observations further strengthens regional vegetation assessment and supports broader ecosystem monitoring frameworks. Together, these advances highlight the growing capacity of UAV-based methods to deliver consistent, high-resolution vegetation information for sustainable rangeland management in Australia and globally.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"94 ","pages":"Article 103663"},"PeriodicalIF":7.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147422417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Suitability networks for restoration: Hydrobiogeochemical flows imprinting habitat-forming species 恢复的适宜性网络:水生生物地球化学流印记栖息地形成物种
IF 7.3 2区 环境科学与生态学 Q1 ECOLOGY Pub Date : 2026-03-01 Epub Date: 2026-02-24 DOI: 10.1016/j.ecoinf.2026.103665
Yuhan Wu , Matteo Convertino
Coastal habitats, such as oyster reefs, are critical land–sea ecotones that support biodiversity, ecosystem services, and ecosystem resilience. However, restoration of oyster reefs in river deltas and other coastal ecosystems remains challenged by the lack of scalable tools capable of quantifying how environmental heterogeneity shapes connectivity and reef structure across fragmented ecotones.
Here, we introduce a generalizable Topologic Systemic Ecograph (TIE) model that integrates three-dimensional hydrodynamic and biogeochemical fields with habitat occurrence data to infer ecological flows, flow-defined connectivity, and basins derived from predicted habitat suitability. By adapting hydro-inspired flow-routing algorithms and network-theoretic analysis, we construct the Oyster Flow Graph (OFG) and delineate Oyster Connectivity Basins (OCBs)– ecograph and ecosheds –providing spatially explicit ecological patterns, including eco-environmental feedbacks that support biogenic structures across ecosystem scales.
Application of the TIE framework to biogenic structures such as oyster reefs in the Pearl River Delta, Greater Bay Area, reveals pronounced regional differentiation, with central delta basins functioning as connectivity hubs and peripheral basins acting as flow bottlenecks. Stable, high-suitability zones emerge along sheltered deltaic and estuarine habitats, indicating conditions favorable for reef establishment and persistence. The inferred ecological flow topology is physically consistent with regional hydrodynamic patterns for 66.34% of flow directions, and Random Forest modeling highlights key hydro-biogeochemical drivers shaping network connectivity. At the delta scale, nitrate concentration, latitudinal (North–South) and vertical velocities at intermediate and deep depths, as well as chlorophyll-a, emerge as the predominant factors. Under high-flow conditions, vertical and longitudinal/cross-delta (East–West) velocities become the most important features. These flow interactions reflect the predominance of deltaic hydrology in governing nutrient transport and residence within reef habitats, thereby influencing reef morphology, ecological fitness, and cascading ecosystem services. Temperature and salinity emerge as second-order factors, given their relatively weak interactions with other environmental variables in defining ecological flows.
Overall, the proposed TIE model and framework advance precision restoration design by explicitly linking inferred eco-environmental flows derived from habitat suitability to ecological connectivity expressed as topology, and are transferable to other coastal and marine habitats where eco-environmental pressures can be structured to trigger ecological self-emergence.
沿海栖息地,如牡蛎礁,是支持生物多样性、生态系统服务和生态系统恢复能力的关键海陆过渡带。然而,在河流三角洲和其他沿海生态系统中,牡蛎礁的恢复仍然面临着挑战,因为缺乏可扩展的工具,能够量化环境异质性如何影响破碎过渡带的连通性和珊瑚礁结构。在此,我们引入了一个可推广的拓扑系统生态图(TIE)模型,该模型将三维水动力和生物地球化学场与栖息地发生数据相结合,以推断生态流量、流量定义的连通性以及从预测的栖息地适宜性中得出的流域。通过采用受水启发的流动路径算法和网络理论分析,我们构建了牡蛎流图(OFG),并描绘了牡蛎连通性盆地(ocb)——生态图和生态区——提供了空间上明确的生态模式,包括支持跨生态系统尺度生物成因结构的生态环境反馈。将TIE框架应用于珠江三角洲、大湾区的牡蛎礁等生物成因结构,可以发现明显的区域分化,三角洲中部盆地是连接枢纽,而周边盆地是流动瓶颈。稳定的、高适宜性的区域沿着被遮蔽的三角洲和河口栖息地出现,表明了有利于珊瑚礁建立和持续存在的条件。在66.34%的流动方向上,推断的生态流拓扑与区域水动力模式在物理上是一致的,随机森林模型突出了形成网络连通性的关键水文-生物地球化学驱动因素。在三角洲尺度上,硝酸盐浓度、中深纬度(南北)和垂直流速以及叶绿素-a是主要影响因子。在大流量条件下,垂直和纵向/跨三角洲(东西)速度成为最重要的特征。这些水流相互作用反映了三角洲水文在控制珊瑚礁栖息地内营养物质运输和居住方面的优势,从而影响了珊瑚礁形态、生态适合度和级联生态系统服务。考虑到温度和盐度在定义生态流量时与其他环境变量的相互作用相对较弱,它们成为二级因素。总体而言,所提出的TIE模型和框架通过明确地将栖息地适宜性推导出的生态环境流量与以拓扑形式表示的生态连通性联系起来,推进了精确的恢复设计,并可转移到其他沿海和海洋栖息地,在这些栖息地中,生态环境压力可以被结构化以触发生态自我涌现。
{"title":"Suitability networks for restoration: Hydrobiogeochemical flows imprinting habitat-forming species","authors":"Yuhan Wu ,&nbsp;Matteo Convertino","doi":"10.1016/j.ecoinf.2026.103665","DOIUrl":"10.1016/j.ecoinf.2026.103665","url":null,"abstract":"<div><div>Coastal habitats, such as oyster reefs, are critical land–sea ecotones that support biodiversity, ecosystem services, and ecosystem resilience. However, restoration of oyster reefs in river deltas and other coastal ecosystems remains challenged by the lack of scalable tools capable of quantifying how environmental heterogeneity shapes connectivity and reef structure across fragmented ecotones.</div><div>Here, we introduce a generalizable Topologic Systemic Ecograph (TIE) model that integrates three-dimensional hydrodynamic and biogeochemical fields with habitat occurrence data to infer ecological flows, flow-defined connectivity, and basins derived from predicted habitat suitability. By adapting hydro-inspired flow-routing algorithms and network-theoretic analysis, we construct the Oyster Flow Graph (OFG) and delineate Oyster Connectivity Basins (OCBs)– ecograph and ecosheds –providing spatially explicit ecological patterns, including eco-environmental feedbacks that support biogenic structures across ecosystem scales.</div><div>Application of the TIE framework to biogenic structures such as oyster reefs in the Pearl River Delta, Greater Bay Area, reveals pronounced regional differentiation, with central delta basins functioning as connectivity hubs and peripheral basins acting as flow bottlenecks. Stable, high-suitability zones emerge along sheltered deltaic and estuarine habitats, indicating conditions favorable for reef establishment and persistence. The inferred ecological flow topology is physically consistent with regional hydrodynamic patterns for 66.34% of flow directions, and Random Forest modeling highlights key hydro-biogeochemical drivers shaping network connectivity. At the delta scale, nitrate concentration, latitudinal (North–South) and vertical velocities at intermediate and deep depths, as well as chlorophyll-a, emerge as the predominant factors. Under high-flow conditions, vertical and longitudinal/cross-delta (East–West) velocities become the most important features. These flow interactions reflect the predominance of deltaic hydrology in governing nutrient transport and residence within reef habitats, thereby influencing reef morphology, ecological fitness, and cascading ecosystem services. Temperature and salinity emerge as second-order factors, given their relatively weak interactions with other environmental variables in defining ecological flows.</div><div>Overall, the proposed TIE model and framework advance precision restoration design by explicitly linking inferred eco-environmental flows derived from habitat suitability to ecological connectivity expressed as topology, and are transferable to other coastal and marine habitats where eco-environmental pressures can be structured to trigger ecological self-emergence.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"94 ","pages":"Article 103665"},"PeriodicalIF":7.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147422403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling the river spatial topology in robust water quality prediction: A LSTM-based evaluation framework 揭示稳健性水质预测中的河流空间拓扑:一个基于lstm的评价框架
IF 7.3 2区 环境科学与生态学 Q1 ECOLOGY Pub Date : 2026-03-01 Epub Date: 2026-02-17 DOI: 10.1016/j.ecoinf.2026.103659
Shuai Wang, Ying Xing, Jiahui Zhu, Yuxian Li, Feifei Dong
Time-series forecasting faces a major challenge when input data is missing. Additionally, standard multi-site water quality models often fail to capture the spatial connections among monitoring stations. This study proposes a GAT-enhanced LSTM model (GAT-LSTM) that integrates Graph Attention Networks (GAT) with Long Short-Term Memory (LSTM) to enhance prediction robustness under data incompleteness. We established a systematic evaluation framework, using MAE, MAPE, and R2 as the metrics for assessing predictive performance. In addition, we defined a Comprehensive Robustness Index (CRI) to evaluate model performance under three scenarios: spatial (missing stations), temporal (missing time steps), and random (missing indicators). Using real-world data from 13 monitoring stations in Pearl River, the third largest river in China, we compared GAT-LSTM against a standalone LSTM. Results show that the two models achieved comparable accuracy when data were complete; however, across all missing-data scenarios, GAT-LSTM consistently demonstrated superior robustness, exhibiting 1.3–1.8 times greater tolerance to data loss than the conventional LSTM. The GAT component became critical when spatial data is missing. The performance gap was most pronounced when key monitoring stations were removed first: GAT-LSTM maintained high stability (CRI: 0.98), whereas the standalone LSTM experienced a sharp decline (CRI: 0.5). These findings confirm that incorporating the GAT architecture provides powerful compensatory capability for incomplete spatial data, rendering GAT-LSTM significantly more resilient in real-world water quality prediction tasks. When monitoring networks suffer from inconsistent spatial coverage, GAT transitions from an optional enhancement to an essential core component.
当输入数据缺失时,时间序列预测面临重大挑战。此外,标准的多站点水质模型往往无法捕捉监测站之间的空间联系。本研究提出了一种GAT增强LSTM模型(GAT-LSTM),该模型将图注意网络(GAT)与长短期记忆(LSTM)相结合,以增强数据不完备情况下的预测鲁棒性。我们建立了一个系统的评估框架,使用MAE、MAPE和R2作为评估预测性能的指标。此外,我们定义了一个综合稳健性指数(CRI)来评估模型在三种情况下的性能:空间(缺失站点)、时间(缺失时间步长)和随机(缺失指标)。利用中国第三大河珠江13个监测站的真实数据,我们将GAT-LSTM与独立LSTM进行了比较。结果表明,在数据完整的情况下,两种模型的精度相当;然而,在所有丢失数据的情况下,GAT-LSTM始终表现出卓越的鲁棒性,对数据丢失的容忍度是传统LSTM的1.3-1.8倍。当空间数据缺失时,GAT组件变得至关重要。首先移除关键监测站时,性能差距最为明显:GAT-LSTM保持高稳定性(CRI: 0.98),而独立LSTM则急剧下降(CRI: 0.5)。这些发现证实,结合GAT架构为不完整的空间数据提供了强大的补偿能力,使GAT- lstm在现实世界的水质预测任务中具有更大的弹性。当监测网络的空间覆盖范围不一致时,GAT从可选的增强功能转变为基本的核心组件。
{"title":"Unveiling the river spatial topology in robust water quality prediction: A LSTM-based evaluation framework","authors":"Shuai Wang,&nbsp;Ying Xing,&nbsp;Jiahui Zhu,&nbsp;Yuxian Li,&nbsp;Feifei Dong","doi":"10.1016/j.ecoinf.2026.103659","DOIUrl":"10.1016/j.ecoinf.2026.103659","url":null,"abstract":"<div><div>Time-series forecasting faces a major challenge when input data is missing. Additionally, standard multi-site water quality models often fail to capture the spatial connections among monitoring stations. This study proposes a GAT-enhanced LSTM model (GAT-LSTM) that integrates Graph Attention Networks (GAT) with Long Short-Term Memory (LSTM) to enhance prediction robustness under data incompleteness. We established a systematic evaluation framework, using MAE, MAPE, and R<sup>2</sup> as the metrics for assessing predictive performance. In addition, we defined a Comprehensive Robustness Index (CRI) to evaluate model performance under three scenarios: spatial (missing stations), temporal (missing time steps), and random (missing indicators). Using real-world data from 13 monitoring stations in Pearl River, the third largest river in China, we compared GAT-LSTM against a standalone LSTM. Results show that the two models achieved comparable accuracy when data were complete; however, across all missing-data scenarios, GAT-LSTM consistently demonstrated superior robustness, exhibiting 1.3–1.8 times greater tolerance to data loss than the conventional LSTM. The GAT component became critical when spatial data is missing. The performance gap was most pronounced when key monitoring stations were removed first: GAT-LSTM maintained high stability (CRI: 0.98), whereas the standalone LSTM experienced a sharp decline (CRI: 0.5). These findings confirm that incorporating the GAT architecture provides powerful compensatory capability for incomplete spatial data, rendering GAT-LSTM significantly more resilient in real-world water quality prediction tasks. When monitoring networks suffer from inconsistent spatial coverage, GAT transitions from an optional enhancement to an essential core component.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"94 ","pages":"Article 103659"},"PeriodicalIF":7.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147422405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A design space-based network for efficient and accurate fish recognition in aquaculture 基于空间的水产养殖鱼类识别网络设计
IF 7.3 2区 环境科学与生态学 Q1 ECOLOGY Pub Date : 2026-03-01 Epub Date: 2026-02-27 DOI: 10.1016/j.ecoinf.2026.103679
Tuyan Luo , Xiaohu Tang , Xin Lv , Baolong Bao , Xiaohui Chen , Xiu Fang , Zhixiang Liu , Jingxiang Xu
Automated fish identification plays a pivotal role in the development of intelligent aquaculture systems by enabling more effective stock assessment and behavioral monitoring. Although contemporary convolutional neural network (CNN)-based approaches have demonstrated strong recognition performance, they frequently exhibit computational inefficiency and limited robustness under the challenging conditions characteristic of underwater environments. In this study, we introduce a novel network exploration framework, grounded in the RegNet design paradigm, for deriving task-specific architectures tailored to underwater fish recognition. Using a relatively small dataset and approximately 200K training iterations, we obtain a family of high-performing models, collectively referred to as SeekNet, spanning multiple complexity regimes. Relative to state-of-the-art baselines, SeekNet consistently achieves superior performance. On our primary dataset, SeekNet attains a rank-1 accuracy of 95.97% and a True Acceptance Rate (TAR) of 88.04% at a False Acceptance Rate (FAR) of 106. On a separate closed-set dataset, it reaches a rank-1 accuracy of 98.78% and a TAR of 98.71% at the same FAR threshold. These results substantiate the effectiveness of the proposed methodology and underscore its practical suitability for deployment in real-world aquaculture environments.
鱼类自动识别通过实现更有效的种群评估和行为监测,在智能水产养殖系统的发展中起着关键作用。尽管基于卷积神经网络(CNN)的当代方法已经显示出强大的识别性能,但在水下环境的挑战性条件下,它们经常表现出计算效率低下和鲁棒性有限。在本研究中,我们引入了一种基于RegNet设计范式的新型网络探索框架,用于衍生针对水下鱼类识别的特定任务架构。使用相对较小的数据集和大约200K的训练迭代,我们获得了一系列高性能模型,统称为SeekNet,跨越多种复杂性制度。相对于最先进的基线,SeekNet始终实现卓越的性能。在我们的主要数据集上,SeekNet获得了95.97%的rank-1准确率和88.04%的真实接受率(TAR),而错误接受率(FAR)为10−6。在一个单独的闭集数据集上,在相同的FAR阈值下,它达到了98.78%的rank-1精度和98.71%的TAR。这些结果证实了所提出方法的有效性,并强调了其在实际水产养殖环境中部署的实际适用性。
{"title":"A design space-based network for efficient and accurate fish recognition in aquaculture","authors":"Tuyan Luo ,&nbsp;Xiaohu Tang ,&nbsp;Xin Lv ,&nbsp;Baolong Bao ,&nbsp;Xiaohui Chen ,&nbsp;Xiu Fang ,&nbsp;Zhixiang Liu ,&nbsp;Jingxiang Xu","doi":"10.1016/j.ecoinf.2026.103679","DOIUrl":"10.1016/j.ecoinf.2026.103679","url":null,"abstract":"<div><div>Automated fish identification plays a pivotal role in the development of intelligent aquaculture systems by enabling more effective stock assessment and behavioral monitoring. Although contemporary convolutional neural network (CNN)-based approaches have demonstrated strong recognition performance, they frequently exhibit computational inefficiency and limited robustness under the challenging conditions characteristic of underwater environments. In this study, we introduce a novel network exploration framework, grounded in the RegNet design paradigm, for deriving task-specific architectures tailored to underwater fish recognition. Using a relatively small dataset and approximately 200K training iterations, we obtain a family of high-performing models, collectively referred to as SeekNet, spanning multiple complexity regimes. Relative to state-of-the-art baselines, SeekNet consistently achieves superior performance. On our primary dataset, SeekNet attains a rank-1 accuracy of 95.97% and a True Acceptance Rate (TAR) of 88.04% at a False Acceptance Rate (FAR) of <span><math><mrow><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mo>−</mo><mn>6</mn></mrow></msup></mrow></math></span>. On a separate closed-set dataset, it reaches a rank-1 accuracy of 98.78% and a TAR of 98.71% at the same FAR threshold. These results substantiate the effectiveness of the proposed methodology and underscore its practical suitability for deployment in real-world aquaculture environments.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"94 ","pages":"Article 103679"},"PeriodicalIF":7.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147422420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Area estimation methods in underwater video surveys: Biases, errors and their impacts on density estimates 水下视频测量中的面积估计方法:偏差、误差及其对密度估计的影响
IF 7.3 2区 环境科学与生态学 Q1 ECOLOGY Pub Date : 2026-03-01 Epub Date: 2026-02-13 DOI: 10.1016/j.ecoinf.2026.103648
Georgina Vickery , Fabian Zimmermann , Fletcher Thompson , Carsten Hvingel
Unmanned underwater vehicles (UUVs) are increasingly used for non-invasive data collection, with applications ranging from supporting fisheries stock assessment to biodiversity mapping. Estimating the area surveyed is crucial to calculate densities and abundances from observations. Area estimation methods range from utilising fixed transect dimensions, to advanced approaches that account for path deviations and intra-transect variability in field of view, often integrating multiple sensors or detailed bathymetric data. However, accuracy of position data from remote vehicles is limited by environmental and operational variability. Compounding these differences, researchers rarely fully document methodologies, preventing comparability between datasets.
This paper develops a transferable methodology to process UUV position data, assessing the results of 2 width and 12 length estimation methods on estimated area surveyed. This analysis uniquely includes the calculation of associated error and resulting confidence intervals for each approach. Results show species density estimates can vary up to 13% depending on processing applied. The two equations used to calculate transect width cause significant differences between density estimates. Significant differences in transect length also occur depending on the degree and method of smoothing technique applied to position data. Importantly, these differences between methodologies are encompassed by the variance calculated from the position data.
Recommendations to obtain representative transect areas are to validate width equations against laser measurements, use incremental position data and include depth when calculating total distance travelled. Due to variation in resulting density estimates across methods it is essential to include confidence intervals and full details of pre-filtering and smoothing procedures.
无人水下航行器(uuv)越来越多地用于非侵入性数据收集,其应用范围从支持渔业种群评估到生物多样性制图。估算被调查区域对于计算密度和丰度是至关重要的。面积估计方法的范围从利用固定样条尺寸到考虑路径偏差和视野内样条变异性的高级方法,通常集成多个传感器或详细的水深数据。然而,来自远程车辆的位置数据的准确性受到环境和操作变化的限制。使这些差异更加复杂的是,研究人员很少完整地记录方法,从而阻碍了数据集之间的可比性。本文开发了一种可转移的方法来处理UUV位置数据,评估了2种宽度和12种长度估计方法对估计区域的估计结果。这种分析独特地包括对每种方法的相关误差和结果置信区间的计算。结果表明,根据所采用的处理方法,物种密度估计值最高可达13%。用于计算样带宽度的两个方程导致密度估计值之间的显着差异。根据位置数据的平滑技术的程度和方法不同,样条长度也会发生显著差异。重要的是,这些方法之间的差异包含在位置数据计算的方差中。获得代表性样带面积的建议是,根据激光测量值验证宽度方程,使用增量位置数据,并在计算总行驶距离时包括深度。由于不同方法的密度估计值存在差异,因此必须包括置信区间以及预滤波和平滑过程的全部细节。
{"title":"Area estimation methods in underwater video surveys: Biases, errors and their impacts on density estimates","authors":"Georgina Vickery ,&nbsp;Fabian Zimmermann ,&nbsp;Fletcher Thompson ,&nbsp;Carsten Hvingel","doi":"10.1016/j.ecoinf.2026.103648","DOIUrl":"10.1016/j.ecoinf.2026.103648","url":null,"abstract":"<div><div>Unmanned underwater vehicles (UUVs) are increasingly used for non-invasive data collection, with applications ranging from supporting fisheries stock assessment to biodiversity mapping. Estimating the area surveyed is crucial to calculate densities and abundances from observations. Area estimation methods range from utilising fixed transect dimensions, to advanced approaches that account for path deviations and intra-transect variability in field of view, often integrating multiple sensors or detailed bathymetric data. However, accuracy of position data from remote vehicles is limited by environmental and operational variability. Compounding these differences, researchers rarely fully document methodologies, preventing comparability between datasets.</div><div>This paper develops a transferable methodology to process UUV position data, assessing the results of 2 width and 12 length estimation methods on estimated area surveyed. This analysis uniquely includes the calculation of associated error and resulting confidence intervals for each approach. Results show species density estimates can vary up to 13% depending on processing applied. The two equations used to calculate transect width cause significant differences between density estimates. Significant differences in transect length also occur depending on the degree and method of smoothing technique applied to position data. Importantly, these differences between methodologies are encompassed by the variance calculated from the position data.</div><div>Recommendations to obtain representative transect areas are to validate width equations against laser measurements, use incremental position data and include depth when calculating total distance travelled. Due to variation in resulting density estimates across methods it is essential to include confidence intervals and full details of pre-filtering and smoothing procedures.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"94 ","pages":"Article 103648"},"PeriodicalIF":7.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147422415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting soil δ13C patterns in Brazil using nested datasets, feature selection, and machine learning 使用嵌套数据集、特征选择和机器学习预测巴西土壤δ13C模式
IF 7.3 2区 环境科学与生态学 Q1 ECOLOGY Pub Date : 2026-03-01 Epub Date: 2026-02-16 DOI: 10.1016/j.ecoinf.2026.103647
Osmar Luiz Ferreira de Carvalho , Glauber das Neves , João Paulo Sena-Souza , Alexandre Tadeu Brunello , Vinicius Vasconcelos , Daniel Guerreiro e Silva , Maria Gabriella da Silva Araújo , Deoclecio Jardim Amorim , Luiz Antonio Martinelli , Gabriela Bielefeld Nardoto , Osmar Abílio de Carvalho Júnior
Soil δ13C is an integrative indicator of carbon cycling, vegetation composition, and land use dynamics. Despite increasing availability of high-resolution environmental datasets, predicting soil δ13C remains challenging due to collinear and scale-dependent biogeochemical processes, and few studies have systematically compared feature selection strategies or regression algorithms across spatial scales. This study introduces an innovative hierarchical framework for predicting the spatial distribution of soil δ13C across Brazil, systematically comparing feature selection strategies and machine learning algorithms across three nested datasets: Cerrado, extended Cerrado, and national scale. Predictors included climatic variables, topography, soil properties, and vegetation indices. Feature selection combined stepwise, recursive, and exhaustive searches, followed by variance inflation factor (VIF) filtering to reduce multicollinearity. Model benchmarking compared linear, kernel-based, and ensemble regressors under nested cross-validation, with performance assessed by coefficient of determination (R2), root mean squared error (RMSE), and mean absolute error (MAE). Results show that model performance declined with increasing spatial extent, with best VIF-constrained R2 decreasing from 0.77 (local) to 0.64 (regional) and 0.58 (national). Compact VIF-constrained subsets yielded similar accuracy to unconstrained sets, demonstrating that multicollinearity control improves parsimony without sacrificing predictive power. Ensemble regressors outperformed linear and kernel-based methods across all datasets. Feature importance shifted with spatial extent, with vegetation productivity and seasonal climate jointly structuring δ13C patterns rather than any single predictor dominating across scales. This framework advances δ13C isoscape modeling by combining predictive accuracy with interpretability, supporting applications in soil carbon monitoring, ecological research, and land-use planning.
土壤δ13C是反映土壤碳循环、植被组成和土地利用动态的综合指标。尽管高分辨率环境数据集的可用性越来越高,但由于共线性和尺度依赖的生物地球化学过程,预测土壤δ13C仍然具有挑战性,并且很少有研究系统地比较跨空间尺度的特征选择策略或回归算法。本研究引入了一个创新的分层框架来预测巴西土壤δ13C的空间分布,系统地比较了三个嵌套数据集(Cerrado、扩展Cerrado和国家尺度)的特征选择策略和机器学习算法。预测因子包括气候变量、地形、土壤性质和植被指数。特征选择结合逐步、递归和穷举搜索,然后进行方差膨胀因子(VIF)滤波以减少多重共线性。模型基准比较了嵌套交叉验证下的线性、基于核和集合回归,并通过决定系数(R2)、均方根误差(RMSE)和平均绝对误差(MAE)来评估其性能。结果表明,模型性能随空间范围的增加而下降,最佳的vif约束R2从0.77(局部)下降到0.64(区域)和0.58(全国)。紧凑的vif约束子集与无约束子集产生相似的精度,表明多重共线性控制在不牺牲预测能力的情况下提高了简约性。集合回归在所有数据集上都优于线性和基于核的方法。特征重要性随空间范围的变化而变化,植被生产力和季节气候共同构成δ13C模式,而不是单一的预测因子在尺度上占主导地位。该框架将预测精度与可解释性相结合,促进了δ13C等景观模型的发展,支持在土壤碳监测、生态研究和土地利用规划中的应用。
{"title":"Predicting soil δ13C patterns in Brazil using nested datasets, feature selection, and machine learning","authors":"Osmar Luiz Ferreira de Carvalho ,&nbsp;Glauber das Neves ,&nbsp;João Paulo Sena-Souza ,&nbsp;Alexandre Tadeu Brunello ,&nbsp;Vinicius Vasconcelos ,&nbsp;Daniel Guerreiro e Silva ,&nbsp;Maria Gabriella da Silva Araújo ,&nbsp;Deoclecio Jardim Amorim ,&nbsp;Luiz Antonio Martinelli ,&nbsp;Gabriela Bielefeld Nardoto ,&nbsp;Osmar Abílio de Carvalho Júnior","doi":"10.1016/j.ecoinf.2026.103647","DOIUrl":"10.1016/j.ecoinf.2026.103647","url":null,"abstract":"<div><div>Soil <span><math><msup><mrow><mi>δ</mi></mrow><mrow><mn>13</mn></mrow></msup></math></span>C is an integrative indicator of carbon cycling, vegetation composition, and land use dynamics. Despite increasing availability of high-resolution environmental datasets, predicting soil <span><math><msup><mrow><mi>δ</mi></mrow><mrow><mn>13</mn></mrow></msup></math></span>C remains challenging due to collinear and scale-dependent biogeochemical processes, and few studies have systematically compared feature selection strategies or regression algorithms across spatial scales. This study introduces an innovative hierarchical framework for predicting the spatial distribution of soil <span><math><msup><mrow><mi>δ</mi></mrow><mrow><mn>13</mn></mrow></msup></math></span>C across Brazil, systematically comparing feature selection strategies and machine learning algorithms across three nested datasets: Cerrado, extended Cerrado, and national scale. Predictors included climatic variables, topography, soil properties, and vegetation indices. Feature selection combined stepwise, recursive, and exhaustive searches, followed by variance inflation factor (VIF) filtering to reduce multicollinearity. Model benchmarking compared linear, kernel-based, and ensemble regressors under nested cross-validation, with performance assessed by coefficient of determination (<span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>), root mean squared error (RMSE), and mean absolute error (MAE). Results show that model performance declined with increasing spatial extent, with best VIF-constrained <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> decreasing from 0.77 (local) to 0.64 (regional) and 0.58 (national). Compact VIF-constrained subsets yielded similar accuracy to unconstrained sets, demonstrating that multicollinearity control improves parsimony without sacrificing predictive power. Ensemble regressors outperformed linear and kernel-based methods across all datasets. Feature importance shifted with spatial extent, with vegetation productivity and seasonal climate jointly structuring <span><math><msup><mrow><mi>δ</mi></mrow><mrow><mn>13</mn></mrow></msup></math></span>C patterns rather than any single predictor dominating across scales. This framework advances <span><math><msup><mrow><mi>δ</mi></mrow><mrow><mn>13</mn></mrow></msup></math></span>C isoscape modeling by combining predictive accuracy with interpretability, supporting applications in soil carbon monitoring, ecological research, and land-use planning.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"94 ","pages":"Article 103647"},"PeriodicalIF":7.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147422406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
YOLO-HARVEST: A hybrid ViT architecture with locality-enhanced attention for automated wildlife species classification YOLO-HARVEST:一种混合ViT架构,具有位置增强的关注,用于自动野生动物物种分类
IF 7.3 2区 环境科学与生态学 Q1 ECOLOGY Pub Date : 2026-03-01 Epub Date: 2026-01-08 DOI: 10.1016/j.ecoinf.2026.103605
Anuruddha Paul , Rishi Raj , Mahendra Kumar Gourisaria , Amitkumar V. Jha , Nicu Bizon
Wildlife conservation efforts increasingly depend on automated species classification for processing large-scale camera trap data, yet existing approaches struggle with accuracy and computational efficiency in resource-constrained environments. This paper introduces HARVEST (Hierarchical Attention for Robust Vision Enhancement with Shifted Tokenization), a novel hybrid architecture integrating YOLOv8 object detection with transformer-based classification. The architecture incorporates three key innovations: Shifted Patch Tokenization (SPT) for boundary information preservation, Local Information Enhancer (LIFE) for spatial feature extraction, and Locality-Enhanced Attention (LEA) for adaptive feature integration. The model is evaluated on two comprehensive datasets: a challenging 45-species Ohio State University (OSU) Small Animals dataset exhibiting an extreme class imbalance (6320:1 ratio) and a balanced 6-species African wildlife dataset. The HARVEST demonstrates excellent performance and achieves 85.27% accuracy on the OSU dataset and 94.74% accuracy on the Wildlife dataset with only 13.0M parameters, representing an 85% reduction compared to standard Vision Transformers while maintaining superior performance. The OSU evaluation demonstrates robust performance across highly imbalanced real-world conditions with species sample sizes ranging from 1 to 6320 images, validating practical applicability for conservation scenarios. Qualitative analysis reveals biologically meaningful attention patterns focusing on taxonomically relevant features. The efficient architecture enables real-world deployment in conservation applications, providing a practical solution for automated wildlife monitoring and biodiversity surveillance.
野生动物保护工作越来越依赖于自动物种分类来处理大规模相机陷阱数据,然而现有的方法在资源有限的环境中存在准确性和计算效率的问题。本文介绍了一种将YOLOv8目标检测与基于变换的分类相结合的新型混合架构——HARVEST (Hierarchical Attention for Robust Vision Enhancement with shifting Tokenization)。该体系结构包含三个关键创新:用于边界信息保存的移位补丁标记化(SPT),用于空间特征提取的局部信息增强器(LIFE)和用于自适应特征集成的位置增强注意(LEA)。该模型在两个综合数据集上进行了评估:一个具有挑战性的45种俄亥俄州立大学(OSU)小动物数据集,显示出极端的类不平衡(6380:1的比例)和一个平衡的6种非洲野生动物数据集。HARVEST表现出优异的性能,在OSU数据集上实现了85.27%的准确率,在野生动物数据集上实现了94.74%的准确率,仅使用13.0M参数,与标准视觉变形器相比降低了85%,同时保持了优异的性能。俄勒冈州立大学的评估表明,在高度不平衡的现实世界条件下,物种样本量从1到6320张不等,验证了保护场景的实际适用性。定性分析揭示了生物学上有意义的注意力模式,集中在分类学上相关的特征上。这种高效的架构能够在保护应用中进行实际部署,为自动野生动物监测和生物多样性监测提供实用的解决方案。
{"title":"YOLO-HARVEST: A hybrid ViT architecture with locality-enhanced attention for automated wildlife species classification","authors":"Anuruddha Paul ,&nbsp;Rishi Raj ,&nbsp;Mahendra Kumar Gourisaria ,&nbsp;Amitkumar V. Jha ,&nbsp;Nicu Bizon","doi":"10.1016/j.ecoinf.2026.103605","DOIUrl":"10.1016/j.ecoinf.2026.103605","url":null,"abstract":"<div><div>Wildlife conservation efforts increasingly depend on automated species classification for processing large-scale camera trap data, yet existing approaches struggle with accuracy and computational efficiency in resource-constrained environments. This paper introduces HARVEST (Hierarchical Attention for Robust Vision Enhancement with Shifted Tokenization), a novel hybrid architecture integrating YOLOv8 object detection with transformer-based classification. The architecture incorporates three key innovations: Shifted Patch Tokenization (SPT) for boundary information preservation, Local Information Enhancer (LIFE) for spatial feature extraction, and Locality-Enhanced Attention (LEA) for adaptive feature integration. The model is evaluated on two comprehensive datasets: a challenging 45-species Ohio State University (OSU) Small Animals dataset exhibiting an extreme class imbalance (6320:1 ratio) and a balanced 6-species African wildlife dataset. The HARVEST demonstrates excellent performance and achieves 85.27% accuracy on the OSU dataset and 94.74% accuracy on the Wildlife dataset with only 13.0M parameters, representing an 85% reduction compared to standard Vision Transformers while maintaining superior performance. The OSU evaluation demonstrates robust performance across highly imbalanced real-world conditions with species sample sizes ranging from 1 to 6320 images, validating practical applicability for conservation scenarios. Qualitative analysis reveals biologically meaningful attention patterns focusing on taxonomically relevant features. The efficient architecture enables real-world deployment in conservation applications, providing a practical solution for automated wildlife monitoring and biodiversity surveillance.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"94 ","pages":"Article 103605"},"PeriodicalIF":7.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reliable machine learning initialization methods for the calibration of Dynamic Energy Budget models 动态能量预算模型标定的可靠机器学习初始化方法
IF 7.3 2区 环境科学与生态学 Q1 ECOLOGY Pub Date : 2026-03-01 Epub Date: 2026-01-30 DOI: 10.1016/j.ecoinf.2026.103624
Diogo F. Oliveira , Gonçalo M. Marques , Filipe M.P. Santos , Laure Pecquerie , João M.C. Sousa , Tiago Domingos
Dynamic Energy Budget (DEB) theory is a general theory that describes how organisms utilize the energy in food for maintenance, growth, development, and reproduction. DEB models have been widely applied in fields such as conservation biology, aquaculture and ecotoxicology, due to their ability to simulate how organisms respond to changing environmental conditions. To obtain a DEB model, the calibration problem must be solved: find the parameters that minimize the deviation between observed data and model predictions. While DEB model calibration is largely automated, the selection of initial parameters remains a key unresolved step, since the only automated method – the bijection method – often fails to produce a feasible initial parameter set. Consequently, modelers resort to trial-and-error to find parameters to seed the estimation. To bridge this gap, we propose using machine learning to initialize the calibration. We develop two models: a neural network and a 1-nearest-neighbor. Both models are built with a focus on feasibility, directly integrating parameter constraints into their structure. We train and evaluate our methods on the 5000+ DEB models in the Add-my-Pet database. Both methods generate feasible parameter sets in 99% of cases — compared to only 40% for the bijection method. The neural network initialization leads to improved DEB model calibration, achieving a calibration loss three times lower, on average, when compared to other methods. To support broader adoption, we have open-sourced our code and our models are available as initialization options within DEBtool, the primary software for parameter calibration.
动态能量预算(DEB)理论是描述生物体如何利用食物中的能量来维持、生长、发育和繁殖的一般理论。DEB模型由于能够模拟生物体对环境条件变化的反应,已广泛应用于保护生物学、水产养殖和生态毒理学等领域。为了获得DEB模型,必须解决校准问题:找到使观测数据与模型预测之间偏差最小的参数。虽然DEB模型校准在很大程度上是自动化的,但初始参数的选择仍然是一个关键的未解决的步骤,因为唯一的自动化方法-双注入法-往往不能产生可行的初始参数集。因此,建模者采用试错法来寻找参数以进行估计。为了弥补这一差距,我们建议使用机器学习来初始化校准。我们开发了两个模型:一个神经网络和一个最近邻。这两种模型的建立都着眼于可行性,直接将参数约束整合到其结构中。我们在Add-my-Pet数据库中的5000多个DEB模型上训练和评估我们的方法。两种方法在99%的情况下都能产生可行的参数集,而双注射方法只有40%。神经网络初始化可以改善DEB模型的校准,与其他方法相比,平均校准损失降低了三倍。为了支持更广泛的应用,我们已经开源了我们的代码,并且我们的模型可以作为参数校准的主要软件DEBtool中的初始化选项。
{"title":"Reliable machine learning initialization methods for the calibration of Dynamic Energy Budget models","authors":"Diogo F. Oliveira ,&nbsp;Gonçalo M. Marques ,&nbsp;Filipe M.P. Santos ,&nbsp;Laure Pecquerie ,&nbsp;João M.C. Sousa ,&nbsp;Tiago Domingos","doi":"10.1016/j.ecoinf.2026.103624","DOIUrl":"10.1016/j.ecoinf.2026.103624","url":null,"abstract":"<div><div>Dynamic Energy Budget (DEB) theory is a general theory that describes how organisms utilize the energy in food for maintenance, growth, development, and reproduction. DEB models have been widely applied in fields such as conservation biology, aquaculture and ecotoxicology, due to their ability to simulate how organisms respond to changing environmental conditions. To obtain a DEB model, the calibration problem must be solved: find the parameters that minimize the deviation between observed data and model predictions. While DEB model calibration is largely automated, the selection of initial parameters remains a key unresolved step, since the only automated method – the bijection method – often fails to produce a feasible initial parameter set. Consequently, modelers resort to trial-and-error to find parameters to seed the estimation. To bridge this gap, we propose using machine learning to initialize the calibration. We develop two models: a neural network and a 1-nearest-neighbor. Both models are built with a focus on feasibility, directly integrating parameter constraints into their structure. We train and evaluate our methods on the 5000+ DEB models in the Add-my-Pet database. Both methods generate feasible parameter sets in 99% of cases — compared to only 40% for the bijection method. The neural network initialization leads to improved DEB model calibration, achieving a calibration loss three times lower, on average, when compared to other methods. To support broader adoption, we have open-sourced our code and our models are available as initialization options within <span>DEBtool</span>, the primary software for parameter calibration.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"94 ","pages":"Article 103624"},"PeriodicalIF":7.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146173894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large language vision models for zero-shot handwriting recognition of historical herbarium labels 历史植物标本馆标签零射击手写识别的大型语言视觉模型
IF 7.3 2区 环境科学与生态学 Q1 ECOLOGY Pub Date : 2026-03-01 Epub Date: 2026-02-12 DOI: 10.1016/j.ecoinf.2026.103656
Matthias Körschens , Solveig Franziska Bucher , Christiane M. Ritz , Sebastian Gebauer , Jens Wesenberg , Christine Römermann
Herbaria contain large numbers of conserved specimens with lots of information for biodiversity research, since they offer a track record of the morphology as well as temporal and spatial distribution of plant species worldwide. Besides the dried plant itself, a lot of additional information is usually provided with the herbarium specimens, typically captured in printed or handwritten labels, such as the date of collection, the location and the collector’s name. While, due to historical reasons, the specimens have been collected and labeled manually, considerable efforts are underway to digitize entire herbaria and therewith make the specimens available for analysis with automated methods. However, the extraction of information from handwritten labels is a considerable challenge, since the handwritings do not only differ from one collector to another, but they are also often in old types of writing (e.g., Sütterlin, an old German script). Therefore, they are often hard to decipher both manually and automatically, and barely any substantial consistent data of this kind exists to train state-of-the-art vision models. Since the location of the labels differs depending on the record, they need to be detected before the automated analysis of the writing, which also proved challenging in the past. In this work we show that state-of-the-art Large Language and Vision Models (LLVM) possess capabilities to extract such handwriting zero-shot, i.e., completely without training or fine-tuning, to a high degree of accuracy. Additionally, we show that the results can be refined and improved considerably by performing zero-shot detection of the labels beforehand. We evaluate our approach on two novel datasets, one containing handwritten and one printed labels, respectively, based on herbarium scans from the virtual herbarium of the flora of Germany. In our evaluations, the approaches achieve a mean similarity of 84.5% for handwritten, and one of 93.1% for printed labels. Thus, we conclude that still some evaluation is needed before the LLVMs can be fully applied to transcribe herbarium specimen labels, as sometimes the species taxonomies as well as the collection sites are not correctly identified. Still these models can support the transcription process in large collections. Our code and a graphical web application is publicly available under https://github.com/Atlas8008/herbarium_label_reader.
植物标本室保存了大量的植物标本,为生物多样性研究提供了大量的信息,因为它们提供了世界范围内植物物种形态和时空分布的跟踪记录。除了干燥的植物本身,植物标本室标本通常还提供了许多额外的信息,通常以印刷或手写的标签形式记录,如采集日期、地点和采集者的姓名。然而,由于历史原因,标本一直是手工收集和标记的,人们正在努力将整个植物标本馆数字化,从而使标本可以用自动化方法进行分析。然而,从手写体标签中提取信息是一个相当大的挑战,因为手写体不仅在不同的收藏者之间不同,而且它们通常是古老的书写类型(例如, tterlin,一种古老的德国文字)。因此,无论是手动还是自动,它们通常都很难破译,而且几乎没有这种类型的实质性一致数据来训练最先进的视觉模型。由于标签的位置因记录而异,因此需要在自动分析写作之前检测到它们,这在过去也被证明是具有挑战性的。在这项工作中,我们展示了最先进的大型语言和视觉模型(LLVM)具有提取这种手写零射击的能力,即完全不需要训练或微调,具有高度的准确性。此外,我们表明,通过事先对标签进行零射击检测,可以大大改进和改进结果。我们在两个新的数据集上评估了我们的方法,一个包含手写标签,一个包含印刷标签,分别基于来自德国植物区系虚拟植物标本馆的植物标本馆扫描。在我们的评估中,这些方法对手写标签的平均相似度为84.5%,对印刷标签的平均相似度为93.1%。因此,在将llvm充分应用于植物标本室标本标记转录之前,还需要进行一些评估,因为有时物种分类和采集地点不能正确识别。这些模型仍然可以支持大型集合的转录过程。我们的代码和一个图形web应用程序可以在https://github.com/Atlas8008/herbarium_label_reader上公开获得。
{"title":"Large language vision models for zero-shot handwriting recognition of historical herbarium labels","authors":"Matthias Körschens ,&nbsp;Solveig Franziska Bucher ,&nbsp;Christiane M. Ritz ,&nbsp;Sebastian Gebauer ,&nbsp;Jens Wesenberg ,&nbsp;Christine Römermann","doi":"10.1016/j.ecoinf.2026.103656","DOIUrl":"10.1016/j.ecoinf.2026.103656","url":null,"abstract":"<div><div>Herbaria contain large numbers of conserved specimens with lots of information for biodiversity research, since they offer a track record of the morphology as well as temporal and spatial distribution of plant species worldwide. Besides the dried plant itself, a lot of additional information is usually provided with the herbarium specimens, typically captured in printed or handwritten labels, such as the date of collection, the location and the collector’s name. While, due to historical reasons, the specimens have been collected and labeled manually, considerable efforts are underway to digitize entire herbaria and therewith make the specimens available for analysis with automated methods. However, the extraction of information from handwritten labels is a considerable challenge, since the handwritings do not only differ from one collector to another, but they are also often in old types of writing (e.g., Sütterlin, an old German script). Therefore, they are often hard to decipher both manually and automatically, and barely any substantial consistent data of this kind exists to train state-of-the-art vision models. Since the location of the labels differs depending on the record, they need to be detected before the automated analysis of the writing, which also proved challenging in the past. In this work we show that state-of-the-art Large Language and Vision Models (LLVM) possess capabilities to extract such handwriting zero-shot, i.e., completely without training or fine-tuning, to a high degree of accuracy. Additionally, we show that the results can be refined and improved considerably by performing zero-shot detection of the labels beforehand. We evaluate our approach on two novel datasets, one containing handwritten and one printed labels, respectively, based on herbarium scans from the virtual herbarium of the flora of Germany. In our evaluations, the approaches achieve a mean similarity of 84.5% for handwritten, and one of 93.1% for printed labels. Thus, we conclude that still some evaluation is needed before the LLVMs can be fully applied to transcribe herbarium specimen labels, as sometimes the species taxonomies as well as the collection sites are not correctly identified. Still these models can support the transcription process in large collections. Our code and a graphical web application is publicly available under <span><span>https://github.com/Atlas8008/herbarium_label_reader</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"94 ","pages":"Article 103656"},"PeriodicalIF":7.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146173984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Ecological Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1