Pub Date : 2024-12-01DOI: 10.1016/j.ecoinf.2024.102919
Xinze Huo, Peizhen Zhang, Ziyi Feng
The rapid growth of offshore wind farms has become a global priority, with both new and total installed capacities increasing sharply. Consequently, underwater noise generated with these developments has garnered significant attention. This study investigated the signals produced by 5.5 MW wind turbines at the Yangjiang offshore wind farm, focusing on various distances and depths. Results showed that the primary energy of the underwater noise was concentrated below 1500 Hz. At the same distance, deeper waters had lower noise levels than shallower waters. The discrete spectrum near the wind farm has a dominant frequency of 44 Hz. The peak sound pressure levels reach 93.76 dB at a depth of 10 m and 81.55 dB at 20 m, measured 50 m from the turbine. Horizontally, the sound pressure level of the dominant frequency decreased consistently as the distance from the wind farm increased. The sound transmission loss within 1 km is less than 10 dB, reaching 16.39 dB at 4 km, with noise levels nearing ambient ocean noise. A segmented spectral wide-angle parabolic equation was used to simulate the spatial sound field of the underwater noise, considering seabed topography. The noise propagation and attenuation models were validated against the measured data. Understanding noise propagation and attenuation with distance is crucial for selecting suitable offshore wind farm locations. Mitigating the impact of elevated underwater noise on sound-dependent species is essential for their survival.
{"title":"Study of underwater sound propagation and attenuation characteristics at the Yangjiang offshore wind farma","authors":"Xinze Huo, Peizhen Zhang, Ziyi Feng","doi":"10.1016/j.ecoinf.2024.102919","DOIUrl":"10.1016/j.ecoinf.2024.102919","url":null,"abstract":"<div><div>The rapid growth of offshore wind farms has become a global priority, with both new and total installed capacities increasing sharply. Consequently, underwater noise generated with these developments has garnered significant attention. This study investigated the signals produced by 5.5 MW wind turbines at the Yangjiang offshore wind farm, focusing on various distances and depths. Results showed that the primary energy of the underwater noise was concentrated below 1500 Hz. At the same distance, deeper waters had lower noise levels than shallower waters. The discrete spectrum near the wind farm has a dominant frequency of 44 Hz. The peak sound pressure levels reach 93.76 dB at a depth of 10 m and 81.55 dB at 20 m, measured 50 m from the turbine. Horizontally, the sound pressure level of the dominant frequency decreased consistently as the distance from the wind farm increased. The sound transmission loss within 1 km is less than 10 dB, reaching 16.39 dB at 4 km, with noise levels nearing ambient ocean noise. A segmented spectral wide-angle parabolic equation was used to simulate the spatial sound field of the underwater noise, considering seabed topography. The noise propagation and attenuation models were validated against the measured data. Understanding noise propagation and attenuation with distance is crucial for selecting suitable offshore wind farm locations. Mitigating the impact of elevated underwater noise on sound-dependent species is essential for their survival.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102919"},"PeriodicalIF":5.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01DOI: 10.1016/j.ecoinf.2024.102911
Hyo Gyeom Kim , Chaehong Lim , Taesung Kim , Jeong-Hui Kim , Hyun-Woo Kim
Discharge regulation in rivers with cascade weirs profoundly alters natural flow dynamics, impacting habitat characteristics and leading to the homogenization of community structure and function. Identifying indicator species and their ecological thresholds is crucial for effectively monitoring and assessing the ecological conditions of regulated river systems. It is also essential for guiding conservation and management efforts. In this study, we investigated the influence of discharge patterns on zooplankton communities in the Yeongsan River, specifically focusing on the effects of flow regulation by cascade weirs. We analyzed the zooplankton community dynamics over a 12-year monitoring period and demonstrated the significance of discharge in shaping river ecosystem dynamics. Species indicator tests and gradient forest modeling were used to identify indicative genera and establish their ecological thresholds under various discharge patterns. Our findings revealed a distinct and significant discharge effect on zooplankton community composition and diversity, independent of water quality and nutrient-related factors. Monsoonal rainfall influenced the discharge patterns, which were categorized into three levels; this classification was further supported by indicator species and their responses. Despite their low abundances limiting clear responses, indicator genera, such as Rotaria and Macrothrix, were shaped by discharge levels. This study highlighted the need to incorporate discharge considerations into river management and conservation strategies to safeguard aquatic biodiversity. Our study provides valuable insights into sustainable river ecosystem management by elucidating the ecological consequences of flow regulation by cascade weirs.
{"title":"Impact of discharge regulation on zooplankton communities regarding indicator species and their thresholds in the cascade weirs of the Yeongsan River","authors":"Hyo Gyeom Kim , Chaehong Lim , Taesung Kim , Jeong-Hui Kim , Hyun-Woo Kim","doi":"10.1016/j.ecoinf.2024.102911","DOIUrl":"10.1016/j.ecoinf.2024.102911","url":null,"abstract":"<div><div>Discharge regulation in rivers with cascade weirs profoundly alters natural flow dynamics, impacting habitat characteristics and leading to the homogenization of community structure and function. Identifying indicator species and their ecological thresholds is crucial for effectively monitoring and assessing the ecological conditions of regulated river systems. It is also essential for guiding conservation and management efforts. In this study, we investigated the influence of discharge patterns on zooplankton communities in the Yeongsan River, specifically focusing on the effects of flow regulation by cascade weirs. We analyzed the zooplankton community dynamics over a 12-year monitoring period and demonstrated the significance of discharge in shaping river ecosystem dynamics. Species indicator tests and gradient forest modeling were used to identify indicative genera and establish their ecological thresholds under various discharge patterns. Our findings revealed a distinct and significant discharge effect on zooplankton community composition and diversity, independent of water quality and nutrient-related factors. Monsoonal rainfall influenced the discharge patterns, which were categorized into three levels; this classification was further supported by indicator species and their responses. Despite their low abundances limiting clear responses, indicator genera, such as <em>Rotaria</em> and <em>Macrothrix</em>, were shaped by discharge levels. This study highlighted the need to incorporate discharge considerations into river management and conservation strategies to safeguard aquatic biodiversity. Our study provides valuable insights into sustainable river ecosystem management by elucidating the ecological consequences of flow regulation by cascade weirs.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102911"},"PeriodicalIF":5.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01DOI: 10.1016/j.ecoinf.2024.102921
Elizabeth Wenk , Thomas Mesaglio , David Keith , Will Cornwell
Dynamic yet accurate species lists for protected areas are essential for conservation and biodiversity research. Even when such lists exist, changing taxonomy, ongoing species migrations and invasions, and new discoveries of historically overlooked species mean static lists can become rapidly outdated. Biodiversity databases such as the Global Biodiversity Information Facility, and citizen science platforms such as iNaturalist, offer rapidly accessible, georeferenced data, but their accuracy is rarely tested. Here we compare species lists generated for two of the world's oldest, more famous protected areas – Yosemite National Park in California, United States and Royal National Park in New South Wales, Australia – using both automated data extraction techniques and extensive manual curation steps. We show that automated list creation without manual curation offers inflated measures of species diversity. Lists generated from herbarium vouchers required more curation than lists generated from iNaturalist, with both incorrect coordinates attached to vouchers and long-outdated names inflating voucher-based species lists. In comparison, iNaturalist data had relatively few errors, in part due to continual curation by a large community, including many botanical experts, and the frequent and automatic implementation of taxonomic updates. As such, iNaturalist will become an increasingly accurate supplementary source for automated biodiversity lists over time, but currently offers poor coverage of graminoid species and introduced species relative to showier, native taxa, and relies on continued expert contributions to identifications. At this point, researchers must manually curate lists extracted from herbarium vouchers or static park lists, and integrate these data with records from iNaturalist, to produce the most robust and taxonomically up-to-date species lists for protected areas.
{"title":"Curating protected area-level species lists in an era of diverse and dynamic data sources","authors":"Elizabeth Wenk , Thomas Mesaglio , David Keith , Will Cornwell","doi":"10.1016/j.ecoinf.2024.102921","DOIUrl":"10.1016/j.ecoinf.2024.102921","url":null,"abstract":"<div><div>Dynamic yet accurate species lists for protected areas are essential for conservation and biodiversity research. Even when such lists exist, changing taxonomy, ongoing species migrations and invasions, and new discoveries of historically overlooked species mean static lists can become rapidly outdated. Biodiversity databases such as the Global Biodiversity Information Facility, and citizen science platforms such as iNaturalist, offer rapidly accessible, georeferenced data, but their accuracy is rarely tested. Here we compare species lists generated for two of the world's oldest, more famous protected areas – Yosemite National Park in California, United States and Royal National Park in New South Wales, Australia – using both automated data extraction techniques and extensive manual curation steps. We show that automated list creation without manual curation offers inflated measures of species diversity. Lists generated from herbarium vouchers required more curation than lists generated from iNaturalist, with both incorrect coordinates attached to vouchers and long-outdated names inflating voucher-based species lists. In comparison, iNaturalist data had relatively few errors, in part due to continual curation by a large community, including many botanical experts, and the frequent and automatic implementation of taxonomic updates. As such, iNaturalist will become an increasingly accurate supplementary source for automated biodiversity lists over time, but currently offers poor coverage of graminoid species and introduced species relative to showier, native taxa, and relies on continued expert contributions to identifications. At this point, researchers must manually curate lists extracted from herbarium vouchers or static park lists, and integrate these data with records from iNaturalist, to produce the most robust and taxonomically up-to-date species lists for protected areas.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102921"},"PeriodicalIF":5.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142757379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01DOI: 10.1016/j.ecoinf.2024.102925
Hetarth Raval, Jyotismita Chaki
Global food security is threatened by plant diseases and manual detection methods are often labor-intensive and time-consuming. Deep learning offers a promising solution by enabling early and accurate detection of leaf diseases. This study presents a novel deep-learning model designed to address the challenges of real-world leaf disease identification. To enhance the model's robustness, we incorporated six datasets (LD, LD1, LD2, LD3, LD4, LD5) which include image augmentation techniques, like flipped versions (LD1) and controlled noise (LD2, LD3). Additionally, we introduced new datasets with additional noise types (LD4) and real-world scenarios (LD5). To further improve accuracy, we employed an ensemble approach, combining MobileNetV3_Small and EfficientNetV2B3 with weighted voting. Our model achieved exceptional performance, surpassing 94 % accuracy on imbalanced data (LD) and exceeding 99 % on balanced, high-quality data (LD1). Even in noisy environments (LD2, LD3, LD4, LD5), our model consistently outperformed other approaches, maintaining an accuracy rate above 90 %. To ensure transparency and interpretability, we utilized Explainable AI (LIME) to visualize the model's decision-making process. These results demonstrate the potential of our model as a reliable and accurate tool for leaf disease detection in practical agricultural settings.
{"title":"Ensemble transfer learning meets explainable AI: A deep learning approach for leaf disease detection","authors":"Hetarth Raval, Jyotismita Chaki","doi":"10.1016/j.ecoinf.2024.102925","DOIUrl":"10.1016/j.ecoinf.2024.102925","url":null,"abstract":"<div><div>Global food security is threatened by plant diseases and manual detection methods are often labor-intensive and time-consuming. Deep learning offers a promising solution by enabling early and accurate detection of leaf diseases. This study presents a novel deep-learning model designed to address the challenges of real-world leaf disease identification. To enhance the model's robustness, we incorporated six datasets (LD, LD1, LD2, LD3, LD4, LD5) which include image augmentation techniques, like flipped versions (LD1) and controlled noise (LD2, LD3). Additionally, we introduced new datasets with additional noise types (LD4) and real-world scenarios (LD5). To further improve accuracy, we employed an ensemble approach, combining MobileNetV3_Small and EfficientNetV2B3 with weighted voting. Our model achieved exceptional performance, surpassing 94 % accuracy on imbalanced data (LD) and exceeding 99 % on balanced, high-quality data (LD1). Even in noisy environments (LD2, LD3, LD4, LD5), our model consistently outperformed other approaches, maintaining an accuracy rate above 90 %. To ensure transparency and interpretability, we utilized Explainable AI (LIME) to visualize the model's decision-making process. These results demonstrate the potential of our model as a reliable and accurate tool for leaf disease detection in practical agricultural settings.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102925"},"PeriodicalIF":5.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142757380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01DOI: 10.1016/j.ecoinf.2024.102892
Stephan van der Westhuizen , Gerard B.M. Heuvelink , Sugnet Gardner-Lubbe , Catherine E. Clarke
In digital soil mapping, machine learning is gradually replacing traditional statistical models because of their greater flexibility and better prediction performance. However, unlike traditional models, a notable drawback of machine learning models is that they are “black-box” in nature due to their limited ability to provide comprehensive interpretations for their predictions. Explainable machine learning (XML) methods provide visualisations that can be used to aid in understanding predictions made by machine learning models. Popular model-agnostic visualisation methods include partial dependence plots, independent conditional expectation curves, and partial dependence plots produced with Shapley values. These methods require that covariates are uncorrelated which could be restrictive. For cases where covariates are correlated, an alternative approach is the Accumulated Local Effect plot, which however is limited to depicting one or two covariates at a time. Another disadvantage of the above mentioned methods is that no readily available goodness-of-fit metric is available. In this paper we propose the use of a principal component analysis biplot as a model-agnostic method to gain insight into machine learning predictions in digital soil mapping. A biplot is a powerful visualisation tool that is used to seek patterns in multivariate data. A biplot does not require covariates included in the visualisation to be uncorrelated, and furthermore, an analytically derived goodness-of-fit metric is provided which allows the user to evaluate the accuracy of the approximation. We present examples from a case study in South Africa in which soil organic carbon is mapped with a random forest model. Our findings show that biplots can provide meaningful interpretations for predictions, making it a worthy addition to the XML toolkit.
{"title":"Biplots for understanding machine learning predictions in digital soil mapping","authors":"Stephan van der Westhuizen , Gerard B.M. Heuvelink , Sugnet Gardner-Lubbe , Catherine E. Clarke","doi":"10.1016/j.ecoinf.2024.102892","DOIUrl":"10.1016/j.ecoinf.2024.102892","url":null,"abstract":"<div><div>In digital soil mapping, machine learning is gradually replacing traditional statistical models because of their greater flexibility and better prediction performance. However, unlike traditional models, a notable drawback of machine learning models is that they are “black-box” in nature due to their limited ability to provide comprehensive interpretations for their predictions. Explainable machine learning (XML) methods provide visualisations that can be used to aid in understanding predictions made by machine learning models. Popular model-agnostic visualisation methods include partial dependence plots, independent conditional expectation curves, and partial dependence plots produced with Shapley values. These methods require that covariates are uncorrelated which could be restrictive. For cases where covariates are correlated, an alternative approach is the Accumulated Local Effect plot, which however is limited to depicting one or two covariates at a time. Another disadvantage of the above mentioned methods is that no readily available goodness-of-fit metric is available. In this paper we propose the use of a principal component analysis biplot as a model-agnostic method to gain insight into machine learning predictions in digital soil mapping. A biplot is a powerful visualisation tool that is used to seek patterns in multivariate data. A biplot does not require covariates included in the visualisation to be uncorrelated, and furthermore, an analytically derived goodness-of-fit metric is provided which allows the user to evaluate the accuracy of the approximation. We present examples from a case study in South Africa in which soil organic carbon is mapped with a random forest model. Our findings show that biplots can provide meaningful interpretations for predictions, making it a worthy addition to the XML toolkit.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102892"},"PeriodicalIF":5.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Underwater passive acoustics is used worldwide for multi-year monitoring of marine mammals. Yet, the large amount of audio recordings raises the need to automate the detection of acoustic events. For instance, the increasing number of Offshore Wind Farms (OWF) raises key environmental and societal issues relating to their impacts on wildlife. In this context, monitoring marine mammals along with information on their acoustic environment throughout the OWF life cycle is crucial. The objective of this study is to evaluate the ability of a single deep learning model to precisely detect and localize, in time and in frequency, the marine mammal sounds over a wide frequency range and classify them by species and sound types.
A broadband hydrophone, deployed at the Fécamp OWF (Normandy, France), recorded the underwater soundscape including sounds from marine mammals occurring in the area. To visualize these sounds, 15-s spectrograms were computed. From these images, dolphin (D) and porpoise (P) sounds were manually annotated, including different types of sounds: Click-Trains (DCT, PCT), Buzzes (DB, PB) and Whistles (DW). The spectrograms were then split into five-fold cross-validation datasets, each containing one half of manual annotations and one half of only background noise. A Faster R-CNN model was trained to precisely detect and classify the marine mammal sounds in the spectrograms.
Three model output configurations were used: (1) overall detection of marine mammals (presence vs. absence), (2) detection and classification of species (two classes: dolphin, porpoise) and (3) sound types (five classes: DCT, DB, DW, PCT, PB). For the simplest configuration (1) 15.4 % of the spectrogram dataset had detections while missing only 6.6 % of annotated spectrograms. For the more precise configurations, (2) and (3), the mean Average Precision (mAP) achieved were 92.3 % (2) and 84.3 % (3), and the macro average Area under the curve (AUC) 95.7 % (2) and 94.9 % (3).
This model will help to speed up the annotation processes, by reducing the spectrogram quantity to be manually analyzed and having time-frequency boxes already drawn. Several model parameters can be adjusted to trade off missed detections and false positives which need to be carefully considered and adapted to the problem. For instance, these adjustments would be particularly relevant depending on the human resources available to manually check the model detections and the criticality of missing marine mammal sounds. These models are promising, ranging from the simple detection of marine mammal presence to precise ecological inferences over the long term.
{"title":"A deep learning model for detecting and classifying multiple marine mammal species from passive acoustic data","authors":"Quentin Hamard , Minh-Tan Pham , Dorian Cazau , Karine Heerah","doi":"10.1016/j.ecoinf.2024.102906","DOIUrl":"10.1016/j.ecoinf.2024.102906","url":null,"abstract":"<div><div>Underwater passive acoustics is used worldwide for multi-year monitoring of marine mammals. Yet, the large amount of audio recordings raises the need to automate the detection of acoustic events. For instance, the increasing number of Offshore Wind Farms (OWF) raises key environmental and societal issues relating to their impacts on wildlife. In this context, monitoring marine mammals along with information on their acoustic environment throughout the OWF life cycle is crucial. The objective of this study is to evaluate the ability of a single deep learning model to precisely detect and localize, in time and in frequency, the marine mammal sounds over a wide frequency range and classify them by species and sound types.</div><div>A broadband hydrophone, deployed at the Fécamp OWF (Normandy, France), recorded the underwater soundscape including sounds from marine mammals occurring in the area. To visualize these sounds, 15-s spectrograms were computed. From these images, dolphin (D) and porpoise (P) sounds were manually annotated, including different types of sounds: Click-Trains (D<sub>CT</sub>, P<sub>CT</sub>), Buzzes (D<sub>B</sub>, P<sub>B</sub>) and Whistles (D<sub>W</sub>). The spectrograms were then split into five-fold cross-validation datasets, each containing one half of manual annotations and one half of only background noise. A Faster R-CNN model was trained to precisely detect and classify the marine mammal sounds in the spectrograms.</div><div>Three model output configurations were used: (1) overall detection of marine mammals (presence vs. absence), (2) detection and classification of species (two classes: dolphin, porpoise) and (3) sound types (five classes: D<sub>CT</sub>, D<sub>B</sub>, D<sub>W</sub>, P<sub>CT</sub>, P<sub>B</sub>). For the simplest configuration (1) 15.4 % of the spectrogram dataset had detections while missing only 6.6 % of annotated spectrograms. For the more precise configurations, (2) and (3), the mean Average Precision (mAP) achieved were 92.3 % (2) and 84.3 % (3), and the macro average Area under the curve (AUC) 95.7 % (2) and 94.9 % (3).</div><div>This model will help to speed up the annotation processes, by reducing the spectrogram quantity to be manually analyzed and having time-frequency boxes already drawn. Several model parameters can be adjusted to trade off missed detections and false positives which need to be carefully considered and adapted to the problem. For instance, these adjustments would be particularly relevant depending on the human resources available to manually check the model detections and the criticality of missing marine mammal sounds. These models are promising, ranging from the simple detection of marine mammal presence to precise ecological inferences over the long term.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102906"},"PeriodicalIF":5.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01DOI: 10.1016/j.ecoinf.2024.102922
Xin Tian , Weifan Cao , Shaowen Liu , Buyue Zhang , Junshuo Wei , Zheng Ma , Rui Gao , Zhongbin Su , Shoutian Dong
Accurately predicting the growth status of rice seedlings and understanding their growth rate and health status in a timely manner helps adjust the growth cycle and management measures. By predicting the growth status of the seedlings, the best time for transplanting can be selected, improving the survival rate and overall health of the seedlings, thereby enhancing yield and quality. Therefore, this study proposes a data-driven time-series model, the U + LSTM-F model, for predicting the growth status of Wuyou Rice 4 seedlings. First, the U-Net model is employed to segment sequentially collected images, extracting features such as leaf age and stem length of the rice seedlings. Subsequently, the collected ambient temperature and humidity data are aligned with the leaf age and stem length data. Finally, the LSTM model is used for time-series analysis, enabling the model to learn the temporal relationship between environmental and growth data and predict the growth trend of the rice seedlings. Additionally, an attention mechanism is introduced to enhance model performance, and the model's effectiveness is evaluated using multiple quantitative metrics. The proposed model achieves an RMSE of 0.032 and MAPE of 0.895 % for leaf age prediction, and an RMSE of 0.067 and MAPE of 0.814 % for stem length prediction. The experimental results show that this data-driven approach, which combines growth data with environmental data, exhibits high accuracy in predicting the leaf age and stem length of rice seedlings. This provides a more accurate tool for predicting the growth of rice seedlings, offering valuable insights for rice seedling cultivation research.
{"title":"U + LSTM-F: A data-driven growth process model of rice seedlings","authors":"Xin Tian , Weifan Cao , Shaowen Liu , Buyue Zhang , Junshuo Wei , Zheng Ma , Rui Gao , Zhongbin Su , Shoutian Dong","doi":"10.1016/j.ecoinf.2024.102922","DOIUrl":"10.1016/j.ecoinf.2024.102922","url":null,"abstract":"<div><div>Accurately predicting the growth status of rice seedlings and understanding their growth rate and health status in a timely manner helps adjust the growth cycle and management measures. By predicting the growth status of the seedlings, the best time for transplanting can be selected, improving the survival rate and overall health of the seedlings, thereby enhancing yield and quality. Therefore, this study proposes a data-driven time-series model, the U + LSTM-F model, for predicting the growth status of Wuyou Rice 4 seedlings. First, the U-Net model is employed to segment sequentially collected images, extracting features such as leaf age and stem length of the rice seedlings. Subsequently, the collected ambient temperature and humidity data are aligned with the leaf age and stem length data. Finally, the LSTM model is used for time-series analysis, enabling the model to learn the temporal relationship between environmental and growth data and predict the growth trend of the rice seedlings. Additionally, an attention mechanism is introduced to enhance model performance, and the model's effectiveness is evaluated using multiple quantitative metrics. The proposed model achieves an RMSE of 0.032 and MAPE of 0.895 % for leaf age prediction, and an RMSE of 0.067 and MAPE of 0.814 % for stem length prediction. The experimental results show that this data-driven approach, which combines growth data with environmental data, exhibits high accuracy in predicting the leaf age and stem length of rice seedlings. This provides a more accurate tool for predicting the growth of rice seedlings, offering valuable insights for rice seedling cultivation research.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102922"},"PeriodicalIF":5.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Runoff is pivotal in water resource management and ecological conservation. Current research predominantly emphasizes enhancing the precision of machine learning-based runoff predictions, with limited focus on their physical interpretability. This study introduces an innovative two-step hybrid runoff prediction framework tailored for the headwater region of the Yellow River Basin (YRB) to improve prediction accuracy and elucidate the runoff modeling process. The framework integrates machine learning techniques with dual signal decomposition approaches, incorporating diverse hydrometeorological and geographic indicators. Long Short-Term Memory (LSTM) and eXtreme Gradient Boosting (XGBoost) algorithms were employed to predict monthly runoff generation in sub-basins delineated by the Soil and Water Assessment Tool (SWAT), which were subsequently integrated using a Recurrent Neural Network (RNN) for monthly runoff concentration prediction. Results indicate that the proposed models delivered superior prediction performance compared to the SWAT model (R2 = 0.86, NSE = 0.85), with the LSTM-based two-step hybrid model (R2 = 0.90, NSE = 0.90) outperforming the XGBoost-based model (R2 = 0.89, NSE = 0.88). The dual decomposition method, integrating seasonal-trend decomposition based on loess (STL) and successive variational mode decomposition (SVMD), demonstrated exceptional efficacy in addressing the complexities of hydrometeorological time series. Models decomposed by STL-SVMD exhibited the highest average R2 and NSE values, as well as the lowest RMSE and MAE values in sub-basin runoff calculations. The low standard deviations of performance metrics further underscored the stability of these models across all sub-basins. This study demonstrates the efficacy of the proposed two-step hybrid model for simulating physical runoff processes in the headwater region of the YRB, providing valuable insights for regional hydrological cycle research and hydro-ecological security.
{"title":"Two-step hybrid model for monthly runoff prediction utilizing integrated machine learning algorithms and dual signal decompositions","authors":"Shujun Wu , Zengchuan Dong , Sandra M. Guzmán , Gregory Conde , Wenzhuo Wang , Shengnan Zhu , Yiqing Shao , Jinyu Meng","doi":"10.1016/j.ecoinf.2024.102914","DOIUrl":"10.1016/j.ecoinf.2024.102914","url":null,"abstract":"<div><div>Runoff is pivotal in water resource management and ecological conservation. Current research predominantly emphasizes enhancing the precision of machine learning-based runoff predictions, with limited focus on their physical interpretability. This study introduces an innovative two-step hybrid runoff prediction framework tailored for the headwater region of the Yellow River Basin (YRB) to improve prediction accuracy and elucidate the runoff modeling process. The framework integrates machine learning techniques with dual signal decomposition approaches, incorporating diverse hydrometeorological and geographic indicators. Long Short-Term Memory (LSTM) and eXtreme Gradient Boosting (XGBoost) algorithms were employed to predict monthly runoff generation in sub-basins delineated by the Soil and Water Assessment Tool (SWAT), which were subsequently integrated using a Recurrent Neural Network (RNN) for monthly runoff concentration prediction. Results indicate that the proposed models delivered superior prediction performance compared to the SWAT model (R<sup>2</sup> = 0.86, NSE = 0.85), with the LSTM-based two-step hybrid model (R<sup>2</sup> = 0.90, NSE = 0.90) outperforming the XGBoost-based model (R<sup>2</sup> = 0.89, NSE = 0.88). The dual decomposition method, integrating seasonal-trend decomposition based on loess (STL) and successive variational mode decomposition (SVMD), demonstrated exceptional efficacy in addressing the complexities of hydrometeorological time series. Models decomposed by STL-SVMD exhibited the highest average R<sup>2</sup> and NSE values, as well as the lowest RMSE and MAE values in sub-basin runoff calculations. The low standard deviations of performance metrics further underscored the stability of these models across all sub-basins. This study demonstrates the efficacy of the proposed two-step hybrid model for simulating physical runoff processes in the headwater region of the YRB, providing valuable insights for regional hydrological cycle research and hydro-ecological security.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102914"},"PeriodicalIF":5.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01DOI: 10.1016/j.ecoinf.2024.102920
Shuhan Jia , Linlin Bei , Yu Li , Quanhua Zhao
In this study, a novel multiple spatiotemporal data interpolating empirical orthogonal function (MS-DINEOF) method was employed to solve the problem of missing remote sensing data in the estimation of ocean primary productivity (OPP). The scheme was integrated with a vertically generalized productivity model (VGPM) for estimating OPP. First, a new time-scale feature was defined for effectively preserving spatiotemporal characteristics during the reconstruction of missing remote sensing data. The proposed algorithm, which integrates MS-DINEOF for reconstructing sea surface temperature, chlorophyll-a concentration, photosynthetically active radiation, and diffuse attenuation coefficient at 490 nm data, with VGPM for OPP estimation, was implemented for the Bohai Sea from 2010 to 2021. The main results are as follows: (1) The root mean square error values of the reconstructed data were all less than 0.1, and the absolute error values of the estimated OPP were even smaller. The quality of the reconstructed data using the MS-DINEOF algorithm was high, both for overall and local data. (2) The OPP in the Bohai Sea exhibited obvious seasonal fluctuations. (3) The spatial distribution of OPP exhibited regional characteristics over time. Specifically, OPP in the Bohai Sea showed a decreasing trend from the coastal sea to the distant sea during the periods 2010–2014, 2015–2019, and 2020–2021. The OPPs were higher in the coastal areas than in Bohai Bay and Laizhou Bay and gradually decreased from the coastal sea to the distant sea in July and August during 2015–2019.
{"title":"Spatiotemporal analysis of ocean primary productivity in Bohai Sea estimated using improved DINEOF reconstructed MODIS data","authors":"Shuhan Jia , Linlin Bei , Yu Li , Quanhua Zhao","doi":"10.1016/j.ecoinf.2024.102920","DOIUrl":"10.1016/j.ecoinf.2024.102920","url":null,"abstract":"<div><div>In this study, a novel multiple spatiotemporal data interpolating empirical orthogonal function (MS-DINEOF) method was employed to solve the problem of missing remote sensing data in the estimation of ocean primary productivity (OPP). The scheme was integrated with a vertically generalized productivity model (VGPM) for estimating OPP. First, a new time-scale feature was defined for effectively preserving spatiotemporal characteristics during the reconstruction of missing remote sensing data. The proposed algorithm, which integrates MS-DINEOF for reconstructing sea surface temperature, chlorophyll-a concentration, photosynthetically active radiation, and diffuse attenuation coefficient at 490 nm data, with VGPM for OPP estimation, was implemented for the Bohai Sea from 2010 to 2021. The main results are as follows: (1) The root mean square error values of the reconstructed data were all less than 0.1, and the absolute error values of the estimated OPP were even smaller. The quality of the reconstructed data using the MS-DINEOF algorithm was high, both for overall and local data. (2) The OPP in the Bohai Sea exhibited obvious seasonal fluctuations. (3) The spatial distribution of OPP exhibited regional characteristics over time. Specifically, OPP in the Bohai Sea showed a decreasing trend from the coastal sea to the distant sea during the periods 2010–2014, 2015–2019, and 2020–2021. The OPPs were higher in the coastal areas than in Bohai Bay and Laizhou Bay and gradually decreased from the coastal sea to the distant sea in July and August during 2015–2019.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102920"},"PeriodicalIF":5.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01DOI: 10.1016/j.ecoinf.2024.102917
Pradeep Wagle , Gopichandh Danala , Catherine Donner , Xiangming Xiao , Corey Moffet , Stacey A. Gunter , Wolfgang Jentner , David S. Ebert
The vegetation phenology of tallgrass prairie varies yearly, depending on climatic conditions, plant species composition, and location. Modeling time series of vegetation indices (VIs) using climate data can be useful for understanding and predicting how tallgrass prairie will respond to future climate scenarios and for identifying and managing areas of tallgrass prairie that are particularly susceptible to climate-induced changes. Machine or deep learning algorithms can be well-suited to model VIs for phenology studies by identifying patterns and relationships between climatic factors and VIs using historical data. This study evaluated the performance of 12 machine and deep learning algorithms, encompassing a diverse range of algorithmic families, in modeling patterns of the Moderate Resolution Imaging Spectroradiometer-derived enhanced vegetation index (EVI, greenness index) and land surface water index (LSWI) in native tallgrass prairie. The models include linear regression, Bayesian ridge, elastic net, decision tree, random forest, eXtreme Gradient Boosting (XGBoost), support vector regression (SVR), K-nearest neighbors (KNN), artificial neural network (ANN), convolutional neural network (CNN), recurrent neural network (RNN), and long short-term memory (LSTM). Air and soil temperatures showed the highest correlations with EVI (r ≥ 0.77) and LSWI (r ≥ 0.56). The low correlation (r ≤ 0.23) of EVI and LSWI with contemporaneous rainfall or soil moisture suggests vegetation's delayed response to these factors. The results indicated that ensemble methods like XGBoost and random forest performed best across all three datasets (i.e., training, testing, and validation) for modeling EVI and LSWI. Deep learning models showed varying performance across datasets, and their performance was sub-optimal compared to XGBoost and random forest. The linear regression also showed a moderate performance, while the decision tree performed the weakest overall. The strong performance of XGBoost and random forest highlights the intricate and nonlinear relationship of prairie vegetation with climatic factors. These models' strength lies in capturing such complexities. This study provides insights into the key climatic factors and underlying processes that control the vegetation dynamics of tallgrass prairie ecosystems. Our machine learning models can be a valuable tool for developing new strategies to manage tallgrass prairie ecosystems in the face of climate change.
{"title":"Modeling time series of vegetation indices in tallgrass prairie using machine and deep learning algorithms","authors":"Pradeep Wagle , Gopichandh Danala , Catherine Donner , Xiangming Xiao , Corey Moffet , Stacey A. Gunter , Wolfgang Jentner , David S. Ebert","doi":"10.1016/j.ecoinf.2024.102917","DOIUrl":"10.1016/j.ecoinf.2024.102917","url":null,"abstract":"<div><div>The vegetation phenology of tallgrass prairie varies yearly, depending on climatic conditions, plant species composition, and location. Modeling time series of vegetation indices (VIs) using climate data can be useful for understanding and predicting how tallgrass prairie will respond to future climate scenarios and for identifying and managing areas of tallgrass prairie that are particularly susceptible to climate-induced changes. Machine or deep learning algorithms can be well-suited to model VIs for phenology studies by identifying patterns and relationships between climatic factors and VIs using historical data. This study evaluated the performance of 12 machine and deep learning algorithms, encompassing a diverse range of algorithmic families, in modeling patterns of the Moderate Resolution Imaging Spectroradiometer-derived enhanced vegetation index (EVI, greenness index) and land surface water index (LSWI) in native tallgrass prairie. The models include linear regression, Bayesian ridge, elastic net, decision tree, random forest, eXtreme Gradient Boosting (XGBoost), support vector regression (SVR), K-nearest neighbors (KNN), artificial neural network (ANN), convolutional neural network (CNN), recurrent neural network (RNN), and long short-term memory (LSTM). Air and soil temperatures showed the highest correlations with EVI (<em>r</em> ≥ 0.77) and LSWI (<em>r</em> ≥ 0.56). The low correlation (<em>r</em> ≤ 0.23) of EVI and LSWI with contemporaneous rainfall or soil moisture suggests vegetation's delayed response to these factors. The results indicated that ensemble methods like XGBoost and random forest performed best across all three datasets (i.e., training, testing, and validation) for modeling EVI and LSWI. Deep learning models showed varying performance across datasets, and their performance was sub-optimal compared to XGBoost and random forest. The linear regression also showed a moderate performance, while the decision tree performed the weakest overall. The strong performance of XGBoost and random forest highlights the intricate and nonlinear relationship of prairie vegetation with climatic factors. These models' strength lies in capturing such complexities. This study provides insights into the key climatic factors and underlying processes that control the vegetation dynamics of tallgrass prairie ecosystems. Our machine learning models can be a valuable tool for developing new strategies to manage tallgrass prairie ecosystems in the face of climate change.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"84 ","pages":"Article 102917"},"PeriodicalIF":5.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}