Pub Date : 2024-10-28DOI: 10.1016/j.mlwa.2024.100595
Danxu Wang , Emma Regentova , Venkatesan Muthukumar , Markus Berli , Frederick C. Harris Jr.
The heat from wildfires volatilizes soil’s organic compounds which form a waxy layer when condensed on cooler soil particles causing soil to repel water. Timely assessment of soil water repellency (SWR) is critical for prediction and prevention of detrimental impacts of hydrophobic soils such as soil erosion, reduced availability of water to plants, and water runoff after rainfalls leading to floods. The Water Drop Penetration Time (WDPT), i.e., the time elapsed from a drop landing on the soil surface to its complete absorption is commonly used to assess the SWR level. Its manual measurements have variability based on the used instruments and subjective observations. The goal of this work is to design an automated system to perform standardized WDPT tests and assess the SWR levels. It consists of an electronically controlled mechanism to release a water drop, and a video camera to record the water penetration process. The latter is modeled as an “action” in video and Temporal Action Localization (TAL) analytics is used for predicting the WDPT and assessing the SWR level.
{"title":"A machine learning framework to measure Water Drop Penetration Time (WDPT) for soil water repellency analysis","authors":"Danxu Wang , Emma Regentova , Venkatesan Muthukumar , Markus Berli , Frederick C. Harris Jr.","doi":"10.1016/j.mlwa.2024.100595","DOIUrl":"10.1016/j.mlwa.2024.100595","url":null,"abstract":"<div><div>The heat from wildfires volatilizes soil’s organic compounds which form a waxy layer when condensed on cooler soil particles causing soil to repel water. Timely assessment of soil water repellency (SWR) is critical for prediction and prevention of detrimental impacts of hydrophobic soils such as soil erosion, reduced availability of water to plants, and water runoff after rainfalls leading to floods. The Water Drop Penetration Time (WDPT), i.e., the time elapsed from a drop landing on the soil surface to its complete absorption is commonly used to assess the SWR level. Its manual measurements have variability based on the used instruments and subjective observations. The goal of this work is to design an automated system to perform standardized WDPT tests and assess the SWR levels. It consists of an electronically controlled mechanism to release a water drop, and a video camera to record the water penetration process. The latter is modeled as an “action” in video and Temporal Action Localization (TAL) analytics is used for predicting the WDPT and assessing the SWR level.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"18 ","pages":"Article 100595"},"PeriodicalIF":0.0,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142554391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.mlwa.2024.100600
Vahid Daghigh , Hamid Daghigh , Thomas E. Lacy Jr. , Mohammad Naraghi
Machine learning (ML) techniques have shown promising applications in a broad range of topics in engineering, composite materials behavior analysis, and manufacturing. This paper reviews successful ML implementations for defect and damage identification and progression in composites. The focus is on predicting composites' responses under specific loads and environments and optimizing setting and imperfection sensitivity. Discussions and recommendations toward promising ML implementation practices for fruitful interpretable results in the composites’ analysis are provided.
机器学习(ML)技术在工程、复合材料行为分析和制造等广泛领域的应用前景广阔。本文回顾了在复合材料缺陷和损伤识别与发展方面成功的 ML 实施。重点是预测复合材料在特定载荷和环境下的反应,以及优化设置和缺陷敏感性。本文就复合材料分析中有望获得可解释结果的 ML 实施实践进行了讨论并提出了建议。
{"title":"Review of machine learning applications for defect detection in composite materials","authors":"Vahid Daghigh , Hamid Daghigh , Thomas E. Lacy Jr. , Mohammad Naraghi","doi":"10.1016/j.mlwa.2024.100600","DOIUrl":"10.1016/j.mlwa.2024.100600","url":null,"abstract":"<div><div>Machine learning (ML) techniques have shown promising applications in a broad range of topics in engineering, composite materials behavior analysis, and manufacturing. This paper reviews successful ML implementations for defect and damage identification and progression in composites. The focus is on predicting composites' responses under specific loads and environments and optimizing setting and imperfection sensitivity. Discussions and recommendations toward promising ML implementation practices for fruitful interpretable results in the composites’ analysis are provided.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"18 ","pages":"Article 100600"},"PeriodicalIF":0.0,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dendrobium officinale is a well-recognized functional food material. Considering its therapeutic effect and price vary among different geographical origins, this paper proposed an origin identification method based on Raman spectroscopy and NNRW (neural network with random weights)-stacking ensemble model. In a case study of dendrobium officinale samples from three different geographical origins, we compare both single estimators, i.e., KNN (k-nearest neighbors), MLP (multi-layer perceptron), DTC (decision tree classifier), and NNRW, and their stacking ensemble counterparts. The results showed that the NNRW-stacking ensemble has the best test accuracy (96.3%) and an impressive fitting speed (the fastest among all ensembles). In conclusion, the NNRW-stacking ensemble model combined with Raman spectroscopy can be a promising method for herb geographical original identification. The proposed model has demonstrated the speed advantage of NNRW (no need for gradient-based iterations) and the generalization power of stacking ensembles (reduce single-estimator bias).
{"title":"Geographical origin identification of dendrobium officinale based on NNRW-stacking ensembles","authors":"Yinsheng Zhang , Chen Chen , Fangjie Guo , Haiyan Wang","doi":"10.1016/j.mlwa.2024.100594","DOIUrl":"10.1016/j.mlwa.2024.100594","url":null,"abstract":"<div><div>Dendrobium officinale is a well-recognized functional food material. Considering its therapeutic effect and price vary among different geographical origins, this paper proposed an origin identification method based on Raman spectroscopy and NNRW (neural network with random weights)-stacking ensemble model. In a case study of dendrobium officinale samples from three different geographical origins, we compare both single estimators, i.e., KNN (k-nearest neighbors), MLP (multi-layer perceptron), DTC (decision tree classifier), and NNRW, and their stacking ensemble counterparts. The results showed that the NNRW-stacking ensemble has the best test accuracy (96.3%) and an impressive fitting speed (the fastest among all ensembles). In conclusion, the NNRW-stacking ensemble model combined with Raman spectroscopy can be a promising method for herb geographical original identification. The proposed model has demonstrated the speed advantage of NNRW (no need for gradient-based iterations) and the generalization power of stacking ensembles (reduce single-estimator bias).</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"18 ","pages":"Article 100594"},"PeriodicalIF":0.0,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-19DOI: 10.1016/j.mlwa.2024.100593
Michael Nigro, Sridhar Krishnan
Audio scene analysis involves a variety of tasks to obtain information from an audio environment. Audio source counting is one such task that has implications to many other aspects of audio analysis, yet it is relatively unexplored. This work presents the first review of the audio source counting literature and aims to convey the significance of this task to the wider domain of audio analysis. We identify and discuss connections between audio source counting and other more commonly studied audio analysis tasks. In addition, a review of the publicly available audio datasets is presented, highlighting the lack of datasets geared towards audio source counting. Our goal of this review paper is to promote future research of audio source counting.
{"title":"Trends in audio scene source counting and analysis","authors":"Michael Nigro, Sridhar Krishnan","doi":"10.1016/j.mlwa.2024.100593","DOIUrl":"10.1016/j.mlwa.2024.100593","url":null,"abstract":"<div><div>Audio scene analysis involves a variety of tasks to obtain information from an audio environment. Audio source counting is one such task that has implications to many other aspects of audio analysis, yet it is relatively unexplored. This work presents the first review of the audio source counting literature and aims to convey the significance of this task to the wider domain of audio analysis. We identify and discuss connections between audio source counting and other more commonly studied audio analysis tasks. In addition, a review of the publicly available audio datasets is presented, highlighting the lack of datasets geared towards audio source counting. Our goal of this review paper is to promote future research of audio source counting.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"18 ","pages":"Article 100593"},"PeriodicalIF":0.0,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142532256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Breast cancer (BC) is the most common type of cancer among women globally and is one of the leading causes of cancer-related deaths among women. In the diagnosis of BC, histopathological assessment is the gold standard, where automated tumor detection technologies play a pivotal role. Utilizing Convolutional Neural Networks (CNNs) for automated analysis of image patches from Whole Slide Images (WSIs) enhances detection accuracy and alleviates the workload of pathologists. However, CNNs often face limitations in handling pathological patches due to a lack of sufficient contextual information and limited feature generation capabilities. To address this, we propose a novel Multi-scale Multi-head Self-attention Ensemble Network (MMSEN), which integrates a multi-scale feature generation module, a convolutional self-attention module, and an adaptive feature integration with an output module, effectively optimizing the performance of classical CNNs. The design of MMSEN optimizes the capture of key information and the comprehensive integration of features in WSIs pathological patches, significantly enhancing the precision of tumor detection. Validation results from a five-fold cross-validation experiment on the PatchCamelyon (PCam) dataset demonstrate that MMSEN achieves a ROC-AUC of 99.01% ± 0.02%, an F1-score of 98.00% ± 0.08%, a Balanced Accuracy (B-Acc) of 98.00% ± 0.08%, and a Matthews Correlation Coefficient (MCC) of 96.00% ± 0.16% (). These results demonstrate the effectiveness and potential of MMSEN in detecting tumors from pathological patches in WSIs for BC.
{"title":"Tumor detection in breast cancer pathology patches using a Multi-scale Multi-head Self-attention Ensemble Network on Whole Slide Images","authors":"Ruigang Ge , Guoyue Chen , Kazuki Saruta , Yuki Terata","doi":"10.1016/j.mlwa.2024.100592","DOIUrl":"10.1016/j.mlwa.2024.100592","url":null,"abstract":"<div><div>Breast cancer (BC) is the most common type of cancer among women globally and is one of the leading causes of cancer-related deaths among women. In the diagnosis of BC, histopathological assessment is the gold standard, where automated tumor detection technologies play a pivotal role. Utilizing Convolutional Neural Networks (CNNs) for automated analysis of image patches from Whole Slide Images (WSIs) enhances detection accuracy and alleviates the workload of pathologists. However, CNNs often face limitations in handling pathological patches due to a lack of sufficient contextual information and limited feature generation capabilities. To address this, we propose a novel Multi-scale Multi-head Self-attention Ensemble Network (MMSEN), which integrates a multi-scale feature generation module, a convolutional self-attention module, and an adaptive feature integration with an output module, effectively optimizing the performance of classical CNNs. The design of MMSEN optimizes the capture of key information and the comprehensive integration of features in WSIs pathological patches, significantly enhancing the precision of tumor detection. Validation results from a five-fold cross-validation experiment on the PatchCamelyon (PCam) dataset demonstrate that MMSEN achieves a ROC-AUC of 99.01% ± 0.02%, an F1-score of 98.00% ± 0.08%, a Balanced Accuracy (B-Acc) of 98.00% ± 0.08%, and a Matthews Correlation Coefficient (MCC) of 96.00% ± 0.16% (<span><math><mrow><mi>p</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>05</mn></mrow></math></span>). These results demonstrate the effectiveness and potential of MMSEN in detecting tumors from pathological patches in WSIs for BC.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"18 ","pages":"Article 100592"},"PeriodicalIF":0.0,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-17DOI: 10.1016/j.mlwa.2024.100591
Ibomoiye Domor Mienye , Yanxia Sun , Emmanuel Ileberi
Artificial Intelligence (AI) techniques are transforming various sectors and hold significant potential to advance sustainable development in Africa. However, their effective integration is constrained by region-specific challenges, limiting widespread deployment. This study reviews the current state of sustainable development in Africa, highlighting the role AI can play in driving progress across key sectors, including healthcare, agriculture, education, environmental protection, and infrastructure. The paper outlines the challenges hindering AI adoption and presents strategic approaches to address these obstacles, specifically targeting Africa’s socio-economic and environmental needs. In addition, the study proposes a comprehensive framework for integrating AI into Africa’s sustainable development efforts, offering tailored AI-driven strategies that align with the continent’s unique context. This framework provides a valuable resource for AI researchers, policymakers, and practitioners working towards sustainable development in Africa.
{"title":"Artificial intelligence and sustainable development in Africa: A comprehensive review","authors":"Ibomoiye Domor Mienye , Yanxia Sun , Emmanuel Ileberi","doi":"10.1016/j.mlwa.2024.100591","DOIUrl":"10.1016/j.mlwa.2024.100591","url":null,"abstract":"<div><div>Artificial Intelligence (AI) techniques are transforming various sectors and hold significant potential to advance sustainable development in Africa. However, their effective integration is constrained by region-specific challenges, limiting widespread deployment. This study reviews the current state of sustainable development in Africa, highlighting the role AI can play in driving progress across key sectors, including healthcare, agriculture, education, environmental protection, and infrastructure. The paper outlines the challenges hindering AI adoption and presents strategic approaches to address these obstacles, specifically targeting Africa’s socio-economic and environmental needs. In addition, the study proposes a comprehensive framework for integrating AI into Africa’s sustainable development efforts, offering tailored AI-driven strategies that align with the continent’s unique context. This framework provides a valuable resource for AI researchers, policymakers, and practitioners working towards sustainable development in Africa.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"18 ","pages":"Article 100591"},"PeriodicalIF":0.0,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142532255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-11DOI: 10.1016/j.mlwa.2024.100590
Britt van Leeuwen , Maike Nutzel
Maritime security is of tremendous importance in countering drug trafficking, particularly through sea-based routes. In this paper, we address the pressing need for effective detection methods by introducing a novel approach utilizing Automatic Identification System (AIS) data. Our focus lies on detecting the ‘drop-off’ method, a prevalent technique for contraband smuggling at sea. Unlike existing research, primarily employing unsupervised methods, we propose a supervised model specifically tailored to this illicit activity, with a particular emphasis on its application to fishing vessels.
Our model significantly reduces the number of data points requiring classification by the observer by 70% , thereby enhancing the efficiency of the drop-off detection process. By employing a Long Short-Term Memory (LSTM) model, our approach demonstrates a change from traditional methods and offers advantages in capturing complex temporal patterns inherent in ‘drop-off’ activities. The rationale behind choosing LSTM lies in its ability to effectively model sequential data, which is essential for detecting drug traffic activities at sea where patterns are subtle and dynamic.
Moreover, this model holds the potential for integration into real-time surveillance systems, thereby enhancing operational capabilities in detecting and preventing drug traffic. The generalizability of our model makes for considerable potential in enhancing maritime security efforts and providing assistance in countering drug traffic on a global scale. Importantly, our model outperforms both baseline models, underscoring its effectiveness and superiority in addressing the specific challenges posed by ‘drop-off’ detection. For more information and access to the code repository, please visit this link.
{"title":"Detecting drug transfers via the drop-off method: A supervised model approach using AIS data","authors":"Britt van Leeuwen , Maike Nutzel","doi":"10.1016/j.mlwa.2024.100590","DOIUrl":"10.1016/j.mlwa.2024.100590","url":null,"abstract":"<div><div>Maritime security is of tremendous importance in countering drug trafficking, particularly through sea-based routes. In this paper, we address the pressing need for effective detection methods by introducing a novel approach utilizing Automatic Identification System (AIS) data. Our focus lies on detecting the ‘drop-off’ method, a prevalent technique for contraband smuggling at sea. Unlike existing research, primarily employing unsupervised methods, we propose a supervised model specifically tailored to this illicit activity, with a particular emphasis on its application to fishing vessels.</div><div>Our model significantly reduces the number of data points requiring classification by the observer by 70% , thereby enhancing the efficiency of the drop-off detection process. By employing a Long Short-Term Memory (LSTM) model, our approach demonstrates a change from traditional methods and offers advantages in capturing complex temporal patterns inherent in ‘drop-off’ activities. The rationale behind choosing LSTM lies in its ability to effectively model sequential data, which is essential for detecting drug traffic activities at sea where patterns are subtle and dynamic.</div><div>Moreover, this model holds the potential for integration into real-time surveillance systems, thereby enhancing operational capabilities in detecting and preventing drug traffic. The generalizability of our model makes for considerable potential in enhancing maritime security efforts and providing assistance in countering drug traffic on a global scale. Importantly, our model outperforms both baseline models, underscoring its effectiveness and superiority in addressing the specific challenges posed by ‘drop-off’ detection. For more information and access to the code repository, please visit <span><span>this link</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"18 ","pages":"Article 100590"},"PeriodicalIF":0.0,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142445582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-04DOI: 10.1016/j.mlwa.2024.100589
M. Prisbrey, D. Pereira, J. Greenhall, E. Davis, P. Vakhlamov, C. Chavez, C. Pantea
Monitoring pressure inside hermetically sealed vessels typically relies on devices that have direct contact with the fluid inside. Gaining this access requires a hole through the wall of the vessel, which creates potential for leaks, ruptures, and complete failures. To solve this, noninvasive solutions utilize external sensors that relate vessel-wall behavior to internal pressure. However, existing noninvasive techniques require permanently attaching sensors to a unique vessel and then monitoring for changes in the vessel. We present a noninvasive pressure monitoring technique based on acoustic resonance spectroscopy (ARS) and machine learning (ML) that enables estimating pressure in a vessel similar to those it was trained on and does not require sensors to be permanently attached. We train k-nearest neighbor (KNN) regressor models using experimentally gathered acoustic resonance spectra to estimate the pressure in six stainless-steel vessels. We demonstrate accurate estimation of the pressure inside the vessels when training and testing using spectra taken exclusively from an individual vessel, and when performing cross-validation between vessels. The acoustic technique presented in this paper finds broad applications across industry to monitor pressure in systems where having permanent sensors is undesirable, such as complicated pneumatic systems, vacuum sealed foods, and more.
监测密封容器内的压力通常依赖于与容器内流体直接接触的设备。要实现这种接触,需要在容器壁上开孔,这就有可能造成泄漏、破裂和完全失效。为了解决这个问题,非侵入式解决方案利用外部传感器将容器壁的行为与内部压力联系起来。然而,现有的非侵入式技术需要将传感器永久性地安装到一个独特的容器上,然后监测容器的变化。我们提出了一种基于声共振波谱(ARS)和机器学习(ML)的非侵入式压力监测技术,该技术能够估算与训练过的血管类似的血管中的压力,而且不需要永久连接传感器。我们使用实验收集的声共振波谱训练 k 近邻(KNN)回归模型,以估算六个不锈钢容器内的压力。在使用从单个容器采集的频谱进行训练和测试以及在容器之间进行交叉验证时,我们证明了对容器内压力的准确估计。本文介绍的声学技术可广泛应用于各行各业,在不希望使用永久传感器的系统中监测压力,如复杂的气动系统、真空密封食品等。
{"title":"Noninvasive pressure monitoring using acoustic resonance spectroscopy and machine learning","authors":"M. Prisbrey, D. Pereira, J. Greenhall, E. Davis, P. Vakhlamov, C. Chavez, C. Pantea","doi":"10.1016/j.mlwa.2024.100589","DOIUrl":"10.1016/j.mlwa.2024.100589","url":null,"abstract":"<div><div>Monitoring pressure inside hermetically sealed vessels typically relies on devices that have direct contact with the fluid inside. Gaining this access requires a hole through the wall of the vessel, which creates potential for leaks, ruptures, and complete failures. To solve this, noninvasive solutions utilize external sensors that relate vessel-wall behavior to internal pressure. However, existing noninvasive techniques require permanently attaching sensors to a unique vessel and then monitoring for changes in the vessel. We present a noninvasive pressure monitoring technique based on acoustic resonance spectroscopy (ARS) and machine learning (ML) that enables estimating pressure in a vessel similar to those it was trained on and does not require sensors to be permanently attached. We train k-nearest neighbor (KNN) regressor models using experimentally gathered acoustic resonance spectra to estimate the pressure in six stainless-steel vessels. We demonstrate accurate estimation of the pressure inside the vessels when training and testing using spectra taken exclusively from an individual vessel, and when performing cross-validation between vessels. The acoustic technique presented in this paper finds broad applications across industry to monitor pressure in systems where having permanent sensors is undesirable, such as complicated pneumatic systems, vacuum sealed foods, and more.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"18 ","pages":"Article 100589"},"PeriodicalIF":0.0,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142425944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, visual tracking algorithms have achieved impressive results by combining dynamic templates. However, the instability of visual images and the incorrect timing of template updates lead to decreased tracking accuracy and stability in intricate scenarios. To address these issues, we propose a visual tracking algorithm through visual language fusion and a state update evaluator (VLFSE). Specifically, our approach introduces a multimodal attention mechanism that uses self-attention to mine and integrate information from diverse sources effectively. This mechanism ensures a richer, context-aware representation of the target, enabling more accurate tracking even in complex scenes. Moreover, we recognize the critical need for precise template updates to maintain tracking accuracy over time. To this end, we develop a state update evaluator, a component trained online to assess the necessity and timing of template updates accurately. This evaluator acts as a safeguard, preventing erroneous updates and ensuring the tracker adapts optimally to changes in the target’s appearance. The experimental results on challenging visual language tracking datasets demonstrate our tracker’s superior performance, showcasing its adaptability and accuracy in complex tracking scenarios.
{"title":"VLFSE: Enhancing visual tracking through visual language fusion and state update evaluator","authors":"Fuchao Yang , Mingkai Jiang , Qiaohong Hao , Xiaolei Zhao , Qinghe Feng","doi":"10.1016/j.mlwa.2024.100588","DOIUrl":"10.1016/j.mlwa.2024.100588","url":null,"abstract":"<div><div>Recently, visual tracking algorithms have achieved impressive results by combining dynamic templates. However, the instability of visual images and the incorrect timing of template updates lead to decreased tracking accuracy and stability in intricate scenarios. To address these issues, we propose a visual tracking algorithm through visual language fusion and a state update evaluator (VLFSE). Specifically, our approach introduces a multimodal attention mechanism that uses self-attention to mine and integrate information from diverse sources effectively. This mechanism ensures a richer, context-aware representation of the target, enabling more accurate tracking even in complex scenes. Moreover, we recognize the critical need for precise template updates to maintain tracking accuracy over time. To this end, we develop a state update evaluator, a component trained online to assess the necessity and timing of template updates accurately. This evaluator acts as a safeguard, preventing erroneous updates and ensuring the tracker adapts optimally to changes in the target’s appearance. The experimental results on challenging visual language tracking datasets demonstrate our tracker’s superior performance, showcasing its adaptability and accuracy in complex tracking scenarios.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"18 ","pages":"Article 100588"},"PeriodicalIF":0.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142425943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Predicting molecular properties is crucial in drug synthesis and screening, but traditional molecular dynamics methods are time-consuming and costly. Recently, deep learning methods, particularly Graph Neural Networks (GNNs), have significantly improved efficiency by capturing molecular structures’ invariance under translation, rotation, and permutation. However, current GNN methods require complex data processing, increasing algorithmic complexity. This high complexity leads to several challenges, including increased computation time, higher computational resource demands, increased memory consumption. This paper introduces InvarNet, a GNN-based model trained with a composite loss function that bypasses intricate data processing while maintaining molecular property invariance. By pre-storing atomic feature attributes, InvarNet avoids repeated feature extraction during forward propagation. Experiments on three public datasets (Electronic Materials, QM9, and MD17) demonstrate that InvarNet achieves superior prediction accuracy, excellent stability, and convergence speed. It reaches state-of-the-art performance on the Electronic Materials dataset and outperforms existing models on the and properties of the QM9 dataset. On the MD17 dataset, InvarNet excels in energy prediction of benzene without atomic force. Additionally, InvarNet accelerates training time per epoch by 2.24 times compared to SphereNet on the QM9 dataset, simplifying data processing while maintaining acceptable accuracy.
{"title":"InvarNet: Molecular property prediction via rotation invariant graph neural networks","authors":"Danyan Chen , Gaoxiang Duan , Dengbao Miao , Xiaoying Zheng , Yongxin Zhu","doi":"10.1016/j.mlwa.2024.100587","DOIUrl":"10.1016/j.mlwa.2024.100587","url":null,"abstract":"<div><div>Predicting molecular properties is crucial in drug synthesis and screening, but traditional molecular dynamics methods are time-consuming and costly. Recently, deep learning methods, particularly Graph Neural Networks (GNNs), have significantly improved efficiency by capturing molecular structures’ invariance under translation, rotation, and permutation. However, current GNN methods require complex data processing, increasing algorithmic complexity. This high complexity leads to several challenges, including increased computation time, higher computational resource demands, increased memory consumption. This paper introduces InvarNet, a GNN-based model trained with a composite loss function that bypasses intricate data processing while maintaining molecular property invariance. By pre-storing atomic feature attributes, InvarNet avoids repeated feature extraction during forward propagation. Experiments on three public datasets (Electronic Materials, QM9, and MD17) demonstrate that InvarNet achieves superior prediction accuracy, excellent stability, and convergence speed. It reaches state-of-the-art performance on the Electronic Materials dataset and outperforms existing models on the <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> and <span><math><mrow><mi>a</mi><mi>l</mi><mi>p</mi><mi>h</mi><mi>a</mi></mrow></math></span> properties of the QM9 dataset. On the MD17 dataset, InvarNet excels in energy prediction of benzene without atomic force. Additionally, InvarNet accelerates training time per epoch by 2.24 times compared to SphereNet on the QM9 dataset, simplifying data processing while maintaining acceptable accuracy.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"18 ","pages":"Article 100587"},"PeriodicalIF":0.0,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142359250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}