Pub Date : 2026-06-01Epub Date: 2026-02-05DOI: 10.1016/j.mlwa.2026.100852
Khafiizh Hastuti, Erwin Yudi Hidayat, Abu Salam, Usman Sudibyo
Fine-grained recognition of cultural artifacts remains challenging because of the scarcity of annotated data, subtle intra-class differences, and heterogeneous imaging conditions. This study addresses these issues through a domain-specific deep learning pipeline, demonstrated on Indonesian keris classification across three tasks: pamor (27 classes), dhapur (42), and tangguh (5). The pipeline integrates background homogenization, orientation normalization, and YOLOv8-based blade cropping with mask-aware augmentation restricted to the blade regions. For classification, we propose KerisRDNet, which extends InceptionResNetV2 with Inception-Residual-Dilated (IRD) blocks and squeeze-and-excitation to model the elongated geometries and subtle forging motifs. Experiments show that baseline networks collapse under fine-grained settings, with macro-F1 near zero, whereas the proposed approach achieves 0.268 (pamor), 0.276 (dhapur), and 0.635 (tangguh) with Top-3 accuracy above 0.5 and AUC up to 0.853. Across three stratified resamplings, paired non-parametric tests (Wilcoxon signed-rank) indicated directionally consistent improvements; given the small number of repetitions (), these results are interpreted conservatively. These results demonstrate the feasibility of practically viable keris recognition as a decision-support tool for cultural heritage curation, while also offering a transferable workflow for low-data fine-grained recognition tasks.
{"title":"KerisRDNet: Mask-aware augmentation and residual dilated networks for cultural heritage blade classification","authors":"Khafiizh Hastuti, Erwin Yudi Hidayat, Abu Salam, Usman Sudibyo","doi":"10.1016/j.mlwa.2026.100852","DOIUrl":"10.1016/j.mlwa.2026.100852","url":null,"abstract":"<div><div>Fine-grained recognition of cultural artifacts remains challenging because of the scarcity of annotated data, subtle intra-class differences, and heterogeneous imaging conditions. This study addresses these issues through a domain-specific deep learning pipeline, demonstrated on Indonesian keris classification across three tasks: <em>pamor</em> (27 classes), <em>dhapur</em> (42), and <em>tangguh</em> (5). The pipeline integrates background homogenization, orientation normalization, and YOLOv8-based blade cropping with mask-aware augmentation restricted to the blade regions. For classification, we propose KerisRDNet, which extends InceptionResNetV2 with Inception-Residual-Dilated (IRD) blocks and squeeze-and-excitation to model the elongated geometries and subtle forging motifs. Experiments show that baseline networks collapse under fine-grained settings, with macro-F1 near zero, whereas the proposed approach achieves 0.268 (<em>pamor</em>), 0.276 (<em>dhapur</em>), and 0.635 (<em>tangguh</em>) with Top-3 accuracy above 0.5 and AUC up to 0.853. Across three stratified resamplings, paired non-parametric tests (Wilcoxon signed-rank) indicated directionally consistent improvements; given the small number of repetitions (<span><math><mrow><mi>n</mi><mo>=</mo><mn>3</mn></mrow></math></span>), these results are interpreted conservatively. These results demonstrate the feasibility of practically viable keris recognition as a decision-support tool for cultural heritage curation, while also offering a transferable workflow for low-data fine-grained recognition tasks.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"24 ","pages":"Article 100852"},"PeriodicalIF":4.9,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146161644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The rise of online rental platforms has led to an overwhelming amount of user-generated content, making it difficult for prospective consumers to discern which reviews are helpful. Existing approaches often rely on raw helpfulness votes, which are sparse, subjective, and temporally inconsistent. Also, there is lack of labeled dataset in the field of rental review usefulness prediction. This paper introduces a novel dataset of apartment reviews collected from online website and proposes an intelligent machine learning framework to predict the helpfulness of rental reviews. To address the challenge of obtaining reliable labels from sparse and subjective user votes, a scoring-based labeling strategy is developed that uses helpful vote count and timeliness. A diverse set of features including TF–IDF vectors, sentiment polarity, rating deviation, and review length are used to capture both textual and behavioral aspects of the reviews. Multiple classifiers, including Logistic Regression, Naive Bayes, and XGBoost, are systematically evaluated under 5-fold cross-validation, along with a rule-based and deep learning models.
Experimental results show that XGBoost consistently achieves the best overall performance with an accuracy of 0.71 and ROC-AUC of 0.75 when leveraging all features. This research makes three key contributions: (i) the first large-scale dataset for rental review, (ii) auto annotation technique that uses clustering approach with score from user votes and time since posted, and (iii) comprehensive evaluation pipeline spanning rule-based, traditional, and deep learning classifiers. Together, these advances establish a foundation for intelligent rental review helpfulness estimation, with broader implications for e-commerce, hospitality, and user-generated content analysis.
{"title":"Towards an intelligent review helpfulness estimation: A novel dataset and machine learning framework","authors":"Rakibul Hassan, Shubhashish Kar, Jorge Fonseca Cacho, Shaikh Arifuzzaman","doi":"10.1016/j.mlwa.2026.100849","DOIUrl":"10.1016/j.mlwa.2026.100849","url":null,"abstract":"<div><div>The rise of online rental platforms has led to an overwhelming amount of user-generated content, making it difficult for prospective consumers to discern which reviews are helpful. Existing approaches often rely on raw helpfulness votes, which are sparse, subjective, and temporally inconsistent. Also, there is lack of labeled dataset in the field of rental review usefulness prediction. This paper introduces a novel dataset of apartment reviews collected from online website and proposes an intelligent machine learning framework to predict the helpfulness of rental reviews. To address the challenge of obtaining reliable labels from sparse and subjective user votes, a scoring-based labeling strategy is developed that uses helpful vote count and timeliness. A diverse set of features including TF–IDF vectors, sentiment polarity, rating deviation, and review length are used to capture both textual and behavioral aspects of the reviews. Multiple classifiers, including Logistic Regression, Naive Bayes, and XGBoost, are systematically evaluated under 5-fold cross-validation, along with a rule-based and deep learning models.</div><div>Experimental results show that XGBoost consistently achieves the best overall performance with an accuracy of 0.71 and ROC-AUC of 0.75 when leveraging all features. This research makes three key contributions: (i) the first large-scale dataset for rental review, (ii) auto annotation technique that uses clustering approach with score from user votes and time since posted, and (iii) comprehensive evaluation pipeline spanning rule-based, traditional, and deep learning classifiers. Together, these advances establish a foundation for intelligent rental review helpfulness estimation, with broader implications for e-commerce, hospitality, and user-generated content analysis.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100849"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-01-19DOI: 10.1016/j.mlwa.2026.100843
Mekhla Sarkar , Yen-Chu Huang , Tsong-Hai Lee , Jiann-Der Lee , Prasan Kumar Sahoo
Intracranial arterial stenosis (ICAS) is a leading cause of cerebrovascular accidents, and accurate morphological assessment of intracranial arteries is critical for diagnosis and treatment planning. Complex vascular structures, imaging noise, and variability in time-of-flight magnetic resonance angiography (TOF-MRA) images are challenging issues for the manual delineation that motivates the use of deep learning (DL) for automatic segmentation of the intracranial arteries. DL based automatic segmentation offers a promising solution by providing consistent and noise-reduced vessel delineation. However, selecting an optimal segmentation architecture remains challenging due to the diversity of network designs and encoder backbones. Therefore, this study presents a systematic benchmarking of five widely used DL segmentation architectures, UNet, LinkNet, Feature Pyramid Networks (FPN), Pyramid Scene Parsing Network (PSPNet), and DeepLabV3+, each combined with nine backbone networks, yielding 45 model variants, including previously unexplored configurations for intracranial artery segmentation in TOF-MRA. Models were trained and cross-validated on four datasets: in-house, CereVessMRA, IXI and ADAM, and evaluated on held-out independent test set. Performance metrics included Intersection over Union (IoU), Dice Similarity Coefficient (DSC), and a Stability Score, combining the coefficient of variation of IoU and DSC to quantify segmentation consistency and reproducibility. Experimental results demonstrated highest DSC score was achieved with UNet–SE-ResNeXt50, LinkNet-SE-ResNeXt50, FPN-DenseNet169, FPN-SENet154. The most stable configurations were LinkNet–EfficientNetB6, LinkNet–SENet154, UNet–DenseNet169, and UNet–EfficientNetB6. Conversely, DeepLabV3+ and PSPNet variants consistently underperformed. These findings provide actionable guidance for selecting backbone–segmentation pairs and highlight trade-offs between accuracy, robustness, and reproducibility for complex intracranial artery TOF-MRA segmentation tasks.
颅内动脉狭窄(ICAS)是脑血管意外的主要原因,准确的颅内动脉形态学评估对诊断和治疗计划至关重要。复杂的血管结构、成像噪声和飞行时间磁共振血管造影(TOF-MRA)图像的可变性是人工描绘的挑战问题,这促使使用深度学习(DL)来自动分割颅内动脉。基于深度学习的自动分割提供了一个很有前途的解决方案,它提供了一致的、降噪的血管描绘。然而,由于网络设计和编码器主干网的多样性,选择一个最佳的分割架构仍然是一个挑战。因此,本研究对五种广泛使用的深度学习分割架构UNet、LinkNet、特征金字塔网络(FPN)、金字塔场景解析网络(PSPNet)和DeepLabV3+进行了系统的基准测试,每一种都与9个骨干网络相结合,产生45种模型变体,包括以前未开发的TOF-MRA颅内动脉分割配置。模型在内部、CereVessMRA、IXI和ADAM四个数据集上进行训练和交叉验证,并在独立测试集上进行评估。性能指标包括交联(Intersection over Union, IoU)、骰子相似系数(Dice Similarity Coefficient, DSC)和稳定性评分,结合IoU和DSC的变异系数来量化分割的一致性和可重复性。实验结果表明,UNet-SE-ResNeXt50、LinkNet-SE-ResNeXt50、FPN-DenseNet169、FPN-SENet154的DSC得分最高。最稳定的配置是LinkNet-EfficientNetB6、LinkNet-SENet154、UNet-DenseNet169和UNet-EfficientNetB6。相反,DeepLabV3+和PSPNet变体一直表现不佳。这些发现为选择骨干分割对提供了可操作的指导,并突出了复杂颅内动脉TOF-MRA分割任务的准确性,稳健性和可重复性之间的权衡。
{"title":"Analysis of major segmentation models for intracranial artery time-of-flight magnetic resonance angiography images","authors":"Mekhla Sarkar , Yen-Chu Huang , Tsong-Hai Lee , Jiann-Der Lee , Prasan Kumar Sahoo","doi":"10.1016/j.mlwa.2026.100843","DOIUrl":"10.1016/j.mlwa.2026.100843","url":null,"abstract":"<div><div>Intracranial arterial stenosis (ICAS) is a leading cause of cerebrovascular accidents, and accurate morphological assessment of intracranial arteries is critical for diagnosis and treatment planning. Complex vascular structures, imaging noise, and variability in time-of-flight magnetic resonance angiography (TOF-MRA) images are challenging issues for the manual delineation that motivates the use of deep learning (DL) for automatic segmentation of the intracranial arteries. DL based automatic segmentation offers a promising solution by providing consistent and noise-reduced vessel delineation. However, selecting an optimal segmentation architecture remains challenging due to the diversity of network designs and encoder backbones. Therefore, this study presents a systematic benchmarking of five widely used DL segmentation architectures, UNet, LinkNet, Feature Pyramid Networks (FPN), Pyramid Scene Parsing Network (PSPNet), and DeepLabV3+, each combined with nine backbone networks, yielding 45 model variants, including previously unexplored configurations for intracranial artery segmentation in TOF-MRA. Models were trained and cross-validated on four datasets: in-house, CereVessMRA, IXI and ADAM, and evaluated on held-out independent test set. Performance metrics included Intersection over Union (IoU), Dice Similarity Coefficient (DSC), and a Stability Score, combining the coefficient of variation of IoU and DSC to quantify segmentation consistency and reproducibility. Experimental results demonstrated highest DSC score was achieved with UNet–SE-ResNeXt50, LinkNet-SE-ResNeXt50, FPN-DenseNet169, FPN-SENet154. The most stable configurations were LinkNet–EfficientNetB6, LinkNet–SENet154, UNet–DenseNet169, and UNet–EfficientNetB6. Conversely, DeepLabV3+ and PSPNet variants consistently underperformed. These findings provide actionable guidance for selecting backbone–segmentation pairs and highlight trade-offs between accuracy, robustness, and reproducibility for complex intracranial artery TOF-MRA segmentation tasks.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100843"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-13DOI: 10.1016/j.mlwa.2025.100812
Mohammadhossein Homaei , Mehran Tarif , Pablo García Rodríguez , Mar Ávila , Andrés Caro
Machine learning (ML) models are often used to predict demand in digital twins (DTs) of water distribution systems (WDS). However, most models do not provide uncertainty estimation, and this makes risk evaluation limited. In this work, we introduce the first systematic framework for hierarchical uncertainty transfer in regional water networks, because until now no method existed for DT of regional water systems. We propose Adaptive Multi-Village Conformal Prediction (AMV-CP), a method that keeps theoretical guarantees and also allows transfer of uncertainty information between villages that are similar in structure but different in operation. The main ideas are: (i) village-adaptive conformity scores that capture local patterns, (ii) a meta-learning algorithm that reduces calibration cost by 88.6%, and (iii) regime-aware calibration that keeps 94.2% coverage when seasons change. We use eight years of data from six villages with 6174 users in one regional network. The results show a theoretical basis for cross-village transfer and 95.1% empirical coverage (target was 95%), with real-time speed of 120 predictions per second. Early multi-step tests also show 93.7% coverage for 24-hour horizons, with controlled trade-offs. This framework is the first systematic method for controlled uncertainty transfer in infrastructure DTs, with theoretical guarantees under -mixing and practical deployment. Our multi-village tests demonstrate the value of meta-learning for uncertainty estimation and make a base method that can be used in other hierarchical infrastructure systems. The system is validated in a Mediterranean rural network, but generalization to other climates, urban settings, and cascading systems needs further empirical study.
{"title":"Adaptive multi-domain uncertainty quantification for digital twin water forecasting","authors":"Mohammadhossein Homaei , Mehran Tarif , Pablo García Rodríguez , Mar Ávila , Andrés Caro","doi":"10.1016/j.mlwa.2025.100812","DOIUrl":"10.1016/j.mlwa.2025.100812","url":null,"abstract":"<div><div>Machine learning (ML) models are often used to predict demand in digital twins (DTs) of water distribution systems (WDS). However, most models do not provide uncertainty estimation, and this makes risk evaluation limited. In this work, we introduce the first systematic framework for hierarchical uncertainty transfer in regional water networks, because until now no method existed for DT of regional water systems. We propose Adaptive Multi-Village Conformal Prediction (AMV-CP), a method that keeps theoretical guarantees and also allows transfer of uncertainty information between villages that are similar in structure but different in operation. The main ideas are: (i) village-adaptive conformity scores that capture local patterns, (ii) a meta-learning algorithm that reduces calibration cost by 88.6%, and (iii) regime-aware calibration that keeps 94.2% coverage when seasons change. We use eight years of data from six villages with 6174 users in one regional network. The results show a theoretical basis for cross-village transfer and 95.1% empirical coverage (target was 95%), with real-time speed of 120 predictions per second. Early multi-step tests also show 93.7% coverage for 24-hour horizons, with controlled trade-offs. This framework is the first systematic method for controlled uncertainty transfer in infrastructure DTs, with theoretical guarantees under <span><math><mi>ϕ</mi></math></span>-mixing and practical deployment. Our multi-village tests demonstrate the value of meta-learning for uncertainty estimation and make a base method that can be used in other hierarchical infrastructure systems. The system is validated in a Mediterranean rural network, but generalization to other climates, urban settings, and cascading systems needs further empirical study.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100812"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-16DOI: 10.1016/j.mlwa.2025.100823
Nouf AlShenaifi, Nourah Alangari
Despite the growing importance of multimodal signals on social media, Arabic stance detection has remained largely text-only, overlooking the visual context that often accompanies user posts. To bridge this gap, we present MAWQIF-MM, the first publicly available Arabic multimodal stance detection corpus of tweet–image pairs annotated with three stance labels: Favor, Against, and Neutral. Building on this resource, we propose a novel attention-based cross-modal fusion model that jointly encodes text and images. Textual content is processed using AraBERT v2, a transformer-based language model optimized for Arabic, while visual features are extracted using BLIP with a ViT-B backbone, a state-of-the-art vision-language model. These two modalities are integrated via multi-head cross-attention to capture cross-modal interactions. Experimental results demonstrate the effectiveness of our approach: on a held-out test set, the model achieves 88% accuracy, outperforming a text-only AraBERT baseline by 12 percentage points and an image-only BLIP baseline by 4 points. To further probe large vision–language models (VLMs) in low-resource settings, we benchmark Gemini 2.5 Flash and GPT-4o under zero-shot and few-shot prompting. While these models show promising generalization, they struggle with nuanced stances without fine-tuning, underscoring the value of domain-specific supervised training.
{"title":"Beyond text: Multimodal stance detection in Arabic tweets","authors":"Nouf AlShenaifi, Nourah Alangari","doi":"10.1016/j.mlwa.2025.100823","DOIUrl":"10.1016/j.mlwa.2025.100823","url":null,"abstract":"<div><div>Despite the growing importance of multimodal signals on social media, Arabic stance detection has remained largely text-only, overlooking the visual context that often accompanies user posts. To bridge this gap, we present MAWQIF-MM, the first publicly available Arabic multimodal stance detection corpus of tweet–image pairs annotated with three stance labels: Favor, Against, and Neutral. Building on this resource, we propose a novel attention-based cross-modal fusion model that jointly encodes text and images. Textual content is processed using AraBERT v2, a transformer-based language model optimized for Arabic, while visual features are extracted using BLIP with a ViT-B backbone, a state-of-the-art vision-language model. These two modalities are integrated via multi-head cross-attention to capture cross-modal interactions. Experimental results demonstrate the effectiveness of our approach: on a held-out test set, the model achieves 88% accuracy, outperforming a text-only AraBERT baseline by 12 percentage points and an image-only BLIP baseline by 4 points. To further probe large vision–language models (VLMs) in low-resource settings, we benchmark Gemini 2.5 Flash and GPT-4o under zero-shot and few-shot prompting. While these models show promising generalization, they struggle with nuanced stances without fine-tuning, underscoring the value of domain-specific supervised training.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100823"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-01-18DOI: 10.1016/j.mlwa.2026.100845
Obaida AlHousrya, Aseel Bennagi, Petru A. Cotfas, Daniel T. Cotfas
Driver fatigue remains a critical factor in road accidents, particularly in long duration or cognitively demanding driving scenarios. This study presents a comprehensive, low cost, and real time system for monitoring driver health and electric vehicle status through physiological signal analysis. By integrating heart rate, eye movement, and breathing rate sensors, both simulated and real, this hybrid framework detects signs of fatigue using machine learning classifiers trained on publicly available datasets including OpenDriver, DriveDB, MAUS, YawDD, TinyML, and the Driver Respiration Dataset. The system architecture combines Arduino based hardware, cloud integration via Microsoft Azure, and advanced classification and anomaly detection algorithms such as Random Forest and Isolation Forest. Evaluation across diverse datasets revealed robust fatigue detection capabilities, with OpenDriver achieving 97.6% cross validation accuracy and 95.8% F1-score, while image and respiration-based models complemented the electrocardiogram-based analysis. These results demonstrate the feasibility of affordable, multimodal health monitoring in EVs, offering a scalable and deployable solution for enhancing road safety.
{"title":"A hybrid machine learning and IoT system for driver fatigue monitoring in connected electric vehicles","authors":"Obaida AlHousrya, Aseel Bennagi, Petru A. Cotfas, Daniel T. Cotfas","doi":"10.1016/j.mlwa.2026.100845","DOIUrl":"10.1016/j.mlwa.2026.100845","url":null,"abstract":"<div><div>Driver fatigue remains a critical factor in road accidents, particularly in long duration or cognitively demanding driving scenarios. This study presents a comprehensive, low cost, and real time system for monitoring driver health and electric vehicle status through physiological signal analysis. By integrating heart rate, eye movement, and breathing rate sensors, both simulated and real, this hybrid framework detects signs of fatigue using machine learning classifiers trained on publicly available datasets including OpenDriver, DriveDB, MAUS, YawDD, TinyML, and the Driver Respiration Dataset. The system architecture combines Arduino based hardware, cloud integration via Microsoft Azure, and advanced classification and anomaly detection algorithms such as Random Forest and Isolation Forest. Evaluation across diverse datasets revealed robust fatigue detection capabilities, with OpenDriver achieving 97.6% cross validation accuracy and 95.8% F1-score, while image and respiration-based models complemented the electrocardiogram-based analysis. These results demonstrate the feasibility of affordable, multimodal health monitoring in EVs, offering a scalable and deployable solution for enhancing road safety.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100845"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146037558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seismic ground-motion simulations provide high-fidelity predictions but are computationally prohibitive for large-scale scenario analyses. Surrogate models based on Multi-Layer Perceptrons (MLPs) or Fourier Neural Operators (FNOs) have been studied, yet each has limitations: MLPs fail to capture spatial correlations, while FNOs incur high costs from repeated Fourier transforms on full-resolution grids. To overcome these issues, we propose a surrogate model based on the MLP-Mixer architecture that operates on a patch grid, enabling efficient extraction of global spatial correlations. In addition, we introduce a multi-stream design with source and geology inputs fused through a learnable element-wise multi-modal mixer, allowing period-dependent, data-driven fusion of modalities. Experiments on Nankai Trough simulations demonstrate that the proposed method, referred to as Multi-MLP-Mixer, achieves accuracy comparable to state-of-the-art surrogate models while reducing training and inference time, thereby balancing predictive performance with computational efficiency.
{"title":"Multi-MLP-Mixer based surrogate model for seismic ground-motion with spatial source and geological parameters","authors":"Hirotaka Hachiya , Yuto Kuroki , Asako Iwaki , Takahiro Maeda , Naonori Ueda , Hiroyuki Fujiwara","doi":"10.1016/j.mlwa.2026.100855","DOIUrl":"10.1016/j.mlwa.2026.100855","url":null,"abstract":"<div><div>Seismic ground-motion simulations provide high-fidelity predictions but are computationally prohibitive for large-scale scenario analyses. Surrogate models based on Multi-Layer Perceptrons (MLPs) or Fourier Neural Operators (FNOs) have been studied, yet each has limitations: MLPs fail to capture spatial correlations, while FNOs incur high costs from repeated Fourier transforms on full-resolution grids. To overcome these issues, we propose a surrogate model based on the MLP-Mixer architecture that operates on a patch grid, enabling efficient extraction of global spatial correlations. In addition, we introduce a multi-stream design with source and geology inputs fused through a learnable element-wise multi-modal mixer, allowing period-dependent, data-driven fusion of modalities. Experiments on Nankai Trough simulations demonstrate that the proposed method, referred to as Multi-MLP-Mixer, achieves accuracy comparable to state-of-the-art surrogate models while reducing training and inference time, thereby balancing predictive performance with computational efficiency.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100855"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146187667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-11-29DOI: 10.1016/j.mlwa.2025.100798
Temitope Olubanjo Kehinde , Azeez A. Oyedele , Morenikeji Kabirat Kareem , Joseph Akpan , Oludolapo A. Olanrewaju
This study presents an integrated Data Envelopment Analysis (DEA) and ensemble learning framework optimized with the Golden Jackal Optimization (GJO) algorithm to evaluate and predict the efficiency of United States information technology firms. Both Constant Returns to Scale and Variable Returns to Scale models were applied to measure firm efficiency and compute scale efficiency, providing a clearer distinction between managerial and scale-related effects. Using data from 3940 firms over the period 2013 to 2023, a robustness test introducing ±20% random noise to a 10% random sample confirmed that the CCR model achieved stronger stability, with a correlation coefficient of 0.795 compared to 0.773 for the BCC model. Consequently, the CCR results were adopted as the basis for predictive modeling. DEA efficiency scores were predicted using six ensemble learners, including XGBoost, Gradient Boosting Regressor, AdaBoost, Extra Trees Regressor, Random Forest, and LightGBM, with GJO employed for hyperparameter tuning. The Gradient Boosting Regressor optimized with GJO achieved the best predictive performance, accurately reproducing the observed efficiency scores. SHAP and feature importance analyses revealed that Total Equity, Operating Income, and Total Assets were the most influential determinants of efficiency. This research contributes a scalable and interpretable approach to efficiency prediction, offering actionable insights for managers, investors, and policymakers in volatile financial markets.
本研究提出一个整合数据包络分析(DEA)与金豺优化(GJO)算法的集成学习框架来评估和预测美国信息技术公司的效率。恒定规模回报和可变规模回报模型都被应用于衡量企业效率和计算规模效率,在管理效应和规模相关效应之间提供了更清晰的区分。利用2013年至2023年期间3940家企业的数据,对10%随机样本引入±20%随机噪声的稳健性检验证实,CCR模型具有更强的稳定性,其相关系数为0.795,而BCC模型的相关系数为0.773。因此,采用CCR结果作为预测建模的基础。使用XGBoost、Gradient Boosting Regressor、AdaBoost、Extra Trees Regressor、Random Forest和LightGBM等6个集成学习器预测DEA效率得分,并使用GJO进行超参数调优。使用GJO优化的梯度增强回归器获得了最佳的预测性能,准确地再现了观察到的效率得分。SHAP和特征重要性分析显示,总股本、营业收入和总资产是效率的最具影响力的决定因素。本研究为效率预测提供了一种可扩展和可解释的方法,为动荡的金融市场中的管理者、投资者和政策制定者提供了可操作的见解。
{"title":"Explainable DEA–ensemble approach with golden jackal optimization: efficiency evaluation and prediction for United States information technology firms","authors":"Temitope Olubanjo Kehinde , Azeez A. Oyedele , Morenikeji Kabirat Kareem , Joseph Akpan , Oludolapo A. Olanrewaju","doi":"10.1016/j.mlwa.2025.100798","DOIUrl":"10.1016/j.mlwa.2025.100798","url":null,"abstract":"<div><div>This study presents an integrated Data Envelopment Analysis (DEA) and ensemble learning framework optimized with the Golden Jackal Optimization (GJO) algorithm to evaluate and predict the efficiency of United States information technology firms. Both Constant Returns to Scale and Variable Returns to Scale models were applied to measure firm efficiency and compute scale efficiency, providing a clearer distinction between managerial and scale-related effects. Using data from 3940 firms over the period 2013 to 2023, a robustness test introducing ±20% random noise to a 10% random sample confirmed that the CCR model achieved stronger stability, with a correlation coefficient of 0.795 compared to 0.773 for the BCC model. Consequently, the CCR results were adopted as the basis for predictive modeling. DEA efficiency scores were predicted using six ensemble learners, including XGBoost, Gradient Boosting Regressor, AdaBoost, Extra Trees Regressor, Random Forest, and LightGBM, with GJO employed for hyperparameter tuning. The Gradient Boosting Regressor optimized with GJO achieved the best predictive performance, accurately reproducing the observed efficiency scores. SHAP and feature importance analyses revealed that Total Equity, Operating Income, and Total Assets were the most influential determinants of efficiency. This research contributes a scalable and interpretable approach to efficiency prediction, offering actionable insights for managers, investors, and policymakers in volatile financial markets.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100798"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145694191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-11-26DOI: 10.1016/j.mlwa.2025.100803
Xin Yu Huang , Venkat Margapuri
Stress is a widespread psychological concern that often manifests alongside conditions such as anxiety and depression. Traditional self-report tools like the Perceived Stress Scale (PSS-10) may not fully capture an individual’s stress experience. This study explores whether integrating multimodal biometric data through video, audio, and transcriptions can enhance stress detection by providing a more comprehensive and interpretive point of view. Participants completed the PSS-10 while being recorded, and emotional features were extracted using machine learning models across the three biometric modalities. Results revealed weak correlations among the modalities, indicating that each captures distinct aspects of stress. Notably, the combined biometric score demonstrated greater sensitivity than the PSS-10 alone, suggesting that multimodal models may detect stress-related states that self-reports overlook. These findings support the development of more comprehensive stress assessment tools, although they are not intended to replace professional clinical evaluation.
{"title":"Exploring multimodal, non-invasive stress assessment through audio-visual and textual cues integrated with psychometric survey data","authors":"Xin Yu Huang , Venkat Margapuri","doi":"10.1016/j.mlwa.2025.100803","DOIUrl":"10.1016/j.mlwa.2025.100803","url":null,"abstract":"<div><div>Stress is a widespread psychological concern that often manifests alongside conditions such as anxiety and depression. Traditional self-report tools like the Perceived Stress Scale (PSS-10) may not fully capture an individual’s stress experience. This study explores whether integrating multimodal biometric data through video, audio, and transcriptions can enhance stress detection by providing a more comprehensive and interpretive point of view. Participants completed the PSS-10 while being recorded, and emotional features were extracted using machine learning models across the three biometric modalities. Results revealed weak correlations among the modalities, indicating that each captures distinct aspects of stress. Notably, the combined biometric score demonstrated greater sensitivity than the PSS-10 alone, suggesting that multimodal models may detect stress-related states that self-reports overlook. These findings support the development of more comprehensive stress assessment tools, although they are not intended to replace professional clinical evaluation.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100803"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-08DOI: 10.1016/j.mlwa.2025.100814
Ethar Alzaid , George Wright , Mark Eastwood , Piotr Keller , Fayyaz Minhas
Survival prediction from medical data is often constrained by scarce labels, limiting the effectiveness of fully supervised models. In addition, most existing approaches produce deterministic risk scores without conveying reliability, which hinders interpretability and clinical trustworthiness. To address these challenges, we introduce T-SURE, a transductive survival ranking and risk-stratification framework that learns jointly from labeled and unlabeled patients to reduce dependence on large annotated cohorts. It also estimates a rejection score that identifies high-uncertainty cases, enabling selective abstention when confidence is low. T-SURE generates a single risk score that enables (1) patient ranking based on survival risk, (2) automatic assignment to risk groups, and (3) optional rejection of uncertain predictions. We extensively evaluated the model on pan-cancer datasets from The Cancer Genome Atlas (TCGA), using gene expression profiles, whole slide images, pathology reports, and clinical information. The model outperformed existing approaches in both ranking and risk stratification, especially in the limited labeled data regimen. It also showed consistent improvements in performance as uncertain samples were rejected, while maintaining statistically significant stratification across datasets. T-SURE integrates as a reliable component within computational pathology pipelines by guiding risk-specific therapeutic and monitoring decisions and flagging ambiguous or rare cases via a high rejection score for further investigation. To support reproducibility, the full implementation of T-SURE is publicly available at: (Anonymized).
{"title":"Automatic discovery of robust risk groups from limited survival data across biomedical modalities","authors":"Ethar Alzaid , George Wright , Mark Eastwood , Piotr Keller , Fayyaz Minhas","doi":"10.1016/j.mlwa.2025.100814","DOIUrl":"10.1016/j.mlwa.2025.100814","url":null,"abstract":"<div><div>Survival prediction from medical data is often constrained by scarce labels, limiting the effectiveness of fully supervised models. In addition, most existing approaches produce deterministic risk scores without conveying reliability, which hinders interpretability and clinical trustworthiness. To address these challenges, we introduce T-SURE, a transductive survival ranking and risk-stratification framework that learns jointly from labeled and unlabeled patients to reduce dependence on large annotated cohorts. It also estimates a rejection score that identifies high-uncertainty cases, enabling selective abstention when confidence is low. T-SURE generates a single risk score that enables (1) patient ranking based on survival risk, (2) automatic assignment to risk groups, and (3) optional rejection of uncertain predictions. We extensively evaluated the model on pan-cancer datasets from The Cancer Genome Atlas (TCGA), using gene expression profiles, whole slide images, pathology reports, and clinical information. The model outperformed existing approaches in both ranking and risk stratification, especially in the limited labeled data regimen. It also showed consistent improvements in performance as uncertain samples were rejected, while maintaining statistically significant stratification across datasets. T-SURE integrates as a reliable component within computational pathology pipelines by guiding risk-specific therapeutic and monitoring decisions and flagging ambiguous or rare cases via a high rejection score for further investigation. To support reproducibility, the full implementation of T-SURE is publicly available at: (Anonymized).</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"23 ","pages":"Article 100814"},"PeriodicalIF":4.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145748776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}