Heterogeneous Social Networks (HSNs) represent complex structures where diverse entities, such as users, items, and interactions, coexist and interact within a unified framework. This paper offers a systematic review of HSN Analysis, addressing the theoretical and practical challenges associated with investigating the interplay between varied node types and diverse relationships within HSNs. The paper begins by defining HSNs and outlining their characteristics, highlighting the existence of diverse entity kinds and a range of relationship types. It explores the significance of HSNs in modeling real‐world systems, including online social platforms, biological networks, e‐commerce networks, and recommendation systems, where diverse entities play distinct roles. The analysis of HSNs extends beyond traditional homogeneous networks, incorporating various types of nodes and edges, and introduces novel considerations for effective analysis. The difficulties in modeling, representing, and analyzing HSNs will be covered in this work. Several reviews of social network analysis have been published in the past, but they often focus on simple networks, not HSN analysis specifically. This paper aims to fill that gap by comprehensively reviewing different aspects of HSN and its analysis. We start with the fundamentals of HSNs, explore its major types‐multi‐relational networks and multi‐modal networks and further their impact on popular data mining tasks. Then, we explore various applications of heterogeneous information network analysis, like recommender systems, text mining, fraud detection, and e‐commerce. Finally, we look at recent research and suggest promising future directions in the field of HSN analysis.
{"title":"An Overview of Heterogeneous Social Network Analysis","authors":"Deepti Singh, Ankita Verma","doi":"10.1002/widm.70028","DOIUrl":"https://doi.org/10.1002/widm.70028","url":null,"abstract":"Heterogeneous Social Networks (HSNs) represent complex structures where diverse entities, such as users, items, and interactions, coexist and interact within a unified framework. This paper offers a systematic review of HSN Analysis, addressing the theoretical and practical challenges associated with investigating the interplay between varied node types and diverse relationships within HSNs. The paper begins by defining HSNs and outlining their characteristics, highlighting the existence of diverse entity kinds and a range of relationship types. It explores the significance of HSNs in modeling real‐world systems, including online social platforms, biological networks, e‐commerce networks, and recommendation systems, where diverse entities play distinct roles. The analysis of HSNs extends beyond traditional homogeneous networks, incorporating various types of nodes and edges, and introduces novel considerations for effective analysis. The difficulties in modeling, representing, and analyzing HSNs will be covered in this work. Several reviews of social network analysis have been published in the past, but they often focus on simple networks, not HSN analysis specifically. This paper aims to fill that gap by comprehensively reviewing different aspects of HSN and its analysis. We start with the fundamentals of HSNs, explore its major types‐multi‐relational networks and multi‐modal networks and further their impact on popular data mining tasks. Then, we explore various applications of heterogeneous information network analysis, like recommender systems, text mining, fraud detection, and e‐commerce. Finally, we look at recent research and suggest promising future directions in the field of HSN analysis.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144288333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automating vehicle damage detection is essential for automotive industry applications like insurance claims, online sales, and repair cost estimates, addressing the labor-intensive, time-consuming, and error-prone nature of current manual inspections. This systematic literature review explores the use of artificial intelligence (AI), particularly deep learning-based algorithms, to improve the accuracy and efficiency of damage detection under dynamic and challenging conditions specific to the requirements of our industry partners. The review is structured around five key research questions and includes extensive empirical evaluations to identify gaps and challenges in existing methods. Findings reveal significant potential for AI to automate and enhance the damage detection process but also highlight areas requiring further research and development. The review discusses these gaps in detail, providing a comprehensive foundation for future work in this field. Furthermore, the review findings are intended to guide both our research and the broader research community in advancing the practical application of AI for vehicle damage assessment. The insights gained from this review are crucial for developing robust AI solutions that can operate effectively in real-world scenarios, ultimately improving operational efficiency and customer experience in the automotive industry.
{"title":"Vehicle Damage Detection Using Artificial Intelligence: A Systematic Literature Review","authors":"Md Jahid Hasan, Cong Kha Nguyen, Yee Ling Boo, Hamed Jahani, Kok-Leong Ong","doi":"10.1002/widm.70027","DOIUrl":"https://doi.org/10.1002/widm.70027","url":null,"abstract":"Automating vehicle damage detection is essential for automotive industry applications like insurance claims, online sales, and repair cost estimates, addressing the labor-intensive, time-consuming, and error-prone nature of current manual inspections. This systematic literature review explores the use of artificial intelligence (AI), particularly deep learning-based algorithms, to improve the accuracy and efficiency of damage detection under dynamic and challenging conditions specific to the requirements of our industry partners. The review is structured around five key research questions and includes extensive empirical evaluations to identify gaps and challenges in existing methods. Findings reveal significant potential for AI to automate and enhance the damage detection process but also highlight areas requiring further research and development. The review discusses these gaps in detail, providing a comprehensive foundation for future work in this field. Furthermore, the review findings are intended to guide both our research and the broader research community in advancing the practical application of AI for vehicle damage assessment. The insights gained from this review are crucial for developing robust AI solutions that can operate effectively in real-world scenarios, ultimately improving operational efficiency and customer experience in the automotive industry.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144237453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This review paper presents a comprehensive analysis of the memetic algorithms (MAs) for feature selection (FS), particularly in high‐dimensional datasets. MAs effectively address the challenges of feature selection by combining the global exploration capabilities of evolutionary algorithms with the local optimization of search techniques. Their hybrid nature makes them well suited for tackling the complexity, scalability, and computational demands of FS problems across various domains, including bioinformatics, image processing, and financial forecasting. This review highlights the recent advancements, customized variants, and practical applications of MA‐based FS methods while providing critical insights into their limitations, such as computational overhead and overfitting. Additionally, the paper outlines future research directions to further enhance the efficacy of MAs in feature selection, offering a balanced perspective on their contributions to the field.
{"title":"Advances in Feature Selection Using Memetic Algorithms: A Comprehensive Review","authors":"Keerthi Gabbi Reddy, Deepasikha Mishra","doi":"10.1002/widm.70026","DOIUrl":"https://doi.org/10.1002/widm.70026","url":null,"abstract":"This review paper presents a comprehensive analysis of the memetic algorithms (MAs) for feature selection (FS), particularly in high‐dimensional datasets. MAs effectively address the challenges of feature selection by combining the global exploration capabilities of evolutionary algorithms with the local optimization of search techniques. Their hybrid nature makes them well suited for tackling the complexity, scalability, and computational demands of FS problems across various domains, including bioinformatics, image processing, and financial forecasting. This review highlights the recent advancements, customized variants, and practical applications of MA‐based FS methods while providing critical insights into their limitations, such as computational overhead and overfitting. Additionally, the paper outlines future research directions to further enhance the efficacy of MAs in feature selection, offering a balanced perspective on their contributions to the field.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144201657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the growing reliance on LLMs for a wide range of NLP tasks, optimizing the use of labeled and unlabeled data for effective context generation has become critical. This work explores the interplay between two prominent methodologies in few-shot learning: in-context learning (ICL), which utilizes labeled task-specific data, and retrieval-augmented generation (RAG), which leverages unlabeled external knowledge to augment generative models. Since each has its individual limitations, we propose a novel hybrid approach to obtain “the best of both worlds” by dynamically integrating both labeled and unlabeled data towards improving the downstream performance of LLMs. Our methodology, which we call LU-RAG (labeled and unlabeled RAG), recomputes the scores of top-k labeled instances and top-m unlabeled passages to refine context selection. Our experimental results demonstrate that LU-RAG consistently outperforms both standalone ICL and RAG across multiple benchmarks, showing significant gains in downstream performance. Furthermore, we show that LU-RAG performs better with a semantic neighborhood as compared to a lexical one, highlighting its ability to generalize effectively.
{"title":"The “Curious Case of Contexts” in Retrieval-Augmented Generation With a Combination of Labeled and Unlabeled Data","authors":"Payel Santra, Madhusudan Ghosh, Debasis Ganguly, Partha Basuchowdhuri, Sudip Kumar Naskar","doi":"10.1002/widm.70021","DOIUrl":"https://doi.org/10.1002/widm.70021","url":null,"abstract":"With the growing reliance on LLMs for a wide range of NLP tasks, optimizing the use of labeled and unlabeled data for effective context generation has become critical. This work explores the interplay between two prominent methodologies in few-shot learning: in-context learning (ICL), which utilizes labeled task-specific data, and retrieval-augmented generation (RAG), which leverages unlabeled external knowledge to augment generative models. Since each has its individual limitations, we propose a novel hybrid approach to obtain “the best of both worlds” by dynamically integrating both labeled and unlabeled data towards improving the downstream performance of LLMs. Our methodology, which we call LU-RAG (labeled and unlabeled RAG), recomputes the scores of top-<i>k</i> labeled instances and top-<i>m</i> unlabeled passages to refine context selection. Our experimental results demonstrate that LU-RAG consistently outperforms both standalone ICL and RAG across multiple benchmarks, showing significant gains in downstream performance. Furthermore, we show that LU-RAG performs better with a semantic neighborhood as compared to a lexical one, highlighting its ability to generalize effectively.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"134 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144165784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recommendation systems predict user interests and recommend items for online platforms including e-commerce, social networks, and decision systems. However, data bias has become a significant obstacle, severely impacting the accuracy, fairness, and reliability of recommendation results. This survey examines causal inference for optimizing recommendation systems and mitigating data bias, addressing three questions: (1) Bias types and performance impacts; (2) Causal inference mitigation methods; (3) Approach advantages, limitations, and research opportunities. The motivation for this survey stems from the limitations of traditional debiasing methods, which often fail to account for causal relationships and struggle in dynamic, real-world scenarios. Causal inference provides a robust framework for identifying and addressing the underlying causes of bias, enabling more transparent and accurate recommendation systems. Therefore, we define three critical stages of bias: bias in the data stage, model selection stage, and model evaluation stage. For each stage, causal inference-based optimization methods are introduced and critically analyzed. Unlike traditional debiasing methods, this study analyzes data augmentation and regularization techniques as potential strategies for future research. The whole research might highlight the ability of causal inference to uncover and control confounding factors, offering deeper insights into the mechanisms driving biases.
{"title":"A Survey on Causal Inference-Driven Data Bias Optimization in Recommendation Systems: Principles, Opportunities and Challenges","authors":"Yongkang Li, Xingyu Zhu, Yuheng Wu, Wenxu Zhao, Xiaona Xia","doi":"10.1002/widm.70020","DOIUrl":"https://doi.org/10.1002/widm.70020","url":null,"abstract":"Recommendation systems predict user interests and recommend items for online platforms including e-commerce, social networks, and decision systems. However, data bias has become a significant obstacle, severely impacting the accuracy, fairness, and reliability of recommendation results. This survey examines causal inference for optimizing recommendation systems and mitigating data bias, addressing three questions: (1) Bias types and performance impacts; (2) Causal inference mitigation methods; (3) Approach advantages, limitations, and research opportunities. The motivation for this survey stems from the limitations of traditional debiasing methods, which often fail to account for causal relationships and struggle in dynamic, real-world scenarios. Causal inference provides a robust framework for identifying and addressing the underlying causes of bias, enabling more transparent and accurate recommendation systems. Therefore, we define three critical stages of bias: bias in the data stage, model selection stage, and model evaluation stage. For each stage, causal inference-based optimization methods are introduced and critically analyzed. Unlike traditional debiasing methods, this study analyzes data augmentation and regularization techniques as potential strategies for future research. The whole research might highlight the ability of causal inference to uncover and control confounding factors, offering deeper insights into the mechanisms driving biases.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144130746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Artificial intelligence (AI) is emerging as a transforming force in waste management practices, enabling new ways of bringing efficiency and effectiveness. This survey presents methods related to waste management, which are categorized systematically for understanding the effectiveness of various AI-based techniques. The study undertakes a critical review of relevant research works that epitomize major advances and methodologies of AI-driven waste management. The manuscript provides an exhaustive taxonomy, dividing AI methods into Supervised Learning, Unsupervised Learning, and Reinforcement Learning, and then subdividing Supervised Learning into four broad categories: Machine Learning-based Classification, CNNs, Transfer Learning, and Hybrid or Ensemble Learning. We further evaluate different datasets applied in performance benchmarking and the efficacy of the various AI models. We also discuss some critical issues, such as the problem of available data quality, poor generalization of models, and integration of systems. Future research directions, which would go a long way toward helping to surmount these challenges, are also discussed. This survey aims to present a structured framework for understanding current AI applications in waste management, therefore guiding ongoing and future research in the field.
{"title":"Artificial Intelligence-Based Waste Management: A Review of Classification, Techniques, Issues, and Challenges","authors":"Dhanashree Vipul Yevle, Palvinder Singh Mann","doi":"10.1002/widm.70025","DOIUrl":"https://doi.org/10.1002/widm.70025","url":null,"abstract":"Artificial intelligence (AI) is emerging as a transforming force in waste management practices, enabling new ways of bringing efficiency and effectiveness. This survey presents methods related to waste management, which are categorized systematically for understanding the effectiveness of various AI-based techniques. The study undertakes a critical review of relevant research works that epitomize major advances and methodologies of AI-driven waste management. The manuscript provides an exhaustive taxonomy, dividing AI methods into Supervised Learning, Unsupervised Learning, and Reinforcement Learning, and then subdividing Supervised Learning into four broad categories: Machine Learning-based Classification, CNNs, Transfer Learning, and Hybrid or Ensemble Learning. We further evaluate different datasets applied in performance benchmarking and the efficacy of the various AI models. We also discuss some critical issues, such as the problem of available data quality, poor generalization of models, and integration of systems. Future research directions, which would go a long way toward helping to surmount these challenges, are also discussed. This survey aims to present a structured framework for understanding current AI applications in waste management, therefore guiding ongoing and future research in the field.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144087993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simi Job, Xiaohui Tao, Taotao Cai, Haoran Xie, Lin Li, Qing Li, Jianming Yong
In machine learning, exploring data correlations to predict outcomes is a fundamental task. Recognizing causal relationships embedded within data is pivotal for a comprehensive understanding of system dynamics, the significance of which is paramount in data-driven decision-making processes. Beyond traditional methods, there has been a shift toward using graph neural networks (GNNs) for causal learning, given their capabilities as universal data approximators. Thus, a thorough review of the advancements in causal learning using GNNs is both relevant and timely. To structure this review, we introduce a novel taxonomy that encompasses various state-of-the-art GNN methods used in studying causality. GNNs are further categorized based on their applications in the causality domain. We further provide an exhaustive compilation of datasets integral to causal learning with GNNs to serve as a resource for practical study. This review also touches upon the application of causal learning across diverse sectors. We conclude the review with insights into potential challenges and promising avenues for future exploration in this rapidly evolving field of machine learning.
{"title":"Exploring Causal Learning Through Graph Neural Networks: An In-Depth Review","authors":"Simi Job, Xiaohui Tao, Taotao Cai, Haoran Xie, Lin Li, Qing Li, Jianming Yong","doi":"10.1002/widm.70024","DOIUrl":"https://doi.org/10.1002/widm.70024","url":null,"abstract":"In machine learning, exploring data correlations to predict outcomes is a fundamental task. Recognizing causal relationships embedded within data is pivotal for a comprehensive understanding of system dynamics, the significance of which is paramount in data-driven decision-making processes. Beyond traditional methods, there has been a shift toward using graph neural networks (GNNs) for causal learning, given their capabilities as universal data approximators. Thus, a thorough review of the advancements in causal learning using GNNs is both relevant and timely. To structure this review, we introduce a novel taxonomy that encompasses various state-of-the-art GNN methods used in studying causality. GNNs are further categorized based on their applications in the causality domain. We further provide an exhaustive compilation of datasets integral to causal learning with GNNs to serve as a resource for practical study. This review also touches upon the application of causal learning across diverse sectors. We conclude the review with insights into potential challenges and promising avenues for future exploration in this rapidly evolving field of machine learning.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"97 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144087994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Loan T. T. Nguyen, Trang T. D. Nguyen, Quang-Thinh Bui, Bay Vo
Geospatial data enhances traditional datasets by integrating spatial and temporal dimensions, facilitating advanced visualizations and comprehensive analytical insights. As a fundamental aspect of geospatial analytics, geospatial data clustering (GDC) has become a prominent area of academic research, playing a critical role in theoretical exploration and applied domains. GDC seeks to group geospatial objects based on inherent similarities, a necessity driven by modern datasets' increasing scale and complexity, particularly those within geographic information systems (GIS). This paper highlights key challenges and advancements in GDC, including spatial data clustering (SDC), clustering techniques within GIS, and algorithms designed for geospatial data clustering in network spaces (GDC in NS). Practical implementations of these methodologies encompass diverse applications such as hotspot analysis, infectious disease monitoring, transportation optimization, urban traffic management, and emergency response planning. These contributions are foundational for advancing scholarly research and addressing domain-specific challenges in this field.
地理空间数据通过整合空间和时间维度,促进高级可视化和全面分析见解,增强了传统数据集。地理空间数据聚类作为地理空间分析的一个基础方面,在理论探索和应用领域中发挥着重要作用,已成为一个重要的学术研究领域。GDC寻求基于内在相似性对地理空间对象进行分组,这是现代数据集日益增长的规模和复杂性,特别是地理信息系统(GIS)中的数据集所驱动的必要条件。本文重点介绍了GDC的主要挑战和进展,包括空间数据聚类(SDC)、GIS中的聚类技术以及为网络空间中的地理空间数据聚类设计的算法(GDC in NS)。这些方法的实际应用包括热点分析、传染病监测、交通优化、城市交通管理和应急响应规划等多种应用。这些贡献是推进学术研究和解决该领域特定领域挑战的基础。
{"title":"Geospatial Data Clustering in Network Space: A Survey","authors":"Loan T. T. Nguyen, Trang T. D. Nguyen, Quang-Thinh Bui, Bay Vo","doi":"10.1002/widm.70023","DOIUrl":"https://doi.org/10.1002/widm.70023","url":null,"abstract":"Geospatial data enhances traditional datasets by integrating spatial and temporal dimensions, facilitating advanced visualizations and comprehensive analytical insights. As a fundamental aspect of geospatial analytics, geospatial data clustering (GDC) has become a prominent area of academic research, playing a critical role in theoretical exploration and applied domains. GDC seeks to group geospatial objects based on inherent similarities, a necessity driven by modern datasets' increasing scale and complexity, particularly those within geographic information systems (GIS). This paper highlights key challenges and advancements in GDC, including spatial data clustering (SDC), clustering techniques within GIS, and algorithms designed for geospatial data clustering in network spaces (GDC in NS). Practical implementations of these methodologies encompass diverse applications such as hotspot analysis, infectious disease monitoring, transportation optimization, urban traffic management, and emergency response planning. These contributions are foundational for advancing scholarly research and addressing domain-specific challenges in this field.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144066652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abdul Aziz Noor, Awais Manzoor, Muhammad Deedahwar Mazhar Qureshi, M. Atif Qureshi, Wael Rashwan
This overview investigates the evolution and current landscape of eXplainable Artificial Intelligence (XAI) in healthcare, highlighting its implications for researchers, technology developers, and policymakers. Following the PRISMA protocol, we analyzed 89 publications from January 2000 to June 2024, spanning 19 medical domains, with a focus on Neurology and Cancer as the most studied areas. Various data types are reviewed, including tabular data, medical imaging, and clinical text, offering a comprehensive perspective on XAI applications. Key findings identify significant gaps, such as the limited availability of public datasets, suboptimal data preprocessing techniques, insufficient feature selection and engineering, and the limited utilization of multiple XAI methods. Additionally, the lack of standardized XAI evaluation metrics and practical obstacles in integrating XAI systems into clinical workflows are emphasized. We provide actionable recommendations, including the design of explainability‐centric models, the application of diverse and multiple XAI methods, and the fostering of interdisciplinary collaboration. These strategies aim to guide researchers in building robust AI models, assist technology developers in creating intuitive and user‐friendly AI tools, and inform policymakers in establishing effective regulations. Addressing these gaps will promote the development of transparent, reliable, and user‐centred AI systems in healthcare, ultimately improving decision‐making and patient outcomes.
{"title":"Unveiling Explainable AI in Healthcare: Current Trends, Challenges, and Future Directions","authors":"Abdul Aziz Noor, Awais Manzoor, Muhammad Deedahwar Mazhar Qureshi, M. Atif Qureshi, Wael Rashwan","doi":"10.1002/widm.70018","DOIUrl":"https://doi.org/10.1002/widm.70018","url":null,"abstract":"This overview investigates the evolution and current landscape of eXplainable Artificial Intelligence (XAI) in healthcare, highlighting its implications for researchers, technology developers, and policymakers. Following the PRISMA protocol, we analyzed 89 publications from January 2000 to June 2024, spanning 19 medical domains, with a focus on Neurology and Cancer as the most studied areas. Various data types are reviewed, including tabular data, medical imaging, and clinical text, offering a comprehensive perspective on XAI applications. Key findings identify significant gaps, such as the limited availability of public datasets, suboptimal data preprocessing techniques, insufficient feature selection and engineering, and the limited utilization of multiple XAI methods. Additionally, the lack of standardized XAI evaluation metrics and practical obstacles in integrating XAI systems into clinical workflows are emphasized. We provide actionable recommendations, including the design of explainability‐centric models, the application of diverse and multiple XAI methods, and the fostering of interdisciplinary collaboration. These strategies aim to guide researchers in building robust AI models, assist technology developers in creating intuitive and user‐friendly AI tools, and inform policymakers in establishing effective regulations. Addressing these gaps will promote the development of transparent, reliable, and user‐centred AI systems in healthcare, ultimately improving decision‐making and patient outcomes.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143933226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The maintenance advancements achieved in Industry 4.0 generate large amounts of data, necessitating complete, accurate, and precise labels for training datasets to align with corresponding ground truth. These labels serve as annotations for early anomaly detection. Delivering high‐quality annotations derived from weak labels and striking a balance between annotation efforts and accuracy are critical tasks. Consequently, researchers have focused their attention on Weakly Supervised Learning methods, which have shown effectiveness in handling datasets characterized by incomplete, imprecise, and erroneous labels across various maintenance applications. In this survey, the authors aim to address a gap in the existing literature by conducting a comprehensive examination of Weakly Supervised Learning for Predictive Maintenance, categorizing related works. Furthermore, the survey discusses challenges and identifies open research lines.
{"title":"Weak Supervision: A Survey on Predictive Maintenance","authors":"Antonio M. Martínez‐Heredia, Sebastián Ventura","doi":"10.1002/widm.70022","DOIUrl":"https://doi.org/10.1002/widm.70022","url":null,"abstract":"The maintenance advancements achieved in Industry 4.0 generate large amounts of data, necessitating complete, accurate, and precise labels for training datasets to align with corresponding ground truth. These labels serve as annotations for early anomaly detection. Delivering high‐quality annotations derived from weak labels and striking a balance between annotation efforts and accuracy are critical tasks. Consequently, researchers have focused their attention on Weakly Supervised Learning methods, which have shown effectiveness in handling datasets characterized by incomplete, imprecise, and erroneous labels across various maintenance applications. In this survey, the authors aim to address a gap in the existing literature by conducting a comprehensive examination of Weakly Supervised Learning for Predictive Maintenance, categorizing related works. Furthermore, the survey discusses challenges and identifies open research lines.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143933235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}