Ali Kohan, Amir Zahedi, Roohallah Alizadehsani, Ru‐San Tan, U. Rajendra Acharya
Intracranial hemorrhage (IH) is a critical condition requiring rapid and accurate diagnosis to ensure effective treatment and reduce mortality rates. Recently, artificial intelligence (AI) models have demonstrated significant potential in automating the detection and analysis of brain injuries in IH patients. However, the “black‐box” nature of many AI systems raises concerns about transparency, reliability, and clinical applicability. Explainable AI (XAI) addresses these challenges by making AI models more interpretable, allowing healthcare professionals to understand and trust the decision‐making processes. This review paper explores various XAI techniques—such as SHapley Additive exPlanations (SHAP), Local Interpretable Model‐Agnostic Explanations (LIME), Randomized Input Sampling for Explanation (RISE), Class Activation Mapping (CAM), and its variants—and their specific applications in IH clinical tasks. We systematically examine studies incorporating XAI for curing IH patients, highlighting how these methods enhance model transparency and support clinical decision‐making. The Preferred Reporting Items for Systematic Reviews and Meta‐Analyses (PRISMA) methodology was employed to select the papers. Studies are categorized into those using tabular data and those using image data. The literature indicates a rapidly growing number of XAI publications in this field. SHAP is the most commonly used XAI method for tabular data, while CAM‐based methods, such as Grad‐CAM, dominate in image‐based applications. Furthermore, we discuss current limitations of XAI methods and future research directions. This review aims to provide researchers and clinicians with valuable insights into the role of XAI in improving the reliability and practical integration of AI‐driven tools for IH patient care.This article is categorized under: Application Areas > Health CareFundamental Concepts of Data and Knowledge > Explainable AITechnologies > Machine Learning
{"title":"Application of Explainable Artificial Intelligence (XAI) Techniques in Patients With Intracranial Hemorrhage: A Systematic Review","authors":"Ali Kohan, Amir Zahedi, Roohallah Alizadehsani, Ru‐San Tan, U. Rajendra Acharya","doi":"10.1002/widm.70031","DOIUrl":"https://doi.org/10.1002/widm.70031","url":null,"abstract":"Intracranial hemorrhage (IH) is a critical condition requiring rapid and accurate diagnosis to ensure effective treatment and reduce mortality rates. Recently, artificial intelligence (AI) models have demonstrated significant potential in automating the detection and analysis of brain injuries in IH patients. However, the “black‐box” nature of many AI systems raises concerns about transparency, reliability, and clinical applicability. Explainable AI (XAI) addresses these challenges by making AI models more interpretable, allowing healthcare professionals to understand and trust the decision‐making processes. This review paper explores various XAI techniques—such as SHapley Additive exPlanations (SHAP), Local Interpretable Model‐Agnostic Explanations (LIME), Randomized Input Sampling for Explanation (RISE), Class Activation Mapping (CAM), and its variants—and their specific applications in IH clinical tasks. We systematically examine studies incorporating XAI for curing IH patients, highlighting how these methods enhance model transparency and support clinical decision‐making. The Preferred Reporting Items for Systematic Reviews and Meta‐Analyses (PRISMA) methodology was employed to select the papers. Studies are categorized into those using tabular data and those using image data. The literature indicates a rapidly growing number of XAI publications in this field. SHAP is the most commonly used XAI method for tabular data, while CAM‐based methods, such as Grad‐CAM, dominate in image‐based applications. Furthermore, we discuss current limitations of XAI methods and future research directions. This review aims to provide researchers and clinicians with valuable insights into the role of XAI in improving the reliability and practical integration of AI‐driven tools for IH patient care.This article is categorized under: <jats:list list-type=\"simple\"> <jats:list-item>Application Areas > Health Care</jats:list-item> <jats:list-item>Fundamental Concepts of Data and Knowledge > Explainable AI</jats:list-item> <jats:list-item>Technologies > Machine Learning</jats:list-item> </jats:list>","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144503560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mangayarkarasi Ramaiah, Prabhavathy Settu, Vinayakumar Ravi
Forecasting soil moisture is critical for keeping groundwater levels stable, monitoring droughts, and assisting agricultural productivity. Surface soil moisture has a tremendous impact on both the environment and society. To provide proper soil moisture, the right tools are required. Gravimetric, physical, and empirical models produce reliable results, but they are generally context‐dependent and inappropriate for large‐scale investigations. Remote sensing has developed as a credible technology for estimating large‐scale soil moisture levels. However, various obstacles exist when getting soil moisture data using remote sensing, including the availability and precision of data sources. The spatial and temporal limits of many remote sensing sources, such as microwave and optical sensors, combined with environmental conditions, provide considerable feasibility issues. As a result, a robust model capable of accurately capturing both linear and nonlinear connections between multiple surface soil variables is critical. Recently, AI approaches have been identified as promising options for managing complicated factors in this domain. This review paper investigates the use of several AI algorithms for estimating soil moisture content (SMC). It focusses on AI‐enabled frameworks built with remote sensing satellite imagery. In addition to including in situ observations, the study discusses the advantages of AI approaches, the issues they solve, and provides a detailed description of the integration of microwave, optical, and combination (synergistic) data sources. This paper also addresses the most common AI approaches applied with various types of remote sensing data and the results they produced. By exploring the strengths and technical problems associated with diverse data sources, this work hopes to help researchers make wise choices about data selection and model construction. Finally, the proposed future research directions are likely to assist emerging researchers in broadening the scope of this critical topic in a way that corresponds with future demands.This article is categorized under: Technologies > Artificial IntelligenceTechnologies > Machine LearningTechnologies > Prediction
{"title":"Artificial Intelligence Techniques Enabled Soil Moisture Estimation Frameworks Using Remote Sensing Satellite Images: Challenges and Future Directions‐ Review","authors":"Mangayarkarasi Ramaiah, Prabhavathy Settu, Vinayakumar Ravi","doi":"10.1002/widm.70032","DOIUrl":"https://doi.org/10.1002/widm.70032","url":null,"abstract":"Forecasting soil moisture is critical for keeping groundwater levels stable, monitoring droughts, and assisting agricultural productivity. Surface soil moisture has a tremendous impact on both the environment and society. To provide proper soil moisture, the right tools are required. Gravimetric, physical, and empirical models produce reliable results, but they are generally context‐dependent and inappropriate for large‐scale investigations. Remote sensing has developed as a credible technology for estimating large‐scale soil moisture levels. However, various obstacles exist when getting soil moisture data using remote sensing, including the availability and precision of data sources. The spatial and temporal limits of many remote sensing sources, such as microwave and optical sensors, combined with environmental conditions, provide considerable feasibility issues. As a result, a robust model capable of accurately capturing both linear and nonlinear connections between multiple surface soil variables is critical. Recently, AI approaches have been identified as promising options for managing complicated factors in this domain. This review paper investigates the use of several AI algorithms for estimating soil moisture content (SMC). It focusses on AI‐enabled frameworks built with remote sensing satellite imagery. In addition to including in situ observations, the study discusses the advantages of AI approaches, the issues they solve, and provides a detailed description of the integration of microwave, optical, and combination (synergistic) data sources. This paper also addresses the most common AI approaches applied with various types of remote sensing data and the results they produced. By exploring the strengths and technical problems associated with diverse data sources, this work hopes to help researchers make wise choices about data selection and model construction. Finally, the proposed future research directions are likely to assist emerging researchers in broadening the scope of this critical topic in a way that corresponds with future demands.This article is categorized under: <jats:list list-type=\"simple\"> <jats:list-item>Technologies > Artificial Intelligence</jats:list-item> <jats:list-item>Technologies > Machine Learning</jats:list-item> <jats:list-item>Technologies > Prediction</jats:list-item> </jats:list>","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144503606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The success of social media platforms has facilitated the emergence of various forms of online abuse within digital communities. This abuse manifests in multiple ways, including hate speech, cyberbullying, emotional abuse, grooming, and shame sexting or sextortion. In this paper, we present a comprehensive analysis of the different forms of abuse prevalent in social media, with a particular focus on how emerging technologies, such as Language Models (LMs) and Large Language Models (LLMs), are reshaping both the detection and generation of abusive content within these networks. We delve into the mechanisms through which social media abuse is perpetuated, exploring the psychological and social impact. To achieve this, we conducted a literature review based on PRISMA methodology, deriving key insights in the field of cyber abuse detection. Additionally, we examine the dual role of advanced language models—highlighting their potential to enhance automated detection systems for abusive behavior while also acknowledging their capacity to generate harmful content. This paper contributes to the ongoing discourse on online safety and ethics by offering both theoretical and practical insights into the evolving landscape of cyber abuse, as well as the technological innovations that simultaneously mitigate and exacerbate it. The findings support platform administrators and policymakers in developing more effective moderation strategies, conducting comprehensive risk assessments, and integrating AI responsibly to create safer digital environments.This article is categorized under: Algorithmic Development > Web MiningTechnologies > Classification
{"title":"A Literature Review of Textual Cyber Abuse Detection Using Cutting‐Edge Natural Language Processing Techniques: Language Models and Large Language Models","authors":"J. Angel Diaz‐Garcia, Joao Paulo Carvalho","doi":"10.1002/widm.70029","DOIUrl":"https://doi.org/10.1002/widm.70029","url":null,"abstract":"The success of social media platforms has facilitated the emergence of various forms of online abuse within digital communities. This abuse manifests in multiple ways, including hate speech, cyberbullying, emotional abuse, grooming, and shame sexting or sextortion. In this paper, we present a comprehensive analysis of the different forms of abuse prevalent in social media, with a particular focus on how emerging technologies, such as Language Models (LMs) and Large Language Models (LLMs), are reshaping both the detection and generation of abusive content within these networks. We delve into the mechanisms through which social media abuse is perpetuated, exploring the psychological and social impact. To achieve this, we conducted a literature review based on PRISMA methodology, deriving key insights in the field of cyber abuse detection. Additionally, we examine the dual role of advanced language models—highlighting their potential to enhance automated detection systems for abusive behavior while also acknowledging their capacity to generate harmful content. This paper contributes to the ongoing discourse on online safety and ethics by offering both theoretical and practical insights into the evolving landscape of cyber abuse, as well as the technological innovations that simultaneously mitigate and exacerbate it. The findings support platform administrators and policymakers in developing more effective moderation strategies, conducting comprehensive risk assessments, and integrating AI responsibly to create safer digital environments.This article is categorized under: <jats:list list-type=\"simple\"> <jats:list-item>Algorithmic Development > Web Mining</jats:list-item> <jats:list-item>Technologies > Classification</jats:list-item> </jats:list>","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"67 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144503457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heterogeneous Social Networks (HSNs) represent complex structures where diverse entities, such as users, items, and interactions, coexist and interact within a unified framework. This paper offers a systematic review of HSN Analysis, addressing the theoretical and practical challenges associated with investigating the interplay between varied node types and diverse relationships within HSNs. The paper begins by defining HSNs and outlining their characteristics, highlighting the existence of diverse entity kinds and a range of relationship types. It explores the significance of HSNs in modeling real‐world systems, including online social platforms, biological networks, e‐commerce networks, and recommendation systems, where diverse entities play distinct roles. The analysis of HSNs extends beyond traditional homogeneous networks, incorporating various types of nodes and edges, and introduces novel considerations for effective analysis. The difficulties in modeling, representing, and analyzing HSNs will be covered in this work. Several reviews of social network analysis have been published in the past, but they often focus on simple networks, not HSN analysis specifically. This paper aims to fill that gap by comprehensively reviewing different aspects of HSN and its analysis. We start with the fundamentals of HSNs, explore its major types‐multi‐relational networks and multi‐modal networks and further their impact on popular data mining tasks. Then, we explore various applications of heterogeneous information network analysis, like recommender systems, text mining, fraud detection, and e‐commerce. Finally, we look at recent research and suggest promising future directions in the field of HSN analysis.
{"title":"An Overview of Heterogeneous Social Network Analysis","authors":"Deepti Singh, Ankita Verma","doi":"10.1002/widm.70028","DOIUrl":"https://doi.org/10.1002/widm.70028","url":null,"abstract":"Heterogeneous Social Networks (HSNs) represent complex structures where diverse entities, such as users, items, and interactions, coexist and interact within a unified framework. This paper offers a systematic review of HSN Analysis, addressing the theoretical and practical challenges associated with investigating the interplay between varied node types and diverse relationships within HSNs. The paper begins by defining HSNs and outlining their characteristics, highlighting the existence of diverse entity kinds and a range of relationship types. It explores the significance of HSNs in modeling real‐world systems, including online social platforms, biological networks, e‐commerce networks, and recommendation systems, where diverse entities play distinct roles. The analysis of HSNs extends beyond traditional homogeneous networks, incorporating various types of nodes and edges, and introduces novel considerations for effective analysis. The difficulties in modeling, representing, and analyzing HSNs will be covered in this work. Several reviews of social network analysis have been published in the past, but they often focus on simple networks, not HSN analysis specifically. This paper aims to fill that gap by comprehensively reviewing different aspects of HSN and its analysis. We start with the fundamentals of HSNs, explore its major types‐multi‐relational networks and multi‐modal networks and further their impact on popular data mining tasks. Then, we explore various applications of heterogeneous information network analysis, like recommender systems, text mining, fraud detection, and e‐commerce. Finally, we look at recent research and suggest promising future directions in the field of HSN analysis.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144288333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automating vehicle damage detection is essential for automotive industry applications like insurance claims, online sales, and repair cost estimates, addressing the labor-intensive, time-consuming, and error-prone nature of current manual inspections. This systematic literature review explores the use of artificial intelligence (AI), particularly deep learning-based algorithms, to improve the accuracy and efficiency of damage detection under dynamic and challenging conditions specific to the requirements of our industry partners. The review is structured around five key research questions and includes extensive empirical evaluations to identify gaps and challenges in existing methods. Findings reveal significant potential for AI to automate and enhance the damage detection process but also highlight areas requiring further research and development. The review discusses these gaps in detail, providing a comprehensive foundation for future work in this field. Furthermore, the review findings are intended to guide both our research and the broader research community in advancing the practical application of AI for vehicle damage assessment. The insights gained from this review are crucial for developing robust AI solutions that can operate effectively in real-world scenarios, ultimately improving operational efficiency and customer experience in the automotive industry.
{"title":"Vehicle Damage Detection Using Artificial Intelligence: A Systematic Literature Review","authors":"Md Jahid Hasan, Cong Kha Nguyen, Yee Ling Boo, Hamed Jahani, Kok-Leong Ong","doi":"10.1002/widm.70027","DOIUrl":"https://doi.org/10.1002/widm.70027","url":null,"abstract":"Automating vehicle damage detection is essential for automotive industry applications like insurance claims, online sales, and repair cost estimates, addressing the labor-intensive, time-consuming, and error-prone nature of current manual inspections. This systematic literature review explores the use of artificial intelligence (AI), particularly deep learning-based algorithms, to improve the accuracy and efficiency of damage detection under dynamic and challenging conditions specific to the requirements of our industry partners. The review is structured around five key research questions and includes extensive empirical evaluations to identify gaps and challenges in existing methods. Findings reveal significant potential for AI to automate and enhance the damage detection process but also highlight areas requiring further research and development. The review discusses these gaps in detail, providing a comprehensive foundation for future work in this field. Furthermore, the review findings are intended to guide both our research and the broader research community in advancing the practical application of AI for vehicle damage assessment. The insights gained from this review are crucial for developing robust AI solutions that can operate effectively in real-world scenarios, ultimately improving operational efficiency and customer experience in the automotive industry.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144237453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This review paper presents a comprehensive analysis of the memetic algorithms (MAs) for feature selection (FS), particularly in high‐dimensional datasets. MAs effectively address the challenges of feature selection by combining the global exploration capabilities of evolutionary algorithms with the local optimization of search techniques. Their hybrid nature makes them well suited for tackling the complexity, scalability, and computational demands of FS problems across various domains, including bioinformatics, image processing, and financial forecasting. This review highlights the recent advancements, customized variants, and practical applications of MA‐based FS methods while providing critical insights into their limitations, such as computational overhead and overfitting. Additionally, the paper outlines future research directions to further enhance the efficacy of MAs in feature selection, offering a balanced perspective on their contributions to the field.
{"title":"Advances in Feature Selection Using Memetic Algorithms: A Comprehensive Review","authors":"Keerthi Gabbi Reddy, Deepasikha Mishra","doi":"10.1002/widm.70026","DOIUrl":"https://doi.org/10.1002/widm.70026","url":null,"abstract":"This review paper presents a comprehensive analysis of the memetic algorithms (MAs) for feature selection (FS), particularly in high‐dimensional datasets. MAs effectively address the challenges of feature selection by combining the global exploration capabilities of evolutionary algorithms with the local optimization of search techniques. Their hybrid nature makes them well suited for tackling the complexity, scalability, and computational demands of FS problems across various domains, including bioinformatics, image processing, and financial forecasting. This review highlights the recent advancements, customized variants, and practical applications of MA‐based FS methods while providing critical insights into their limitations, such as computational overhead and overfitting. Additionally, the paper outlines future research directions to further enhance the efficacy of MAs in feature selection, offering a balanced perspective on their contributions to the field.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144201657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the growing reliance on LLMs for a wide range of NLP tasks, optimizing the use of labeled and unlabeled data for effective context generation has become critical. This work explores the interplay between two prominent methodologies in few-shot learning: in-context learning (ICL), which utilizes labeled task-specific data, and retrieval-augmented generation (RAG), which leverages unlabeled external knowledge to augment generative models. Since each has its individual limitations, we propose a novel hybrid approach to obtain “the best of both worlds” by dynamically integrating both labeled and unlabeled data towards improving the downstream performance of LLMs. Our methodology, which we call LU-RAG (labeled and unlabeled RAG), recomputes the scores of top-k labeled instances and top-m unlabeled passages to refine context selection. Our experimental results demonstrate that LU-RAG consistently outperforms both standalone ICL and RAG across multiple benchmarks, showing significant gains in downstream performance. Furthermore, we show that LU-RAG performs better with a semantic neighborhood as compared to a lexical one, highlighting its ability to generalize effectively.
{"title":"The “Curious Case of Contexts” in Retrieval-Augmented Generation With a Combination of Labeled and Unlabeled Data","authors":"Payel Santra, Madhusudan Ghosh, Debasis Ganguly, Partha Basuchowdhuri, Sudip Kumar Naskar","doi":"10.1002/widm.70021","DOIUrl":"https://doi.org/10.1002/widm.70021","url":null,"abstract":"With the growing reliance on LLMs for a wide range of NLP tasks, optimizing the use of labeled and unlabeled data for effective context generation has become critical. This work explores the interplay between two prominent methodologies in few-shot learning: in-context learning (ICL), which utilizes labeled task-specific data, and retrieval-augmented generation (RAG), which leverages unlabeled external knowledge to augment generative models. Since each has its individual limitations, we propose a novel hybrid approach to obtain “the best of both worlds” by dynamically integrating both labeled and unlabeled data towards improving the downstream performance of LLMs. Our methodology, which we call LU-RAG (labeled and unlabeled RAG), recomputes the scores of top-<i>k</i> labeled instances and top-<i>m</i> unlabeled passages to refine context selection. Our experimental results demonstrate that LU-RAG consistently outperforms both standalone ICL and RAG across multiple benchmarks, showing significant gains in downstream performance. Furthermore, we show that LU-RAG performs better with a semantic neighborhood as compared to a lexical one, highlighting its ability to generalize effectively.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"134 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144165784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recommendation systems predict user interests and recommend items for online platforms including e-commerce, social networks, and decision systems. However, data bias has become a significant obstacle, severely impacting the accuracy, fairness, and reliability of recommendation results. This survey examines causal inference for optimizing recommendation systems and mitigating data bias, addressing three questions: (1) Bias types and performance impacts; (2) Causal inference mitigation methods; (3) Approach advantages, limitations, and research opportunities. The motivation for this survey stems from the limitations of traditional debiasing methods, which often fail to account for causal relationships and struggle in dynamic, real-world scenarios. Causal inference provides a robust framework for identifying and addressing the underlying causes of bias, enabling more transparent and accurate recommendation systems. Therefore, we define three critical stages of bias: bias in the data stage, model selection stage, and model evaluation stage. For each stage, causal inference-based optimization methods are introduced and critically analyzed. Unlike traditional debiasing methods, this study analyzes data augmentation and regularization techniques as potential strategies for future research. The whole research might highlight the ability of causal inference to uncover and control confounding factors, offering deeper insights into the mechanisms driving biases.
{"title":"A Survey on Causal Inference-Driven Data Bias Optimization in Recommendation Systems: Principles, Opportunities and Challenges","authors":"Yongkang Li, Xingyu Zhu, Yuheng Wu, Wenxu Zhao, Xiaona Xia","doi":"10.1002/widm.70020","DOIUrl":"https://doi.org/10.1002/widm.70020","url":null,"abstract":"Recommendation systems predict user interests and recommend items for online platforms including e-commerce, social networks, and decision systems. However, data bias has become a significant obstacle, severely impacting the accuracy, fairness, and reliability of recommendation results. This survey examines causal inference for optimizing recommendation systems and mitigating data bias, addressing three questions: (1) Bias types and performance impacts; (2) Causal inference mitigation methods; (3) Approach advantages, limitations, and research opportunities. The motivation for this survey stems from the limitations of traditional debiasing methods, which often fail to account for causal relationships and struggle in dynamic, real-world scenarios. Causal inference provides a robust framework for identifying and addressing the underlying causes of bias, enabling more transparent and accurate recommendation systems. Therefore, we define three critical stages of bias: bias in the data stage, model selection stage, and model evaluation stage. For each stage, causal inference-based optimization methods are introduced and critically analyzed. Unlike traditional debiasing methods, this study analyzes data augmentation and regularization techniques as potential strategies for future research. The whole research might highlight the ability of causal inference to uncover and control confounding factors, offering deeper insights into the mechanisms driving biases.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144130746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Artificial intelligence (AI) is emerging as a transforming force in waste management practices, enabling new ways of bringing efficiency and effectiveness. This survey presents methods related to waste management, which are categorized systematically for understanding the effectiveness of various AI-based techniques. The study undertakes a critical review of relevant research works that epitomize major advances and methodologies of AI-driven waste management. The manuscript provides an exhaustive taxonomy, dividing AI methods into Supervised Learning, Unsupervised Learning, and Reinforcement Learning, and then subdividing Supervised Learning into four broad categories: Machine Learning-based Classification, CNNs, Transfer Learning, and Hybrid or Ensemble Learning. We further evaluate different datasets applied in performance benchmarking and the efficacy of the various AI models. We also discuss some critical issues, such as the problem of available data quality, poor generalization of models, and integration of systems. Future research directions, which would go a long way toward helping to surmount these challenges, are also discussed. This survey aims to present a structured framework for understanding current AI applications in waste management, therefore guiding ongoing and future research in the field.
{"title":"Artificial Intelligence-Based Waste Management: A Review of Classification, Techniques, Issues, and Challenges","authors":"Dhanashree Vipul Yevle, Palvinder Singh Mann","doi":"10.1002/widm.70025","DOIUrl":"https://doi.org/10.1002/widm.70025","url":null,"abstract":"Artificial intelligence (AI) is emerging as a transforming force in waste management practices, enabling new ways of bringing efficiency and effectiveness. This survey presents methods related to waste management, which are categorized systematically for understanding the effectiveness of various AI-based techniques. The study undertakes a critical review of relevant research works that epitomize major advances and methodologies of AI-driven waste management. The manuscript provides an exhaustive taxonomy, dividing AI methods into Supervised Learning, Unsupervised Learning, and Reinforcement Learning, and then subdividing Supervised Learning into four broad categories: Machine Learning-based Classification, CNNs, Transfer Learning, and Hybrid or Ensemble Learning. We further evaluate different datasets applied in performance benchmarking and the efficacy of the various AI models. We also discuss some critical issues, such as the problem of available data quality, poor generalization of models, and integration of systems. Future research directions, which would go a long way toward helping to surmount these challenges, are also discussed. This survey aims to present a structured framework for understanding current AI applications in waste management, therefore guiding ongoing and future research in the field.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144087993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simi Job, Xiaohui Tao, Taotao Cai, Haoran Xie, Lin Li, Qing Li, Jianming Yong
In machine learning, exploring data correlations to predict outcomes is a fundamental task. Recognizing causal relationships embedded within data is pivotal for a comprehensive understanding of system dynamics, the significance of which is paramount in data-driven decision-making processes. Beyond traditional methods, there has been a shift toward using graph neural networks (GNNs) for causal learning, given their capabilities as universal data approximators. Thus, a thorough review of the advancements in causal learning using GNNs is both relevant and timely. To structure this review, we introduce a novel taxonomy that encompasses various state-of-the-art GNN methods used in studying causality. GNNs are further categorized based on their applications in the causality domain. We further provide an exhaustive compilation of datasets integral to causal learning with GNNs to serve as a resource for practical study. This review also touches upon the application of causal learning across diverse sectors. We conclude the review with insights into potential challenges and promising avenues for future exploration in this rapidly evolving field of machine learning.
{"title":"Exploring Causal Learning Through Graph Neural Networks: An In-Depth Review","authors":"Simi Job, Xiaohui Tao, Taotao Cai, Haoran Xie, Lin Li, Qing Li, Jianming Yong","doi":"10.1002/widm.70024","DOIUrl":"https://doi.org/10.1002/widm.70024","url":null,"abstract":"In machine learning, exploring data correlations to predict outcomes is a fundamental task. Recognizing causal relationships embedded within data is pivotal for a comprehensive understanding of system dynamics, the significance of which is paramount in data-driven decision-making processes. Beyond traditional methods, there has been a shift toward using graph neural networks (GNNs) for causal learning, given their capabilities as universal data approximators. Thus, a thorough review of the advancements in causal learning using GNNs is both relevant and timely. To structure this review, we introduce a novel taxonomy that encompasses various state-of-the-art GNN methods used in studying causality. GNNs are further categorized based on their applications in the causality domain. We further provide an exhaustive compilation of datasets integral to causal learning with GNNs to serve as a resource for practical study. This review also touches upon the application of causal learning across diverse sectors. We conclude the review with insights into potential challenges and promising avenues for future exploration in this rapidly evolving field of machine learning.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"97 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144087994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}