首页 > 最新文献

WIREs Data Mining and Knowledge Discovery最新文献

英文 中文
An Overview of Heterogeneous Social Network Analysis 异质社会网络分析综述
Pub Date : 2025-06-13 DOI: 10.1002/widm.70028
Deepti Singh, Ankita Verma
Heterogeneous Social Networks (HSNs) represent complex structures where diverse entities, such as users, items, and interactions, coexist and interact within a unified framework. This paper offers a systematic review of HSN Analysis, addressing the theoretical and practical challenges associated with investigating the interplay between varied node types and diverse relationships within HSNs. The paper begins by defining HSNs and outlining their characteristics, highlighting the existence of diverse entity kinds and a range of relationship types. It explores the significance of HSNs in modeling real‐world systems, including online social platforms, biological networks, e‐commerce networks, and recommendation systems, where diverse entities play distinct roles. The analysis of HSNs extends beyond traditional homogeneous networks, incorporating various types of nodes and edges, and introduces novel considerations for effective analysis. The difficulties in modeling, representing, and analyzing HSNs will be covered in this work. Several reviews of social network analysis have been published in the past, but they often focus on simple networks, not HSN analysis specifically. This paper aims to fill that gap by comprehensively reviewing different aspects of HSN and its analysis. We start with the fundamentals of HSNs, explore its major types‐multi‐relational networks and multi‐modal networks and further their impact on popular data mining tasks. Then, we explore various applications of heterogeneous information network analysis, like recommender systems, text mining, fraud detection, and e‐commerce. Finally, we look at recent research and suggest promising future directions in the field of HSN analysis.
异构社会网络(hsn)表示复杂的结构,其中不同的实体(如用户、项目和交互)共存,并在统一的框架内进行交互。本文对HSN分析进行了系统回顾,解决了与调查HSN内不同节点类型和不同关系之间相互作用相关的理论和实践挑战。本文首先定义了hsn并概述了其特征,强调了不同实体类型和一系列关系类型的存在。它探讨了hsn在建模现实世界系统中的重要性,包括在线社交平台、生物网络、电子商务网络和推荐系统,在这些系统中,不同的实体扮演着不同的角色。hsn的分析超越了传统的同构网络,纳入了各种类型的节点和边缘,并为有效分析引入了新的考虑因素。本文将讨论hsn在建模、表示和分析方面的困难。过去已经发表了一些关于社会网络分析的评论,但它们通常侧重于简单的网络,而不是专门针对HSN的分析。本文旨在通过全面回顾HSN的不同方面及其分析来填补这一空白。我们从hsn的基础开始,探索其主要类型——多关系网络和多模态网络,并进一步探讨它们对流行数据挖掘任务的影响。然后,我们探索了异构信息网络分析的各种应用,如推荐系统、文本挖掘、欺诈检测和电子商务。最后,我们回顾了最近的研究,并提出了HSN分析领域的未来发展方向。
{"title":"An Overview of Heterogeneous Social Network Analysis","authors":"Deepti Singh, Ankita Verma","doi":"10.1002/widm.70028","DOIUrl":"https://doi.org/10.1002/widm.70028","url":null,"abstract":"Heterogeneous Social Networks (HSNs) represent complex structures where diverse entities, such as users, items, and interactions, coexist and interact within a unified framework. This paper offers a systematic review of HSN Analysis, addressing the theoretical and practical challenges associated with investigating the interplay between varied node types and diverse relationships within HSNs. The paper begins by defining HSNs and outlining their characteristics, highlighting the existence of diverse entity kinds and a range of relationship types. It explores the significance of HSNs in modeling real‐world systems, including online social platforms, biological networks, e‐commerce networks, and recommendation systems, where diverse entities play distinct roles. The analysis of HSNs extends beyond traditional homogeneous networks, incorporating various types of nodes and edges, and introduces novel considerations for effective analysis. The difficulties in modeling, representing, and analyzing HSNs will be covered in this work. Several reviews of social network analysis have been published in the past, but they often focus on simple networks, not HSN analysis specifically. This paper aims to fill that gap by comprehensively reviewing different aspects of HSN and its analysis. We start with the fundamentals of HSNs, explore its major types‐multi‐relational networks and multi‐modal networks and further their impact on popular data mining tasks. Then, we explore various applications of heterogeneous information network analysis, like recommender systems, text mining, fraud detection, and e‐commerce. Finally, we look at recent research and suggest promising future directions in the field of HSN analysis.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144288333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vehicle Damage Detection Using Artificial Intelligence: A Systematic Literature Review 基于人工智能的车辆损伤检测:系统的文献综述
Pub Date : 2025-06-07 DOI: 10.1002/widm.70027
Md Jahid Hasan, Cong Kha Nguyen, Yee Ling Boo, Hamed Jahani, Kok-Leong Ong
Automating vehicle damage detection is essential for automotive industry applications like insurance claims, online sales, and repair cost estimates, addressing the labor-intensive, time-consuming, and error-prone nature of current manual inspections. This systematic literature review explores the use of artificial intelligence (AI), particularly deep learning-based algorithms, to improve the accuracy and efficiency of damage detection under dynamic and challenging conditions specific to the requirements of our industry partners. The review is structured around five key research questions and includes extensive empirical evaluations to identify gaps and challenges in existing methods. Findings reveal significant potential for AI to automate and enhance the damage detection process but also highlight areas requiring further research and development. The review discusses these gaps in detail, providing a comprehensive foundation for future work in this field. Furthermore, the review findings are intended to guide both our research and the broader research community in advancing the practical application of AI for vehicle damage assessment. The insights gained from this review are crucial for developing robust AI solutions that can operate effectively in real-world scenarios, ultimately improving operational efficiency and customer experience in the automotive industry.
自动车辆损坏检测对于保险索赔、在线销售和维修成本估算等汽车行业应用至关重要,它解决了当前人工检查的劳动密集型、耗时和容易出错的特点。本系统的文献综述探讨了人工智能(AI)的使用,特别是基于深度学习的算法,以提高在动态和具有挑战性的条件下的损伤检测的准确性和效率,以满足我们的行业合作伙伴的特定要求。这篇综述围绕五个关键研究问题展开,并包括广泛的实证评估,以确定现有方法中的差距和挑战。研究结果揭示了人工智能在自动化和增强损伤检测过程方面的巨大潜力,但也强调了需要进一步研究和开发的领域。本文详细讨论了这些差距,为今后在这一领域的工作提供了全面的基础。此外,审查结果旨在指导我们的研究和更广泛的研究界推进人工智能在车辆损伤评估中的实际应用。从本次审查中获得的见解对于开发强大的人工智能解决方案至关重要,这些解决方案可以在现实场景中有效运行,最终提高汽车行业的运营效率和客户体验。
{"title":"Vehicle Damage Detection Using Artificial Intelligence: A Systematic Literature Review","authors":"Md Jahid Hasan, Cong Kha Nguyen, Yee Ling Boo, Hamed Jahani, Kok-Leong Ong","doi":"10.1002/widm.70027","DOIUrl":"https://doi.org/10.1002/widm.70027","url":null,"abstract":"Automating vehicle damage detection is essential for automotive industry applications like insurance claims, online sales, and repair cost estimates, addressing the labor-intensive, time-consuming, and error-prone nature of current manual inspections. This systematic literature review explores the use of artificial intelligence (AI), particularly deep learning-based algorithms, to improve the accuracy and efficiency of damage detection under dynamic and challenging conditions specific to the requirements of our industry partners. The review is structured around five key research questions and includes extensive empirical evaluations to identify gaps and challenges in existing methods. Findings reveal significant potential for AI to automate and enhance the damage detection process but also highlight areas requiring further research and development. The review discusses these gaps in detail, providing a comprehensive foundation for future work in this field. Furthermore, the review findings are intended to guide both our research and the broader research community in advancing the practical application of AI for vehicle damage assessment. The insights gained from this review are crucial for developing robust AI solutions that can operate effectively in real-world scenarios, ultimately improving operational efficiency and customer experience in the automotive industry.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144237453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advances in Feature Selection Using Memetic Algorithms: A Comprehensive Review 基于模因算法的特征选择研究进展
Pub Date : 2025-06-03 DOI: 10.1002/widm.70026
Keerthi Gabbi Reddy, Deepasikha Mishra
This review paper presents a comprehensive analysis of the memetic algorithms (MAs) for feature selection (FS), particularly in high‐dimensional datasets. MAs effectively address the challenges of feature selection by combining the global exploration capabilities of evolutionary algorithms with the local optimization of search techniques. Their hybrid nature makes them well suited for tackling the complexity, scalability, and computational demands of FS problems across various domains, including bioinformatics, image processing, and financial forecasting. This review highlights the recent advancements, customized variants, and practical applications of MA‐based FS methods while providing critical insights into their limitations, such as computational overhead and overfitting. Additionally, the paper outlines future research directions to further enhance the efficacy of MAs in feature selection, offering a balanced perspective on their contributions to the field.
这篇综述文章提出了一个全面的分析模因算法(MAs)的特征选择(FS),特别是在高维数据集。MAs通过将进化算法的全局探索能力与搜索技术的局部优化能力相结合,有效地解决了特征选择的挑战。它们的混合性质使它们非常适合处理跨各个领域(包括生物信息学、图像处理和财务预测)的复杂性、可伸缩性和计算需求的FS问题。这篇综述强调了基于MA的FS方法的最新进展、定制变体和实际应用,同时提供了对其局限性的关键见解,如计算开销和过拟合。此外,本文还概述了未来的研究方向,以进一步提高MAs在特征选择方面的有效性,并对其在该领域的贡献提供了一个平衡的视角。
{"title":"Advances in Feature Selection Using Memetic Algorithms: A Comprehensive Review","authors":"Keerthi Gabbi Reddy, Deepasikha Mishra","doi":"10.1002/widm.70026","DOIUrl":"https://doi.org/10.1002/widm.70026","url":null,"abstract":"This review paper presents a comprehensive analysis of the memetic algorithms (MAs) for feature selection (FS), particularly in high‐dimensional datasets. MAs effectively address the challenges of feature selection by combining the global exploration capabilities of evolutionary algorithms with the local optimization of search techniques. Their hybrid nature makes them well suited for tackling the complexity, scalability, and computational demands of FS problems across various domains, including bioinformatics, image processing, and financial forecasting. This review highlights the recent advancements, customized variants, and practical applications of MA‐based FS methods while providing critical insights into their limitations, such as computational overhead and overfitting. Additionally, the paper outlines future research directions to further enhance the efficacy of MAs in feature selection, offering a balanced perspective on their contributions to the field.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144201657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The “Curious Case of Contexts” in Retrieval-Augmented Generation With a Combination of Labeled and Unlabeled Data 标记与未标记数据结合的检索增强生成中的“上下文奇特案例”
Pub Date : 2025-05-29 DOI: 10.1002/widm.70021
Payel Santra, Madhusudan Ghosh, Debasis Ganguly, Partha Basuchowdhuri, Sudip Kumar Naskar
With the growing reliance on LLMs for a wide range of NLP tasks, optimizing the use of labeled and unlabeled data for effective context generation has become critical. This work explores the interplay between two prominent methodologies in few-shot learning: in-context learning (ICL), which utilizes labeled task-specific data, and retrieval-augmented generation (RAG), which leverages unlabeled external knowledge to augment generative models. Since each has its individual limitations, we propose a novel hybrid approach to obtain “the best of both worlds” by dynamically integrating both labeled and unlabeled data towards improving the downstream performance of LLMs. Our methodology, which we call LU-RAG (labeled and unlabeled RAG), recomputes the scores of top-k labeled instances and top-m unlabeled passages to refine context selection. Our experimental results demonstrate that LU-RAG consistently outperforms both standalone ICL and RAG across multiple benchmarks, showing significant gains in downstream performance. Furthermore, we show that LU-RAG performs better with a semantic neighborhood as compared to a lexical one, highlighting its ability to generalize effectively.
随着越来越多的NLP任务依赖于llm,优化标记和未标记数据的使用以有效生成上下文变得至关重要。这项工作探讨了在少量学习中两种突出方法之间的相互作用:上下文学习(ICL),它利用标记的任务特定数据,以及检索增强生成(RAG),它利用未标记的外部知识来增强生成模型。由于每种方法都有其各自的局限性,我们提出了一种新的混合方法,通过动态集成标记和未标记的数据来提高llm的下游性能,从而获得“两全其美”。我们的方法,我们称之为LU-RAG(标记和未标记的RAG),重新计算前k个标记实例和前m个未标记段落的分数,以改进上下文选择。我们的实验结果表明,在多个基准测试中,LU-RAG始终优于独立的ICL和RAG,显示出下游性能的显著提高。此外,我们表明,与词汇邻域相比,LU-RAG在语义邻域上的表现更好,突出了其有效泛化的能力。
{"title":"The “Curious Case of Contexts” in Retrieval-Augmented Generation With a Combination of Labeled and Unlabeled Data","authors":"Payel Santra, Madhusudan Ghosh, Debasis Ganguly, Partha Basuchowdhuri, Sudip Kumar Naskar","doi":"10.1002/widm.70021","DOIUrl":"https://doi.org/10.1002/widm.70021","url":null,"abstract":"With the growing reliance on LLMs for a wide range of NLP tasks, optimizing the use of labeled and unlabeled data for effective context generation has become critical. This work explores the interplay between two prominent methodologies in few-shot learning: in-context learning (ICL), which utilizes labeled task-specific data, and retrieval-augmented generation (RAG), which leverages unlabeled external knowledge to augment generative models. Since each has its individual limitations, we propose a novel hybrid approach to obtain “the best of both worlds” by dynamically integrating both labeled and unlabeled data towards improving the downstream performance of LLMs. Our methodology, which we call LU-RAG (labeled and unlabeled RAG), recomputes the scores of top-<i>k</i> labeled instances and top-<i>m</i> unlabeled passages to refine context selection. Our experimental results demonstrate that LU-RAG consistently outperforms both standalone ICL and RAG across multiple benchmarks, showing significant gains in downstream performance. Furthermore, we show that LU-RAG performs better with a semantic neighborhood as compared to a lexical one, highlighting its ability to generalize effectively.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"134 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144165784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Survey on Causal Inference-Driven Data Bias Optimization in Recommendation Systems: Principles, Opportunities and Challenges 推荐系统中因果推理驱动的数据偏差优化研究:原则、机遇与挑战
Pub Date : 2025-05-24 DOI: 10.1002/widm.70020
Yongkang Li, Xingyu Zhu, Yuheng Wu, Wenxu Zhao, Xiaona Xia
Recommendation systems predict user interests and recommend items for online platforms including e-commerce, social networks, and decision systems. However, data bias has become a significant obstacle, severely impacting the accuracy, fairness, and reliability of recommendation results. This survey examines causal inference for optimizing recommendation systems and mitigating data bias, addressing three questions: (1) Bias types and performance impacts; (2) Causal inference mitigation methods; (3) Approach advantages, limitations, and research opportunities. The motivation for this survey stems from the limitations of traditional debiasing methods, which often fail to account for causal relationships and struggle in dynamic, real-world scenarios. Causal inference provides a robust framework for identifying and addressing the underlying causes of bias, enabling more transparent and accurate recommendation systems. Therefore, we define three critical stages of bias: bias in the data stage, model selection stage, and model evaluation stage. For each stage, causal inference-based optimization methods are introduced and critically analyzed. Unlike traditional debiasing methods, this study analyzes data augmentation and regularization techniques as potential strategies for future research. The whole research might highlight the ability of causal inference to uncover and control confounding factors, offering deeper insights into the mechanisms driving biases.
推荐系统预测用户兴趣并为在线平台推荐项目,包括电子商务、社交网络和决策系统。然而,数据偏差已经成为一个重要的障碍,严重影响了推荐结果的准确性、公平性和可靠性。本研究探讨了优化推荐系统和减轻数据偏差的因果推理,解决了三个问题:(1)偏差类型和性能影响;(2)因果推理缓解方法;(3)方法优势、局限性和研究机会。这项调查的动机源于传统的去偏方法的局限性,这些方法往往不能解释因果关系,并且在动态的现实世界场景中挣扎。因果推理为识别和解决偏见的潜在原因提供了一个强大的框架,使推荐系统更加透明和准确。因此,我们定义了偏差的三个关键阶段:数据阶段的偏差,模型选择阶段和模型评估阶段。对于每个阶段,介绍了基于因果推理的优化方法并进行了批判性分析。与传统的去偏方法不同,本研究分析了数据增强和正则化技术作为未来研究的潜在策略。整个研究可能会突出因果推理发现和控制混杂因素的能力,为驱动偏见的机制提供更深入的见解。
{"title":"A Survey on Causal Inference-Driven Data Bias Optimization in Recommendation Systems: Principles, Opportunities and Challenges","authors":"Yongkang Li, Xingyu Zhu, Yuheng Wu, Wenxu Zhao, Xiaona Xia","doi":"10.1002/widm.70020","DOIUrl":"https://doi.org/10.1002/widm.70020","url":null,"abstract":"Recommendation systems predict user interests and recommend items for online platforms including e-commerce, social networks, and decision systems. However, data bias has become a significant obstacle, severely impacting the accuracy, fairness, and reliability of recommendation results. This survey examines causal inference for optimizing recommendation systems and mitigating data bias, addressing three questions: (1) Bias types and performance impacts; (2) Causal inference mitigation methods; (3) Approach advantages, limitations, and research opportunities. The motivation for this survey stems from the limitations of traditional debiasing methods, which often fail to account for causal relationships and struggle in dynamic, real-world scenarios. Causal inference provides a robust framework for identifying and addressing the underlying causes of bias, enabling more transparent and accurate recommendation systems. Therefore, we define three critical stages of bias: bias in the data stage, model selection stage, and model evaluation stage. For each stage, causal inference-based optimization methods are introduced and critically analyzed. Unlike traditional debiasing methods, this study analyzes data augmentation and regularization techniques as potential strategies for future research. The whole research might highlight the ability of causal inference to uncover and control confounding factors, offering deeper insights into the mechanisms driving biases.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144130746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial Intelligence-Based Waste Management: A Review of Classification, Techniques, Issues, and Challenges 基于人工智能的废物管理:分类、技术、问题和挑战综述
Pub Date : 2025-05-19 DOI: 10.1002/widm.70025
Dhanashree Vipul Yevle, Palvinder Singh Mann
Artificial intelligence (AI) is emerging as a transforming force in waste management practices, enabling new ways of bringing efficiency and effectiveness. This survey presents methods related to waste management, which are categorized systematically for understanding the effectiveness of various AI-based techniques. The study undertakes a critical review of relevant research works that epitomize major advances and methodologies of AI-driven waste management. The manuscript provides an exhaustive taxonomy, dividing AI methods into Supervised Learning, Unsupervised Learning, and Reinforcement Learning, and then subdividing Supervised Learning into four broad categories: Machine Learning-based Classification, CNNs, Transfer Learning, and Hybrid or Ensemble Learning. We further evaluate different datasets applied in performance benchmarking and the efficacy of the various AI models. We also discuss some critical issues, such as the problem of available data quality, poor generalization of models, and integration of systems. Future research directions, which would go a long way toward helping to surmount these challenges, are also discussed. This survey aims to present a structured framework for understanding current AI applications in waste management, therefore guiding ongoing and future research in the field.
人工智能(AI)正在成为废物管理实践的变革力量,使提高效率和效益的新方法成为可能。本调查提出了与废物管理相关的方法,这些方法被系统地分类,以了解各种基于人工智能的技术的有效性。该研究对相关研究工作进行了批判性审查,这些研究工作集中体现了人工智能驱动的废物管理的主要进展和方法。该手稿提供了详尽的分类法,将人工智能方法分为监督学习,无监督学习和强化学习,然后将监督学习细分为四大类:基于机器学习的分类,cnn,迁移学习和混合或集成学习。我们进一步评估了应用于性能基准测试的不同数据集和各种人工智能模型的功效。我们还讨论了一些关键问题,如可用数据质量问题、模型泛化不良问题和系统集成问题。未来的研究方向,将大大有助于克服这些挑战,也进行了讨论。本调查旨在提供一个结构化框架,以了解当前人工智能在废物管理中的应用,从而指导该领域正在进行和未来的研究。
{"title":"Artificial Intelligence-Based Waste Management: A Review of Classification, Techniques, Issues, and Challenges","authors":"Dhanashree Vipul Yevle, Palvinder Singh Mann","doi":"10.1002/widm.70025","DOIUrl":"https://doi.org/10.1002/widm.70025","url":null,"abstract":"Artificial intelligence (AI) is emerging as a transforming force in waste management practices, enabling new ways of bringing efficiency and effectiveness. This survey presents methods related to waste management, which are categorized systematically for understanding the effectiveness of various AI-based techniques. The study undertakes a critical review of relevant research works that epitomize major advances and methodologies of AI-driven waste management. The manuscript provides an exhaustive taxonomy, dividing AI methods into Supervised Learning, Unsupervised Learning, and Reinforcement Learning, and then subdividing Supervised Learning into four broad categories: Machine Learning-based Classification, CNNs, Transfer Learning, and Hybrid or Ensemble Learning. We further evaluate different datasets applied in performance benchmarking and the efficacy of the various AI models. We also discuss some critical issues, such as the problem of available data quality, poor generalization of models, and integration of systems. Future research directions, which would go a long way toward helping to surmount these challenges, are also discussed. This survey aims to present a structured framework for understanding current AI applications in waste management, therefore guiding ongoing and future research in the field.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144087993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Causal Learning Through Graph Neural Networks: An In-Depth Review 通过图神经网络探索因果学习:深入回顾
Pub Date : 2025-05-19 DOI: 10.1002/widm.70024
Simi Job, Xiaohui Tao, Taotao Cai, Haoran Xie, Lin Li, Qing Li, Jianming Yong
In machine learning, exploring data correlations to predict outcomes is a fundamental task. Recognizing causal relationships embedded within data is pivotal for a comprehensive understanding of system dynamics, the significance of which is paramount in data-driven decision-making processes. Beyond traditional methods, there has been a shift toward using graph neural networks (GNNs) for causal learning, given their capabilities as universal data approximators. Thus, a thorough review of the advancements in causal learning using GNNs is both relevant and timely. To structure this review, we introduce a novel taxonomy that encompasses various state-of-the-art GNN methods used in studying causality. GNNs are further categorized based on their applications in the causality domain. We further provide an exhaustive compilation of datasets integral to causal learning with GNNs to serve as a resource for practical study. This review also touches upon the application of causal learning across diverse sectors. We conclude the review with insights into potential challenges and promising avenues for future exploration in this rapidly evolving field of machine learning.
在机器学习中,探索数据相关性以预测结果是一项基本任务。认识到数据中嵌入的因果关系对于全面理解系统动力学至关重要,其意义在数据驱动的决策过程中至关重要。除了传统方法之外,鉴于图神经网络(gnn)作为通用数据逼近器的能力,人们已经转向使用图神经网络(gnn)进行因果学习。因此,对使用gnn的因果学习的进展进行全面的回顾是相关的和及时的。为了构建这篇综述,我们引入了一种新的分类法,其中包括用于研究因果关系的各种最先进的GNN方法。gnn根据其在因果关系领域的应用进一步分类。我们进一步提供了一个详尽的数据集汇编,作为gnn因果学习的一部分,作为实际研究的资源。本综述还涉及因果学习在不同部门的应用。我们总结了这一快速发展的机器学习领域的潜在挑战和未来探索的有希望的途径。
{"title":"Exploring Causal Learning Through Graph Neural Networks: An In-Depth Review","authors":"Simi Job, Xiaohui Tao, Taotao Cai, Haoran Xie, Lin Li, Qing Li, Jianming Yong","doi":"10.1002/widm.70024","DOIUrl":"https://doi.org/10.1002/widm.70024","url":null,"abstract":"In machine learning, exploring data correlations to predict outcomes is a fundamental task. Recognizing causal relationships embedded within data is pivotal for a comprehensive understanding of system dynamics, the significance of which is paramount in data-driven decision-making processes. Beyond traditional methods, there has been a shift toward using graph neural networks (GNNs) for causal learning, given their capabilities as universal data approximators. Thus, a thorough review of the advancements in causal learning using GNNs is both relevant and timely. To structure this review, we introduce a novel taxonomy that encompasses various state-of-the-art GNN methods used in studying causality. GNNs are further categorized based on their applications in the causality domain. We further provide an exhaustive compilation of datasets integral to causal learning with GNNs to serve as a resource for practical study. This review also touches upon the application of causal learning across diverse sectors. We conclude the review with insights into potential challenges and promising avenues for future exploration in this rapidly evolving field of machine learning.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"97 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144087994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geospatial Data Clustering in Network Space: A Survey 网络空间中的地理空间数据聚类研究综述
Pub Date : 2025-05-16 DOI: 10.1002/widm.70023
Loan T. T. Nguyen, Trang T. D. Nguyen, Quang-Thinh Bui, Bay Vo
Geospatial data enhances traditional datasets by integrating spatial and temporal dimensions, facilitating advanced visualizations and comprehensive analytical insights. As a fundamental aspect of geospatial analytics, geospatial data clustering (GDC) has become a prominent area of academic research, playing a critical role in theoretical exploration and applied domains. GDC seeks to group geospatial objects based on inherent similarities, a necessity driven by modern datasets' increasing scale and complexity, particularly those within geographic information systems (GIS). This paper highlights key challenges and advancements in GDC, including spatial data clustering (SDC), clustering techniques within GIS, and algorithms designed for geospatial data clustering in network spaces (GDC in NS). Practical implementations of these methodologies encompass diverse applications such as hotspot analysis, infectious disease monitoring, transportation optimization, urban traffic management, and emergency response planning. These contributions are foundational for advancing scholarly research and addressing domain-specific challenges in this field.
地理空间数据通过整合空间和时间维度,促进高级可视化和全面分析见解,增强了传统数据集。地理空间数据聚类作为地理空间分析的一个基础方面,在理论探索和应用领域中发挥着重要作用,已成为一个重要的学术研究领域。GDC寻求基于内在相似性对地理空间对象进行分组,这是现代数据集日益增长的规模和复杂性,特别是地理信息系统(GIS)中的数据集所驱动的必要条件。本文重点介绍了GDC的主要挑战和进展,包括空间数据聚类(SDC)、GIS中的聚类技术以及为网络空间中的地理空间数据聚类设计的算法(GDC in NS)。这些方法的实际应用包括热点分析、传染病监测、交通优化、城市交通管理和应急响应规划等多种应用。这些贡献是推进学术研究和解决该领域特定领域挑战的基础。
{"title":"Geospatial Data Clustering in Network Space: A Survey","authors":"Loan T. T. Nguyen, Trang T. D. Nguyen, Quang-Thinh Bui, Bay Vo","doi":"10.1002/widm.70023","DOIUrl":"https://doi.org/10.1002/widm.70023","url":null,"abstract":"Geospatial data enhances traditional datasets by integrating spatial and temporal dimensions, facilitating advanced visualizations and comprehensive analytical insights. As a fundamental aspect of geospatial analytics, geospatial data clustering (GDC) has become a prominent area of academic research, playing a critical role in theoretical exploration and applied domains. GDC seeks to group geospatial objects based on inherent similarities, a necessity driven by modern datasets' increasing scale and complexity, particularly those within geographic information systems (GIS). This paper highlights key challenges and advancements in GDC, including spatial data clustering (SDC), clustering techniques within GIS, and algorithms designed for geospatial data clustering in network spaces (GDC in NS). Practical implementations of these methodologies encompass diverse applications such as hotspot analysis, infectious disease monitoring, transportation optimization, urban traffic management, and emergency response planning. These contributions are foundational for advancing scholarly research and addressing domain-specific challenges in this field.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144066652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling Explainable AI in Healthcare: Current Trends, Challenges, and Future Directions 揭示医疗保健领域可解释的人工智能:当前趋势、挑战和未来方向
Pub Date : 2025-05-12 DOI: 10.1002/widm.70018
Abdul Aziz Noor, Awais Manzoor, Muhammad Deedahwar Mazhar Qureshi, M. Atif Qureshi, Wael Rashwan
This overview investigates the evolution and current landscape of eXplainable Artificial Intelligence (XAI) in healthcare, highlighting its implications for researchers, technology developers, and policymakers. Following the PRISMA protocol, we analyzed 89 publications from January 2000 to June 2024, spanning 19 medical domains, with a focus on Neurology and Cancer as the most studied areas. Various data types are reviewed, including tabular data, medical imaging, and clinical text, offering a comprehensive perspective on XAI applications. Key findings identify significant gaps, such as the limited availability of public datasets, suboptimal data preprocessing techniques, insufficient feature selection and engineering, and the limited utilization of multiple XAI methods. Additionally, the lack of standardized XAI evaluation metrics and practical obstacles in integrating XAI systems into clinical workflows are emphasized. We provide actionable recommendations, including the design of explainability‐centric models, the application of diverse and multiple XAI methods, and the fostering of interdisciplinary collaboration. These strategies aim to guide researchers in building robust AI models, assist technology developers in creating intuitive and user‐friendly AI tools, and inform policymakers in establishing effective regulations. Addressing these gaps will promote the development of transparent, reliable, and user‐centred AI systems in healthcare, ultimately improving decision‐making and patient outcomes.
本文概述了可解释人工智能(XAI)在医疗保健领域的发展和现状,重点介绍了其对研究人员、技术开发人员和政策制定者的影响。根据PRISMA协议,我们分析了2000年1月至2024年6月期间的89篇出版物,涵盖19个医学领域,重点是神经病学和癌症作为研究最多的领域。回顾了各种数据类型,包括表格数据、医学成像和临床文本,提供了XAI应用的全面视角。关键发现指出了重大差距,例如公共数据集的可用性有限、数据预处理技术次优、特征选择和工程不足以及多种XAI方法的有限利用。此外,还强调了缺乏标准化的XAI评估指标和将XAI系统集成到临床工作流程中的实际障碍。我们提供了可行的建议,包括设计以可解释性为中心的模型,应用多样化和多种XAI方法,以及促进跨学科合作。这些战略旨在指导研究人员建立强大的人工智能模型,协助技术开发人员创建直观和用户友好的人工智能工具,并为政策制定者制定有效的法规提供信息。解决这些差距将促进在医疗保健领域开发透明、可靠和以用户为中心的人工智能系统,最终改善决策和患者治疗结果。
{"title":"Unveiling Explainable AI in Healthcare: Current Trends, Challenges, and Future Directions","authors":"Abdul Aziz Noor, Awais Manzoor, Muhammad Deedahwar Mazhar Qureshi, M. Atif Qureshi, Wael Rashwan","doi":"10.1002/widm.70018","DOIUrl":"https://doi.org/10.1002/widm.70018","url":null,"abstract":"This overview investigates the evolution and current landscape of eXplainable Artificial Intelligence (XAI) in healthcare, highlighting its implications for researchers, technology developers, and policymakers. Following the PRISMA protocol, we analyzed 89 publications from January 2000 to June 2024, spanning 19 medical domains, with a focus on Neurology and Cancer as the most studied areas. Various data types are reviewed, including tabular data, medical imaging, and clinical text, offering a comprehensive perspective on XAI applications. Key findings identify significant gaps, such as the limited availability of public datasets, suboptimal data preprocessing techniques, insufficient feature selection and engineering, and the limited utilization of multiple XAI methods. Additionally, the lack of standardized XAI evaluation metrics and practical obstacles in integrating XAI systems into clinical workflows are emphasized. We provide actionable recommendations, including the design of explainability‐centric models, the application of diverse and multiple XAI methods, and the fostering of interdisciplinary collaboration. These strategies aim to guide researchers in building robust AI models, assist technology developers in creating intuitive and user‐friendly AI tools, and inform policymakers in establishing effective regulations. Addressing these gaps will promote the development of transparent, reliable, and user‐centred AI systems in healthcare, ultimately improving decision‐making and patient outcomes.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143933226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weak Supervision: A Survey on Predictive Maintenance 弱监督:预见性维修调查
Pub Date : 2025-05-12 DOI: 10.1002/widm.70022
Antonio M. Martínez‐Heredia, Sebastián Ventura
The maintenance advancements achieved in Industry 4.0 generate large amounts of data, necessitating complete, accurate, and precise labels for training datasets to align with corresponding ground truth. These labels serve as annotations for early anomaly detection. Delivering high‐quality annotations derived from weak labels and striking a balance between annotation efforts and accuracy are critical tasks. Consequently, researchers have focused their attention on Weakly Supervised Learning methods, which have shown effectiveness in handling datasets characterized by incomplete, imprecise, and erroneous labels across various maintenance applications. In this survey, the authors aim to address a gap in the existing literature by conducting a comprehensive examination of Weakly Supervised Learning for Predictive Maintenance, categorizing related works. Furthermore, the survey discusses challenges and identifies open research lines.
工业4.0中实现的维护进步产生了大量数据,需要为训练数据集提供完整、准确和精确的标签,以与相应的地面事实保持一致。这些标签作为早期异常检测的注释。提供来自弱标签的高质量注释,并在注释工作和准确性之间取得平衡是关键任务。因此,研究人员将注意力集中在弱监督学习方法上,这种方法在处理各种维护应用中以不完整、不精确和错误标签为特征的数据集方面显示出有效性。在本调查中,作者旨在通过对弱监督学习的预测性维护进行全面检查,对相关工作进行分类,以解决现有文献中的空白。此外,调查还讨论了挑战并确定了开放的研究方向。
{"title":"Weak Supervision: A Survey on Predictive Maintenance","authors":"Antonio M. Martínez‐Heredia, Sebastián Ventura","doi":"10.1002/widm.70022","DOIUrl":"https://doi.org/10.1002/widm.70022","url":null,"abstract":"The maintenance advancements achieved in Industry 4.0 generate large amounts of data, necessitating complete, accurate, and precise labels for training datasets to align with corresponding ground truth. These labels serve as annotations for early anomaly detection. Delivering high‐quality annotations derived from weak labels and striking a balance between annotation efforts and accuracy are critical tasks. Consequently, researchers have focused their attention on Weakly Supervised Learning methods, which have shown effectiveness in handling datasets characterized by incomplete, imprecise, and erroneous labels across various maintenance applications. In this survey, the authors aim to address a gap in the existing literature by conducting a comprehensive examination of Weakly Supervised Learning for Predictive Maintenance, categorizing related works. Furthermore, the survey discusses challenges and identifies open research lines.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143933235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
WIREs Data Mining and Knowledge Discovery
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1