Pub Date : 2024-09-10DOI: 10.1109/TAI.2024.3439048
Hussein Abbass;Keeley Crockett;Jonathan Garibaldi;Alexander Gegov;Uzay Kaymak;Joao Miguel C. Sousa
{"title":"Editorial: From Explainable Artificial Intelligence (xAI) to Understandable Artificial Intelligence (uAI)","authors":"Hussein Abbass;Keeley Crockett;Jonathan Garibaldi;Alexander Gegov;Uzay Kaymak;Joao Miguel C. Sousa","doi":"10.1109/TAI.2024.3439048","DOIUrl":"https://doi.org/10.1109/TAI.2024.3439048","url":null,"abstract":"","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10673750","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142165008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-12DOI: 10.1109/tai.2024.3429050
Zihao Li, Pan Gao, Kang You, Chuan Yan, Manoranjan Paul
Previous studies have demonstrated the effectiveness of point-based neural models on the point cloud analysis task. However, there remains a crucial issue on producing the efficient input embedding for raw point coordinates. Moreover, another issue lies in the limited efficiency of neighboring aggregations, which is a critical component in the network stem. In this paper, we propose a Global Attention-guided Dual-domain Feature Learning network (GAD) to address the above-mentioned issues. We first devise the Contextual Position-enhanced Transformer (CPT) module, which is armed with an improved global attention mechanism, to produce a global-aware input embedding that serves as the guidance to subsequent aggregations. Then, the Dual-domain K-nearest neighbor Feature Fusion (DKFF) is cascaded to conduct effective feature aggregation through novel dual-domain feature learning which appreciates both local geometric relations and long-distance semantic connections. Extensive experiments on multiple point cloud analysis tasks (e.g., classification, part segmentation, and scene semantic segmentation) demonstrate the superior performance of the proposed method and the efficacy of the devised modules.
以往的研究已经证明了基于点的神经模型在点云分析任务中的有效性。然而,为原始点坐标生成高效输入嵌入仍是一个关键问题。此外,另一个问题在于邻近聚合的效率有限,而邻近聚合是网络干系中的关键组成部分。在本文中,我们提出了一种全局注意力引导的双域特征学习网络(GAD)来解决上述问题。我们首先设计了上下文位置增强变换器(CPT)模块,该模块采用改进的全局注意力机制,生成全局感知输入嵌入,作为后续聚合的指导。然后,级联双域 K 近邻特征融合(DKFF),通过新颖的双域特征学习(既重视局部几何关系,又重视长距离语义联系)进行有效的特征聚合。在多个点云分析任务(如分类、部件分割和场景语义分割)上的广泛实验证明了所提方法的卓越性能和所设计模块的功效。
{"title":"Global Attention-Guided Dual-Domain Point Cloud Feature Learning for Classification and Segmentation","authors":"Zihao Li, Pan Gao, Kang You, Chuan Yan, Manoranjan Paul","doi":"10.1109/tai.2024.3429050","DOIUrl":"https://doi.org/10.1109/tai.2024.3429050","url":null,"abstract":"Previous studies have demonstrated the effectiveness of point-based neural models on the point cloud analysis task. However, there remains a crucial issue on producing the efficient input embedding for raw point coordinates. Moreover, another issue lies in the limited efficiency of neighboring aggregations, which is a critical component in the network stem. In this paper, we propose a Global Attention-guided Dual-domain Feature Learning network (GAD) to address the above-mentioned issues. We first devise the Contextual Position-enhanced Transformer (CPT) module, which is armed with an improved global attention mechanism, to produce a global-aware input embedding that serves as the guidance to subsequent aggregations. Then, the Dual-domain K-nearest neighbor Feature Fusion (DKFF) is cascaded to conduct effective feature aggregation through novel dual-domain feature learning which appreciates both local geometric relations and long-distance semantic connections. Extensive experiments on multiple point cloud analysis tasks (e.g., classification, part segmentation, and scene semantic segmentation) demonstrate the superior performance of the proposed method and the efficacy of the devised modules.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141655081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-25DOI: 10.1109/TAI.2024.3387583
Ran Cheng;Hugo Jair Escalante;Wei-Wei Tu;Jan N. Van Rijn;Shuo Wang;Yun Yang
The five papers in this special section address different aspects of automated machine learning (AutoML) from fundamental algorithms to real-world applications. Developing high-performance machine learning models is a difficult task that usually requires expertise from data scientists and knowledge from domain experts. To make machine learning more accessible and ease the labor-intensive trial-and-error process of searching for the most appropriate machine learning algorithm and the optimal hyperparameter setting, AutoML was developed and has become a rapidly growing area in recent years. AutoML aims at automation and efficiency of the machine learning process across domains and applications. Nowadays, data is commonly collected over time and susceptible to changes, such as in Internet-of-Things (IoT) systems, mobile phone applications and healthcare data analysis. It poses new challenges to the traditional AutoML with the assumption of data stationarity. Interesting research questions arise around whether, when and how to effectively and efficiently deal with non-stationary data in AutoML.
{"title":"Guest Editorial: AutoML for Nonstationary Data","authors":"Ran Cheng;Hugo Jair Escalante;Wei-Wei Tu;Jan N. Van Rijn;Shuo Wang;Yun Yang","doi":"10.1109/TAI.2024.3387583","DOIUrl":"https://doi.org/10.1109/TAI.2024.3387583","url":null,"abstract":"The five papers in this special section address different aspects of automated machine learning (AutoML) from fundamental algorithms to real-world applications. Developing high-performance machine learning models is a difficult task that usually requires expertise from data scientists and knowledge from domain experts. To make machine learning more accessible and ease the labor-intensive trial-and-error process of searching for the most appropriate machine learning algorithm and the optimal hyperparameter setting, AutoML was developed and has become a rapidly growing area in recent years. AutoML aims at automation and efficiency of the machine learning process across domains and applications. Nowadays, data is commonly collected over time and susceptible to changes, such as in Internet-of-Things (IoT) systems, mobile phone applications and healthcare data analysis. It poses new challenges to the traditional AutoML with the assumption of data stationarity. Interesting research questions arise around whether, when and how to effectively and efficiently deal with non-stationary data in AutoML.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10571781","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141474817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Educational data mining (EDM) offers an effective solution to predict students’ course grades in the next term. Conventional grade prediction methods can be viewed as regressing an expectation of the probability distribution of the student's grade, typically called single-value grade prediction. The reliable prediction outcomes of these methods depend on the complete input information related to students. However, next-term grade prediction often encounters the challenge of incomplete input information due to the inaccessibility of future data and the privacy of data. In this scenario, single-value grade prediction struggles to assess students’ academic status, as it may not be represented and assessed by relying on a singular expectation value. This limitation increases the risk of misjudgment, and may lead to errors in educational decision-making. Considering the challenge of collecting complete input information, we shift from traditional single-value predictions to forecasting the explicit probability distribution of the course grade. The probability distribution of the grade can assess the students’ academic status by providing probabilities corresponding to all possible grade values rather than relying solely on an expectation value, which offers the foundation to support the educators’ decision-making. In this article, the course grade distribution prediction (CGDP) model is proposed, aiming to estimate an explicit conditional probability distribution of course grades in the next term. This model can identify at-risk students, offering comprehensive decision-making information for educators and students. To ensure precise distribution predictions, a calibration method is also employed to improve the alignment between predicted and actual probabilities. Experimental results verify the effectiveness of the proposed model in early grade warning for undergraduates, based on real university data.
{"title":"A Novel Grades Prediction Method for Undergraduate Students by Learning Explicit Conditional Distribution","authors":"Na Zhang;Ming Liu;Lin Wang;Shuangrong Liu;Runyuan Sun;Bo Yang;Shenghui Zhu;Chengdong Li;Cheng Yang;Yuhu Cheng","doi":"10.1109/TAI.2024.3416077","DOIUrl":"https://doi.org/10.1109/TAI.2024.3416077","url":null,"abstract":"Educational data mining (EDM) offers an effective solution to predict students’ course grades in the next term. Conventional grade prediction methods can be viewed as regressing an expectation of the probability distribution of the student's grade, typically called single-value grade prediction. The reliable prediction outcomes of these methods depend on the complete input information related to students. However, next-term grade prediction often encounters the challenge of incomplete input information due to the inaccessibility of future data and the privacy of data. In this scenario, single-value grade prediction struggles to assess students’ academic status, as it may not be represented and assessed by relying on a singular expectation value. This limitation increases the risk of misjudgment, and may lead to errors in educational decision-making. Considering the challenge of collecting complete input information, we shift from traditional single-value predictions to forecasting the explicit probability distribution of the course grade. The probability distribution of the grade can assess the students’ academic status by providing probabilities corresponding to all possible grade values rather than relying solely on an expectation value, which offers the foundation to support the educators’ decision-making. In this article, the course grade distribution prediction (CGDP) model is proposed, aiming to estimate an explicit conditional probability distribution of course grades in the next term. This model can identify at-risk students, offering comprehensive decision-making information for educators and students. To ensure precise distribution predictions, a calibration method is also employed to improve the alignment between predicted and actual probabilities. Experimental results verify the effectiveness of the proposed model in early grade warning for undergraduates, based on real university data.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pancreatic cancer is a highly fatal cancer type. Patients are typically in an advanced stage at their first diagnosis, mainly due to the absence of distinctive early stage symptoms and lack of effective early diagnostic methods. In this work, we propose an automated method for pancreatic cancer diagnosis using noncontrast computed tomography (CT), taking advantage of its widespread availability in clinic. Currently, a primary challenge limiting the clinical value of intelligent systems is low generalization, i.e., the difficulty of achieving stable performance across datasets from different medical sources. To address this challenge, a novel causality-informed graph intervention model is developed based on a multi-instance-learning framework integrated with graph neural network (GNN) for the extraction of local discriminative features. Within this model, we develop a graph causal intervention scheme with three levels of intervention for graph nodes, structures, and representations. This scheme systematically suppresses noncausal factors and thus lead to generalizable predictions. Specifically, first, a target node perturbation strategy is designed to capture target-region features. Second, a causal-structure separation module is developed to automatically identify the causal graph structures for obtaining stable representations of whole target regions. Third, a graph-level feature consistency mechanism is proposed to extract invariant features. Comprehensive experiments on large-scale datasets validated the promising early diagnosis performance of our proposed model. The model generalizability was confirmed on three independent datasets, where the classification accuracy reached 86.3%, 80.4%, and 82.2%, respectively. Overall, we provide a valuable potential tool for pancreatic cancer screening and early diagnosis.
{"title":"A Causality-Informed Graph Intervention Model for Pancreatic Cancer Early Diagnosis","authors":"Xinyue Li;Rui Guo;Hongzhang Zhu;Tao Chen;Xiaohua Qian","doi":"10.1109/TAI.2024.3395586","DOIUrl":"https://doi.org/10.1109/TAI.2024.3395586","url":null,"abstract":"Pancreatic cancer is a highly fatal cancer type. Patients are typically in an advanced stage at their first diagnosis, mainly due to the absence of distinctive early stage symptoms and lack of effective early diagnostic methods. In this work, we propose an automated method for pancreatic cancer diagnosis using noncontrast computed tomography (CT), taking advantage of its widespread availability in clinic. Currently, a primary challenge limiting the clinical value of intelligent systems is low generalization, i.e., the difficulty of achieving stable performance across datasets from different medical sources. To address this challenge, a novel causality-informed graph intervention model is developed based on a multi-instance-learning framework integrated with graph neural network (GNN) for the extraction of local discriminative features. Within this model, we develop a graph causal intervention scheme with three levels of intervention for graph nodes, structures, and representations. This scheme systematically suppresses noncausal factors and thus lead to generalizable predictions. Specifically, first, a target node perturbation strategy is designed to capture target-region features. Second, a causal-structure separation module is developed to automatically identify the causal graph structures for obtaining stable representations of whole target regions. Third, a graph-level feature consistency mechanism is proposed to extract invariant features. Comprehensive experiments on large-scale datasets validated the promising early diagnosis performance of our proposed model. The model generalizability was confirmed on three independent datasets, where the classification accuracy reached 86.3%, 80.4%, and 82.2%, respectively. Overall, we provide a valuable potential tool for pancreatic cancer screening and early diagnosis.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-30DOI: 10.1109/TAI.2024.3395574
Yufei Liu;Jia Wu;Jie Cao
Social behavior prediction on social media is attracting significant attention from researchers. Social e-commerce focuses on engagement marketing, which emphasizes social behavior because it effectively increases brand recognition. Currently, existing works on social behavior prediction suffer from two main problems: 1) they assume that social influence probabilities can be learned independently of each other, and their calculations do not include any influence probability estimations based on friends’ behavior; and 2) negative sampling of subgraphs is usually ignored in social behavior prediction work. To the best of our knowledge, introducing graph contrastive learning (GCL) to social behavior prediction is novel and interesting. In this article, we propose a framework, social behavior prediction via graph contrastive learning with attention named SBP-GCA, to promote social behavior prediction. First, two methods are designed to extract subgraphs from the original graph, and their structural features are learned by GCL. Then, it models how a user's behavior is influenced by neighbors and learns influence features via graph attention networks (GATs). Furthermore, it combines structural features, influence features, and intrinsic features to predict social behavior. Extensive and systematic experiments on three datasets validate the superiority of the proposed SBP-GCA.
社交媒体上的社交行为预测正引起研究人员的极大关注。社交电子商务侧重于参与式营销,强调社交行为,因为它能有效提高品牌认知度。目前,有关社交行为预测的现有研究存在两个主要问题:1)假设社交影响概率可以独立学习,其计算不包括任何基于好友行为的影响概率估计;2)社交行为预测工作通常忽略子图的负采样。据我们所知,将图对比学习(GCL)引入社交行为预测是一项新颖而有趣的工作。在本文中,我们提出了一个通过图对比学习(graph contrastive learning with attention)进行社会行为预测的框架,命名为 SBP-GCA,以促进社会行为预测。首先,我们设计了两种方法从原始图中提取子图,并通过 GCL 学习子图的结构特征。然后,它对用户行为如何受邻居影响进行建模,并通过图注意力网络(GAT)学习影响特征。此外,它还结合了结构特征、影响特征和内在特征来预测社交行为。在三个数据集上进行的广泛而系统的实验验证了所提出的 SBP-GCA 的优越性。
{"title":"SBP-GCA: Social Behavior Prediction via Graph Contrastive Learning With Attention","authors":"Yufei Liu;Jia Wu;Jie Cao","doi":"10.1109/TAI.2024.3395574","DOIUrl":"https://doi.org/10.1109/TAI.2024.3395574","url":null,"abstract":"Social behavior prediction on social media is attracting significant attention from researchers. Social e-commerce focuses on engagement marketing, which emphasizes social behavior because it effectively increases brand recognition. Currently, existing works on social behavior prediction suffer from two main problems: 1) they assume that social influence probabilities can be learned independently of each other, and their calculations do not include any influence probability estimations based on friends’ behavior; and 2) negative sampling of subgraphs is usually ignored in social behavior prediction work. To the best of our knowledge, introducing graph contrastive learning (GCL) to social behavior prediction is novel and interesting. In this article, we propose a framework, social behavior prediction via graph contrastive learning with attention named SBP-GCA, to promote social behavior prediction. First, two methods are designed to extract subgraphs from the original graph, and their structural features are learned by GCL. Then, it models how a user's behavior is influenced by neighbors and learns influence features via graph attention networks (GATs). Furthermore, it combines structural features, influence features, and intrinsic features to predict social behavior. Extensive and systematic experiments on three datasets validate the superiority of the proposed SBP-GCA.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}