首页 > 最新文献

Information Processing & Management最新文献

英文 中文
Meta-path Sampling-Enhanced Course Recommendation in Heterogeneous Networks 异构网络中元路径抽样增强的课程推荐
IF 6.9 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-20 DOI: 10.1016/j.ipm.2025.104482
Lei Zhang , Mengxiang Ma , Juntao Zhang , Tao Xu , Daojun Han , Linkun Fan , Yanhua Zhao
With the successful development and application of Massive Open Online Courses (MOOCs), course recommendations have received widespread attention from researchers. However, existing course recommendation methods face three challenges: (1) user-course interaction is sparse; (2) insufficient modeling of the multiple interactive semantics of user preferences; and (3) the lack of constraints for user knowledge blind zone preferences. To address these challenges, we propose a novel method called Meta-path Sampling-Enhanced Course Recommendation in Heterogeneous Networks (MSEC-Rec), which improves the accuracy of course recommendations by integrating the multi-interaction semantic information of users. Specifically, we enhance the interactions between users and courses through meta-paths in heterogeneous information networks (HINs) to alleviate the interaction sparsity problem. Then, we design a meta-path sampling strategy to model the semantics of multiple interactions between users and courses. Next, we introduce meta-path negative sampling information in HINs and capture users’ knowledge blindness via the contrastive loss function to optimize the score differences between positive and negative samples. Finally, we conduct experiments on the MOOCCube and XuetangX datasets and compare MSEC-Rec with multiple baselines. Compared with the SOTA method on the MOOCCube dataset, the evaluation metrics HR@K and NDCG@K (K= 5, 10, 20) of MSEC-Rec increased by 0.04%, 3.35%, 5.17%, 2.61%, 4.69%, and 4.2%, respectively, demonstrating its effectiveness. The source code and data are available on GitHub: https://github.com/mmx124/MSEC-Rec.
随着大规模在线开放课程(MOOCs)的成功开发和应用,课程推荐受到了研究者的广泛关注。然而,现有的课程推荐方法面临三个挑战:(1)用户-课程交互稀疏;(2)对用户偏好的多重交互语义建模不足;(3)缺乏对用户知识盲区偏好的约束。为了解决这些挑战,我们提出了一种新的方法,称为元路径采样增强的异构网络课程推荐(MSEC-Rec),该方法通过集成用户的多交互语义信息来提高课程推荐的准确性。具体而言,我们通过异构信息网络(HINs)中的元路径增强用户与课程之间的交互,以缓解交互稀疏问题。然后,我们设计了一个元路径采样策略,对用户和球场之间的多个交互语义进行建模。接下来,我们在HINs中引入元路径负样本信息,通过对比损失函数捕捉用户的知识盲目性,优化正负样本的得分差。最后,我们在MOOCCube和XuetangX数据集上进行了实验,并将MSEC-Rec与多个基线进行了比较。与MOOCCube数据集上的SOTA方法相比,MSEC-Rec的评价指标HR@K和NDCG@K (K= 5、10、20)分别提高了0.04%、3.35%、5.17%、2.61%、4.69%和4.2%,表明其有效性。源代码和数据可在GitHub: https://github.com/mmx124/MSEC-Rec。
{"title":"Meta-path Sampling-Enhanced Course Recommendation in Heterogeneous Networks","authors":"Lei Zhang ,&nbsp;Mengxiang Ma ,&nbsp;Juntao Zhang ,&nbsp;Tao Xu ,&nbsp;Daojun Han ,&nbsp;Linkun Fan ,&nbsp;Yanhua Zhao","doi":"10.1016/j.ipm.2025.104482","DOIUrl":"10.1016/j.ipm.2025.104482","url":null,"abstract":"<div><div>With the successful development and application of Massive Open Online Courses (MOOCs), course recommendations have received widespread attention from researchers. However, existing course recommendation methods face three challenges: (1) user-course interaction is sparse; (2) insufficient modeling of the multiple interactive semantics of user preferences; and (3) the lack of constraints for user knowledge blind zone preferences. To address these challenges, we propose a novel method called Meta-path Sampling-Enhanced Course Recommendation in Heterogeneous Networks (MSEC-Rec), which improves the accuracy of course recommendations by integrating the multi-interaction semantic information of users. Specifically, we enhance the interactions between users and courses through meta-paths in heterogeneous information networks (HINs) to alleviate the interaction sparsity problem. Then, we design a meta-path sampling strategy to model the semantics of multiple interactions between users and courses. Next, we introduce meta-path negative sampling information in HINs and capture users’ knowledge blindness via the contrastive loss function to optimize the score differences between positive and negative samples. Finally, we conduct experiments on the MOOCCube and XuetangX datasets and compare MSEC-Rec with multiple baselines. Compared with the SOTA method on the MOOCCube dataset, the evaluation metrics HR@K and NDCG@K (K= 5, 10, 20) of MSEC-Rec increased by 0.04%, 3.35%, 5.17%, 2.61%, 4.69%, and 4.2%, respectively, demonstrating its effectiveness. The source code and data are available on GitHub: <span><span>https://github.com/mmx124/MSEC-Rec</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104482"},"PeriodicalIF":6.9,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145555231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transformer-based large language foundation models for text generation: A comprehensive literature review for different languages and application domains 基于转换器的文本生成大型语言基础模型:不同语言和应用领域的综合文献综述
IF 6.9 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-20 DOI: 10.1016/j.ipm.2025.104477
Raphael Souza de Oliveira , Erick Giovani Sperandio Nascimento
Advances in transformer-based architectures have enabled Large Language Models to generate fluent, coherent, diverse, and consistent text, comparable to human-produced content. However, the main efforts in this area remain predominantly focused on English-language models, raising concerns about their inclusivity and applicability across diverse linguistic and domain-specific contexts. Thus, this study presents a comprehensive literature review aimed to address the research question: ”What types of techniques are applied in the field of natural language processing for text generation?”. Unlike earlier surveys, which typically concentrate on specific tasks or techniques, this review examines applied techniques, covered languages, data domains, and the extent to which proposed models qualify as foundation models. A total of 263 peer-reviewed studies published between 2017 and 2025 were identified and analysed. The review identified 20 types of techniques applied to automatic text generation across 26 languages and 27 application domains. Findings reveal a strong prevalence of general-purpose English-language models, whilst research on multilingual and domain-specific models remains limited. Only thirteen models were identified as foundation models, indicating a lack of broader generalisation capabilities. Key gaps include the absence of foundation models for non-English languages and insufficient exploration of specific domains such as healthcare, law, and finance. Additional challenges involve dataset availability, computational costs, and methodological transparency. Emerging trends suggest a growing focus on multilingual and domain-adapted models, transfer learning, and reinforcement learning with human feedback. Future opportunities lie in expanding research on unexplored languages and domains, addressing ethical concerns, improving evaluation metrics, and developing responsible, people-centred AI systems.
基于转换器的体系结构的进步使得大型语言模型能够生成流畅、连贯、多样和一致的文本,与人类生成的内容相当。然而,这一领域的主要努力仍然主要集中在英语模型上,这引起了人们对其在不同语言和特定领域上下文中的包容性和适用性的关注。因此,本研究提出了一个全面的文献综述,旨在解决研究问题:“在文本生成的自然语言处理领域应用了哪些类型的技术?”与早期的调查不同,早期的调查通常集中在特定的任务或技术上,这次审查检查了应用的技术,涵盖了语言、数据领域,以及被提议的模型作为基础模型的程度。2017年至2025年间发表的263项同行评议研究被确定和分析。该综述确定了用于跨26种语言和27个应用领域的自动文本生成的20种技术。研究结果显示,通用英语模型非常普遍,而对多语言和特定领域模型的研究仍然有限。只有13个模型被确定为基础模型,这表明缺乏更广泛的泛化能力。主要缺陷包括缺乏针对非英语语言的基础模型,以及对医疗保健、法律和金融等特定领域的探索不足。其他挑战包括数据集可用性、计算成本和方法透明度。新兴趋势表明,越来越多的人关注多语言和领域适应模型、迁移学习和人类反馈的强化学习。未来的机会在于扩大对未开发语言和领域的研究,解决伦理问题,改进评估指标,开发负责任的、以人为本的人工智能系统。
{"title":"Transformer-based large language foundation models for text generation: A comprehensive literature review for different languages and application domains","authors":"Raphael Souza de Oliveira ,&nbsp;Erick Giovani Sperandio Nascimento","doi":"10.1016/j.ipm.2025.104477","DOIUrl":"10.1016/j.ipm.2025.104477","url":null,"abstract":"<div><div>Advances in transformer-based architectures have enabled Large Language Models to generate fluent, coherent, diverse, and consistent text, comparable to human-produced content. However, the main efforts in this area remain predominantly focused on English-language models, raising concerns about their inclusivity and applicability across diverse linguistic and domain-specific contexts. Thus, this study presents a comprehensive literature review aimed to address the research question: ”What types of techniques are applied in the field of natural language processing for text generation?”. Unlike earlier surveys, which typically concentrate on specific tasks or techniques, this review examines applied techniques, covered languages, data domains, and the extent to which proposed models qualify as foundation models. A total of 263 peer-reviewed studies published between 2017 and 2025 were identified and analysed. The review identified 20 types of techniques applied to automatic text generation across 26 languages and 27 application domains. Findings reveal a strong prevalence of general-purpose English-language models, whilst research on multilingual and domain-specific models remains limited. Only thirteen models were identified as foundation models, indicating a lack of broader generalisation capabilities. Key gaps include the absence of foundation models for non-English languages and insufficient exploration of specific domains such as healthcare, law, and finance. Additional challenges involve dataset availability, computational costs, and methodological transparency. Emerging trends suggest a growing focus on multilingual and domain-adapted models, transfer learning, and reinforcement learning with human feedback. Future opportunities lie in expanding research on unexplored languages and domains, addressing ethical concerns, improving evaluation metrics, and developing responsible, people-centred AI systems.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104477"},"PeriodicalIF":6.9,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Price-sensitive feature analysis from online reviews using a tiered information extraction framework for product optimization 使用分层信息提取框架对在线评论进行价格敏感特征分析,以优化产品
IF 6.9 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-20 DOI: 10.1016/j.ipm.2025.104490
Sijie Fu , Xinran She , Ruiying Wan , Xiutong Xu , Xianqing Xiong
In price-tiered markets, consumers’ expectations for product functionality often shift with pricing; however, existing research lacks frameworks capable of dynamically capturing these priority changes. To address this gap, this study proposes a price-sensitive deep learning framework that integrates online customer reviews (OCRs) with structured feature extraction and weighting strategies to support data-driven product optimization decisions. Using smart standing desks as a representative case, we develop a hybrid framework that combines Latent Dirichlet Allocation (LDA), BERT-BiLSTM-CRF, and a novel Price-Weighted Comment Frequency-Inverse Group Frequency (PWCF-IGF) metric. Experiments conducted on a dataset of 28,003 OCRs collected from 200 products across nine distinct price segments reveal a clear shift in user attention, from service-related aspects at lower prices to aesthetic quality and interactive experience at higher tiers. Beyond descriptive evidence, a Tobit regression analysis provides statistical validation for this trend, demonstrating a significant positive correlation between price and attention to design and function, and a negative correlation with service-related aspects. By explicitly embedding pricing signals into the review mining process, this research contributes to the development of more intelligent and responsive information management systems and facilitates consumer-centered product innovation and optimization.
在价格分层的市场中,消费者对产品功能的期望通常会随着价格的变化而变化;然而,现有的研究缺乏能够动态捕捉这些优先级变化的框架。为了解决这一差距,本研究提出了一个价格敏感的深度学习框架,该框架将在线客户评论(ocr)与结构化特征提取和加权策略集成在一起,以支持数据驱动的产品优化决策。以智能站立式办公桌为代表,我们开发了一个混合框架,该框架结合了潜在狄利克雷分配(LDA)、BERT-BiLSTM-CRF和一种新的价格加权评论频率-逆群频率(PWCF-IGF)指标。从9个不同价格区间的200种产品中收集的28003个ocr数据集进行的实验显示,用户的注意力从较低价格的服务相关方面明显转移到较高价格的美学质量和互动体验上。除了描述性证据之外,Tobit回归分析为这一趋势提供了统计验证,表明价格与对设计和功能的关注之间存在显著的正相关关系,与服务相关方面存在负相关关系。通过明确地将定价信号嵌入到评论挖掘过程中,本研究有助于开发更智能和响应更快的信息管理系统,并促进以消费者为中心的产品创新和优化。
{"title":"Price-sensitive feature analysis from online reviews using a tiered information extraction framework for product optimization","authors":"Sijie Fu ,&nbsp;Xinran She ,&nbsp;Ruiying Wan ,&nbsp;Xiutong Xu ,&nbsp;Xianqing Xiong","doi":"10.1016/j.ipm.2025.104490","DOIUrl":"10.1016/j.ipm.2025.104490","url":null,"abstract":"<div><div>In price-tiered markets, consumers’ expectations for product functionality often shift with pricing; however, existing research lacks frameworks capable of dynamically capturing these priority changes. To address this gap, this study proposes a price-sensitive deep learning framework that integrates online customer reviews (OCRs) with structured feature extraction and weighting strategies to support data-driven product optimization decisions. Using smart standing desks as a representative case, we develop a hybrid framework that combines Latent Dirichlet Allocation (LDA), BERT-BiLSTM-CRF, and a novel Price-Weighted Comment Frequency-Inverse Group Frequency (PWCF-IGF) metric. Experiments conducted on a dataset of 28,003 OCRs collected from 200 products across nine distinct price segments reveal a clear shift in user attention, from service-related aspects at lower prices to aesthetic quality and interactive experience at higher tiers. Beyond descriptive evidence, a Tobit regression analysis provides statistical validation for this trend, demonstrating a significant positive correlation between price and attention to design and function, and a negative correlation with service-related aspects. By explicitly embedding pricing signals into the review mining process, this research contributes to the development of more intelligent and responsive information management systems and facilitates consumer-centered product innovation and optimization.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104490"},"PeriodicalIF":6.9,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145555233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human-like or machine-like? How anthropomorphic framing shapes older adults’ attitudes toward health AI 像人还是像机器?拟人化框架如何影响老年人对健康人工智能的态度
IF 6.9 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-20 DOI: 10.1016/j.ipm.2025.104489
Tingting Jiang , Yanrun Xu
Promoting older adults’ understanding of health AI is crucial for encouraging technology acceptance and healthier choices. Despite the increasing use of anthropomorphic framing, i.e., attributing human-like qualities to non-human entities, to present AI in online health messages, there still lacks direct evidence regarding its effectiveness among older adults. Drawing on the signaling theory, this study proposes that psychological distance and perceived risk serve as dual signaling pathways between anthropomorphic framing and changes attitudes towards health AI. Results from two controlled experiments, involving a total of 76 senior participants, reveal that human-like framing, as compared to machine-like framing, is more effective in enhancing older adults’ attitudes towards health AI when presented with positive messages. Also, human-like framing helps mitigate the decline in their attitudes when exposed to negative messages. Regarding the mediating pathways, human-like framing is effective in positive messages by reducing psychological distance, while in negative messages, it reduces perceived risk. However, only well-educated older adults’ risk perceptions are influenced by anthropomorphic framing, with less-educated ones unaffected. Furthermore, there is a positive correlation between older adults’ attitudes toward health AI and their intentions to use it. These findings fill a gap in the AI anthropomorphism literature by examining social cues in a communicated text and inform future health communication strategies targeting older adults.
促进老年人对健康人工智能的理解对于鼓励接受技术和做出更健康的选择至关重要。尽管越来越多地使用拟人化框架,即将类似人类的品质归因于非人类实体,在在线健康信息中呈现人工智能,但仍然缺乏直接证据证明其在老年人中的有效性。根据信号理论,本研究提出心理距离和感知风险是拟人化框架和对健康人工智能态度变化之间的双重信号通路。两项涉及76名老年人参与者的对照实验的结果显示,与机器框架相比,在提供积极信息时,类人框架更有效地增强老年人对健康人工智能的态度。此外,当他们接触到负面信息时,类似人类的框架有助于缓解他们态度的下降。在中介通路上,类人框架在积极信息中通过减少心理距离而有效,在消极信息中通过减少感知风险而有效。然而,只有受过良好教育的老年人的风险感知受到拟人化框架的影响,而受教育程度较低的老年人则不受影响。此外,老年人对健康人工智能的态度与他们使用人工智能的意图之间存在正相关。这些发现填补了人工智能拟人化文献的空白,通过研究交流文本中的社会线索,并为未来针对老年人的健康沟通策略提供信息。
{"title":"Human-like or machine-like? How anthropomorphic framing shapes older adults’ attitudes toward health AI","authors":"Tingting Jiang ,&nbsp;Yanrun Xu","doi":"10.1016/j.ipm.2025.104489","DOIUrl":"10.1016/j.ipm.2025.104489","url":null,"abstract":"<div><div>Promoting older adults’ understanding of health AI is crucial for encouraging technology acceptance and healthier choices. Despite the increasing use of anthropomorphic framing, i.e., attributing human-like qualities to non-human entities, to present AI in online health messages, there still lacks direct evidence regarding its effectiveness among older adults. Drawing on the signaling theory, this study proposes that psychological distance and perceived risk serve as dual signaling pathways between anthropomorphic framing and changes attitudes towards health AI. Results from two controlled experiments, involving a total of 76 senior participants, reveal that human-like framing, as compared to machine-like framing, is more effective in enhancing older adults’ attitudes towards health AI when presented with positive messages. Also, human-like framing helps mitigate the decline in their attitudes when exposed to negative messages. Regarding the mediating pathways, human-like framing is effective in positive messages by reducing psychological distance, while in negative messages, it reduces perceived risk. However, only well-educated older adults’ risk perceptions are influenced by anthropomorphic framing, with less-educated ones unaffected. Furthermore, there is a positive correlation between older adults’ attitudes toward health AI and their intentions to use it. These findings fill a gap in the AI anthropomorphism literature by examining social cues in a communicated text and inform future health communication strategies targeting older adults.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104489"},"PeriodicalIF":6.9,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145555232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ExFusion: An Explainable Multi-scale Feature Fusion framework for medical image processing ExFusion:用于医学图像处理的可解释的多尺度特征融合框架
IF 6.9 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-19 DOI: 10.1016/j.ipm.2025.104495
Jianjin Yue , Li Luo , Mengzhuo Guo , Siqi Huang
In medical image processing, the Region of Interest (RoI) delineates the location and boundaries of lesions, helping physicians diagnose and plan prognosis. However, current methods struggle to integrate the high-resolution macroscopic features of the RoI with its low-resolution microscopic features. Additionally, they are usually designed separately for image classification or object detection, lacking the versatility to handle diverse tasks simultaneously. Moreover, these methods, often based on deep learning architectures, are frequently optimized for performance but overlook model explainability, which hinders their broader adoption in real-world healthcare applications. To this end, we propose a new framework, Explainable Multi-scale Feature Fusion (ExFusion), for medical image processing, including an image preprocessing module, an encoder–decoder module, and an explainability module. The image preprocessing module utilizes various techniques to reduce background noise and enhance the features of the RoI. The encoder–decoder module integrates a novel Enhanced Multi-scale Feature Fusion (EFF) block within a U-shaped network architecture. This enables the efficient representation of the RoI’s local and global features, thereby enhancing spatial sensitivity and improving the model’s ability to capture complex information. The final module focuses on increasing the credibility of the framework’s decision-making process by providing post-hoc visualizations that explain how predictions are made. We evaluate the proposed framework through various experiments, demonstrating its superior performance across different tasks compared to existing state-of-the-art models. We invite junior and senior radiologists to evaluate the framework’s outputs. Results show ExFusion exceeds the average performance of radiologists, confirming its potential as an effective tool in real-world healthcare.
在医学图像处理中,感兴趣区域(RoI)描绘病变的位置和边界,帮助医生诊断和规划预后。然而,目前的方法很难将RoI的高分辨率宏观特征与其低分辨率微观特征相结合。此外,它们通常是单独设计用于图像分类或目标检测,缺乏同时处理多种任务的通用性。此外,这些方法通常基于深度学习架构,经常针对性能进行优化,但忽略了模型的可解释性,这阻碍了它们在现实世界医疗保健应用中的广泛采用。为此,我们提出了一个新的框架,可解释的多尺度特征融合(ExFusion),用于医学图像处理,包括图像预处理模块,编解码器模块和可解释性模块。图像预处理模块利用各种技术来降低背景噪声,增强感兴趣区域的特征。编码器-解码器模块在u型网络架构中集成了一种新型的增强多尺度特征融合(EFF)块。这样可以有效地表示RoI的局部和全局特征,从而增强空间敏感性,提高模型捕获复杂信息的能力。最后一个模块侧重于通过提供解释如何做出预测的事后可视化来提高框架决策过程的可信度。我们通过各种实验评估了所提出的框架,与现有的最先进的模型相比,证明了其在不同任务中的优越性能。我们邀请初级和高级放射科医生评估框架的产出。结果显示,ExFusion超过放射科医生的平均表现,证实了其作为现实世界医疗保健有效工具的潜力。
{"title":"ExFusion: An Explainable Multi-scale Feature Fusion framework for medical image processing","authors":"Jianjin Yue ,&nbsp;Li Luo ,&nbsp;Mengzhuo Guo ,&nbsp;Siqi Huang","doi":"10.1016/j.ipm.2025.104495","DOIUrl":"10.1016/j.ipm.2025.104495","url":null,"abstract":"<div><div>In medical image processing, the Region of Interest (RoI) delineates the location and boundaries of lesions, helping physicians diagnose and plan prognosis. However, current methods struggle to integrate the high-resolution macroscopic features of the RoI with its low-resolution microscopic features. Additionally, they are usually designed separately for image classification or object detection, lacking the versatility to handle diverse tasks simultaneously. Moreover, these methods, often based on deep learning architectures, are frequently optimized for performance but overlook model explainability, which hinders their broader adoption in real-world healthcare applications. To this end, we propose a new framework, Explainable Multi-scale Feature Fusion (ExFusion), for medical image processing, including an image preprocessing module, an encoder–decoder module, and an explainability module. The image preprocessing module utilizes various techniques to reduce background noise and enhance the features of the RoI. The encoder–decoder module integrates a novel Enhanced Multi-scale Feature Fusion (EFF) block within a U-shaped network architecture. This enables the efficient representation of the RoI’s local and global features, thereby enhancing spatial sensitivity and improving the model’s ability to capture complex information. The final module focuses on increasing the credibility of the framework’s decision-making process by providing post-hoc visualizations that explain how predictions are made. We evaluate the proposed framework through various experiments, demonstrating its superior performance across different tasks compared to existing state-of-the-art models. We invite junior and senior radiologists to evaluate the framework’s outputs. Results show ExFusion exceeds the average performance of radiologists, confirming its potential as an effective tool in real-world healthcare.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104495"},"PeriodicalIF":6.9,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing large language model for fake news video detection via cross-modal retrieval 通过跨模态检索增强假新闻视频检测的大语言模型
IF 6.9 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-19 DOI: 10.1016/j.ipm.2025.104471
Linfeng Han , Xiaoming Zhang , Tianbo Wang , Yun Liu , Zhiqiang Dong
Fake news video detection aims to analyze and verify the authenticity of video-based news content using multimodal data, including visual, audio, and textual cues. Fake news videos typically contain subtle, misleading alterations, often limited to specific frames, or feature genuine video coupled with fabricated narratives, complicating authenticity assessment through video content alone. Existing approaches predominantly depend on intrinsic features within the video or utilize external knowledge derived from a single modality, limiting their capability to exploit multi-source cross-modal information effectively and rendering them vulnerable to substantial content noise. To address these issues, we propose an Enhanced Method for Fake Video Detection based on Cross-modal Retrieval utilizing Large Models (FVDLM). We first collect relevant video and text news to augment external knowledge. Given the sparsity of informative features in video content, the information bottleneck theory is employed to denoise irrelevant information. Furthermore, to effectively integrate cross-modal knowledge and enrich external context, we introduce a prompt learning approach utilizing large models to generate contextual knowledge. Three specialized prompts are crafted to assess video authenticity from multiple viewpoints. Comprehensive experiments validate the effectiveness and superiority of our proposed model.
假新闻视频检测旨在利用多模态数据,包括视觉、音频和文本线索,分析和验证基于视频的新闻内容的真实性。假新闻视频通常包含微妙的、误导性的改变,通常仅限于特定的帧,或者以真实的视频加上虚构的叙述为特征,仅通过视频内容进行真实性评估变得复杂。现有的方法主要依赖于视频的内在特征或利用来自单一模态的外部知识,这限制了它们有效利用多源跨模态信息的能力,并使它们容易受到大量内容噪声的影响。为了解决这些问题,我们提出了一种基于大模型跨模态检索(FVDLM)的增强假视频检测方法。我们首先收集相关的视频和文字新闻,以增加外部知识。鉴于视频内容中信息特征的稀疏性,采用信息瓶颈理论对无关信息进行去噪。此外,为了有效整合跨模态知识和丰富外部上下文,我们引入了一种利用大模型生成上下文知识的提示学习方法。三个专门的提示被精心制作,以评估视频的真实性从多个角度。综合实验验证了该模型的有效性和优越性。
{"title":"Enhancing large language model for fake news video detection via cross-modal retrieval","authors":"Linfeng Han ,&nbsp;Xiaoming Zhang ,&nbsp;Tianbo Wang ,&nbsp;Yun Liu ,&nbsp;Zhiqiang Dong","doi":"10.1016/j.ipm.2025.104471","DOIUrl":"10.1016/j.ipm.2025.104471","url":null,"abstract":"<div><div>Fake news video detection aims to analyze and verify the authenticity of video-based news content using multimodal data, including visual, audio, and textual cues. Fake news videos typically contain subtle, misleading alterations, often limited to specific frames, or feature genuine video coupled with fabricated narratives, complicating authenticity assessment through video content alone. Existing approaches predominantly depend on intrinsic features within the video or utilize external knowledge derived from a single modality, limiting their capability to exploit multi-source cross-modal information effectively and rendering them vulnerable to substantial content noise. To address these issues, we propose an <strong>Enhanced Method for <u>F</u>ake <u>V</u>ideo <u>D</u>etection based on Cross-modal Retrieval utilizing <u>L</u>arge <u>M</u>odels</strong> (FVDLM). We first collect relevant video and text news to augment external knowledge. Given the sparsity of informative features in video content, the information bottleneck theory is employed to denoise irrelevant information. Furthermore, to effectively integrate cross-modal knowledge and enrich external context, we introduce a prompt learning approach utilizing large models to generate contextual knowledge. Three specialized prompts are crafted to assess video authenticity from multiple viewpoints. Comprehensive experiments validate the effectiveness and superiority of our proposed model.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104471"},"PeriodicalIF":6.9,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-scale memory network with separation training for hyperspectral anomaly detection 基于分离训练的多尺度记忆网络高光谱异常检测
IF 6.9 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-18 DOI: 10.1016/j.ipm.2025.104494
Yu Huo, Youqiang Dong, Chenhao Wang, Min Zhang, Hai Wang
Reconstruction-based hyperspectral anomaly detection (HAD) plays a significant role in remote sensing image interpretation. However, most existing reconstruction-based methods face difficulties in achieving a trade-off between suppressing false alarms and identifying anomalies, thereby limiting their overall detection performance. Towards this end, a multi-scale memory network (MSMNet) is devised to enhance background reconstruction and suppress anomaly reconstruction for HAD. The MSMNet integrates the proposed coarse-to-fine pseudo label generation (CPLG) module, multi-scale memory with separation training (MMST) module, and consistent and discriminative feature learning (CDFL) module into a unified framework to address these challenges. Specifically, the CPLG module considers multi-scale reconstruction errors from coarse-to-fine to generate pseudo sample labels, which are used to guide the training of both the MMST and CDFL modules. The MMST module is developed to suppress the learning of anomaly features while effectively learning and storing pure background features under a multi-scale paradigm by feeding anomaly and background features into the designed separation training branches. In addition, a CDFL module is further developed to enhance background consistency and anomaly discrimination in the feature space through the designed CDFL loss constraint, thereby achieving a better trade-off between suppressing false alarms and identifying anomalies. Experimental results show that our approach achieves AUC(D,F) scores of 99.93%, 99.74%, 99.94%, 99.82%, 99.99%, and 99.60% on six hyperspectral image datasets, outperforming existing state-of-the-art HAD methods by 0.27%, 0.41%, 0.18%, 0.29%, 0.64%, and 0.07%, respectively. Therefore, the comprehensive experiments on these datasets validate the effectiveness and superiority of the proposed method.
基于重建的高光谱异常检测(HAD)在遥感影像解译中起着重要的作用。然而,大多数现有的基于重建的方法在实现抑制假警报和识别异常之间的权衡方面面临困难,从而限制了它们的整体检测性能。为此,设计了一种多尺度记忆网络(MSMNet)来增强背景重建和抑制异常重建。MSMNet将提出的从粗到细的伪标签生成(CPLG)模块、带分离训练的多尺度存储器(MMST)模块以及一致和判别特征学习(CDFL)模块集成到一个统一的框架中,以解决这些挑战。具体来说,CPLG模块考虑从粗到精的多尺度重构误差,生成伪样本标签,用于指导MMST和CDFL模块的训练。MMST模块通过将异常和背景特征馈送到设计的分离训练分支中,抑制异常特征的学习,同时在多尺度范式下有效地学习和存储纯背景特征。此外,进一步开发了CDFL模块,通过设计的CDFL损耗约束增强特征空间的背景一致性和异常识别能力,从而在抑制虚警和识别异常之间实现更好的权衡。实验结果表明,该方法在6个高光谱图像数据集上的AUC(D,F)得分分别为99.93%、99.74%、99.94%、99.82%、99.99%和99.60%,分别比现有的HAD方法高0.27%、0.41%、0.18%、0.29%、0.64%和0.07%。因此,在这些数据集上的综合实验验证了所提方法的有效性和优越性。
{"title":"Multi-scale memory network with separation training for hyperspectral anomaly detection","authors":"Yu Huo,&nbsp;Youqiang Dong,&nbsp;Chenhao Wang,&nbsp;Min Zhang,&nbsp;Hai Wang","doi":"10.1016/j.ipm.2025.104494","DOIUrl":"10.1016/j.ipm.2025.104494","url":null,"abstract":"<div><div>Reconstruction-based hyperspectral anomaly detection (HAD) plays a significant role in remote sensing image interpretation. However, most existing reconstruction-based methods face difficulties in achieving a trade-off between suppressing false alarms and identifying anomalies, thereby limiting their overall detection performance. Towards this end, a multi-scale memory network (MSMNet) is devised to enhance background reconstruction and suppress anomaly reconstruction for HAD. The MSMNet integrates the proposed coarse-to-fine pseudo label generation (CPLG) module, multi-scale memory with separation training (MMST) module, and consistent and discriminative feature learning (CDFL) module into a unified framework to address these challenges. Specifically, the CPLG module considers multi-scale reconstruction errors from coarse-to-fine to generate pseudo sample labels, which are used to guide the training of both the MMST and CDFL modules. The MMST module is developed to suppress the learning of anomaly features while effectively learning and storing pure background features under a multi-scale paradigm by feeding anomaly and background features into the designed separation training branches. In addition, a CDFL module is further developed to enhance background consistency and anomaly discrimination in the feature space through the designed CDFL loss constraint, thereby achieving a better trade-off between suppressing false alarms and identifying anomalies. Experimental results show that our approach achieves <span><math><msub><mrow><mtext>AUC</mtext></mrow><mrow><mrow><mo>(</mo><mi>D</mi><mo>,</mo><mi>F</mi><mo>)</mo></mrow></mrow></msub></math></span> scores of 99.93%, 99.74%, 99.94%, 99.82%, 99.99%, and 99.60% on six hyperspectral image datasets, outperforming existing state-of-the-art HAD methods by 0.27%, 0.41%, 0.18%, 0.29%, 0.64%, and 0.07%, respectively. Therefore, the comprehensive experiments on these datasets validate the effectiveness and superiority of the proposed method.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104494"},"PeriodicalIF":6.9,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DCGRM-Net: Dual-Channel Guided Reconstruction Mamba Network for robust multimodal sentiment analysis 用于鲁棒多模态情感分析的双通道引导重建曼巴网络
IF 6.9 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-16 DOI: 10.1016/j.ipm.2025.104491
Siyuan Liu, Hongkun Zhao, Yang Chen, Fanmin Kong, Kang Li
Multimodal sentiment analysis (MSA) often suffers from missing or degraded modalities. Although Transformer-based reconstruction methods can recover missing features, their high computational cost limits scalability. To address these challenges, we propose a lightweight Dual-Channel Guided Reconstruction Mamba Network (DCGRM-Net), which can be embedded into pre-trained language models (PLMs). Leveraging Mamba’s linear-time modeling and MLP-based conditional generation, DCGRM-Net reduces model complexity while maintaining high computational efficiency. It comprises a Bidirectional Interactive Reconstruction Channel (BIRC) for cross-modal missing feature generation and a Text-Guided Modality Refinement Channel (TGMR) for multi-scale and multi-attention refinement of non-verbal modalities. The reconstructed textual features are fused with both channel outputs and fed into the PLM for sentiment prediction. Evaluations on two benchmarks (CMU-MOSI, 2,199 segments; CMU-MOSEI, 23,454 segments) show that DCGRM-Net surpasses state-of-the-art methods by 2.1%/1.1% and 6.63%/5.46%, respectively, under text-missing and text-based bimodal absence, demonstrating strong robustness and transferability. Furthermore, we conducted a preliminary exploration of the integration of Prompt-Learning methods with DCGRM-Net, revealing their potential and efficient scalability in lightweight modal reconstruction. These results indicate that DCGRM-Net provides an effective and lightweight solution for robust MSA in incomplete data scenarios.
多模态情感分析(MSA)常存在模态缺失或模态退化的问题。尽管基于变压器的重建方法可以恢复缺失的特征,但它们的高计算成本限制了可扩展性。为了解决这些挑战,我们提出了一个轻量级的双通道引导重建曼巴网络(DCGRM-Net),它可以嵌入到预训练的语言模型(plm)中。利用Mamba的线性时间建模和基于mlp的条件生成,DCGRM-Net在保持高计算效率的同时降低了模型的复杂性。该模型包括双向交互重构通道(BIRC)和文本引导模态优化通道(TGMR),前者用于跨模态缺失特征的生成,后者用于非语言模态的多尺度、多注意优化。重建的文本特征与两个通道输出融合,并输入到PLM中进行情感预测。两个基准(CMU-MOSI, 2199个片段;CMU-MOSEI, 23454个片段)的评估表明,在文本缺失和基于文本的双峰缺失情况下,DCGRM-Net分别比最先进的方法高出2.1%/1.1%和6.63%/5.46%,显示出强大的鲁棒性和可转移性。此外,我们对将提示学习方法与DCGRM-Net集成进行了初步探索,揭示了它们在轻量级模态重构中的潜力和高效可扩展性。这些结果表明,DCGRM-Net为不完整数据场景下的鲁棒MSA提供了一种有效且轻量级的解决方案。
{"title":"DCGRM-Net: Dual-Channel Guided Reconstruction Mamba Network for robust multimodal sentiment analysis","authors":"Siyuan Liu,&nbsp;Hongkun Zhao,&nbsp;Yang Chen,&nbsp;Fanmin Kong,&nbsp;Kang Li","doi":"10.1016/j.ipm.2025.104491","DOIUrl":"10.1016/j.ipm.2025.104491","url":null,"abstract":"<div><div>Multimodal sentiment analysis (MSA) often suffers from missing or degraded modalities. Although Transformer-based reconstruction methods can recover missing features, their high computational cost limits scalability. To address these challenges, we propose a lightweight Dual-Channel Guided Reconstruction Mamba Network (DCGRM-Net), which can be embedded into pre-trained language models (PLMs). Leveraging Mamba’s linear-time modeling and MLP-based conditional generation, DCGRM-Net reduces model complexity while maintaining high computational efficiency. It comprises a Bidirectional Interactive Reconstruction Channel (BIRC) for cross-modal missing feature generation and a Text-Guided Modality Refinement Channel (TGMR) for multi-scale and multi-attention refinement of non-verbal modalities. The reconstructed textual features are fused with both channel outputs and fed into the PLM for sentiment prediction. Evaluations on two benchmarks (CMU-MOSI, 2,199 segments; CMU-MOSEI, 23,454 segments) show that DCGRM-Net surpasses state-of-the-art methods by 2.1%/1.1% and 6.63%/5.46%, respectively, under text-missing and text-based bimodal absence, demonstrating strong robustness and transferability. Furthermore, we conducted a preliminary exploration of the integration of Prompt-Learning methods with DCGRM-Net, revealing their potential and efficient scalability in lightweight modal reconstruction. These results indicate that DCGRM-Net provides an effective and lightweight solution for robust MSA in incomplete data scenarios.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104491"},"PeriodicalIF":6.9,"publicationDate":"2025-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An ensemble method using neighborhood granular combination entropy for software defect prediction 基于邻域颗粒组合熵的软件缺陷预测集成方法
IF 6.9 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-15 DOI: 10.1016/j.ipm.2025.104483
Feng Jiang , Xu Yu , Qiang Hu , Jinhuan Liu , Junwei Du
Ensemble learning (EL) has become a widely used tool for software defect prediction (SDP). However, it remains a challenge to enhance the performance of an EL algorithm through manipulating the feature space. To meet this challenge, this paper proposes an EL algorithm called ENGCE, which uses neighborhood granular combination entropy (NGCE) for constructing ensembles of classifiers. In ENGCE, a new technique (called NGCE-based neighborhood approximate reduct) is utilized to manipulate the feature space, which can overcome the limitations of existing attribute reduction techniques. We also investigate the application of ENGCE in SDP and use a hybrid strategy to handle imbalanced SDP datasets. We compare ENGCE against seven baselines using 20 public datasets. When employing the KNN method to construct ensemble members, ENGCE performs best on 17, 16, and 17 datasets in terms of AUC, F1, and MCC, respectively. When employing the CART method to construct ensemble members, ENGCE performs best on 17, 18, and 17 datasets in terms of AUC, F1, and MCC, respectively. In addition, the results of three statistical tests also demonstrate the effectiveness of ENGCE.
集成学习(EL)已经成为软件缺陷预测(SDP)的一种广泛使用的工具。然而,如何通过操纵特征空间来提高EL算法的性能仍然是一个挑战。为了应对这一挑战,本文提出了一种称为ENGCE的EL算法,该算法使用邻域颗粒组合熵(NGCE)来构建分类器集合。在ENGCE中,利用基于ngce的邻域近似约简技术对特征空间进行处理,克服了现有属性约简技术的局限性。我们还研究了ENGCE在SDP中的应用,并使用混合策略处理不平衡的SDP数据集。我们将ENGCE与使用20个公共数据集的7个基线进行比较。当使用KNN方法构建集成成员时,ENGCE分别在17、16和17个数据集上的AUC、F1和MCC方面表现最好。采用CART方法构建集成成员时,在AUC、F1和MCC方面,ENGCE分别在17、18和17个数据集上表现最好。此外,三个统计检验的结果也证明了ENGCE的有效性。
{"title":"An ensemble method using neighborhood granular combination entropy for software defect prediction","authors":"Feng Jiang ,&nbsp;Xu Yu ,&nbsp;Qiang Hu ,&nbsp;Jinhuan Liu ,&nbsp;Junwei Du","doi":"10.1016/j.ipm.2025.104483","DOIUrl":"10.1016/j.ipm.2025.104483","url":null,"abstract":"<div><div>Ensemble learning (EL) has become a widely used tool for software defect prediction (SDP). However, it remains a challenge to enhance the performance of an EL algorithm through manipulating the feature space. To meet this challenge, this paper proposes an EL algorithm called ENGCE, which uses neighborhood granular combination entropy (NGCE) for constructing ensembles of classifiers. In ENGCE, a new technique (called NGCE-based neighborhood approximate reduct) is utilized to manipulate the feature space, which can overcome the limitations of existing attribute reduction techniques. We also investigate the application of ENGCE in SDP and use a hybrid strategy to handle imbalanced SDP datasets. We compare ENGCE against seven baselines using 20 public datasets. When employing the KNN method to construct ensemble members, ENGCE performs best on 17, 16, and 17 datasets in terms of AUC, F1, and MCC, respectively. When employing the CART method to construct ensemble members, ENGCE performs best on 17, 18, and 17 datasets in terms of AUC, F1, and MCC, respectively. In addition, the results of three statistical tests also demonstrate the effectiveness of ENGCE.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104483"},"PeriodicalIF":6.9,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145519842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A synergistic approach to decoupling and knowledge distillation for long- and short-term knowledge states in knowledge tracing 知识追踪中长、短期知识状态解耦与知识蒸馏的协同方法
IF 6.9 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-11-15 DOI: 10.1016/j.ipm.2025.104492
Wei Zhang, Xinyao Zeng
Knowledge tracing is a key technology in adaptive learning platforms and plays a vital role in enabling personalized learning. However, existing approaches often suffer from mutual interference when modeling both long- and short-term knowledge states. Moreover, the lack of deep collaborative modeling strategies during state integration often leads to representational weakening rather than effective synergy, thereby limiting the performance of knowledge tracing. To address these challenges, we propose a novel method for decoupling and knowledge distillation for long- and short-term knowledge states in knowledge tracing (DLSKT). Specifically, we design a position-decoupled hierarchical encoder to capture the learner’s long-term knowledge state and develop a variable-length aware attention mechanism with adaptive sequence length to model their short-term knowledge states. To enhance the distinguishability between long- and short-term representations, we introduce a mutual information minimization constraint based on representational consistency, enabling effective decoupling. Furthermore, to establish efficient collaboration on top of decoupling, we propose a dual-layer asymmetric knowledge distillation strategy that facilitates guided knowledge transfer at both the representation and prediction layers. This design encourages the collaborative modeling of long- and short-term knowledge states, thereby enhancing the model’s expressiveness and predictive accuracy. Extensive experiments on multiple datasets demonstrate that DLSKT consistently outperforms existing strong baseline models in terms of ACC, AUC, and RMSE. It can more effectively model the complex relationships between long- and short-term knowledge states, thereby improving the performance of knowledge tracing tasks.
知识跟踪是自适应学习平台的关键技术,在实现个性化学习方面起着至关重要的作用。然而,现有的方法在建模长期和短期知识状态时经常受到相互干扰。此外,在状态整合过程中缺乏深度协同建模策略,往往导致表征弱化而不是有效协同,从而限制了知识追踪的性能。为了解决这些问题,我们提出了一种新的知识跟踪中长、短期知识状态解耦和知识蒸馏方法。具体来说,我们设计了一个位置解耦的分层编码器来捕捉学习者的长期知识状态,并开发了一个具有自适应序列长度的变长意识注意机制来模拟学习者的短期知识状态。为了增强长期和短期表征之间的可区分性,我们引入了基于表征一致性的互信息最小化约束,实现了有效的解耦。此外,为了在解耦的基础上建立有效的协作,我们提出了一种双层非对称知识蒸馏策略,该策略促进了表征层和预测层的引导知识转移。本设计鼓励对长期和短期知识状态进行协同建模,从而提高模型的表现力和预测精度。在多个数据集上进行的大量实验表明,DLSKT在ACC、AUC和RMSE方面始终优于现有的强基线模型。它可以更有效地对长期和短期知识状态之间的复杂关系进行建模,从而提高知识跟踪任务的性能。
{"title":"A synergistic approach to decoupling and knowledge distillation for long- and short-term knowledge states in knowledge tracing","authors":"Wei Zhang,&nbsp;Xinyao Zeng","doi":"10.1016/j.ipm.2025.104492","DOIUrl":"10.1016/j.ipm.2025.104492","url":null,"abstract":"<div><div>Knowledge tracing is a key technology in adaptive learning platforms and plays a vital role in enabling personalized learning. However, existing approaches often suffer from mutual interference when modeling both long- and short-term knowledge states. Moreover, the lack of deep collaborative modeling strategies during state integration often leads to representational weakening rather than effective synergy, thereby limiting the performance of knowledge tracing. To address these challenges, we propose a novel method for decoupling and knowledge distillation for long- and short-term knowledge states in knowledge tracing (DLSKT). Specifically, we design a position-decoupled hierarchical encoder to capture the learner’s long-term knowledge state and develop a variable-length aware attention mechanism with adaptive sequence length to model their short-term knowledge states. To enhance the distinguishability between long- and short-term representations, we introduce a mutual information minimization constraint based on representational consistency, enabling effective decoupling. Furthermore, to establish efficient collaboration on top of decoupling, we propose a dual-layer asymmetric knowledge distillation strategy that facilitates guided knowledge transfer at both the representation and prediction layers. This design encourages the collaborative modeling of long- and short-term knowledge states, thereby enhancing the model’s expressiveness and predictive accuracy. Extensive experiments on multiple datasets demonstrate that DLSKT consistently outperforms existing strong baseline models in terms of ACC, AUC, and RMSE. It can more effectively model the complex relationships between long- and short-term knowledge states, thereby improving the performance of knowledge tracing tasks.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104492"},"PeriodicalIF":6.9,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145519840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information Processing & Management
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1