Insufficient use of unlabeled data often leads to inaccurate medical image segmentation, and noise in pseudo-labels can further destabilize training. In this paper, we propose a semi-supervised model based on the SAM2 combined with a bidirectional copy-paste mean teacher model (SemiBCP-SAM2). Specifically, we use a student model to generate segmentation results, which are then used as input prompts for SAM2 to generate additional pseudo-labels, providing auxiliary supervision to guide student learning. We also introduce a Masked Prompt (MP) mechanism that reduces prompt confidence to better handle uncertainty and noise, improving its performance in complex or incomplete information scenarios. Another major contribution is the transplantability of this model that can be achieved by replacing the baseline network in the student-teacher model, and can enhance the performance of other semi-supervised segmentation networks at a lower cost. We conduct comparative experiments and performance evaluations of SemiBCP-SAM2 on the ACDC (100 MRI scans) and PROMISE12 (50 MRI scans) datasets. On ACDC, with 5% and 10% labeled data, SemiBCP-SAM2 improves Dice by 0.29% and 1.16%, and Jaccard by 0.39% and 1.84%. On PROMISE12, with 5% and 20% labeled data, it improves Dice by 1.61% and 2.03%, and Jaccard by 1.99% and 2.79%. Source code is released at https://github.com/ydlam/SemiBCP-SAM2.
{"title":"SemiBCP-SAM2 : Semi-supervised model via enhanced bidirectional copy-paste based on SAM2 for medical image segmentation","authors":"Guangqi Yang , Xiaoxin Guo , Haoran Zhang , Zhenyuan Zheng , Hongliang Dong , Songbai Xu","doi":"10.1016/j.ipm.2025.104576","DOIUrl":"10.1016/j.ipm.2025.104576","url":null,"abstract":"<div><div>Insufficient use of unlabeled data often leads to inaccurate medical image segmentation, and noise in pseudo-labels can further destabilize training. In this paper, we propose a semi-supervised model based on the SAM2 combined with a bidirectional copy-paste mean teacher model (SemiBCP-SAM2). Specifically, we use a student model to generate segmentation results, which are then used as input prompts for SAM2 to generate additional pseudo-labels, providing auxiliary supervision to guide student learning. We also introduce a Masked Prompt (MP) mechanism that reduces prompt confidence to better handle uncertainty and noise, improving its performance in complex or incomplete information scenarios. Another major contribution is the transplantability of this model that can be achieved by replacing the baseline network in the student-teacher model, and can enhance the performance of other semi-supervised segmentation networks at a lower cost. We conduct comparative experiments and performance evaluations of SemiBCP-SAM2 on the ACDC (100 MRI scans) and PROMISE12 (50 MRI scans) datasets. On ACDC, with 5% and 10% labeled data, SemiBCP-SAM2 improves Dice by 0.29% and 1.16%, and Jaccard by 0.39% and 1.84%. On PROMISE12, with 5% and 20% labeled data, it improves Dice by 1.61% and 2.03%, and Jaccard by 1.99% and 2.79%. Source code is released at <span><span>https://github.com/ydlam/SemiBCP-SAM2</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 4","pages":"Article 104576"},"PeriodicalIF":6.9,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-31DOI: 10.1016/j.ipm.2025.104574
Xianglin Zhao , Yucheng Jin , Annie Yan Wang , Ming Zhang
Wearable devices provide rich quantitative data for self-reflection on physical activity. However, users often struggle to derive meaningful insights from these data, highlighting the need for enhanced support. To investigate whether Large Language Models (LLMs) can facilitate this process, we propose and evaluate a human-LLM collaborative reflective journaling paradigm. We developed PaceMind, an LLM-mediated journaling system that implements this paradigm based on a three-stage reflection framework. It can generate data-driven drafts and personalized questions to guide users in integrating exercise data with personal insights. A two-week within-subjects study () compared the LLM-mediated system with a template-based journaling baseline. The LLM-mediated design significantly improved the perceived effectiveness of reflection support and increased users’ intention to use the system. However, perceived ease of use did not improve significantly. Users appreciated the LLM’s scaffolding for easing data sense-making, but also reported added cognitive work in verifying and personalizing the LLM-generated content. Although objective activity levels did not change significantly, the LLM-mediated condition showed a trend toward more adaptive exercise planning and sustained engagement. Our findings provide empirical evidence for a human-LLM collaborative reflection paradigm in a data-intensive exercise context. They highlight both the potential to deepen user reflection and underscore the critical design challenge of balancing automation with meaningful cognitive engagement and user control.
{"title":"From tracking to thinking: Facilitating post-exercise reflection by a large language model-mediated journaling system","authors":"Xianglin Zhao , Yucheng Jin , Annie Yan Wang , Ming Zhang","doi":"10.1016/j.ipm.2025.104574","DOIUrl":"10.1016/j.ipm.2025.104574","url":null,"abstract":"<div><div>Wearable devices provide rich quantitative data for self-reflection on physical activity. However, users often struggle to derive meaningful insights from these data, highlighting the need for enhanced support. To investigate whether Large Language Models (LLMs) can facilitate this process, we propose and evaluate a human-LLM collaborative reflective journaling paradigm. We developed <em>PaceMind</em>, an LLM-mediated journaling system that implements this paradigm based on a three-stage reflection framework. It can generate data-driven drafts and personalized questions to guide users in integrating exercise data with personal insights. A two-week within-subjects study (<span><math><mrow><mi>N</mi><mo>=</mo><mn>21</mn></mrow></math></span>) compared the LLM-mediated system with a template-based journaling baseline. The LLM-mediated design significantly improved the perceived effectiveness of reflection support and increased users’ intention to use the system. However, perceived ease of use did not improve significantly. Users appreciated the LLM’s scaffolding for easing data sense-making, but also reported added cognitive work in verifying and personalizing the LLM-generated content. Although objective activity levels did not change significantly, the LLM-mediated condition showed a trend toward more adaptive exercise planning and sustained engagement. Our findings provide empirical evidence for a human-LLM collaborative reflection paradigm in a data-intensive exercise context. They highlight both the potential to deepen user reflection and underscore the critical design challenge of balancing automation with meaningful cognitive engagement and user control.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 4","pages":"Article 104574"},"PeriodicalIF":6.9,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-31DOI: 10.1016/j.ipm.2025.104575
Haibo Zhang , Zhenyu Liu , Yang Wu , Jiaqian Yuan , Gang Li , Zhijie Ding , Bin Hu
Text-based automated depression detection is one of the current hot topics. However, current research lacks the exploration of key verbal behaviors in depression detection scenarios, resulting in insufficient generalization performance of the models. To address this issue, we propose a depression detection method based on emotional pattern discrepancies, as the discrepancies are one of the fundamental features of depression as an affective disorder. Specifically, we propose an Emotional Pattern Discrepancy Aware Depression Detection Model (EPDAD). The EPDAD employs specially designed modules and loss functions to train the model. This approach enables the model to dynamically and comprehensively perceive the different emotional patterns reflected by depressed and healthy individuals in response to various emotional stimuli. As a result, it enhances the model’s ability to learn the essential features of depression. We evaluate the generalization performance of our model from a cross-dataset and cross-topic perspective using MODMA (52 samples) and MIDD (520 samples) datasets. In cross-topic generalization experiments, our method improves F1 score by 10.39% and 1.77% on MODMA and MIDD, respectively, in comparison to the state-of-the-art method. In cross-dataset generalization experiments, our method improves the F1 score by a maximum of 6.37%. We also compare our model with large language models, and the results indicate it is more effective for depression detection tasks. Our research contributes to the practical application of depression detection models. Our code is available at: https://github.com/hbZhzzz/EPDAD.
{"title":"A text-based emotional pattern discrepancy aware model for enhanced generalization in depression detection","authors":"Haibo Zhang , Zhenyu Liu , Yang Wu , Jiaqian Yuan , Gang Li , Zhijie Ding , Bin Hu","doi":"10.1016/j.ipm.2025.104575","DOIUrl":"10.1016/j.ipm.2025.104575","url":null,"abstract":"<div><div>Text-based automated depression detection is one of the current hot topics. However, current research lacks the exploration of key verbal behaviors in depression detection scenarios, resulting in insufficient generalization performance of the models. To address this issue, we propose a depression detection method based on emotional pattern discrepancies, as the discrepancies are one of the fundamental features of depression as an affective disorder. Specifically, we propose an Emotional Pattern Discrepancy Aware Depression Detection Model (EPDAD). The EPDAD employs specially designed modules and loss functions to train the model. This approach enables the model to dynamically and comprehensively perceive the different emotional patterns reflected by depressed and healthy individuals in response to various emotional stimuli. As a result, it enhances the model’s ability to learn the essential features of depression. We evaluate the generalization performance of our model from a cross-dataset and cross-topic perspective using MODMA (52 samples) and MIDD (520 samples) datasets. In cross-topic generalization experiments, our method improves F1 score by 10.39% and 1.77% on MODMA and MIDD, respectively, in comparison to the state-of-the-art method. In cross-dataset generalization experiments, our method improves the F1 score by a maximum of 6.37%. We also compare our model with large language models, and the results indicate it is more effective for depression detection tasks. Our research contributes to the practical application of depression detection models. Our code is available at: <span><span>https://github.com/hbZhzzz/EPDAD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 4","pages":"Article 104575"},"PeriodicalIF":6.9,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-31DOI: 10.1016/j.ipm.2025.104571
Xinyu Su , Shihao Wang , Wei Huang , Zheng Li , Hongmei Chen , Zhong Yuan
Unsupervised outlier detection is a critical task in data mining. Two prominent paradigms, fuzzy information granulation and representation learning, have shown promise but face fundamental, opposing limitations. Fuzzy information granulation-based methods excel at modeling data uncertainty but struggle with the curse of dimensionality and noise in high-dimensional spaces. Conversely, representation learning-based methods effectively handle high-dimensional data but often neglect the uncertainty information inherent in data, such as fuzziness. To address these limitations, we propose Latent Representation-based Outlier Detection with fuzzy granule (LROD). In LROD, we utilize representation learning to address the challenges encountered by fuzzy information granulation-based methods in high-dimensional data by deriving a compact and effective representation from the original feature space. The reconstruction error of each sample serves as the first component of the outlier score. This error, derived from representation learning, effectively captures global structural abnormal information in the data. Subsequently, we introduce fuzzy information granulation on this new representation to address data uncertainty. The second component is formed by aggregating abnormal information from fuzzy information granules, which are induced by various attribute subsets. Finally, these two components are fused to produce the final outlier score. Experimental results demonstrate that LROD outperforms 20 competing methods across 15 datasets, achieving improvements of 4.5%, 10.5%, and 3.1% in AUC, AP, and G-mean metrics, respectively, compared to the second-best method, validating its superior effectiveness. This study demonstrates the significant benefits of a hybrid method, providing a new framework for fusing global structural information with local uncertainty measures to achieve state-of-the-art performance in outlier detection. The code is publicly available at https://github.com/Mxeron/LROD.
{"title":"Outlier detector fusing latent representation and fuzzy granule","authors":"Xinyu Su , Shihao Wang , Wei Huang , Zheng Li , Hongmei Chen , Zhong Yuan","doi":"10.1016/j.ipm.2025.104571","DOIUrl":"10.1016/j.ipm.2025.104571","url":null,"abstract":"<div><div>Unsupervised outlier detection is a critical task in data mining. Two prominent paradigms, fuzzy information granulation and representation learning, have shown promise but face fundamental, opposing limitations. Fuzzy information granulation-based methods excel at modeling data uncertainty but struggle with the curse of dimensionality and noise in high-dimensional spaces. Conversely, representation learning-based methods effectively handle high-dimensional data but often neglect the uncertainty information inherent in data, such as fuzziness. To address these limitations, we propose Latent Representation-based Outlier Detection with fuzzy granule (LROD). In LROD, we utilize representation learning to address the challenges encountered by fuzzy information granulation-based methods in high-dimensional data by deriving a compact and effective representation from the original feature space. The reconstruction error of each sample serves as the first component of the outlier score. This error, derived from representation learning, effectively captures global structural abnormal information in the data. Subsequently, we introduce fuzzy information granulation on this new representation to address data uncertainty. The second component is formed by aggregating abnormal information from fuzzy information granules, which are induced by various attribute subsets. Finally, these two components are fused to produce the final outlier score. Experimental results demonstrate that LROD outperforms 20 competing methods across 15 datasets, achieving improvements of 4.5%, 10.5%, and 3.1% in AUC, AP, and G-mean metrics, respectively, compared to the second-best method, validating its superior effectiveness. This study demonstrates the significant benefits of a hybrid method, providing a new framework for fusing global structural information with local uncertainty measures to achieve state-of-the-art performance in outlier detection. The code is publicly available at <span><span>https://github.com/Mxeron/LROD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 4","pages":"Article 104571"},"PeriodicalIF":6.9,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-31DOI: 10.1016/j.ipm.2025.104584
Yan Hai , Jing Wang , Zhizhong Liu , Lingqiang Meng , Ling Shang , Quan Z. Sheng
Next Point-of-Interest (POI) recommendations for random groups are challenging due to the instability of member relationships and the dynamic evolution of member preferences. To address the issues above, this work proposes a novel Next POI recommendation for Random Group based on Spatio-Temporal Heterogeneous Graph (named as NPRRG-STHG) model. Specifically, NPRRG-STHG constructs a spatio-temporal heterogeneous graph and uses HNode2Vec to learn members’ multidimensional preferences. Next, NPRRG-STHG balances preference differences among group members and generates a fitted representation of the random group. Meanwhile, NPRRG-STHG learns comprehensive POI representations from spatio-temporal enhanced POI interaction graphs and POI transfer graphs using Edge-Enhanced Bipartite Graph Neural Network (EBGNN) and Spatio-Temporal Graph Convolutional Network (STGCN) models, respectively. Finally, NPRRG-STHG recommends the next POI that best matches the random group’s overall preferences. We validated NPRRG-STHG on three public benchmark datasets (Foursquare, Gowalla, and Yelp) with 124,933 to 860,888 check-in records. Compared to advanced baselines, NPRRG-STHG achieved average improvements of about 21.4% in Precision@K and 36.7% in NDCG@K. Ablation studies further verify the effectiveness of each component. These results demonstrate that NPRRG-STHG provides an effective solution for next POI recommendations in random groups.
{"title":"Next POI recommendation for random group based on Spatio-Temporal heterogeneous graph","authors":"Yan Hai , Jing Wang , Zhizhong Liu , Lingqiang Meng , Ling Shang , Quan Z. Sheng","doi":"10.1016/j.ipm.2025.104584","DOIUrl":"10.1016/j.ipm.2025.104584","url":null,"abstract":"<div><div>Next Point-of-Interest (POI) recommendations for random groups are challenging due to the instability of member relationships and the dynamic evolution of member preferences. To address the issues above, this work proposes a novel Next POI recommendation for Random Group based on Spatio-Temporal Heterogeneous Graph (named as NPRRG-STHG) model. Specifically, NPRRG-STHG constructs a spatio-temporal heterogeneous graph and uses HNode2Vec to learn members’ multidimensional preferences. Next, NPRRG-STHG balances preference differences among group members and generates a fitted representation of the random group. Meanwhile, NPRRG-STHG learns comprehensive POI representations from spatio-temporal enhanced POI interaction graphs and POI transfer graphs using Edge-Enhanced Bipartite Graph Neural Network (EBGNN) and Spatio-Temporal Graph Convolutional Network (STGCN) models, respectively. Finally, NPRRG-STHG recommends the next POI that best matches the random group’s overall preferences. We validated NPRRG-STHG on three public benchmark datasets (Foursquare, Gowalla, and Yelp) with 124,933 to 860,888 check-in records. Compared to advanced baselines, NPRRG-STHG achieved average improvements of about 21.4% in Precision@K and 36.7% in NDCG@K. Ablation studies further verify the effectiveness of each component. These results demonstrate that NPRRG-STHG provides an effective solution for next POI recommendations in random groups.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 4","pages":"Article 104584"},"PeriodicalIF":6.9,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-31DOI: 10.1016/j.ipm.2025.104603
Haval I. Hussein , Masoud M. Hassan
Facial attribute recognition (FAR) has garnered significant attention due to its wide-ranging applications in biometrics and security. Traditional FAR methods typically learn shared feature representations across all attributes; however, they often fail to capture the unique characteristics necessary for each attribute, thereby limiting performance. Moreover, these methods frequently neglect uncertainty quantification, which is crucial for enhancing model reliability. To address these issues, we propose a novel FAR model that integrates global and label-specific feature learning with uncertainty quantification. The proposed model utilizes EfficientNetV2B0 as the backbone architecture and introduces two specialized heads: one for refining global features through shared attention and another for learning label-specific attention. These heads were trained jointly, and their predicted probabilities were averaged during inference to improve performance. Experiments conducted on the CelebA and LFWA datasets demonstrated that the proposed model outperformed both baseline and state-of-the-art models, achieving average accuracies of 92.11% and 87.46%, respectively. Moreover, the inclusion of uncertainty quantification provided valuable insights into model confidence, which was accompanied by measurable performance improvements, with average accuracy gains of 0.01% on CelebA and 0.05% on LFWA. Despite the improvement in accuracy, the model maintained a computational efficiency of 3.1 GFLOPs and a parameter count of 24.57 million. Additionally, visualization results using Grad-CAM showed that the attention modules accurately focused on relevant facial regions, thereby validating the interpretability of the model. These results highlight the potential of our approach for accurate, efficient, and interpretable FAR in real-world applications.
{"title":"Uncertainty-aware facial attribute recognition through joint learning of shared and label-specific attention","authors":"Haval I. Hussein , Masoud M. Hassan","doi":"10.1016/j.ipm.2025.104603","DOIUrl":"10.1016/j.ipm.2025.104603","url":null,"abstract":"<div><div>Facial attribute recognition (FAR) has garnered significant attention due to its wide-ranging applications in biometrics and security. Traditional FAR methods typically learn shared feature representations across all attributes; however, they often fail to capture the unique characteristics necessary for each attribute, thereby limiting performance. Moreover, these methods frequently neglect uncertainty quantification, which is crucial for enhancing model reliability. To address these issues, we propose a novel FAR model that integrates global and label-specific feature learning with uncertainty quantification. The proposed model utilizes EfficientNetV2B0 as the backbone architecture and introduces two specialized heads: one for refining global features through shared attention and another for learning label-specific attention. These heads were trained jointly, and their predicted probabilities were averaged during inference to improve performance. Experiments conducted on the CelebA and LFWA datasets demonstrated that the proposed model outperformed both baseline and state-of-the-art models, achieving average accuracies of 92.11% and 87.46%, respectively. Moreover, the inclusion of uncertainty quantification provided valuable insights into model confidence, which was accompanied by measurable performance improvements, with average accuracy gains of 0.01% on CelebA and 0.05% on LFWA. Despite the improvement in accuracy, the model maintained a computational efficiency of 3.1 GFLOPs and a parameter count of 24.57 million. Additionally, visualization results using Grad-CAM showed that the attention modules accurately focused on relevant facial regions, thereby validating the interpretability of the model. These results highlight the potential of our approach for accurate, efficient, and interpretable FAR in real-world applications.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 4","pages":"Article 104603"},"PeriodicalIF":6.9,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1016/j.ipm.2025.104592
Xiaohu Zheng , Siqi Du , Zhifeng Liu , Jian Wang , Shunli Hou , Ke Wang
Yarn quality is a critical control object in the textile industry. Accurate quality prediction not only reduces costs but also provides data support for process optimization. However, the high-dimensional and noisy data in the yarn production process cause poor model predictive capability, and existing methods struggle to capture the complex interrelationships between processes. To address this, a dual-channel quality prediction method based on graph embedding is proposed. Specifically, a process-oriented heterogeneous network is constructed to represent the production process nodes and their collaborative relationships in the form of a directed heterogeneous graph. Based on this graph structure, a dynamically adjustable embedding module is designed to generate node embeddings with good interpretability for the process flow. A dual-channel quality prediction architecture is then designed for quality prediction in scenarios with noisy data. The proposed method is experimentally validated on cotton production data from an enterprise with 608 data samples. The results show that the proposed method demonstrates the best overall performance in predicting different yarn quality indicators. The mean square error, mean absolute error and root mean square error are reduced by 45.3%, 34.5% and 25.2% on average, respectively. This provides a new modeling reference for quality prediction in manufacturing scenarios with clear process flows.
{"title":"Graph embedding-based dual-channel quality prediction method for yarn manufacturing system","authors":"Xiaohu Zheng , Siqi Du , Zhifeng Liu , Jian Wang , Shunli Hou , Ke Wang","doi":"10.1016/j.ipm.2025.104592","DOIUrl":"10.1016/j.ipm.2025.104592","url":null,"abstract":"<div><div>Yarn quality is a critical control object in the textile industry. Accurate quality prediction not only reduces costs but also provides data support for process optimization. However, the high-dimensional and noisy data in the yarn production process cause poor model predictive capability, and existing methods struggle to capture the complex interrelationships between processes. To address this, a dual-channel quality prediction method based on graph embedding is proposed. Specifically, a process-oriented heterogeneous network is constructed to represent the production process nodes and their collaborative relationships in the form of a directed heterogeneous graph. Based on this graph structure, a dynamically adjustable embedding module is designed to generate node embeddings with good interpretability for the process flow. A dual-channel quality prediction architecture is then designed for quality prediction in scenarios with noisy data. The proposed method is experimentally validated on cotton production data from an enterprise with 608 data samples. The results show that the proposed method demonstrates the best overall performance in predicting different yarn quality indicators. The mean square error, mean absolute error and root mean square error are reduced by 45.3%, 34.5% and 25.2% on average, respectively. This provides a new modeling reference for quality prediction in manufacturing scenarios with clear process flows.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 4","pages":"Article 104592"},"PeriodicalIF":6.9,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1016/j.ipm.2025.104577
Na Tian , Qihang Jia , Wenna Liu , Xiangfu Ding , Wencang Zhao
Scene graph generation is a crucial task in visual scene understanding and reasoning, but its performance is often limited by the long-tail distribution of relationships. The high similarity of predicate semantics makes tail predicates easily covered by high-frequency head predicates, leading to semantic confusion and a decline in rare relationship recognition performance. To address it, we propose the Cognitive Alignment Network, which draws inspiration from the cognitive psychology mechanism of sensation and perception for alleviating predicate semantic confusion. It explicitly models the reasoning process by separating coarse-grained sensory capture from fine-grained perceptual reasoning. The sensory sensitive module enhances the extraction of object features from visual stimuli, while the perceptual reinforcement module improves the relationship instances between similar predicates to ensure fine-grained semantic distinctions without altering the underlying meaning. Experiments show that the mR@K of CANet outperforms the current state-of-the-art method by 2.6% on Visual Genome and its Scorewtd improves by 1.4% on Open Images V6. Visualization results also validate that incorporating cognitive mechanisms into scene graphs can effectively mitigate the long-tail problem and enhance the model’s generalization and reasoning capabilities in real-world scenes.
{"title":"Cognitive alignment network: Integrating sensory-perceptual cues for predicate similarity discrimination","authors":"Na Tian , Qihang Jia , Wenna Liu , Xiangfu Ding , Wencang Zhao","doi":"10.1016/j.ipm.2025.104577","DOIUrl":"10.1016/j.ipm.2025.104577","url":null,"abstract":"<div><div>Scene graph generation is a crucial task in visual scene understanding and reasoning, but its performance is often limited by the long-tail distribution of relationships. The high similarity of predicate semantics makes tail predicates easily covered by high-frequency head predicates, leading to semantic confusion and a decline in rare relationship recognition performance. To address it, we propose the Cognitive Alignment Network, which draws inspiration from the cognitive psychology mechanism of sensation and perception for alleviating predicate semantic confusion. It explicitly models the reasoning process by separating coarse-grained sensory capture from fine-grained perceptual reasoning. The sensory sensitive module enhances the extraction of object features from visual stimuli, while the perceptual reinforcement module improves the relationship instances between similar predicates to ensure fine-grained semantic distinctions without altering the underlying meaning. Experiments show that the mR@K of CANet outperforms the current state-of-the-art method by 2.6% on Visual Genome and its <em>Score<sub>wtd</sub></em> improves by 1.4% on Open Images V6. Visualization results also validate that incorporating cognitive mechanisms into scene graphs can effectively mitigate the long-tail problem and enhance the model’s generalization and reasoning capabilities in real-world scenes.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 4","pages":"Article 104577"},"PeriodicalIF":6.9,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1016/j.ipm.2025.104598
Huilin Liu , Xiaolong Hu , Yu Jiang , Tianyue Wan , Wanqi Ma
Accurate detections of traffic accidents are challenging, and existing detection methods struggle to maintain robustness to diverse conditions. To address this, we propose a structured and causality-guided spatiotemporal diffusion network (SCTNet) for unsupervised traffic accident detection. The SCTNet framework integrates dual-phase patch sampling (DPPS) to mitigate sampling bias between training and testing phases. Spatiotemporal causal graph fusion (STCGF) captures the causal dependencies among interacting agents, and a structured spatiotemporal noise (SSTN) mechanism enhances temporal sensitivity and context consistency. The diffusion-based dual-stream design enables the fusion of visual and motion information for robust spatiotemporal representation learning. Experiments conducted on two traffic datasets show that SCTNet achieves higher detection accuracy and stronger cross-domain generalization than existing methods. More generally, our study contributes to data-driven decision making and research on intelligent information systems in complex, dynamic transportation environments. The source code is available at https://github.com/Jasoncode0115/SCTNet.
{"title":"SCTNet: Structured and causality-guided spatiotemporal diffusion network for unsupervised traffic accident detection","authors":"Huilin Liu , Xiaolong Hu , Yu Jiang , Tianyue Wan , Wanqi Ma","doi":"10.1016/j.ipm.2025.104598","DOIUrl":"10.1016/j.ipm.2025.104598","url":null,"abstract":"<div><div>Accurate detections of traffic accidents are challenging, and existing detection methods struggle to maintain robustness to diverse conditions. To address this, we propose a structured and causality-guided spatiotemporal diffusion network (SCTNet) for unsupervised traffic accident detection. The SCTNet framework integrates dual-phase patch sampling (DPPS) to mitigate sampling bias between training and testing phases. Spatiotemporal causal graph fusion (STCGF) captures the causal dependencies among interacting agents, and a structured spatiotemporal noise (SSTN) mechanism enhances temporal sensitivity and context consistency. The diffusion-based dual-stream design enables the fusion of visual and motion information for robust spatiotemporal representation learning. Experiments conducted on two traffic datasets show that SCTNet achieves higher detection accuracy and stronger cross-domain generalization than existing methods. More generally, our study contributes to data-driven decision making and research on intelligent information systems in complex, dynamic transportation environments. The source code is available at <span><span>https://github.com/Jasoncode0115/SCTNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 4","pages":"Article 104598"},"PeriodicalIF":6.9,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145847574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1016/j.ipm.2025.104581
Hao Liu , Dong Li , Bing Zeng , Haopeng Ren
Effective multi-hop reasoning over knowledge graphs is critical for knowledge completion, yet prior methods often struggle to model relation dependencies and capture neighborhood context interactions, thereby limiting path interpretability and predictive performance. To adequately model the interaction between neighborhood information and context, we introduce a graph attention convolutional (GAC) mechanism that aggregates and updates node information within the local first-order neighborhood. We then employ attention mechanisms to generate entity and relation reasoning contexts and construct GAC-based policy networks to reinforce interaction between these contexts and their corresponding neighborhoods. Extensive experiments on five knowledge graphs demonstrate the effectiveness of our method, which achieves notable improvements on FB15K-237, including a 7.6 % relative improvement in Hits@1, a 14.6 % increase in MRR, and a 6.9 % enhancement in path interpretability.
{"title":"Graph attention convolutional networks for interpretable multi-hop knowledge graph reasoning","authors":"Hao Liu , Dong Li , Bing Zeng , Haopeng Ren","doi":"10.1016/j.ipm.2025.104581","DOIUrl":"10.1016/j.ipm.2025.104581","url":null,"abstract":"<div><div>Effective multi-hop reasoning over knowledge graphs is critical for knowledge completion, yet prior methods often struggle to model relation dependencies and capture neighborhood context interactions, thereby limiting path interpretability and predictive performance. To adequately model the interaction between neighborhood information and context, we introduce a graph attention convolutional (GAC) mechanism that aggregates and updates node information within the local first-order neighborhood. We then employ attention mechanisms to generate entity and relation reasoning contexts and construct GAC-based policy networks to reinforce interaction between these contexts and their corresponding neighborhoods. Extensive experiments on five knowledge graphs demonstrate the effectiveness of our method, which achieves notable improvements on FB15K-237, including a 7.6 % relative improvement in Hits@1, a 14.6 % increase in MRR, and a 6.9 % enhancement in path interpretability.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 4","pages":"Article 104581"},"PeriodicalIF":6.9,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}