Pub Date : 2024-10-23DOI: 10.1016/j.ipm.2024.103931
Haijun He, Bobo Li, Yiyun Xiong, Li Zheng, Kang He, Fei Li, Donghong Ji
Personality Recognition in Conversations (PRC) is a task of significant interest and practical value. Existing studies on the PRC task utilize conversation inadequately and neglect affective information. Considering the way of information processing of these studies is not yet close enough to the concept of personality, we propose the SAH-GCN model for the PRC task in this study. This model initially processes the original conversation input to extract the central speaker feature. Leveraging Contrastive Learning, it continuously adjusts the embedding of each utterance by incorporating affective information to cope with the semantic similarity. Subsequently, the model employs Graph Convolutional Networks to simulate the conversation dynamics, ensuring comprehensive interaction between the central speaker feature and other relevant features. Lastly, it heuristically fuses central speaker features from multiple conversations involving the same speaker into one comprehensive feature, facilitating personality recognition. We conduct experiments using the recently released CPED dataset, which is the personality dataset encompassing affection labels and conversation details. Our results demonstrate that SAH-GCN achieves superior accuracy (+1.88%) compared to prior works on the PRC task. Further analysis verifies the efficacy of our scheme that fuses multiple conversations and incorporates affective information for personality recognition.
{"title":"Heuristic personality recognition based on fusing multiple conversations and utterance-level affection","authors":"Haijun He, Bobo Li, Yiyun Xiong, Li Zheng, Kang He, Fei Li, Donghong Ji","doi":"10.1016/j.ipm.2024.103931","DOIUrl":"10.1016/j.ipm.2024.103931","url":null,"abstract":"<div><div><strong>P</strong>ersonality <strong>R</strong>ecognition in <strong>C</strong>onversations (<strong>PRC</strong>) is a task of significant interest and practical value. Existing studies on the PRC task utilize conversation inadequately and neglect affective information. Considering the way of information processing of these studies is not yet close enough to the concept of personality, we propose the SAH-GCN model for the PRC task in this study. This model initially processes the original conversation input to extract the central speaker feature. Leveraging Contrastive Learning, it continuously adjusts the embedding of each utterance by incorporating affective information to cope with the semantic similarity. Subsequently, the model employs Graph Convolutional Networks to simulate the conversation dynamics, ensuring comprehensive interaction between the central speaker feature and other relevant features. Lastly, it heuristically fuses central speaker features from multiple conversations involving the same speaker into one comprehensive feature, facilitating personality recognition. We conduct experiments using the recently released CPED dataset, which is the personality dataset encompassing affection labels and conversation details. Our results demonstrate that SAH-GCN achieves superior accuracy (+1.88%) compared to prior works on the PRC task. Further analysis verifies the efficacy of our scheme that fuses multiple conversations and incorporates affective information for personality recognition.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103931"},"PeriodicalIF":7.4,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142536066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph contrastive learning (GCL) has recently attracted significant attention in the field of recommender systems. However, many GCL methods aim to enhance recommendation accuracy by employing dense matrix operations and frequent manipulation of graph structures to generate contrast views, leading to substantial computational resource consumption. While simpler GCL methods have lower computational costs, they fail to fully exploit collaborative filtering information, leading to reduced accuracy. On the other hand, more complex adaptive methods achieve higher accuracy but at the expense of significantly greater computational cost. Consequently, there exists a considerable gap in accuracy between these lightweight models and the more complex GCL methods focused on high accuracy.
To address this issue and achieve high predictive accuracy while maintaining low computational cost, we propose a novel method that incorporates attention-wise graph reconstruction with message masking and cross-view interaction for contrastive learning. The attention-wise graph reconstruction with message masking preserves the structural and semantic information of the graph while mitigating the overfitting problem. Linear attention ensures that the algorithm’s complexity remains low. Furthermore, the cross-view interaction is capable of capturing more high-quality latent features. Our results, validated on four datasets, demonstrate that the proposed method maintains a lightweight computational cost and significantly outperforms the baseline methods in recommendation accuracy.
{"title":"LacGCL: Lightweight message masking with linear attention and cross-view interaction graph contrastive learning for recommendation","authors":"Haohe Jia , Peng Hou , Yong Zhou , Hongbin Zhu , Hongfeng Chai","doi":"10.1016/j.ipm.2024.103930","DOIUrl":"10.1016/j.ipm.2024.103930","url":null,"abstract":"<div><div>Graph contrastive learning (GCL) has recently attracted significant attention in the field of recommender systems. However, many GCL methods aim to enhance recommendation accuracy by employing dense matrix operations and frequent manipulation of graph structures to generate contrast views, leading to substantial computational resource consumption. While simpler GCL methods have lower computational costs, they fail to fully exploit collaborative filtering information, leading to reduced accuracy. On the other hand, more complex adaptive methods achieve higher accuracy but at the expense of significantly greater computational cost. Consequently, there exists a considerable gap in accuracy between these lightweight models and the more complex GCL methods focused on high accuracy.</div><div>To address this issue and achieve high predictive accuracy while maintaining low computational cost, we propose a novel method that incorporates attention-wise graph reconstruction with message masking and cross-view interaction for contrastive learning. The attention-wise graph reconstruction with message masking preserves the structural and semantic information of the graph while mitigating the overfitting problem. Linear attention ensures that the algorithm’s complexity remains low. Furthermore, the cross-view interaction is capable of capturing more high-quality latent features. Our results, validated on four datasets, demonstrate that the proposed method maintains a lightweight computational cost and significantly outperforms the baseline methods in recommendation accuracy.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103930"},"PeriodicalIF":7.4,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142536247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-22DOI: 10.1016/j.ipm.2024.103917
Heng-yang Lu , Tian-ci Liu , Rui Cong , Jun Yang , Qiang Gan , Wei Fang , Xiao-jun Wu
Aspect-based Sentiment Analysis (ABSA) aims to extract fine-grained sentiment information from online reviews. Few-shot ABSA faces challenges with limited labeled data and recent generative models have outperformed traditional classification models. Existing methods use Question Answering (QA) templates with Text-to-Text Transfer Transformer (T5) to extract sentiment elements, introducing a generative sentiment analysis paradigm. However, these models often fail to fully grasp ABSA rules, generating non-standard or incorrect outputs. This issue also arises with large language models (LLMs) due to insufficient labeled data for tuning and learning. Additionally, ABSA datasets often include many short, uninformative reviews, complicating sentiment element extraction in few-shot scenarios. This paper addresses two major challenges in few-shot ABSA: (1) How to let the generative model well understand the ABSA rules under few-shot scenarios. (2) How to enhance the review text with richer information. We propose a Quantity Augmentation and Information Enhancement (QAIE) approach, leveraging LLMs to generate fluent texts and infer implicit information. First, we propose a quantity augmentation module, which leverages the large language model (LLM) to obtain sufficient labeled data for the generative model to learn the ABSA rules better. Then, we introduce an information enhancement module, which brings more informative input to the generative model by enhancing the information in the review. Comprehensive experiments on five ABSA tasks using three widely-used datasets demonstrate that our QAIE model achieves approximately 10% improvement over state-of-the-art models. Specifically, for the most challenging ASQP task, our LLM-based model is compared with the existing state-of-the-art models on datasets Rest15 and Rest16, achieving F1 gains of 9.42% and 6.45% respectively in the few-shot setting.
{"title":"QAIE: LLM-based Quantity Augmentation and Information Enhancement for few-shot Aspect-Based Sentiment Analysis","authors":"Heng-yang Lu , Tian-ci Liu , Rui Cong , Jun Yang , Qiang Gan , Wei Fang , Xiao-jun Wu","doi":"10.1016/j.ipm.2024.103917","DOIUrl":"10.1016/j.ipm.2024.103917","url":null,"abstract":"<div><div>Aspect-based Sentiment Analysis (ABSA) aims to extract fine-grained sentiment information from online reviews. Few-shot ABSA faces challenges with limited labeled data and recent generative models have outperformed traditional classification models. Existing methods use Question Answering (QA) templates with Text-to-Text Transfer Transformer (T5) to extract sentiment elements, introducing a generative sentiment analysis paradigm. However, these models often fail to fully grasp ABSA rules, generating non-standard or incorrect outputs. This issue also arises with large language models (LLMs) due to insufficient labeled data for tuning and learning. Additionally, ABSA datasets often include many short, uninformative reviews, complicating sentiment element extraction in few-shot scenarios. This paper addresses two major challenges in few-shot ABSA: (1) <em>How to let the generative model well understand the ABSA rules under few-shot scenarios</em>. (2) <em>How to enhance the review text with richer information</em>. We propose a <strong>Q</strong>uantity <strong>A</strong>ugmentation and <strong>I</strong>nformation <strong>E</strong>nhancement (<strong>QAIE</strong>) approach, leveraging LLMs to generate fluent texts and infer implicit information. First, we propose a quantity augmentation module, which leverages the large language model (LLM) to obtain sufficient labeled data for the generative model to learn the ABSA rules better. Then, we introduce an information enhancement module, which brings more informative input to the generative model by enhancing the information in the review. Comprehensive experiments on five ABSA tasks using three widely-used datasets demonstrate that our QAIE model achieves approximately 10% improvement over state-of-the-art models. Specifically, for the most challenging ASQP task, our LLM-based model is compared with the existing state-of-the-art models on datasets Rest15 and Rest16, achieving F1 gains of 9.42% and 6.45% respectively in the <span><math><mrow><mi>k</mi><mo>=</mo><mn>5</mn></mrow></math></span> few-shot setting.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103917"},"PeriodicalIF":7.4,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142536246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-20DOI: 10.1016/j.ipm.2024.103934
Qiang Cao , Xian Cheng
Despite the widespread adoption of deep learning to enhance image classification, significant obstacles remain. First, multisource data with diverse sizes and formats is a great challenge for most current deep learning models. Second, lacking manual labeled data for model training limits the application of deep learning. Third, the widely used CNN-based methods shows their limitations in extracting global features and yield poor performance for image topology. To address these issues, we propose a Hybrid Feature Fusion Deep Learning (HFFDL) framework for image classification. This framework consists of an automated image segmentation module, a two-stream backbone module, and a classification module. The automatic image segmentation module utilizes the U-Net model and transfer learning to detect region of interest (ROI) in multisource images; the two-stream backbone module integrates the Swin Transformer architecture with the Inception CNN, with the aim of simultaneous extracting local and global features for efficient representation learning. We evaluate the performance of HFFDL framework with two publicly available image datasets: one for identifying COVID-19 through X-ray scans of the chest (30,386 images), and another for multiclass skin cancer screening using dermoscopy images (25,331 images). The HFFDL framework exhibited greater performance in comparison to many cutting-edge models, achieving the AUC score 0.9835 and 0.8789, respectively. Furthermore, a practical application study conducted in a hospital, identifying viable embryos using medical images, revealed the HFFDL framework outperformed embryologists.
{"title":"A hybrid feature fusion deep learning framework for multi-source medical image analysis","authors":"Qiang Cao , Xian Cheng","doi":"10.1016/j.ipm.2024.103934","DOIUrl":"10.1016/j.ipm.2024.103934","url":null,"abstract":"<div><div>Despite the widespread adoption of deep learning to enhance image classification, significant obstacles remain. First, multisource data with diverse sizes and formats is a great challenge for most current deep learning models. Second, lacking manual labeled data for model training limits the application of deep learning. Third, the widely used CNN-based methods shows their limitations in extracting global features and yield poor performance for image topology. To address these issues, we propose a Hybrid Feature Fusion Deep Learning (HFFDL) framework for image classification. This framework consists of an automated image segmentation module, a two-stream backbone module, and a classification module. The automatic image segmentation module utilizes the U-Net model and transfer learning to detect region of interest (ROI) in multisource images; the two-stream backbone module integrates the Swin Transformer architecture with the Inception CNN, with the aim of simultaneous extracting local and global features for efficient representation learning. We evaluate the performance of HFFDL framework with two publicly available image datasets: one for identifying COVID-19 through X-ray scans of the chest (30,386 images), and another for multiclass skin cancer screening using dermoscopy images (25,331 images). The HFFDL framework exhibited greater performance in comparison to many cutting-edge models, achieving the AUC score 0.9835 and 0.8789, respectively. Furthermore, a practical application study conducted in a hospital, identifying viable embryos using medical images, revealed the HFFDL framework outperformed embryologists.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103934"},"PeriodicalIF":7.4,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142535923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-19DOI: 10.1016/j.ipm.2024.103922
Jinjin Zhang, Qimeng Fan, Dingan Wang, Pu Huang, Zhangjing Yang
Discriminantive Least Squares Regression (DLSR) is an algorithm that employs -draggings techniques to enhance intra-class similarity. However, it overlooks that an increase in intra-class closeness may simultaneously lead to a decrease in the distance between similar but different classes. To address this issue, we propose a new approach called Triple Sparse Denoising Discriminantive Least Squares Regression (TSDDLSR), which combines three sparsity constraints: sparsity constraints between classes to amplify the growth of the distance between similar classes; sparsity constraints on relaxation matrices to capture more local structure; sparsity constraints on noise matrices to minimize the effect of outliers. In addition, we position the matrix decomposition step in the label space strategically with the objective of enhancing denoising capabilities, safeguarding it from potential degradation, and preserving its underlying manifold structure. Our experiments evaluate the classification performance of the method under face recognition tasks (AR, CMU PIE, Extended Yale B, Georgia Tech, FERET datasets), biometric recognition tasks (PolyU Palmprint dataset), and object recognition tasks (COIL-20, ImageNet datasets). Meanwhile, the results show that TSDDLSR significantly improves classification performance compared to existing methods.
{"title":"Triple Sparse Denoising Discriminantive Least Squares Regression for image classification","authors":"Jinjin Zhang, Qimeng Fan, Dingan Wang, Pu Huang, Zhangjing Yang","doi":"10.1016/j.ipm.2024.103922","DOIUrl":"10.1016/j.ipm.2024.103922","url":null,"abstract":"<div><div>Discriminantive Least Squares Regression (DLSR) is an algorithm that employs <span><math><mi>ɛ</mi></math></span>-draggings techniques to enhance intra-class similarity. However, it overlooks that an increase in intra-class closeness may simultaneously lead to a decrease in the distance between similar but different classes. To address this issue, we propose a new approach called Triple Sparse Denoising Discriminantive Least Squares Regression (TSDDLSR), which combines three sparsity constraints: sparsity constraints between classes to amplify the growth of the distance between similar classes; sparsity constraints on relaxation matrices to capture more local structure; sparsity constraints on noise matrices to minimize the effect of outliers. In addition, we position the matrix decomposition step in the label space strategically with the objective of enhancing denoising capabilities, safeguarding it from potential degradation, and preserving its underlying manifold structure. Our experiments evaluate the classification performance of the method under face recognition tasks (AR, CMU PIE, Extended Yale B, Georgia Tech, FERET datasets), biometric recognition tasks (PolyU Palmprint dataset), and object recognition tasks (COIL-20, ImageNet datasets). Meanwhile, the results show that TSDDLSR significantly improves classification performance compared to existing methods.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103922"},"PeriodicalIF":7.4,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142536026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-19DOI: 10.1016/j.ipm.2024.103933
Lin Runhui , Li Yalin , Ji Ze , Xie Qiqi , Chen Xiaoyu
Scientific breakthroughs have the potential to reshape the trajectory of knowledge flow and significantly impact later research. The aim of this study is to introduce the Degree of Innovation Breakthrough (DIB) metric to more accurately quantify the extent of scientific breakthroughs. The DIB metric takes into account changes in the trajectory of knowledge flow, as well as the deep and width of impact, and it modifies the traditional assumption of equal citation contributions by assigning weighted citation counts. The effectiveness of the DIB metric is assessed using ROC curves and AUC metrics, demonstrating its ability to differentiate between high and low scientific breakthroughs with high sensitivity and minimal false positives. Based on ROC curves, this study proposes a method to calculate the threshold for high scientific breakthrough, reducing subjectivity. The effectiveness of the proposed method is demonstrated through a dataset consisting of 1108 award-winning computer science papers and 9832 matched control papers, showing that the DIB metric surpasses single-dimensional metrics. The study also performs a granular analysis of the innovation breakthrough degree of non-award-winning papers, categorizing them into four types based on originality and impact through 2D histogram visualization, and suggests tailored management strategies. Through the adoption of this refined classification strategy, the management of innovation practices can be optimized, ultimately fostering the enhancement of innovative research outcomes. The quantitative tools introduced in this paper offer guidance for researchers in the fields of science intelligence mining and science trend prediction.
{"title":"Quantifying the degree of scientific innovation breakthrough: Considering knowledge trajectory change and impact","authors":"Lin Runhui , Li Yalin , Ji Ze , Xie Qiqi , Chen Xiaoyu","doi":"10.1016/j.ipm.2024.103933","DOIUrl":"10.1016/j.ipm.2024.103933","url":null,"abstract":"<div><div>Scientific breakthroughs have the potential to reshape the trajectory of knowledge flow and significantly impact later research. The aim of this study is to introduce the Degree of Innovation Breakthrough (DIB) metric to more accurately quantify the extent of scientific breakthroughs. The DIB metric takes into account changes in the trajectory of knowledge flow, as well as the deep and width of impact, and it modifies the traditional assumption of equal citation contributions by assigning weighted citation counts. The effectiveness of the DIB metric is assessed using ROC curves and AUC metrics, demonstrating its ability to differentiate between high and low scientific breakthroughs with high sensitivity and minimal false positives. Based on ROC curves, this study proposes a method to calculate the threshold for high scientific breakthrough, reducing subjectivity. The effectiveness of the proposed method is demonstrated through a dataset consisting of 1108 award-winning computer science papers and 9832 matched control papers, showing that the DIB metric surpasses single-dimensional metrics. The study also performs a granular analysis of the innovation breakthrough degree of non-award-winning papers, categorizing them into four types based on originality and impact through 2D histogram visualization, and suggests tailored management strategies. Through the adoption of this refined classification strategy, the management of innovation practices can be optimized, ultimately fostering the enhancement of innovative research outcomes. The quantitative tools introduced in this paper offer guidance for researchers in the fields of science intelligence mining and science trend prediction.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103933"},"PeriodicalIF":7.4,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142536025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-19DOI: 10.1016/j.ipm.2024.103927
Tao Wen , Yu-wang Chen , Tahir Abbas Syed , Darminder Ghataoura
Effectively understanding and enhancing communication flows among employees within an organizational hierarchy is crucial for optimizing operational and decision-making efficiency. To fill this significant gap in research, we propose a systematic and comprehensive social network analysis approach, coupled with a newly formulated communication vector and matrix, to examine communication behaviors and dynamics in an organizational hierarchy. We use the Enron email dataset, consisting of 619,499 emails, as an illustrative example to bridge the micro-macro divide of organizational communication research. A series of centrality measures are employed to evaluate the influential ability of individual employees, revealing descending influential ability and changing behaviors according to hierarchy. We also uncover that employees tend to communicate within the same functional teams through the identification of community structure and the proposed communication matrix. Furthermore, the emergent dynamics of organizational communication during a crisis are examined through a time-segmented dataset, showcasing the progressive absence of the legal team, the responsibility of top management, and the presence of hierarchy. By considering both individual and organizational perspectives, our work provides a systematic and data-driven approach to understanding how the organizational communication network emerges dynamically from individual communication behaviors within the hierarchy, which has the potential to enhance operational and decision-making efficiency within organizations.
{"title":"Examining communication network behaviors, structure and dynamics in an organizational hierarchy: A social network analysis approach","authors":"Tao Wen , Yu-wang Chen , Tahir Abbas Syed , Darminder Ghataoura","doi":"10.1016/j.ipm.2024.103927","DOIUrl":"10.1016/j.ipm.2024.103927","url":null,"abstract":"<div><div>Effectively understanding and enhancing communication flows among employees within an organizational hierarchy is crucial for optimizing operational and decision-making efficiency. To fill this significant gap in research, we propose a systematic and comprehensive social network analysis approach, coupled with a newly formulated communication vector and matrix, to examine communication behaviors and dynamics in an organizational hierarchy. We use the Enron email dataset, consisting of 619,499 emails, as an illustrative example to bridge the micro-macro divide of organizational communication research. A series of centrality measures are employed to evaluate the influential ability of individual employees, revealing descending influential ability and changing behaviors according to hierarchy. We also uncover that employees tend to communicate within the same functional teams through the identification of community structure and the proposed communication matrix. Furthermore, the emergent dynamics of organizational communication during a crisis are examined through a time-segmented dataset, showcasing the progressive absence of the legal team, the responsibility of top management, and the presence of hierarchy. By considering both individual and organizational perspectives, our work provides a systematic and data-driven approach to understanding how the organizational communication network emerges dynamically from individual communication behaviors within the hierarchy, which has the potential to enhance operational and decision-making efficiency within organizations.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103927"},"PeriodicalIF":7.4,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142535922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18DOI: 10.1016/j.ipm.2024.103923
Amir Moslemi , Mina Jamshidi
Feature selection techniques are widely being used as a preprocessing step to train machine learning algorithms to circumvent the curse of dimensionality, overfitting, and computation time challenges. Projection-based methods are frequently employed in feature selection, leveraging the extraction of linear relationships among features. The absence of nonlinear information extraction among features is notable in this context. While auto-encoder based techniques have recently gained traction for feature selection, their focus remains primarily on the encoding phase, as it is through this phase that the selected features are derived. The subtle point is that the performance of auto-encoder to obtain the most discriminative features is significantly affected by decoding phase. To address these challenges, in this paper, we proposed a novel feature selection based on auto-encoder to not only extracting nonlinear information among features but also decoding phase is regularized as well to enhance the performance of algorithm. In this study, we defined a new model of auto-encoder to preserve the topological information of reconstructed close to input data. To geometric structure of input data is preserved in projected space using Laplacian graph, and geometrical projected space is preserved in reconstructed space using a suitable term (abstract Laplacian graph of reconstructed data) in optimization problem. Preserving abstract Laplacian graph of reconstructed data close to Laplacian graph of input data affects the performance of feature selection and we experimentally showed this. Therefore, we show an effective approach to solve the objective of the corresponding problem. Since this approach can be mainly used for clustering aims, we conducted experiments on ten benchmark datasets and assessed our propped method based on clustering accuracy and normalized mutual information (NMI) metric. Our method obtained considerable superiority over recent state-of-the-art techniques in terms of NMI and accuracy.
{"title":"Unsupervised feature selection using sparse manifold learning: Auto-encoder approach","authors":"Amir Moslemi , Mina Jamshidi","doi":"10.1016/j.ipm.2024.103923","DOIUrl":"10.1016/j.ipm.2024.103923","url":null,"abstract":"<div><div>Feature selection techniques are widely being used as a preprocessing step to train machine learning algorithms to circumvent the curse of dimensionality, overfitting, and computation time challenges. Projection-based methods are frequently employed in feature selection, leveraging the extraction of linear relationships among features. The absence of nonlinear information extraction among features is notable in this context. While auto-encoder based techniques have recently gained traction for feature selection, their focus remains primarily on the encoding phase, as it is through this phase that the selected features are derived. The subtle point is that the performance of auto-encoder to obtain the most discriminative features is significantly affected by decoding phase. To address these challenges, in this paper, we proposed a novel feature selection based on auto-encoder to not only extracting nonlinear information among features but also decoding phase is regularized as well to enhance the performance of algorithm. In this study, we defined a new model of auto-encoder to preserve the topological information of reconstructed close to input data. To geometric structure of input data is preserved in projected space using Laplacian graph, and geometrical projected space is preserved in reconstructed space using a suitable term (abstract Laplacian graph of reconstructed data) in optimization problem. Preserving abstract Laplacian graph of reconstructed data close to Laplacian graph of input data affects the performance of feature selection and we experimentally showed this. Therefore, we show an effective approach to solve the objective of the corresponding problem. Since this approach can be mainly used for clustering aims, we conducted experiments on ten benchmark datasets and assessed our propped method based on clustering accuracy and normalized mutual information (NMI) metric. Our method obtained considerable superiority over recent state-of-the-art techniques in terms of NMI and accuracy.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103923"},"PeriodicalIF":7.4,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18DOI: 10.1016/j.ipm.2024.103920
Shixuan Liu , Haoxiang Cheng , Yunfei Wang , Yue He , Changjun Fan , Zhong Liu
Heterogeneous Information Networks (HINs) encapsulate diverse entity and relation types, with meta-paths providing essential meta-level semantics for knowledge reasoning, although their utility is constrained by discovery challenges. While Large Language Models (LLMs) offer new prospects for meta-path discovery due to their extensive knowledge encoding and efficiency, their adaptation faces challenges such as corpora bias, lexical discrepancies, and hallucination. This paper pioneers the mitigation of these challenges by presenting EvoPath, an innovative framework that leverages LLMs to efficiently identify high-quality meta-paths. EvoPath is carefully designed, with each component aimed at addressing issues that could lead to potential knowledge conflicts. With a minimal subset of HIN facts, EvoPath iteratively generates and evolves meta-paths by dynamically replaying meta-paths in the buffer with prioritization based on their scores. Comprehensive experiments on three large, complex HINs with hundreds of relations demonstrate that our framework, EvoPath, enables LLMs to generate high-quality meta-paths through effective prompting, confirming its superior performance in HIN reasoning tasks. Further ablation studies validate the effectiveness of each module within the framework.
异构信息网络(HIN)封装了各种实体和关系类型,元路径为知识推理提供了重要的元级语义,但其实用性受到发现挑战的限制。虽然大语言模型(LLM)因其广泛的知识编码和高效性为元路径发现提供了新的前景,但其适应性面临着语料偏差、词汇差异和幻觉等挑战。EvoPath 是一种利用 LLMs 高效识别高质量元路径的创新框架,本文通过介绍 EvoPath 率先缓解了这些挑战。EvoPath 经过精心设计,每个组件都旨在解决可能导致潜在知识冲突的问题。EvoPath 使用最小的 HIN 事实子集,通过动态重放缓冲区中的元路径,并根据其分数确定优先级,从而迭代生成和演化元路径。在三个包含数百个关系的大型复杂 HIN 上进行的综合实验证明,我们的框架 EvoPath 能够通过有效的提示使 LLM 生成高质量的元路径,从而证实了它在 HIN 推理任务中的卓越性能。进一步的消融研究验证了该框架中每个模块的有效性。
{"title":"EvoPath: Evolutionary meta-path discovery with large language models for complex heterogeneous information networks","authors":"Shixuan Liu , Haoxiang Cheng , Yunfei Wang , Yue He , Changjun Fan , Zhong Liu","doi":"10.1016/j.ipm.2024.103920","DOIUrl":"10.1016/j.ipm.2024.103920","url":null,"abstract":"<div><div>Heterogeneous Information Networks (HINs) encapsulate diverse entity and relation types, with meta-paths providing essential meta-level semantics for knowledge reasoning, although their utility is constrained by discovery challenges. While Large Language Models (LLMs) offer new prospects for meta-path discovery due to their extensive knowledge encoding and efficiency, their adaptation faces challenges such as corpora bias, lexical discrepancies, and hallucination. This paper pioneers the mitigation of these challenges by presenting EvoPath, an innovative framework that leverages LLMs to efficiently identify high-quality meta-paths. EvoPath is carefully designed, with each component aimed at addressing issues that could lead to potential knowledge conflicts. With a minimal subset of HIN facts, EvoPath iteratively generates and evolves meta-paths by dynamically replaying meta-paths in the buffer with prioritization based on their scores. Comprehensive experiments on three large, complex HINs with hundreds of relations demonstrate that our framework, EvoPath, enables LLMs to generate high-quality meta-paths through effective prompting, confirming its superior performance in HIN reasoning tasks. Further ablation studies validate the effectiveness of each module within the framework.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103920"},"PeriodicalIF":7.4,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142536024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Classical Chinese literature, with its long history spanning thousands of years, serves as an invaluable resource for historical and humanistic studies. Previous classical Chinese language models have achieved significant progress in semantic understanding. However, they largely neglected the dynamic evolution of language across different historical eras. In this paper, we introduce a novel diachronic pre-trained language model tailored for classical Chinese texts. This model utilizes a time-based transformer architecture that captures the continuous evolution of semantics over time. Moreover, it adeptly balances the contextual and temporal information, minimizing semantic ambiguities from excessive time-related inputs. A high-quality diachronic corpus for classical Chinese is developed for training. This corpus spans from the pre-Qin dynasty to the Qing dynasty and includes a diverse array of genres. We validate its effectiveness by enriching a well-known classical Chinese word sense disambiguation dataset with additional temporal annotations. The results demonstrate the state-of-the-art performance of our model in discerning classical Chinese word meanings across different historical periods. Our research helps linguists to rapidly grasp the extent of semantic changes across different periods from vast corpora.1
{"title":"A diachronic language model for long-time span classical Chinese","authors":"Yuting Wei, Meiling Li, Yangfu Zhu, Yuanxing Xu, Yuqing Li, Bin Wu","doi":"10.1016/j.ipm.2024.103925","DOIUrl":"10.1016/j.ipm.2024.103925","url":null,"abstract":"<div><div>Classical Chinese literature, with its long history spanning thousands of years, serves as an invaluable resource for historical and humanistic studies. Previous classical Chinese language models have achieved significant progress in semantic understanding. However, they largely neglected the dynamic evolution of language across different historical eras. In this paper, we introduce a novel diachronic pre-trained language model tailored for classical Chinese texts. This model utilizes a time-based transformer architecture that captures the continuous evolution of semantics over time. Moreover, it adeptly balances the contextual and temporal information, minimizing semantic ambiguities from excessive time-related inputs. A high-quality diachronic corpus for classical Chinese is developed for training. This corpus spans from the pre-Qin dynasty to the Qing dynasty and includes a diverse array of genres. We validate its effectiveness by enriching a well-known classical Chinese word sense disambiguation dataset with additional temporal annotations. The results demonstrate the state-of-the-art performance of our model in discerning classical Chinese word meanings across different historical periods. Our research helps linguists to rapidly grasp the extent of semantic changes across different periods from vast corpora.<span><span><sup>1</sup></span></span></div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103925"},"PeriodicalIF":7.4,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142440957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}