Information Processing & Management最新文献_第8页

Soft fusion of channel information in depression detection using functional near-infrared spectroscopy

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2024-12-05 DOI: 10.1016/j.ipm.2024.104003

Jitao Zhong , Yushan Wu , Hele Liu , Jinlong Chao , Bin Hu , Sujie Ma , Hong Peng

To address the gap in fNIRS-based depression detection research concerning channel selection and information fusion, and to possibly provide recommendations for channel design to fNIRS device manufacturers, we propose a novel framework for depression detection using functional near-infrared spectroscopy (fNIRS) with optimized channel selection and fusion. Involving a sample of 80 participants (40 depressed, 40 healthy), we employed Phase Space Reconstruction (PSR) to capture neurovascular nonlinear dynamics from the fNIRS data. Using multi-objective optimization (MOMVO), we identified key channels in brain regions such as the Left Dorsolateral Prefrontal Cortex, Right Infraorbital Superior Frontal Gyrus, Right Dorsolateral Prefrontal Cortex, and Right Middle Frontal Gyrus. Our approach achieved depression detection rates of 96.1% under positive stimuli, 91.3% under neutral stimuli, and 98.0% under negative stimuli, surpassing comparative methods by 5% to 12%. This framework demonstrates potential for improving early depression detection and clinical applications.

{"title":"Soft fusion of channel information in depression detection using functional near-infrared spectroscopy","authors":"Jitao Zhong , Yushan Wu , Hele Liu , Jinlong Chao , Bin Hu , Sujie Ma , Hong Peng","doi":"10.1016/j.ipm.2024.104003","DOIUrl":"10.1016/j.ipm.2024.104003","url":null,"abstract":"<div><div>To address the gap in fNIRS-based depression detection research concerning channel selection and information fusion, and to possibly provide recommendations for channel design to fNIRS device manufacturers, we propose a novel framework for depression detection using functional near-infrared spectroscopy (fNIRS) with optimized channel selection and fusion. Involving a sample of 80 participants (40 depressed, 40 healthy), we employed Phase Space Reconstruction (PSR) to capture neurovascular nonlinear dynamics from the fNIRS data. Using multi-objective optimization (MOMVO), we identified key channels in brain regions such as the Left Dorsolateral Prefrontal Cortex, Right Infraorbital Superior Frontal Gyrus, Right Dorsolateral Prefrontal Cortex, and Right Middle Frontal Gyrus. Our approach achieved depression detection rates of 96.1% under positive stimuli, 91.3% under neutral stimuli, and 98.0% under negative stimuli, surpassing comparative methods by 5% to 12%. This framework demonstrates potential for improving early depression detection and clinical applications.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104003"},"PeriodicalIF":7.4,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Do not wait: Preemptive rumor detection with cooperative LLMs and accessible social context

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2024-12-05 DOI: 10.1016/j.ipm.2024.103995

Junyi Chen, Leyuan Liu, Fan Zhou

Concerning the dissemination of rumors via tweets, current methods for detecting rumors – both content-based and graph-based – do not adequately address the need for preemptive action to suppress rumors before public exposure. Although the advancement of large language models (LLMs) indicates a positive trend, their application in rumor detection remains either overly simplistic or excessively complex. Motivated by these, we put forward EvidenceRD, which employs an efficient yet effective cooperative strategy of three types of LLMs to mine informative evidence that augments content context behind tweets warranting a fact check. This is then integrated with a credibility network, an automatically generated social context based on the social homophily theory, to depict potential credibility relationships between authors of corresponding tweets. Extensive experiments across four public datasets confirm the efficacy of the proposed EvidenceRD. Specifically, EvidenceRD outperforms state-of-the-art baselines across various categories, achieving an improvement in general detection performance ranging from 3% to 16%. This superiority is achieved by exclusively utilizing pre-public information, a constraint not imposed on the comparative baselines. Besides, As EvidenceRD considers both evidence and the social context behind a tweet, it not only offers enhanced explainability through its multi-perspective evidence presented in natural language but also demonstrates greater robustness and transferability across different scenarios. Additional efficiency analysis demonstrates that the above-enhanced characteristics brought by EvidenceRD are cost-effective in terms of both financial and computational expenses. To summarize, compared to existing studies, this work theoretically presents a generally effective and efficient paradigm for utilizing LLMs in the context of rumor detection and demonstrates how integrating accessible social context can effectively detect rumors. Practically, the proposed method can accurately detect rumors before they emerge publicly with several notable metrics improvements and aid human decision-making on them from multiple dimensions.

{"title":"Do not wait: Preemptive rumor detection with cooperative LLMs and accessible social context","authors":"Junyi Chen, Leyuan Liu, Fan Zhou","doi":"10.1016/j.ipm.2024.103995","DOIUrl":"10.1016/j.ipm.2024.103995","url":null,"abstract":"<div><div>Concerning the dissemination of rumors via tweets, current methods for detecting rumors – both content-based and graph-based – do not adequately address the need for preemptive action to suppress rumors before public exposure. Although the advancement of large language models (LLMs) indicates a positive trend, their application in rumor detection remains either overly simplistic or excessively complex. Motivated by these, we put forward EvidenceRD, which employs an efficient yet effective cooperative strategy of three types of LLMs to mine informative evidence that augments content context behind tweets warranting a fact check. This is then integrated with a credibility network, an automatically generated social context based on the social homophily theory, to depict potential credibility relationships between authors of corresponding tweets. Extensive experiments across four public datasets confirm the efficacy of the proposed EvidenceRD. Specifically, EvidenceRD outperforms state-of-the-art baselines across various categories, achieving an improvement in general detection performance ranging from 3% to 16%. This superiority is achieved by exclusively utilizing pre-public information, a constraint not imposed on the comparative baselines. Besides, As EvidenceRD considers both evidence and the social context behind a tweet, it not only offers enhanced explainability through its multi-perspective evidence presented in natural language but also demonstrates greater robustness and transferability across different scenarios. Additional efficiency analysis demonstrates that the above-enhanced characteristics brought by EvidenceRD are cost-effective in terms of both financial and computational expenses. To summarize, compared to existing studies, this work theoretically presents a generally effective and efficient paradigm for utilizing LLMs in the context of rumor detection and demonstrates how integrating accessible social context can effectively detect rumors. Practically, the proposed method can accurately detect rumors before they emerge publicly with several notable metrics improvements and aid human decision-making on them from multiple dimensions.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 103995"},"PeriodicalIF":7.4,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generating counterfactual negative samples for image-text matching

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2024-12-05 DOI: 10.1016/j.ipm.2024.103990

Xinqi Su , Dan Song , Wenhui Li , Tongwei Ren , An-An Liu

The method of image-text matching typically employs hard triplet loss as its optimization objective to learn coarse correspondences based on object co-occurrence statistics. However, due to insufficiently sampled negative instances, this coarse correspondences not only leads to the model learning biases in semantic co-occurrence but also obscures the model’s understanding of crucial semantic and significant semantic contextual dependencies. In this study, we propose the Generating Feature-level and Relation-level Counterfactual Negative Samples method (GFRN) for image-text matching. This method utilizes prior knowledge and gradients to mask key regions or words to generate feature-level counterfactual negative samples, or disrupts their important contextual dependencies through Bernoulli distributions and self-supervised learning to generate relation-level counterfactual negative samples with sufficient information. Subsequently, we employ these counterfactual samples to construct contrastive triplet losses to enhance the training of the image-text matching model. Consequently, the model’s ability to understand crucial semantic concepts and complex dependency relationships is significantly enhanced, and semantic biases are greatly reduced. Compared to state-of-the-art methods, the proposed GFRN improves rSum by 3.9% on Flickr30K, 2.0% on MSCOCO1K, and 4.8% on MSCOCO5K, with significant improvements in R@1 across all datasets.

{"title":"Generating counterfactual negative samples for image-text matching","authors":"Xinqi Su , Dan Song , Wenhui Li , Tongwei Ren , An-An Liu","doi":"10.1016/j.ipm.2024.103990","DOIUrl":"10.1016/j.ipm.2024.103990","url":null,"abstract":"<div><div>The method of image-text matching typically employs hard triplet loss as its optimization objective to learn coarse correspondences based on object co-occurrence statistics. However, due to insufficiently sampled negative instances, this coarse correspondences not only leads to the model learning biases in semantic co-occurrence but also obscures the model’s understanding of crucial semantic and significant semantic contextual dependencies. In this study, we propose the Generating Feature-level and Relation-level Counterfactual Negative Samples method (GFRN) for image-text matching. This method utilizes prior knowledge and gradients to mask key regions or words to generate feature-level counterfactual negative samples, or disrupts their important contextual dependencies through Bernoulli distributions and self-supervised learning to generate relation-level counterfactual negative samples with sufficient information. Subsequently, we employ these counterfactual samples to construct contrastive triplet losses to enhance the training of the image-text matching model. Consequently, the model’s ability to understand crucial semantic concepts and complex dependency relationships is significantly enhanced, and semantic biases are greatly reduced. Compared to state-of-the-art methods, the proposed GFRN improves rSum by 3.9% on Flickr30K, 2.0% on MSCOCO1K, and 4.8% on MSCOCO5K, with significant improvements in R@1 across all datasets.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 103990"},"PeriodicalIF":7.4,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GAM: A scalable and efficient multi-chain data sharing scheme

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2024-12-05 DOI: 10.1016/j.ipm.2024.104004

Zihan Wu, Yuzhen Wang, Liangmin Wang

Multi-chain data sharing refers to cross-chain data exchange among multiple blockchains. However, existing multi-chain data sharing schemes rely on direct blockchain-to-blockchain connections to establish links among multiple chains. This leads to poor scalability and low efficiency as the number of connected blockchains increases. To address these problems, we propose GAM (Group Authorization-based Multi-chain Data Sharing) for scalable and efficient multi-chain data sharing. GAM enhances scalability by organizing users from different chains into authorized virtual groups, enabling trusted data sharing within it. To further improve efficiency, the data sharing authorization process is executed on-chain, while data transfer is based on off-chain storage. We provide a formal analysis of GAM in multi-chain scenarios and implement a proof-of-concept prototype using Hyperledger Fabric. Experimental results demonstrate that GAM is effective in reducing the execution time of multi-chain data sharing while maintaining high transaction throughput and minimal end-to-end delay.

引用次数: 0

Blockchain empowered dynamic access control for secure data sharing in collaborative emergency management

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2024-12-03 DOI: 10.1016/j.ipm.2024.103960

Qi Wang, Yi Liu

In the realm of emergency management involving multi-party collaboration, securely sharing data among multiple entities poses a significant challenge. Traditional centralized data platforms struggle with fine-grained access control for data from different sources and are susceptible to single-point failures and risks of forged authorization. This paper delves into the prospective application of blockchain technology for decentralized data governance. We propose a dynamic access control technology for multi-party collaborative emergency response, which combines blockchain and attribute-based access control technologies to enable dynamic access control in multi-party data sharing scenarios. A data sharing system for collaborative emergency management built with Hyperledger Fabric has been developed and subjected to a series of evaluation tests. The experimental results reveal that our blockchain-based system not only significantly enhances data security and privacy by leveraging decentralized control but also improves the efficiency and reliability of data sharing in emergency situations. Specifically, our findings demonstrate the system's ability to dynamically adapt access control policies based on changing emergency scenarios, effectively mitigating risks associated with single-point failures and unauthorized access. These outcomes underscore the viability and reliability of our approach, providing a robust framework for secure, scalable, and flexible data sharing in the demanding field of emergency management.

{"title":"Blockchain empowered dynamic access control for secure data sharing in collaborative emergency management","authors":"Qi Wang, Yi Liu","doi":"10.1016/j.ipm.2024.103960","DOIUrl":"10.1016/j.ipm.2024.103960","url":null,"abstract":"<div><div>In the realm of emergency management involving multi-party collaboration, securely sharing data among multiple entities poses a significant challenge. Traditional centralized data platforms struggle with fine-grained access control for data from different sources and are susceptible to single-point failures and risks of forged authorization. This paper delves into the prospective application of blockchain technology for decentralized data governance. We propose a dynamic access control technology for multi-party collaborative emergency response, which combines blockchain and attribute-based access control technologies to enable dynamic access control in multi-party data sharing scenarios. A data sharing system for collaborative emergency management built with Hyperledger Fabric has been developed and subjected to a series of evaluation tests. The experimental results reveal that our blockchain-based system not only significantly enhances data security and privacy by leveraging decentralized control but also improves the efficiency and reliability of data sharing in emergency situations. Specifically, our findings demonstrate the system's ability to dynamically adapt access control policies based on changing emergency scenarios, effectively mitigating risks associated with single-point failures and unauthorized access. These outcomes underscore the viability and reliability of our approach, providing a robust framework for secure, scalable, and flexible data sharing in the demanding field of emergency management.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 103960"},"PeriodicalIF":7.4,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

IBACodec: End-to-end speech codec with intra-inter broad attention

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2024-12-03 DOI: 10.1016/j.ipm.2024.103979

Xiaonan Yang , Jinjie Zhou , Deshan Yang, Yunwei Wan, Limin Pan, Senlin Luo

Speech compression attempts to yield a compact bitstream that can represent a speech signal with minimal distortion by eliminating redundant information, which is increasingly challenging as the bitrate decreases. However, existing neural speech codecs do not fully exploit the information from previous speech sequences, and learning encoded features blindly leads to the ineffective removal of redundant information, resulting in suboptimal reconstruction quality. In this work, we propose an end-to-end speech codec with intra-inter broad attention, named IBACodec, that efficiently compresses speech across different types of datasets, including LibriTTS, LJSpeech, and more. By designing an intra-inter broad transformer that integrates multi-head attention networks and LSTM, our model captures broad attention with direct context awareness between the intra- and inter-frames of speech. Furthermore, we present a dual-branch conformer for channel-wise modeling to effectively eliminate redundant information. In subjective evaluations using speech at a 24 kHz sampling rate, IBACodec at 6.3 kbps is comparable to SoundStream at 9 kbps and better than Opus at 9 kbps, with about 30 % fewer bits. Objective experimental results show that IBACodec outperforms state-of-the-art codecs across a wide range of bitrates, with an average ViSQOL, LLR, and CEP improvement of up to 4.97 %, 38.94 %, and 25.39 %, respectively.

{"title":"IBACodec: End-to-end speech codec with intra-inter broad attention","authors":"Xiaonan Yang , Jinjie Zhou , Deshan Yang, Yunwei Wan, Limin Pan, Senlin Luo","doi":"10.1016/j.ipm.2024.103979","DOIUrl":"10.1016/j.ipm.2024.103979","url":null,"abstract":"<div><div>Speech compression attempts to yield a compact bitstream that can represent a speech signal with minimal distortion by eliminating redundant information, which is increasingly challenging as the bitrate decreases. However, existing neural speech codecs do not fully exploit the information from previous speech sequences, and learning encoded features blindly leads to the ineffective removal of redundant information, resulting in suboptimal reconstruction quality. In this work, we propose an end-to-end speech codec with intra-inter broad attention, named IBACodec, that efficiently compresses speech across different types of datasets, including LibriTTS, LJSpeech, and more. By designing an intra-inter broad transformer that integrates multi-head attention networks and LSTM, our model captures broad attention with direct context awareness between the intra- and inter-frames of speech. Furthermore, we present a dual-branch conformer for channel-wise modeling to effectively eliminate redundant information. In subjective evaluations using speech at a 24 kHz sampling rate, IBACodec at 6.3 kbps is comparable to SoundStream at 9 kbps and better than Opus at 9 kbps, with about 30 % fewer bits. Objective experimental results show that IBACodec outperforms state-of-the-art codecs across a wide range of bitrates, with an average ViSQOL, LLR, and CEP improvement of up to 4.97 %, 38.94 %, and 25.39 %, respectively.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 103979"},"PeriodicalIF":7.4,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Are large language models qualified reviewers in originality evaluation?

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2024-12-03 DOI: 10.1016/j.ipm.2024.103973

Shengzhi Huang , Yong Huang , Yinpeng Liu , Zhuoran Luo , Wei Lu

Large language models (LLMs) are a new generation of conversational language model with impressive semantic comprehension, text generation, and knowledge inference capabilities. LLMs are significantly influencing the development of science by assisting researchers in analyzing, understanding, and grasping original knowledge in scientific papers. This study investigates LLMs’ potential as qualified reviewers in originality evaluation in zero-shot learning, utilizing a unique, manually crafted prompt. Using biomedical papers as the data source, we constructed two evaluation datasets based on Nobel Prize papers and disruptive index. The evaluation performance of multiple LLMs of different types and scales on the datasets was scrutinized through the analysis of originality score (OS), originality type (OT), and originality description (OD), all of which were generated by the LLM. Our results show that LLMs can to some extent discern papers with distinct originality level via OS; however, they appear to be overly lenient reviewers. In LLMs’ evaluation mechanism, five distinct OTs reflecting varied research contributions do not manifest independently, but together they positively influence OS. Of all the LLMs analyzed, GPT-4 stood out as able to produce the most readable ODs, effectively explaining the inference process for both OS and OT from the perspectives of completeness, logicality, and regularity.

{"title":"Are large language models qualified reviewers in originality evaluation?","authors":"Shengzhi Huang , Yong Huang , Yinpeng Liu , Zhuoran Luo , Wei Lu","doi":"10.1016/j.ipm.2024.103973","DOIUrl":"10.1016/j.ipm.2024.103973","url":null,"abstract":"<div><div>Large language models (LLMs) are a new generation of conversational language model with impressive semantic comprehension, text generation, and knowledge inference capabilities. LLMs are significantly influencing the development of science by assisting researchers in analyzing, understanding, and grasping original knowledge in scientific papers. This study investigates LLMs’ potential as qualified reviewers in originality evaluation in zero-shot learning, utilizing a unique, manually crafted prompt. Using biomedical papers as the data source, we constructed two evaluation datasets based on Nobel Prize papers and disruptive index. The evaluation performance of multiple LLMs of different types and scales on the datasets was scrutinized through the analysis of originality score (OS), originality type (OT), and originality description (OD), all of which were generated by the LLM. Our results show that LLMs can to some extent discern papers with distinct originality level via OS; however, they appear to be overly lenient reviewers. In LLMs’ evaluation mechanism, five distinct OTs reflecting varied research contributions do not manifest independently, but together they positively influence OS. Of all the LLMs analyzed, GPT-4 stood out as able to produce the most readable ODs, effectively explaining the inference process for both OS and OT from the perspectives of completeness, logicality, and regularity.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 103973"},"PeriodicalIF":7.4,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An efficient extraction method of journal-article table data for data-driven applications

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2024-12-02 DOI: 10.1016/j.ipm.2024.104006

Jianxin Deng , Gang Liu , Ling Wang , Jiawei Liang , Bolin Dai

To improve the accuracy and automation of table extraction from journal articles, we present an efficient method for automatically extracting data from tables in PDF-based journal articles using table texts and border features. All characters and lines in each article are obtained from the text stream of the target PDF file. The table area is then located via the filtering rules and algorithm designed utilizing the obtained features of the table, such as text size, border length, and absolute location of elements. Furthermore, an improved hierarchical clustering algorithm is designed to restore the logical structure of the table, which includes single-linkage clustering and agglomerative nesting based on border constraints. By combining text block layout features, it restores the entire process of character merging, text block clustering, and cell clustering. Finally, by constructing a table structure to restore the correspondence between the header and body, the content output with the desired correct structure is achieved. The table area detection accuracy, logic and content extraction accuracy, information loss rate, extraction efficiency and comprehensive performance were utilized to quantify the performance. Through the extraction experiment of a dataset comprising 500 academic articles with 1157 tables, it indicated the weighted average F1 for table detection achieved 0.963, and the F1 values for logical-structure restoration and content accuracy reached 0.856 and 0.889, respectively. Compared to Tabula, ABBYY FineReader, and TabbyPDF, this method exhibited the highest efficiency, minimal information loss, and best overall performance. The proposed method enables rapid and large-scale acquisition of table data from PDF-based journal articles.

{"title":"An efficient extraction method of journal-article table data for data-driven applications","authors":"Jianxin Deng , Gang Liu , Ling Wang , Jiawei Liang , Bolin Dai","doi":"10.1016/j.ipm.2024.104006","DOIUrl":"10.1016/j.ipm.2024.104006","url":null,"abstract":"<div><div>To improve the accuracy and automation of table extraction from journal articles, we present an efficient method for automatically extracting data from tables in PDF-based journal articles using table texts and border features. All characters and lines in each article are obtained from the text stream of the target PDF file. The table area is then located via the filtering rules and algorithm designed utilizing the obtained features of the table, such as text size, border length, and absolute location of elements. Furthermore, an improved hierarchical clustering algorithm is designed to restore the logical structure of the table, which includes single-linkage clustering and agglomerative nesting based on border constraints. By combining text block layout features, it restores the entire process of character merging, text block clustering, and cell clustering. Finally, by constructing a table structure to restore the correspondence between the header and body, the content output with the desired correct structure is achieved. The table area detection accuracy, logic and content extraction accuracy, information loss rate, extraction efficiency and comprehensive performance were utilized to quantify the performance. Through the extraction experiment of a dataset comprising 500 academic articles with 1157 tables, it indicated the weighted average <em>F</em>1 for table detection achieved 0.963, and the <em>F</em>1 values for logical-structure restoration and content accuracy reached 0.856 and 0.889, respectively. Compared to Tabula, ABBYY FineReader, and TabbyPDF, this method exhibited the highest efficiency, minimal information loss, and best overall performance. The proposed method enables rapid and large-scale acquisition of table data from PDF-based journal articles.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104006"},"PeriodicalIF":7.4,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Few-shot multi-hop reasoning via reinforcement learning and path search strategy over temporal knowledge graphs 基于时间知识图的强化学习和路径搜索策略的少射多跳推理

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2024-12-01 DOI: 10.1016/j.ipm.2024.104001

Luyi Bai, Han Zhang, Xuanxuan An, Lin Zhu

Multi-hop reasoning on knowledge graphs is an important way to complete the knowledge graph. However, existing multi-hop reasoning methods often perform poorly in few-shot scenarios and primarily focus on static knowledge graphs, neglecting to model the dynamic changes of events over time in Temporal Knowledge Graphs (TKGs). Therefore, in this paper, we consider the few-shot multi-hop reasoning task on TKGs and propose a few-shot multi-hop reasoning model for TKGs (TFSM), which uses a reinforcement learning framework to improve model interpretability and introduces the one-hop neighbors of the task entity to consider the impact of previous events on the representation of current task entity. In order to reduce the cost of searching complex nodes, our model adopts a strategy based on path search and prunes the search space by considering the correlation between existing paths and the current state. Compared to the baseline method, our model achieved 5-shot Few-shot Temporal Knowledge Graph (FTKG) performance improvements of 1.0% ∼ 18.9% on ICEWS18-few, 0.6% ∼ 22.9% on ICEWS14-few, and 0.7% ∼ 10.5% on GDELT-few. Extensive experiments show that TFSM outperforms existing models on most metrics on the commonly used benchmark datasets ICEWS18-few, ICEWS14-few, and GDELT-few. Furthermore, ablation experiments demonstrated the effectiveness of each part of our model. In addition, we demonstrate the interpretability of the model by performing path analysis with a path search-based strategy.

知识图上的多跳推理是完善知识图的重要途径。然而，现有的多跳推理方法往往在少数场景下表现不佳，并且主要关注静态知识图，而忽略了在时间知识图（TKGs）中对事件随时间的动态变化进行建模。因此，本文考虑了TKGs上的少跳多推理任务，提出了TKGs的少跳多推理模型（TFSM），该模型使用强化学习框架来提高模型的可解释性，并引入任务实体的一跳邻居来考虑先前事件对当前任务实体表示的影响。为了降低搜索复杂节点的代价，我们的模型采用基于路径搜索的策略，通过考虑已有路径与当前状态之间的相关性，对搜索空间进行修剪。与基线方法相比，我们的模型在ICEWS18-few上实现了5次时间知识图（FTKG）的性能改进，分别为1.0% ~ 18.9%、0.6% ~ 22.9%和0.7% ~ 10.5%。大量的实验表明，在常用的基准数据集ICEWS18-few、ICEWS14-few和GDELT-few上，TFSM在大多数指标上优于现有模型。此外，烧蚀实验证明了模型各部分的有效性。此外，我们通过使用基于路径搜索的策略执行路径分析来证明模型的可解释性。

{"title":"Few-shot multi-hop reasoning via reinforcement learning and path search strategy over temporal knowledge graphs","authors":"Luyi Bai, Han Zhang, Xuanxuan An, Lin Zhu","doi":"10.1016/j.ipm.2024.104001","DOIUrl":"10.1016/j.ipm.2024.104001","url":null,"abstract":"<div><div>Multi-hop reasoning on knowledge graphs is an important way to complete the knowledge graph. However, existing multi-hop reasoning methods often perform poorly in few-shot scenarios and primarily focus on static knowledge graphs, neglecting to model the dynamic changes of events over time in Temporal Knowledge Graphs (TKGs). Therefore, in this paper, we consider the few-shot multi-hop reasoning task on TKGs and propose a few-shot multi-hop reasoning model for TKGs (TFSM), which uses a reinforcement learning framework to improve model interpretability and introduces the one-hop neighbors of the task entity to consider the impact of previous events on the representation of current task entity. In order to reduce the cost of searching complex nodes, our model adopts a strategy based on path search and prunes the search space by considering the correlation between existing paths and the current state. Compared to the baseline method, our model achieved 5-shot Few-shot Temporal Knowledge Graph (FTKG) performance improvements of 1.0% ∼ 18.9% on ICEWS18-few, 0.6% ∼ 22.9% on ICEWS14-few, and 0.7% ∼ 10.5% on GDELT-few. Extensive experiments show that TFSM outperforms existing models on most metrics on the commonly used benchmark datasets ICEWS18-few, ICEWS14-few, and GDELT-few. Furthermore, ablation experiments demonstrated the effectiveness of each part of our model. In addition, we demonstrate the interpretability of the model by performing path analysis with a path search-based strategy.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104001"},"PeriodicalIF":7.4,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142756610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Basis is also explanation: Interpretable Legal Judgment Reasoning prompted by multi-source knowledge 依据也是解释：多源知识提示下的可解释性法律判决推理

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2024-11-29 DOI: 10.1016/j.ipm.2024.103996

Shangyuan Li , Shiman Zhao , Zhuoran Zhang , Zihao Fang , Wei Chen , Tengjiao Wang

The task of Legal Judgment Prediction (LJP) aims to forecast case outcomes by analyzing fact descriptions, playing a pivotal role in enhancing judicial system efficiency and fairness. Existing LJP methods primarily focus on improving representations of fact descriptions to enhance judgment performance. However, these methods typically depend on the superficial case information and neglect the underlying legal basis, resulting in a lack of in-depth reasoning and interpretability in the judgment process of long-tail or confusing cases. Recognizing that the basis for judgments in real-world legal contexts encompasses both factual logic and related legal knowledge, we introduce the interpretable legal judgment reasoning framework with multi-source knowledge prompted. The essence of this framework is to transform the implicit factual logic of cases and external legal knowledge into explicit basis for judgment, aiming to enhance not only the accuracy of judgment predictions but also the interpretability of the reasoning process. Specifically, we design a chain prompt reasoning module that guides a large language model to elucidate factual logic basis through incremental reasoning, aligning the model prior knowledge with task-oriented knowledge in the process. To match the above fact-based information with legal knowledge basis, we propose a contrastive knowledge fusing module to inject external statutes knowledge into the fact description embedding. It pushes away the distance of similar knowledge in the semantic space during the encoding of external knowledge base without manual annotation, thus improving the judgment prediction performance of long-tail and confusing cases. Experimental results on two real datasets indicate that our framework significantly outperforms existing LJP baseline methods in accuracy and interpretability, achieving new state-of-the-art performance. In addition, tests on specially constructed long-tail and confusing case datasets demonstrate that the proposed framework possesses improved generalization abilities for predicting these complex cases.

法律判决预测的任务是通过分析事实描述来预测案件结果，在提高司法系统效率和公正方面发挥着关键作用。现有的LJP方法主要侧重于改进事实描述的表示，以提高判断性能。然而，这些方法往往依赖于表面的案件信息，忽视了潜在的法律依据，导致在长尾案件或混淆案件的判决过程中缺乏深入的推理和可解释性。认识到现实世界法律环境中判决的基础既包括事实逻辑和相关法律知识，我们引入了多源知识提示的可解释法律判决推理框架。这一框架的实质是将案件的隐性事实逻辑和外部法律知识转化为明确的判断依据，旨在提高判断预测的准确性和推理过程的可解释性。具体而言，我们设计了一个链式提示推理模块，引导一个大型语言模型通过增量推理来阐明事实逻辑基础，并在此过程中将模型先验知识与任务导向知识对齐。为了将上述事实信息与法律知识基础相匹配，我们提出了一个对比知识融合模块，将外部法规知识注入事实描述嵌入中。在外部知识库编码过程中，不需要人工标注，将语义空间中相似知识的距离推远，从而提高了长尾和混淆案例的判断预测性能。在两个真实数据集上的实验结果表明，我们的框架在准确性和可解释性方面明显优于现有的LJP基线方法，实现了新的最先进的性能。此外，对特殊构建的长尾和混淆案例数据集的测试表明，该框架在预测这些复杂案例方面具有更好的泛化能力。

{"title":"Basis is also explanation: Interpretable Legal Judgment Reasoning prompted by multi-source knowledge","authors":"Shangyuan Li , Shiman Zhao , Zhuoran Zhang , Zihao Fang , Wei Chen , Tengjiao Wang","doi":"10.1016/j.ipm.2024.103996","DOIUrl":"10.1016/j.ipm.2024.103996","url":null,"abstract":"<div><div>The task of Legal Judgment Prediction (LJP) aims to forecast case outcomes by analyzing fact descriptions, playing a pivotal role in enhancing judicial system efficiency and fairness. Existing LJP methods primarily focus on improving representations of fact descriptions to enhance judgment performance. However, these methods typically depend on the superficial case information and neglect the underlying legal basis, resulting in a lack of in-depth reasoning and interpretability in the judgment process of long-tail or confusing cases. Recognizing that the basis for judgments in real-world legal contexts encompasses both factual logic and related legal knowledge, we introduce the interpretable legal judgment reasoning framework with multi-source knowledge prompted. The essence of this framework is to transform the implicit factual logic of cases and external legal knowledge into explicit basis for judgment, aiming to enhance not only the accuracy of judgment predictions but also the interpretability of the reasoning process. Specifically, we design a chain prompt reasoning module that guides a large language model to elucidate factual logic basis through incremental reasoning, aligning the model prior knowledge with task-oriented knowledge in the process. To match the above fact-based information with legal knowledge basis, we propose a contrastive knowledge fusing module to inject external statutes knowledge into the fact description embedding. It pushes away the distance of similar knowledge in the semantic space during the encoding of external knowledge base without manual annotation, thus improving the judgment prediction performance of long-tail and confusing cases. Experimental results on two real datasets indicate that our framework significantly outperforms existing LJP baseline methods in accuracy and interpretability, achieving new state-of-the-art performance. In addition, tests on specially constructed long-tail and confusing case datasets demonstrate that the proposed framework possesses improved generalization abilities for predicting these complex cases.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 103996"},"PeriodicalIF":7.4,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0