首页 > 最新文献

Knowledge and Information Systems最新文献

英文 中文
Latent side-information dynamic augmentation for incremental recommendation 用于增量推荐的潜在侧信息动态增强技术
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-26 DOI: 10.1007/s10115-024-02165-9
Jing Zhang, Jin Shi, Jingsheng Duan, Yonggong Ren

The incremental recommendation involves updating existing models by extracting information from interaction data at current time-step, with the aim of maintaining model accuracy while addressing limitations including parameter dependencies and inefficient training. However, real-time user interaction data is often afflicted by substantial noise and invalid samples, presenting the following key challenges for incremental model updating: (1) how to effectively extract valuable new knowledge from interaction data at the current time-step to ensure model accuracy and timeliness, and (2) how to safeguard against the catastrophic forgetting of long-term stable preference information, thus preserving the model’s sensitivity during cold-starts. In response to these challenges, we propose the Incremental Recommendation with Stable Latent Side-information Updating (SIIFR). This model employs a side-information augmenter to extract valuable latent side-information from user interaction behavior at time-step T, thereby sidestepping the interference caused by noisy interaction data and acquiring stable user preference. Moreover, the model utilizes rough interaction data at time-step (T+1), in conjunction with existing side-information enhancements to achieve incremental updates of latent preferences, thereby ensuring the model’s efficacy during cold-start. Furthermore, SIIFR leverages the change rate in user latent side-information to mitigate catastrophic forgetting that results in the loss of long-term stable preference information. The effectiveness of the proposed model is validated and compared against existing models using four popular incremental datasets. The model code can be achieved at: https://github.com/LNNU-computer-research-526/FR-sii.

增量推荐是指通过从当前时间步骤的交互数据中提取信息来更新现有模型,目的是在保持模型准确性的同时解决参数依赖性和训练效率低下等限制因素。然而,实时用户交互数据往往存在大量噪声和无效样本,这给增量模型更新带来了以下关键挑战:(1) 如何在当前时间步有效地从交互数据中提取有价值的新知识,以确保模型的准确性和及时性;(2) 如何防止长期稳定偏好信息的灾难性遗忘,从而在冷启动时保持模型的灵敏度。为了应对这些挑战,我们提出了稳定潜在侧面信息更新增量推荐模型(SIIFR)。该模型利用侧信息增强器从时间步 T 的用户交互行为中提取有价值的潜在侧信息,从而避开噪声交互数据的干扰,获得稳定的用户偏好。此外,该模型还利用时间步(T+1)的粗略交互数据,结合现有的侧信息增强器,实现潜在偏好的增量更新,从而确保模型在冷启动期间的有效性。此外,SIIFR 还能利用用户潜在侧信息的变化率来减轻灾难性遗忘导致的长期稳定偏好信息丢失。我们使用四种流行的增量数据集对所提出模型的有效性进行了验证,并与现有模型进行了比较。模型代码见:https://github.com/LNNU-computer-research-526/FR-sii。
{"title":"Latent side-information dynamic augmentation for incremental recommendation","authors":"Jing Zhang, Jin Shi, Jingsheng Duan, Yonggong Ren","doi":"10.1007/s10115-024-02165-9","DOIUrl":"https://doi.org/10.1007/s10115-024-02165-9","url":null,"abstract":"<p>The incremental recommendation involves updating existing models by extracting information from interaction data at current time-step, with the aim of maintaining model accuracy while addressing limitations including parameter dependencies and inefficient training. However, real-time user interaction data is often afflicted by substantial noise and invalid samples, presenting the following key challenges for incremental model updating: (1) how to effectively extract valuable new knowledge from interaction data at the current time-step to ensure model accuracy and timeliness, and (2) how to safeguard against the catastrophic forgetting of long-term stable preference information, thus preserving the model’s sensitivity during cold-starts. In response to these challenges, we propose the Incremental Recommendation with Stable Latent Side-information Updating (SIIFR). This model employs a side-information augmenter to extract valuable latent side-information from user interaction behavior at time-step <i>T</i>, thereby sidestepping the interference caused by noisy interaction data and acquiring stable user preference. Moreover, the model utilizes rough interaction data at time-step <span>(T+1)</span>, in conjunction with existing side-information enhancements to achieve incremental updates of latent preferences, thereby ensuring the model’s efficacy during cold-start. Furthermore, SIIFR leverages the change rate in user latent side-information to mitigate catastrophic forgetting that results in the loss of long-term stable preference information. The effectiveness of the proposed model is validated and compared against existing models using four popular incremental datasets. The model code can be achieved at: https://github.com/LNNU-computer-research-526/FR-sii.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An overview of semantic-based process mining techniques: trends and future directions 基于语义的流程挖掘技术概览:趋势和未来方向
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-26 DOI: 10.1007/s10115-024-02147-x
Fadilul-lah Yassaanah Issahaku, Ke Lu, Fang Xianwen, Sumaiya Bashiru Danwana, Husein Mohammed Bandago

Process mining algorithms essentially reflect the execution behavior of events in an event log for conformance checking, model discovery, or enhancement. Domain experts have developed several process mining algorithms based on theoretical frameworks such as linear integer programming, heuristics, and genetic algorithms, region-based and semantic-based approaches. The idea is to generate insightful representations of these processes of information systems to enable process mining practitioners to gain insight into their systems. Recently, there has been a shift toward semantic-based approaches for process mining since they not only discover enhanced models but also emphasize context. To this effect, this paper conducts a comprehensive review of 30 articles on semantic process mining techniques. It was found that 44.7% of all works used semantics for process discovery, 23.7% for model enhancement, and conformance checking was the least with 10.5%. We further indicate the benefits and contributions of these methods to process mining. Challenges, opportunities, and prospective future research areas are also discussed.

流程挖掘算法主要反映事件日志中事件的执行行为,用于一致性检查、模型发现或增强。领域专家基于线性整数编程、启发式算法、遗传算法、基于区域和基于语义的方法等理论框架,开发了多种流程挖掘算法。其目的是对信息系统的这些流程生成有洞察力的表征,使流程挖掘从业人员能够深入了解他们的系统。最近,流程挖掘开始转向基于语义的方法,因为这些方法不仅能发现增强型模型,而且还强调上下文。为此,本文对 30 篇有关语义流程挖掘技术的文章进行了全面综述。结果发现,44.7% 的作品使用语义来发现流程,23.7% 的作品使用语义来增强模型,而一致性检查最少,仅占 10.5%。我们进一步指出了这些方法对流程挖掘的益处和贡献。我们还讨论了挑战、机遇和未来的研究领域。
{"title":"An overview of semantic-based process mining techniques: trends and future directions","authors":"Fadilul-lah Yassaanah Issahaku, Ke Lu, Fang Xianwen, Sumaiya Bashiru Danwana, Husein Mohammed Bandago","doi":"10.1007/s10115-024-02147-x","DOIUrl":"https://doi.org/10.1007/s10115-024-02147-x","url":null,"abstract":"<p>Process mining algorithms essentially reflect the execution behavior of events in an event log for conformance checking, model discovery, or enhancement. Domain experts have developed several process mining algorithms based on theoretical frameworks such as linear integer programming, heuristics, and genetic algorithms, region-based and semantic-based approaches. The idea is to generate insightful representations of these processes of information systems to enable process mining practitioners to gain insight into their systems. Recently, there has been a shift toward semantic-based approaches for process mining since they not only discover enhanced models but also emphasize context. To this effect, this paper conducts a comprehensive review of 30 articles on semantic process mining techniques. It was found that 44.7% of all works used semantics for process discovery, 23.7% for model enhancement, and conformance checking was the least with 10.5%. We further indicate the benefits and contributions of these methods to process mining. Challenges, opportunities, and prospective future research areas are also discussed.\u0000</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An aviation accidents prediction method based on MTCNN and Bayesian optimization 基于 MTCNN 和贝叶斯优化的航空事故预测方法
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-26 DOI: 10.1007/s10115-024-02168-6
Minglan Xiong, Zhaoguo Hou, Huawei Wang, Changchang Che, Rui Luo

The safety of the civil aviation system has been of increasing concern with several accidents in recent years. It is urgent to put forward a precise accident prediction model, which can systematically analyze safety from the perspective of accident mechanism to enhance training accuracy. Furthermore, the predictive model is critical for stakeholders to identify risk and implement the proactive safety paradigm. In this work, to mitigate casualties and economic losses arising from aviation accidents and improve system safety, the focus is on predicting the aircraft damage severity, the injury/death severity, and the flight phases in the sequence of identifying event risk sources. This work establishes a multi-task deep convolutional neural network (MTCNN) learning framework to accomplish this goal. An innovative prediction rule will be developed to refine prediction results from two approaches: handling imbalanced classes and Bayesian optimization. By comparing the performance of the proposed multi-task model with other single-task machine learning models with ten-fold cross-validation and statistical testing, the effectiveness of the developed model in predicting aviation accident severity and flight phase is demonstrated.

近年来,随着多起事故的发生,民航系统的安全问题日益受到关注。提出精确的事故预测模型,从事故机理的角度系统地分析安全问题,提高培训的准确性,已迫在眉睫。此外,预测模型对于利益相关者识别风险和实施主动安全范式也至关重要。在这项工作中,为了减少航空事故造成的人员伤亡和经济损失,提高系统安全性,重点是预测飞机损坏严重程度、人员伤亡严重程度,以及按事件风险源识别顺序预测飞行阶段。为实现这一目标,本研究建立了多任务深度卷积神经网络(MTCNN)学习框架。将开发一种创新的预测规则,以完善两种方法的预测结果:处理不平衡类和贝叶斯优化。通过十倍交叉验证和统计测试,比较所提出的多任务模型与其他单任务机器学习模型的性能,证明了所开发模型在预测航空事故严重程度和飞行阶段方面的有效性。
{"title":"An aviation accidents prediction method based on MTCNN and Bayesian optimization","authors":"Minglan Xiong, Zhaoguo Hou, Huawei Wang, Changchang Che, Rui Luo","doi":"10.1007/s10115-024-02168-6","DOIUrl":"https://doi.org/10.1007/s10115-024-02168-6","url":null,"abstract":"<p>The safety of the civil aviation system has been of increasing concern with several accidents in recent years. It is urgent to put forward a precise accident prediction model, which can systematically analyze safety from the perspective of accident mechanism to enhance training accuracy. Furthermore, the predictive model is critical for stakeholders to identify risk and implement the proactive safety paradigm. In this work, to mitigate casualties and economic losses arising from aviation accidents and improve system safety, the focus is on predicting the aircraft damage severity, the injury/death severity, and the flight phases in the sequence of identifying event risk sources. This work establishes a multi-task deep convolutional neural network (MTCNN) learning framework to accomplish this goal. An innovative prediction rule will be developed to refine prediction results from two approaches: handling imbalanced classes and Bayesian optimization. By comparing the performance of the proposed multi-task model with other single-task machine learning models with ten-fold cross-validation and statistical testing, the effectiveness of the developed model in predicting aviation accident severity and flight phase is demonstrated.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep reinforcement learning-based scheduling in distributed systems: a critical review 基于深度强化学习的分布式系统调度:重要综述
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-26 DOI: 10.1007/s10115-024-02167-7
Zahra Jalali Khalil Abadi, Najme Mansouri, Mohammad Masoud Javidi

Many fields of research use parallelized and distributed computing environments, including astronomy, earth science, and bioinformatics. Due to an increase in client requests, service providers face various challenges, such as task scheduling, security, resource management, and virtual machine migration. NP-hard scheduling problems require a long time to implement an optimal or suboptimal solution due to their large solution space. With recent advances in artificial intelligence, deep reinforcement learning (DRL) can be used to solve scheduling problems. The DRL approach combines the strength of deep learning and neural networks with reinforcement learning’s feedback-based learning. This paper provides a comprehensive overview of DRL-based scheduling algorithms in distributed systems by categorizing algorithms and applications. As a result, several articles are assessed based on their main objectives, quality of service and scheduling parameters, as well as evaluation environments (i.e., simulation tools, real-world environment). The literature review indicates that algorithms based on RL, such as Q-learning, are effective for learning scaling and scheduling policies in a cloud environment. Additionally, the challenges and directions for further research on deep reinforcement learning to address scheduling problems were summarized (e.g., edge intelligence, ideal dynamic task scheduling framework, human–machine interaction, resource-hungry artificial intelligence (AI) and sustainability).

许多研究领域都使用并行化和分布式计算环境,包括天文学、地球科学和生物信息学。由于客户请求的增加,服务提供商面临着任务调度、安全性、资源管理和虚拟机迁移等各种挑战。由于 NP 难调度问题的求解空间很大,因此需要很长时间才能找到最优或次优解。随着人工智能领域的最新进展,深度强化学习(DRL)可用于解决调度问题。DRL 方法将深度学习和神经网络的优势与强化学习的反馈学习相结合。本文通过对算法和应用进行分类,全面概述了分布式系统中基于 DRL 的调度算法。因此,本文根据其主要目标、服务质量和调度参数以及评估环境(即仿真工具、真实世界环境)对多篇文章进行了评估。文献综述表明,基于 RL 的算法(如 Q-learning)可有效学习云环境中的扩展和调度策略。此外,还总结了深度强化学习在解决调度问题方面面临的挑战和进一步研究的方向(如边缘智能、理想的动态任务调度框架、人机交互、资源饥渴型人工智能(AI)和可持续性)。
{"title":"Deep reinforcement learning-based scheduling in distributed systems: a critical review","authors":"Zahra Jalali Khalil Abadi, Najme Mansouri, Mohammad Masoud Javidi","doi":"10.1007/s10115-024-02167-7","DOIUrl":"https://doi.org/10.1007/s10115-024-02167-7","url":null,"abstract":"<p>Many fields of research use parallelized and distributed computing environments, including astronomy, earth science, and bioinformatics. Due to an increase in client requests, service providers face various challenges, such as task scheduling, security, resource management, and virtual machine migration. NP-hard scheduling problems require a long time to implement an optimal or suboptimal solution due to their large solution space. With recent advances in artificial intelligence, deep reinforcement learning (DRL) can be used to solve scheduling problems. The DRL approach combines the strength of deep learning and neural networks with reinforcement learning’s feedback-based learning. This paper provides a comprehensive overview of DRL-based scheduling algorithms in distributed systems by categorizing algorithms and applications. As a result, several articles are assessed based on their main objectives, quality of service and scheduling parameters, as well as evaluation environments (i.e., simulation tools, real-world environment). The literature review indicates that algorithms based on RL, such as Q-learning, are effective for learning scaling and scheduling policies in a cloud environment. Additionally, the challenges and directions for further research on deep reinforcement learning to address scheduling problems were summarized (e.g., edge intelligence, ideal dynamic task scheduling framework, human–machine interaction, resource-hungry artificial intelligence (AI) and sustainability).</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UCAD: commUnity disCovery method in Attribute-based multicoloreD networks UCAD:基于属性的多核网络中的通信一致性发现方法
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-19 DOI: 10.1007/s10115-024-02163-x
Félicité Gamgne Domgue, Norbert Tsopze, René Ndoundam

Many hierarchical methods for community detection in multicolored networks are capable of finding clusters when there are interslice correlation between layers. However, in general, they aggregate all the links in different layer treating them as being equivalent. Therefore, such aggregation might ignore the information about the relevance of a dimension in which the node is involved. In this paper, we fill this gap by proposing a hierarchical classification-based Louvain method for interslice-multicolored networks. In particular, we define a new node centrality measure named Attractivity to describe the inter-slice correlation that incorporates within and across-dimension topological features in order to identify the relevant dimension. Then, after merging dimensions through a frequential aggregation, we group nodes by their relational and attribute similarity, where attributes correspond to their relevant dimensions. We conduct an extensive experimentation using seven real-world multicolored networks, which also includes comparison with state-of-the-art methods. Results show the significance of our proposed method in discovering relevant communities over multiple dimensions and highlight its ability in producing optimal covers with higher values of the multidimensional version of the modularity function.

许多用于多色网络中群落检测的分层方法都能在层与层之间存在相关性时找到群落。但是,一般情况下,这些方法会将不同层中的所有链接聚合在一起,将其视为等价链接。因此,这种聚合可能会忽略节点所在维度的相关性信息。在本文中,我们针对互译-多色网络提出了一种基于分层分类的卢万方法,从而填补了这一空白。具体而言,我们定义了一种名为 "吸引力"(Attractivity)的新节点中心性度量来描述切片间的相关性,该度量结合了维内和跨维拓扑特征,以识别相关维度。然后,在通过频率聚合合并维度后,我们根据节点的关系和属性相似性对节点进行分组,其中属性对应于相关维度。我们使用七个真实世界的多色网络进行了广泛的实验,其中还包括与最先进方法的比较。实验结果表明,我们提出的方法在发现多个维度上的相关社区方面具有重要意义,并突出了该方法在产生具有更高模块化函数多维版本值的最佳覆盖方面的能力。
{"title":"UCAD: commUnity disCovery method in Attribute-based multicoloreD networks","authors":"Félicité Gamgne Domgue, Norbert Tsopze, René Ndoundam","doi":"10.1007/s10115-024-02163-x","DOIUrl":"https://doi.org/10.1007/s10115-024-02163-x","url":null,"abstract":"<p>Many hierarchical methods for community detection in multicolored networks are capable of finding clusters when there are interslice correlation between layers. However, in general, they aggregate all the links in different layer treating them as being equivalent. Therefore, such aggregation might ignore the information about the relevance of a dimension in which the node is involved. In this paper, we fill this gap by proposing a hierarchical classification-based Louvain method for interslice-multicolored networks. In particular, we define a new node centrality measure named <i>Attractivity</i> to describe the inter-slice correlation that incorporates within and across-dimension topological features in order to identify the relevant dimension. Then, after merging dimensions through a frequential aggregation, we group nodes by their relational and attribute similarity, where attributes correspond to their relevant dimensions. We conduct an extensive experimentation using seven real-world multicolored networks, which also includes comparison with state-of-the-art methods. Results show the significance of our proposed method in discovering relevant communities over multiple dimensions and highlight its ability in producing optimal covers with higher values of the multidimensional version of the modularity function.\u0000</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Situational Data Integration in Question Answering systems: a survey over two decades 问题解答系统中的情景数据整合:二十年来的调查
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-18 DOI: 10.1007/s10115-024-02136-0
Maria Helena Franciscatto, Luis Carlos Erpen de Bona, Celio Trois, Marcos Didonet Del FabroFabro, João Carlos Damasceno Lima

Question Answering (QA) systems provide accurate answers to questions; however, they lack the ability to consolidate data from multiple sources, making it difficult to manage complex questions that could be answered with additional data retrieved and integrated on the fly. This integration is inherent to Situational Data Integration (SDI) approaches that deal with dynamic requirements of ad hoc queries that neither traditional database management systems, nor search engines are effective in providing an answer. Thus, if QA systems include SDI characteristics, they could be able to return validated and immediate information for supporting users decisions. For this reason, we surveyed QA-based systems, assessing their capabilities to support SDI features, i.e., Ad hoc Data Retrieval, Data Management, and Timely Decision Support. We also identified patterns concerning these features in the surveyed studies, highlighting them in a timeline that shows the SDI evolution in the QA domain. To the best of your knowledge, this study is precursor in the joint analysis of SDI and QA, showing a combination that can favor the way systems support users. Our analyses show that most of SDI features are rarely addressed in QA systems, and based on that, we discuss directions for further research.

问题解答(QA)系统可以为问题提供准确的答案,但它们缺乏整合来自多个来源的数据的能力,因此难以管理复杂的问题,而这些问题可以通过即时检索和整合额外的数据来回答。这种整合是情境数据整合(SDI)方法所固有的,它可以处理临时查询的动态需求,而传统的数据库管理系统或搜索引擎都无法有效地提供答案。因此,如果质量保证系统包含 SDI 特性,就能返回经过验证的即时信息,为用户决策提供支持。为此,我们对基于质量保证的系统进行了调查,评估它们支持 SDI 特性的能力,即临时数据检索、数据管理和及时决策支持。我们还在调查研究中找出了与这些功能相关的模式,并在显示质量保证领域 SDI 演进的时间轴中突出了这些模式。据我们所知,这项研究是对 SDI 和质量保证进行联合分析的先驱,显示出两者的结合有利于系统为用户提供支持。我们的分析表明,大多数 SDI 功能在质量保证系统中很少涉及,在此基础上,我们讨论了进一步研究的方向。
{"title":"Situational Data Integration in Question Answering systems: a survey over two decades","authors":"Maria Helena Franciscatto, Luis Carlos Erpen de Bona, Celio Trois, Marcos Didonet Del FabroFabro, João Carlos Damasceno Lima","doi":"10.1007/s10115-024-02136-0","DOIUrl":"https://doi.org/10.1007/s10115-024-02136-0","url":null,"abstract":"<p>Question Answering (QA) systems provide accurate answers to questions; however, they lack the ability to consolidate data from multiple sources, making it difficult to manage complex questions that could be answered with additional data retrieved and integrated on the fly. This integration is inherent to Situational Data Integration (SDI) approaches that deal with dynamic requirements of ad hoc queries that neither traditional database management systems, nor search engines are effective in providing an answer. Thus, if QA systems include SDI characteristics, they could be able to return validated and immediate information for supporting users decisions. For this reason, we surveyed QA-based systems, assessing their capabilities to support SDI features, i.e., <i>Ad hoc Data Retrieval, Data Management,</i> and <i>Timely Decision Support</i>. We also identified patterns concerning these features in the surveyed studies, highlighting them in a timeline that shows the SDI evolution in the QA domain. To the best of your knowledge, this study is precursor in the joint analysis of SDI and QA, showing a combination that can favor the way systems support users. Our analyses show that most of SDI features are rarely addressed in QA systems, and based on that, we discuss directions for further research.\u0000</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid storage blockchain-based query efficiency enhancement method for business environment evaluation 基于混合存储区块链的商业环境评估查询效率提升方法
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-17 DOI: 10.1007/s10115-024-02144-0
Su Li, Junlu Wang, Wanting Ji, Ze Chen, Baoyan Song

A favorable business environment plays a crucial role in facilitating the high-quality development of a modern economy. In order to enhance the credibility and efficiency of business environment evaluation, this paper proposes a hybrid storage blockchain-based query efficiency enhancement method for business environment evaluation. Currently, most blockchain systems store block data in key-value databases or file systems with simple semantic descriptions. However, such systems have a single query interface, limited supported query types, and high storage overhead, which leads to low performance. To tackle these challenges, this paper proposes a query efficiency enhancement method based on hybrid storage blockchain. Firstly, data are stored in a hybrid data storage architecture combining on-chain and off-chain. Additionally, relational semantics are added to block data, and three index mechanisms are designed to expedite data access. Subsequently, corresponding query efficiency enhancement algorithms are designed based on the query types that are applicable to the aforementioned three index mechanisms, further refining the query processing. Finally, a comprehensive authentication query is implemented on the blockchain for the light client, and the user can verify the soundness and integrity of the query results. Experimental results on three open datasets show that the method proposed in this paper significantly reduces storage overhead, has shorter query latency for three different query types, and improves retrieval performance and verification efficiency.

良好的营商环境对推动现代经济高质量发展起着至关重要的作用。为提升营商环境评价的公信力和效率,本文提出了一种基于混合存储区块链的营商环境评价查询效率提升方法。目前,大多数区块链系统将区块数据存储在键值数据库或文件系统中,语义描述简单。然而,这类系统的查询接口单一,支持的查询类型有限,存储开销大,导致性能低下。针对这些挑战,本文提出了一种基于混合存储区块链的查询效率提升方法。首先,数据存储在链上和链下相结合的混合数据存储架构中。此外,还为区块数据添加了关系语义,并设计了三种索引机制来加快数据访问速度。随后,根据适用于上述三种索引机制的查询类型,设计了相应的查询效率增强算法,进一步完善了查询处理。最后,在区块链上为轻客户端实现了综合认证查询,用户可以验证查询结果的合理性和完整性。在三个开放数据集上的实验结果表明,本文提出的方法显著降低了存储开销,缩短了三种不同查询类型的查询延迟,提高了检索性能和验证效率。
{"title":"A hybrid storage blockchain-based query efficiency enhancement method for business environment evaluation","authors":"Su Li, Junlu Wang, Wanting Ji, Ze Chen, Baoyan Song","doi":"10.1007/s10115-024-02144-0","DOIUrl":"https://doi.org/10.1007/s10115-024-02144-0","url":null,"abstract":"<p>A favorable business environment plays a crucial role in facilitating the high-quality development of a modern economy. In order to enhance the credibility and efficiency of business environment evaluation, this paper proposes a hybrid storage blockchain-based query efficiency enhancement method for business environment evaluation. Currently, most blockchain systems store block data in key-value databases or file systems with simple semantic descriptions. However, such systems have a single query interface, limited supported query types, and high storage overhead, which leads to low performance. To tackle these challenges, this paper proposes a query efficiency enhancement method based on hybrid storage blockchain. Firstly, data are stored in a hybrid data storage architecture combining on-chain and off-chain. Additionally, relational semantics are added to block data, and three index mechanisms are designed to expedite data access. Subsequently, corresponding query efficiency enhancement algorithms are designed based on the query types that are applicable to the aforementioned three index mechanisms, further refining the query processing. Finally, a comprehensive authentication query is implemented on the blockchain for the light client, and the user can verify the soundness and integrity of the query results. Experimental results on three open datasets show that the method proposed in this paper significantly reduces storage overhead, has shorter query latency for three different query types, and improves retrieval performance and verification efficiency.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GANCDE: Neural networks based on graphs and attention neural control differential equations for human activity recognition GANCDE:基于图形和注意力神经控制微分方程的神经网络,用于人类活动识别
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-15 DOI: 10.1007/s10115-024-02154-y
Tangzhi Teng, Jie Wan, XiaoFeng Zhang
{"title":"GANCDE: Neural networks based on graphs and attention neural control differential equations for human activity recognition","authors":"Tangzhi Teng, Jie Wan, XiaoFeng Zhang","doi":"10.1007/s10115-024-02154-y","DOIUrl":"https://doi.org/10.1007/s10115-024-02154-y","url":null,"abstract":"","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141337111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extended ELECTRE method for multi-criteria group decision-making with spherical cubic fuzzy sets 利用球面立方模糊集进行多标准群体决策的扩展 ELECTRE 方法
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-14 DOI: 10.1007/s10115-024-02132-4
Ghous Ali, Muhammad Nabeel, Adeel Farooq
{"title":"Extended ELECTRE method for multi-criteria group decision-making with spherical cubic fuzzy sets","authors":"Ghous Ali, Muhammad Nabeel, Adeel Farooq","doi":"10.1007/s10115-024-02132-4","DOIUrl":"https://doi.org/10.1007/s10115-024-02132-4","url":null,"abstract":"","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141342032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic similarity-aware feature selection and redundancy removal for text classification using joint mutual information 利用联合互信息为文本分类选择语义相似性感知特征并去除冗余
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-13 DOI: 10.1007/s10115-024-02143-1
Farek Lazhar, Benaidja Amira

The high dimensionality of text data is a challenging issue that requires efficient methods to reduce vector space and improve classification accuracy. Existing filter-based methods fail to address the redundancy issue, resulting in the selection of irrelevant and redundant features. Information theory-based methods effectively solve this problem but are not practical for large amounts of data due to their high time complexity. The proposed method, termed semantic similarity-aware feature selection and redundancy removal (SS-FSRR), employs joint mutual information between the pairs of semantically related terms and the class label to capture redundant features. It is predicated on the assumption that semantically related terms imply potentially redundant ones, which can significantly reduce execution time by avoiding sequential search strategies. In this work, we use Word2Vec’s CBOW model to obtain semantic similarity between terms. The efficiency of the SS-FSRR is compared to six state-of-the-art competitive selection methods for categorical data using two traditional classifiers (SVM and NB) and a robust deep learning model (LSTM) on seven datasets with 10-fold cross-validation, where experimental results show that the SS-FSRR outperforms the other methods on most tested datasets with high stability as measured by the Jaccard’s Index.

文本数据的高维性是一个具有挑战性的问题,需要高效的方法来减少向量空间并提高分类精度。现有的基于滤波器的方法无法解决冗余问题,导致选择不相关的冗余特征。基于信息论的方法能有效解决这一问题,但由于时间复杂度高,对于海量数据来说并不实用。所提出的方法被称为语义相似性感知特征选择和冗余去除(SS-FSRR),它利用语义相关术语对和类别标签之间的联合互信息来捕捉冗余特征。它的前提假设是,语义相关的术语意味着潜在的冗余术语,这就避免了顺序搜索策略,从而大大缩短了执行时间。在这项工作中,我们使用 Word2Vec 的 CBOW 模型来获取术语之间的语义相似性。实验结果表明,在大多数测试数据集上,SS-FSRR 的性能都优于其他方法,而且以 Jaccard 指数衡量,SS-FSRR 具有很高的稳定性。
{"title":"Semantic similarity-aware feature selection and redundancy removal for text classification using joint mutual information","authors":"Farek Lazhar, Benaidja Amira","doi":"10.1007/s10115-024-02143-1","DOIUrl":"https://doi.org/10.1007/s10115-024-02143-1","url":null,"abstract":"<p>The high dimensionality of text data is a challenging issue that requires efficient methods to reduce vector space and improve classification accuracy. Existing filter-based methods fail to address the redundancy issue, resulting in the selection of irrelevant and redundant features. Information theory-based methods effectively solve this problem but are not practical for large amounts of data due to their high time complexity. The proposed method, termed semantic similarity-aware feature selection and redundancy removal (SS-FSRR), employs joint mutual information between the pairs of semantically related terms and the class label to capture redundant features. It is predicated on the assumption that semantically related terms imply potentially redundant ones, which can significantly reduce execution time by avoiding sequential search strategies. In this work, we use Word2Vec’s CBOW model to obtain semantic similarity between terms. The efficiency of the SS-FSRR is compared to six state-of-the-art competitive selection methods for categorical data using two traditional classifiers (SVM and NB) and a robust deep learning model (LSTM) on seven datasets with 10-fold cross-validation, where experimental results show that the SS-FSRR outperforms the other methods on most tested datasets with high stability as measured by the Jaccard’s Index.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Knowledge and Information Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1