首页 > 最新文献

Journal of Intelligent Information Systems最新文献

英文 中文
TSUNAMI - an explainable PPM approach for customer churn prediction in evolving retail data environments TSUNAMI - 在不断变化的零售数据环境中预测客户流失的可解释 PPM 方法
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2023-12-28 DOI: 10.1007/s10844-023-00838-5
Vincenzo Pasquadibisceglie, Annalisa Appice, Giuseppe Ieva, Donato Malerba

Retail companies are greatly interested in performing continuous monitoring of purchase traces of customers, to identify weak customers and take the necessary actions to improve customer satisfaction and ensure their revenues remain unaffected. In this paper, we formulate the customer churn prediction problem as a Predictive Process Monitoring (PPM) problem to be addressed under possible dynamic conditions of evolving retail data environments. To this aim, we propose TSUNAMI as a PPM approach to monitor the customer loyalty in the retail sector. It processes online the sale receipt stream produced by customers of a retail business company and learns a deep neural model to early detect possible purchase customer traces that will outcome in future churners. In addition, the proposed approach integrates a mechanism to detect concept drifts in customer purchase traces and adapts the deep neural model to concept drifts. Finally, to make decisions of customer purchase monitoring explainable to potential stakeholders, we analyse Shapley values of decisions, to explain which characteristics of the customer purchase traces are the most relevant for disentangling churners from non-churners and how these characteristics have possibly changed over time. Experiments with two benchmark retail data sets explore the effectiveness of the proposed approach.

零售公司非常希望对顾客的购买痕迹进行持续监控,以识别薄弱顾客,并采取必要行动提高顾客满意度,确保收入不受影响。在本文中,我们将客户流失预测问题表述为预测过程监控(PPM)问题,以便在不断变化的零售数据环境的可能动态条件下加以解决。为此,我们提出了 TSUNAMI 作为一种 PPM 方法,用于监控零售业的客户忠诚度。该方法在线处理零售商业公司客户产生的销售收据流,并学习深度神经模型,以尽早发现可能导致未来客户流失的购买客户痕迹。此外,所提出的方法还整合了一种机制,用于检测客户购买痕迹中的概念漂移,并根据概念漂移调整深度神经模型。最后,为了向潜在的利益相关者解释客户购买监控的决策,我们分析了决策的 Shapley 值,以解释客户购买痕迹中哪些特征与区分客户流失者和非客户流失者最相关,以及这些特征随着时间的推移可能发生的变化。利用两个基准零售数据集进行的实验探索了所建议方法的有效性。
{"title":"TSUNAMI - an explainable PPM approach for customer churn prediction in evolving retail data environments","authors":"Vincenzo Pasquadibisceglie, Annalisa Appice, Giuseppe Ieva, Donato Malerba","doi":"10.1007/s10844-023-00838-5","DOIUrl":"https://doi.org/10.1007/s10844-023-00838-5","url":null,"abstract":"<p>Retail companies are greatly interested in performing continuous monitoring of purchase traces of customers, to identify weak customers and take the necessary actions to improve customer satisfaction and ensure their revenues remain unaffected. In this paper, we formulate the customer churn prediction problem as a Predictive Process Monitoring (PPM) problem to be addressed under possible dynamic conditions of evolving retail data environments. To this aim, we propose <span>TSUNAMI</span> as a PPM approach to monitor the customer loyalty in the retail sector. It processes online the sale receipt stream produced by customers of a retail business company and learns a deep neural model to early detect possible purchase customer traces that will outcome in future churners. In addition, the proposed approach integrates a mechanism to detect concept drifts in customer purchase traces and adapts the deep neural model to concept drifts. Finally, to make decisions of customer purchase monitoring explainable to potential stakeholders, we analyse Shapley values of decisions, to explain which characteristics of the customer purchase traces are the most relevant for disentangling churners from non-churners and how these characteristics have possibly changed over time. Experiments with two benchmark retail data sets explore the effectiveness of the proposed approach.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139065395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A bayesian-neural-networks framework for scaling posterior distributions over different-curation datasets 在不同配置数据集上扩展后验分布的贝叶斯神经网络框架
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2023-12-26 DOI: 10.1007/s10844-023-00837-6
Alfredo Cuzzocrea, Alessandro Baldo, Edoardo Fadda

In this paper, we propose and experimentally assess an innovative framework for scaling posterior distributions over different-curation datasets, based on Bayesian-Neural-Networks (BNN). Another innovation of our proposed study consists in enhancing the accuracy of the Bayesian classifier via intelligent sampling algorithms. The proposed methodology is relevant in emerging applicative settings, such as provenance detection and analysis and cybercrime. Our contributions are complemented by a comprehensive experimental evaluation and analysis over both static and dynamic image datasets. Derived results confirm the successful application of our proposed methodology to emerging big data analytics settings.

在本文中,我们提出了一个基于贝叶斯神经网络(BNN)的创新框架,并对其进行了实验性评估,该框架用于在不同配置数据集上缩放后验分布。我们提出的另一项创新是通过智能采样算法提高贝叶斯分类器的准确性。所提出的方法适用于新兴的应用环境,如出处检测和分析以及网络犯罪。我们对静态和动态图像数据集进行了全面的实验评估和分析,对我们的贡献进行了补充。得出的结果证实,我们提出的方法可成功应用于新兴的大数据分析环境。
{"title":"A bayesian-neural-networks framework for scaling posterior distributions over different-curation datasets","authors":"Alfredo Cuzzocrea, Alessandro Baldo, Edoardo Fadda","doi":"10.1007/s10844-023-00837-6","DOIUrl":"https://doi.org/10.1007/s10844-023-00837-6","url":null,"abstract":"<p>In this paper, we propose and experimentally assess <i>an innovative framework for scaling posterior distributions over different-curation datasets, based on Bayesian-Neural-Networks (BNN)</i>. Another innovation of our proposed study consists in enhancing the accuracy of the Bayesian classifier via intelligent sampling algorithms. The proposed methodology is relevant in emerging applicative settings, such as <i>provenance detection and analysis</i> and <i>cybercrime</i>. Our contributions are complemented by a comprehensive experimental evaluation and analysis over both static and dynamic image datasets. Derived results confirm the successful application of our proposed methodology to emerging <i>big data analytics</i> settings.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139051948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tell me what you Like: introducing natural language preference elicitation strategies in a virtual assistant for the movie domain 告诉我你喜欢什么:在电影虚拟助手中引入自然语言偏好激发策略
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2023-12-12 DOI: 10.1007/s10844-023-00835-8
Cataldo Musto, Alessandro Francesco Maria Martina, Andrea Iovine, Fedelucio Narducci, Marco de Gemmis, Giovanni Semeraro

Preference elicitation is a crucial step for every recommendation algorithm. In this paper, we present a strategy that allows users to express their preferences and needs through natural language statements. In particular, our natural language preference elicitation pipeline allows users to express preferences on objective movie features (e.g., actors, directors, etc.) as well as on subjective features that are collected by mining user-written movie reviews. To validate our claims, we carried out a user study in the movie domain ((N=114)). The main finding of our experiment is that users tend to express their preferences by using objective features, whose usage largely overcomes that of subjective features, which are more complicated to be expressed. However, when the users are able to express their preferences also in terms of subjective features, they obtain better recommendations in a lower number of conversation turns. We have also identified the main challenges that arise when users talk to the virtual assistant by using subjective features, and this paves the way for future developments of our methodology.

偏好提取是每一种推荐算法的关键步骤。在本文中,我们提出了一种策略,允许用户通过自然语言语句表达他们的偏好和需求。特别是,我们的自然语言偏好引出管道允许用户表达对客观电影特征(例如,演员,导演等)以及通过挖掘用户编写的电影评论收集的主观特征的偏好。为了验证我们的说法,我们在电影领域((N=114))进行了一项用户研究。我们实验的主要发现是,用户倾向于使用客观特征来表达他们的偏好,客观特征的使用在很大程度上超过了主观特征的使用,主观特征的表达更加复杂。然而,当用户也能够在主观特征方面表达他们的偏好时,他们会在更少的会话回合中获得更好的推荐。我们还确定了用户通过使用主观特征与虚拟助手交谈时出现的主要挑战,这为我们方法的未来发展铺平了道路。
{"title":"Tell me what you Like: introducing natural language preference elicitation strategies in a virtual assistant for the movie domain","authors":"Cataldo Musto, Alessandro Francesco Maria Martina, Andrea Iovine, Fedelucio Narducci, Marco de Gemmis, Giovanni Semeraro","doi":"10.1007/s10844-023-00835-8","DOIUrl":"https://doi.org/10.1007/s10844-023-00835-8","url":null,"abstract":"<p>Preference elicitation is a crucial step for every recommendation algorithm. In this paper, we present a strategy that allows users to express their preferences and needs through natural language statements. In particular, our natural language preference elicitation pipeline allows users to express preferences on <i>objective</i> movie features (e.g., actors, directors, etc.) as well as on <i>subjective</i> features that are collected by mining user-written movie reviews. To validate our claims, we carried out a user study in the movie domain (<span>(N=114)</span>). The main finding of our experiment is that users tend to express their preferences by using <i>objective</i> features, whose usage largely overcomes that of <i>subjective</i> features, which are more complicated to be expressed. However, when the users are able to express their preferences also in terms of <i>subjective</i> features, they obtain better recommendations in a lower number of conversation turns. We have also identified the main challenges that arise when users talk to the virtual assistant by using subjective features, and this paves the way for future developments of our methodology.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138629518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Audio super-resolution via vision transformer 通过视觉变压器实现超分辨率音频
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2023-12-12 DOI: 10.1007/s10844-023-00833-w
Simona Nisticò, Luigi Palopoli, Adele Pia Romano

Audio super-resolution refers to techniques that improve the audio signals quality, usually by exploiting bandwidth extension methods, whereby audio enhancement is obtained by expanding the phase and the spectrogram of the input audio traces. These techniques are therefore much significant for all those cases where audio traces miss relevant parts of the audible spectrum. In several cases, the given input signal contains the low-band frequencies (the easiest to capture with low-quality recording instruments) whereas the high-band must be generated. In this paper, we illustrate techniques implemented into a system for bandwidth extension that works on musical tracks and generates the high-band frequencies starting from the low-band ones. The system, called ViT Super-resolution ((textit{ViT-SR})), features an architecture based on a Generative Adversarial Network and Vision Transformer model. In particular, two versions of the architecture will be presented in this paper, that work on different input frequency ranges. Experiments, which are accounted for in the paper, prove the effectiveness of our approach. In particular, the objective has been attained to demonstrate that it is possible to faithfully reconstruct the high-band signal of an audio file having only its low-band spectrum available as the input, therewith including the usually difficult to synthetically generate harmonics occurring in the audio tracks, which significantly contribute to the final perceived sound quality.

音频超分辨率是指提高音频信号质量的技术,通常通过利用带宽扩展方法,通过扩展输入音频走线的相位和频谱图来获得音频增强。因此,这些技术对于所有音频跟踪丢失可听频谱相关部分的情况都非常重要。在一些情况下,给定的输入信号包含低频带频率(用低质量的录音仪器最容易捕获),而必须生成高频带。在本文中,我们举例说明了实现到带宽扩展系统中的技术,该系统适用于音乐轨道,并从低频带开始产生高频带频率。该系统被称为ViT超分辨率((textit{ViT-SR})),其特点是基于生成对抗网络和视觉转换模型的架构。特别地,本文将介绍该架构的两个版本,它们在不同的输入频率范围内工作。实验证明了该方法的有效性。特别是,目标已经实现,以证明有可能忠实地重建音频文件的高频带信号,只有其低频带频谱可用作为输入,从而包括通常难以合成产生的音频轨道中出现的谐波,这对最终感知的音质有重要贡献。
{"title":"Audio super-resolution via vision transformer","authors":"Simona Nisticò, Luigi Palopoli, Adele Pia Romano","doi":"10.1007/s10844-023-00833-w","DOIUrl":"https://doi.org/10.1007/s10844-023-00833-w","url":null,"abstract":"<p>Audio super-resolution refers to techniques that improve the audio signals quality, usually by exploiting bandwidth extension methods, whereby audio enhancement is obtained by expanding the phase and the spectrogram of the input audio traces. These techniques are therefore much significant for all those cases where audio traces miss relevant parts of the audible spectrum. In several cases, the given input signal contains the low-band frequencies (the easiest to capture with low-quality recording instruments) whereas the high-band must be generated. In this paper, we illustrate techniques implemented into a system for bandwidth extension that works on musical tracks and generates the high-band frequencies starting from the low-band ones. The system, called <i>ViT Super-resolution</i> (<span>(textit{ViT-SR})</span>), features an architecture based on a Generative Adversarial Network and Vision Transformer model. In particular, two versions of the architecture will be presented in this paper, that work on different input frequency ranges. Experiments, which are accounted for in the paper, prove the effectiveness of our approach. In particular, the objective has been attained to demonstrate that it is possible to faithfully reconstruct the high-band signal of an audio file having only its low-band spectrum available as the input, therewith including the usually difficult to synthetically generate harmonics occurring in the audio tracks, which significantly contribute to the final perceived sound quality.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138630225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How can text mining improve the explainability of Food security situations? 文本挖掘如何提高粮食安全状况的可解释性?
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2023-12-11 DOI: 10.1007/s10844-023-00832-x
Hugo Deléglise, Agnès Bégué, Roberto Interdonato, Elodie Maître d’Hôtel, Mathieu Roche, Maguelonne Teisseire

Food Security (FS) is a major concern in West Africa, particularly in Burkina Faso, which has been the epicenter of a humanitarian crisis since the beginning of this century. Early warning systems for FS and famines rely mainly on numerical data for their analyses, whereas textual data, which are more complex to process, are rarely used. However, this data is easy to access and represents a source of relevant information that is complementary to commonly used data sources. This study explores methods for obtaining the explanatory context associated with FS from textual data. Based on a corpus of local newspaper articles, we analyze FS over the last ten years in Burkina Faso. We propose an original and dedicated pipeline that combines different textual analysis approaches to obtain an explanatory model evaluated on real-world and large-scale data. The results of our analyses have proven how our approach provides significant results that offer distinct and complementary qualitative information on food security and its spatial and temporal characteristics.

粮食安全(FS)是西非,尤其是布基纳法索关注的一个主要问题,自本世纪初以来,布基纳法索一直是人道主义危机的中心。粮食安全和饥荒预警系统主要依靠数字数据进行分析,而处理起来更为复杂的文本数据则很少使用。然而,这些数据易于获取,是对常用数据源进行补充的相关信息来源。本研究探讨了从文本数据中获取与财务报表相关的解释性语境的方法。基于当地报纸文章的语料库,我们分析了布基纳法索过去十年的金融服务情况。我们提出了一个独创的专用管道,将不同的文本分析方法结合起来,以获得一个在真实世界和大规模数据上进行评估的解释性模型。我们的分析结果证明了我们的方法如何提供了重要的结果,提供了关于粮食安全及其空间和时间特征的独特而互补的定性信息。
{"title":"How can text mining improve the explainability of Food security situations?","authors":"Hugo Deléglise, Agnès Bégué, Roberto Interdonato, Elodie Maître d’Hôtel, Mathieu Roche, Maguelonne Teisseire","doi":"10.1007/s10844-023-00832-x","DOIUrl":"https://doi.org/10.1007/s10844-023-00832-x","url":null,"abstract":"<p>Food Security (FS) is a major concern in West Africa, particularly in Burkina Faso, which has been the epicenter of a humanitarian crisis since the beginning of this century. Early warning systems for FS and famines rely mainly on numerical data for their analyses, whereas textual data, which are more complex to process, are rarely used. However, this data is easy to access and represents a source of relevant information that is complementary to commonly used data sources. This study explores methods for obtaining the explanatory context associated with FS from textual data. Based on a corpus of local newspaper articles, we analyze FS over the last ten years in Burkina Faso. We propose an original and dedicated pipeline that combines different textual analysis approaches to obtain an explanatory model evaluated on real-world and large-scale data. The results of our analyses have proven how our approach provides significant results that offer distinct and complementary qualitative information on food security and its spatial and temporal characteristics.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138577089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A mutually enhanced multi-scale relation-aware graph convolutional network for argument pair extraction 一种用于参数对提取的相互增强的多尺度关系感知图卷积网络
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2023-11-30 DOI: 10.1007/s10844-023-00826-9
Xiaofei Zhu, Yidan Liu, Zhuo Chen, Xu Chen, Jiafeng Guo, Stefan Dietze

Argument pair extraction (APE) is a fine-grained task of argument mining which aims to identify arguments offered by different participants in some discourse and detect interaction relationships between arguments from different participants. In recent years, many research efforts have been devoted to dealing with APE in a multi-task learning framework. Although these approaches have achieved encouraging results, they still face several challenging issues. First, different types of sentence relationships as well as different levels of information exchange among sentences are largely ignored. Second, they solely model interactions between argument pairs either in an explicit or implicit strategy, while neglecting the complementary effect of the two strategies. In this paper, we propose a novel Mutually Enhanced Multi-Scale Relation-Aware Graph Convolutional Network (MMR-GCN) for APE. Specifically, we first design a multi-scale relation-aware graph aggregation module to explicitly model the complex relationships between review and rebuttal passage sentences. In addition, we propose a mutually enhancement transformer module to implicitly and interactively enhance representations of review and rebuttal passage sentences. We experimentally validate MMR-GCN by comparing with the state-of-the-art APE methods. Experimental results show that it considerably outperforms all baseline methods, and the relative performance improvement of MMR-GCN over the best performing baseline MRC-APE in terms of F1 score reaches to 3.48% and 4.43% on the two benchmark datasets, respectively.

论点对抽取(APE)是一种细粒度的论点挖掘任务,旨在识别某一话语中不同参与者提供的论点,并检测不同参与者的论点之间的交互关系。近年来,许多研究都致力于在多任务学习框架下处理APE问题。尽管这些方法取得了令人鼓舞的成果,但它们仍然面临一些具有挑战性的问题。首先,不同类型的句子关系以及句子之间不同程度的信息交换在很大程度上被忽略了。其次,它们仅对显式或隐式策略中参数对之间的相互作用进行建模,而忽略了两种策略的互补效应。在本文中,我们提出了一种新的互增强多尺度关系感知图卷积网络(MMR-GCN)。具体来说,我们首先设计了一个多尺度关系感知的图聚合模块来明确地建模评论和反驳段落之间的复杂关系。此外,我们提出了一个相互增强的转换模块,以隐式和交互式地增强评论和反驳段落句子的表示。我们通过比较最先进的APE方法,实验验证了MMR-GCN。实验结果表明,它明显优于所有基线方法,在两个基准数据集上,MMR-GCN相对于表现最好的基线MRC-APE的F1分数的相对性能提升分别达到3.48%和4.43%。
{"title":"A mutually enhanced multi-scale relation-aware graph convolutional network for argument pair extraction","authors":"Xiaofei Zhu, Yidan Liu, Zhuo Chen, Xu Chen, Jiafeng Guo, Stefan Dietze","doi":"10.1007/s10844-023-00826-9","DOIUrl":"https://doi.org/10.1007/s10844-023-00826-9","url":null,"abstract":"<p>Argument pair extraction (APE) is a fine-grained task of argument mining which aims to identify arguments offered by different participants in some discourse and detect interaction relationships between arguments from different participants. In recent years, many research efforts have been devoted to dealing with APE in a multi-task learning framework. Although these approaches have achieved encouraging results, they still face several challenging issues. First, different types of sentence relationships as well as different levels of information exchange among sentences are largely ignored. Second, they solely model interactions between argument pairs either in an explicit or implicit strategy, while neglecting the complementary effect of the two strategies. In this paper, we propose a novel Mutually Enhanced Multi-Scale Relation-Aware Graph Convolutional Network (MMR-GCN) for APE. Specifically, we first design a multi-scale relation-aware graph aggregation module to explicitly model the complex relationships between review and rebuttal passage sentences. In addition, we propose a mutually enhancement transformer module to implicitly and interactively enhance representations of review and rebuttal passage sentences. We experimentally validate MMR-GCN by comparing with the state-of-the-art APE methods. Experimental results show that it considerably outperforms all baseline methods, and the relative performance improvement of MMR-GCN over the best performing baseline MRC-APE in terms of F1 score reaches to 3.48% and 4.43% on the two benchmark datasets, respectively.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138518264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
T-shaped expert mining: a novel approach based on skill translation and focal loss t型专家挖掘:一种基于技能转换和焦点丢失的新方法
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2023-11-28 DOI: 10.1007/s10844-023-00831-y
Zohreh Fallahnejad, Mahmood Karimian, Fatemeh Lashkari, Hamid Beigy

Hiring knowledgeable and cost-effective individuals, who use their knowledge and expertise to boost the organization, is extremely important for organizations as they are the most valuable assets. T-shaped experts are the best option based on agile methodology. The T-shaped professional has a deep understanding of one topic and broad knowledge of several others. Compared to other types of professionals, T-shaped professionals are better communicators and cheaper to hire. Finding T-shaped experts in a given skill area requires determining each candidate’s depth of knowledge and shape of expertise. To estimate each candidate’s depth of knowledge in a given skill area, we propose a translation-based method that utilizes two attention-based skill translation models to overcome the vocabulary mismatch between skills and user documents. We also propose two new approaches based on binary cross-entropy and focal loss to determine whether each user is T-shaped. Our experiments on three collections of the StackOverflow dataset demonstrate the efficiency of our proposed method compared to the state-of-the-art approaches.

雇佣知识渊博、成本效益高的人,利用他们的知识和专长来推动组织发展,对组织来说是极其重要的,因为他们是最有价值的资产。t型专家是基于敏捷方法的最佳选择。t型专业人士对一个主题有深刻的理解,对其他几个主题有广泛的了解。与其他类型的专业人士相比,t型专业人士更善于沟通,雇佣成本也更低。在特定的技能领域找到t型专家需要确定每个候选人的知识深度和专业知识的形状。为了估计每个候选人在给定技能领域的知识深度,我们提出了一种基于翻译的方法,该方法利用两个基于注意力的技能翻译模型来克服技能和用户文档之间的词汇不匹配。我们还提出了基于二元交叉熵和焦点损失的两种新方法来确定每个用户是否为t形。我们在StackOverflow数据集的三个集合上的实验表明,与最先进的方法相比,我们提出的方法是有效的。
{"title":"T-shaped expert mining: a novel approach based on skill translation and focal loss","authors":"Zohreh Fallahnejad, Mahmood Karimian, Fatemeh Lashkari, Hamid Beigy","doi":"10.1007/s10844-023-00831-y","DOIUrl":"https://doi.org/10.1007/s10844-023-00831-y","url":null,"abstract":"<p>Hiring knowledgeable and cost-effective individuals, who use their knowledge and expertise to boost the organization, is extremely important for organizations as they are the most valuable assets. T-shaped experts are the best option based on agile methodology. The T-shaped professional has a deep understanding of one topic and broad knowledge of several others. Compared to other types of professionals, T-shaped professionals are better communicators and cheaper to hire. Finding T-shaped experts in a given skill area requires determining each candidate’s depth of knowledge and shape of expertise. To estimate each candidate’s depth of knowledge in a given skill area, we propose a translation-based method that utilizes two attention-based skill translation models to overcome the vocabulary mismatch between skills and user documents. We also propose two new approaches based on binary cross-entropy and focal loss to determine whether each user is T-shaped. Our experiments on three collections of the StackOverflow dataset demonstrate the efficiency of our proposed method compared to the state-of-the-art approaches.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138518263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing anomaly detectors with LatentOut 利用LatentOut增强异常检测器
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2023-11-24 DOI: 10.1007/s10844-023-00829-6
Fabrizio Angiulli, Fabio Fassetti, Luca Ferragina

({{textbf{Latent}}varvec{Out}}) is a recently introduced algorithm for unsupervised anomaly detection which enhances latent space-based neural methods, namely (Variational) Autoencoders, GANomaly and ANOGan architectures. The main idea behind it is to exploit both the latent space and the baseline score of these architectures in order to provide a refined anomaly score performing density estimation in the augmented latent-space/baseline-score feature space. In this paper we investigate the performance of ({{textbf{Latent}}varvec{Out}}) acting as a one-class classifier and we experiment the combination of ({{textbf{Latent}}varvec{Out}}) with GAAL architectures, a novel type of Generative Adversarial Networks for unsupervised anomaly detection. Moreover, we show that the feature space induced by ({{textbf{Latent}}varvec{Out}}) has the characteristic to enhance the separation between normal and anomalous data. Indeed, we prove that standard data mining outlier detection methods perform better when applied on this novel augmented latent space rather than on the original data space.

({{textbf{Latent}}varvec{Out}}) 是最近引入的一种用于无监督异常检测的算法,它增强了基于潜在空间的神经方法,即(变分)自编码器、GANomaly和ANOGan架构。其背后的主要思想是利用这些架构的潜在空间和基线分数,以便在增强的潜在空间/基线分数特征空间中提供执行密度估计的精细异常分数。在本文中,我们研究了({{textbf{Latent}}varvec{Out}})作为单类分类器的性能,并实验了({{textbf{Latent}}varvec{Out}})与GAAL架构的组合,GAAL架构是一种用于无监督异常检测的新型生成对抗网络。此外,我们还证明了({{textbf{Latent}}varvec{Out}})诱导的特征空间具有增强正常和异常数据分离的特性。事实上,我们证明了标准的数据挖掘离群点检测方法在应用于这种新的增强潜在空间时比应用于原始数据空间时表现得更好。
{"title":"Enhancing anomaly detectors with LatentOut","authors":"Fabrizio Angiulli, Fabio Fassetti, Luca Ferragina","doi":"10.1007/s10844-023-00829-6","DOIUrl":"https://doi.org/10.1007/s10844-023-00829-6","url":null,"abstract":"<p><span>({{textbf{Latent}}varvec{Out}})</span> is a recently introduced algorithm for unsupervised anomaly detection which enhances latent space-based neural methods, namely (<i>Variational</i>) <i>Autoencoders</i>, <i>GANomaly</i> and <i>ANOGan</i> architectures. The main idea behind it is to exploit both the latent space and the baseline score of these architectures in order to provide a refined anomaly score performing density estimation in the augmented latent-space/baseline-score feature space. In this paper we investigate the performance of <span>({{textbf{Latent}}varvec{Out}})</span> acting as a one-class classifier and we experiment the combination of <span>({{textbf{Latent}}varvec{Out}})</span> with <i>GAAL</i> architectures, a novel type of Generative Adversarial Networks for unsupervised anomaly detection. Moreover, we show that the feature space induced by <span>({{textbf{Latent}}varvec{Out}})</span> has the characteristic to enhance the separation between normal and anomalous data. Indeed, we prove that standard data mining outlier detection methods perform better when applied on this novel augmented latent space rather than on the original data space.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138518258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A transformer-based framework for predicting geomagnetic indices with uncertainty quantification 基于变压器的不确定量化地磁指数预测框架
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2023-11-18 DOI: 10.1007/s10844-023-00828-7
Yasser Abduallah, Jason T. L. Wang, Haimin Wang, Ju Jing

Geomagnetic activities have a crucial impact on Earth, which can affect spacecraft and electrical power grids. Geospace scientists use a geomagnetic index, called the Kp index, to describe the overall level of geomagnetic activity. This index is an important indicator of disturbances in the Earth’s magnetic field and is used by the U.S. Space Weather Prediction Center as an alert and warning service for users who may be affected by the disturbances. Another commonly used index, called the ap index, is converted from the Kp index. Early and accurate prediction of the Kp and ap indices is essential for preparedness and disaster risk management. In this paper, we present a deep learning framework, named GNet, to perform short-term forecasting of the Kp and ap indices. Specifically, GNet takes as input time series of solar wind parameters’ values, provided by NASA’s Space Science Data Coordinated Archive, and predicts as output the Kp and ap indices respectively at time point (varvec{t + w}) hours for a given time point (varvec{t}) where (varvec{w}) ranges from 1 to 9. GNet combines transformer encoder blocks with Bayesian inference, which is capable of quantifying both aleatoric uncertainty (data uncertainty) and epistemic uncertainty (model uncertainty) in making predictions. Experimental results show that GNet outperforms closely related machine learning methods in terms of the root mean square error and R-squared score. Furthermore, GNet can provide both data and model uncertainty quantification results, which the existing methods cannot offer. To our knowledge, this is the first time that Bayesian transformers have been used for geomagnetic activity prediction.

地磁活动对地球有至关重要的影响,它可以影响航天器和电网。地球空间科学家使用一种地磁指数,称为Kp指数,来描述地磁活动的总体水平。该指标是地球磁场扰动的重要指标,被美国空间天气预报中心用作可能受到干扰影响的用户的警报和预警服务。另一个常用的指数,称为ap指数,是由Kp指数转换而来的。Kp和ap指数的早期和准确预测对于备灾和灾害风险管理至关重要。在本文中,我们提出了一个名为GNet的深度学习框架,用于对Kp和ap指数进行短期预测。具体而言,GNet以NASA空间科学数据协调档案提供的太阳风参数值时间序列作为输入,分别预测给定时间点(varvec{t}) ((varvec{w})的取值范围为1 ~ 9)(varvec{t + w})小时的Kp和ap指数作为输出。GNet将变压器编码器块与贝叶斯推理相结合,能够在预测中量化任意不确定性(数据不确定性)和认知不确定性(模型不确定性)。实验结果表明,GNet在均方根误差和r平方分数方面优于密切相关的机器学习方法。此外,GNet可以提供现有方法无法提供的数据和模型不确定性量化结果。据我们所知,这是贝叶斯变压器第一次被用于地磁活动预测。
{"title":"A transformer-based framework for predicting geomagnetic indices with uncertainty quantification","authors":"Yasser Abduallah, Jason T. L. Wang, Haimin Wang, Ju Jing","doi":"10.1007/s10844-023-00828-7","DOIUrl":"https://doi.org/10.1007/s10844-023-00828-7","url":null,"abstract":"<p>Geomagnetic activities have a crucial impact on Earth, which can affect spacecraft and electrical power grids. Geospace scientists use a geomagnetic index, called the Kp index, to describe the overall level of geomagnetic activity. This index is an important indicator of disturbances in the Earth’s magnetic field and is used by the U.S. Space Weather Prediction Center as an alert and warning service for users who may be affected by the disturbances. Another commonly used index, called the ap index, is converted from the Kp index. Early and accurate prediction of the Kp and ap indices is essential for preparedness and disaster risk management. In this paper, we present a deep learning framework, named GNet, to perform short-term forecasting of the Kp and ap indices. Specifically, GNet takes as input time series of solar wind parameters’ values, provided by NASA’s Space Science Data Coordinated Archive, and predicts as output the Kp and ap indices respectively at time point <span>(varvec{t + w})</span> hours for a given time point <span>(varvec{t})</span> where <span>(varvec{w})</span> ranges from 1 to 9. GNet combines transformer encoder blocks with Bayesian inference, which is capable of quantifying both aleatoric uncertainty (data uncertainty) and epistemic uncertainty (model uncertainty) in making predictions. Experimental results show that GNet outperforms closely related machine learning methods in terms of the root mean square error and R-squared score. Furthermore, GNet can provide both data and model uncertainty quantification results, which the existing methods cannot offer. To our knowledge, this is the first time that Bayesian transformers have been used for geomagnetic activity prediction.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138516069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EqBal-RS: Mitigating popularity bias in recommender systems EqBal-RS:减轻推荐系统中的流行偏见
3区 计算机科学 Q2 Computer Science Pub Date : 2023-11-07 DOI: 10.1007/s10844-023-00817-w
Shivam Gupta, Kirandeep Kaur, Shweta Jain
{"title":"EqBal-RS: Mitigating popularity bias in recommender systems","authors":"Shivam Gupta, Kirandeep Kaur, Shweta Jain","doi":"10.1007/s10844-023-00817-w","DOIUrl":"https://doi.org/10.1007/s10844-023-00817-w","url":null,"abstract":"","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135476770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Intelligent Information Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1