Pub Date : 2024-08-31DOI: 10.1007/s10844-024-00880-x
Allegra De Filippo, Emanuele Di Giacomo, Andrea Borghesi
Predicting the execution time of weather forecast models is a complex task, since these models are usually performed on High Performance Computing systems that require large computing capabilities. Indeed, a reliable prediction can imply several benefits, by allowing for an improved planning of the model execution, a better allocation of available resources, and the identification of possible anomalies. However, to make such predictions is usually hard, since there is a scarcity of datasets that benchmark the existing meteorological simulation models. In this work, we focus on the runtime predictions of the execution of the COSMO (COnsortium for SMall-scale MOdeling) weather forecasting model used at the Hydro-Meteo-Climate Structure of the Regional Agency for the Environment and Energy Prevention Emilia-Romagna. We show how a plethora of Machine Learning approaches can obtain accurate runtime predictions of this complex model, by designing a new well-defined benchmark for this application task. Indeed, our contribution is twofold: 1) the creation of a large public dataset reporting the runtime of COSMO run under a variety of different configurations; 2) a comparative study of ML models, which greatly outperform the current state-of-practice used by the domain experts. This data collection represents an essential initial benchmark for this application field, and a useful resource for analyzing the model performance: better accuracy in runtime predictions could help facility owners to improve job scheduling and resource allocation of the entire system; while for a final user, a posteriori analysis could help to identify anomalous runs.
预测天气预报模型的执行时间是一项复杂的任务,因为这些模型通常是在需要大型计算能力的高性能计算系统上执行的。事实上,可靠的预测可以带来多种益处,包括改进模型执行计划、更好地分配可用资源以及识别可能的异常情况。然而,要做出这样的预测通常很难,因为现有的气象模拟模型缺乏基准数据集。在这项工作中,我们重点研究了艾米利亚-罗马涅大区环境和能源预防局水文气象气候结构使用的 COSMO(小尺度模拟联盟)天气预报模型的运行预测。我们通过为这一应用任务设计一个定义明确的新基准,展示了大量机器学习方法如何在运行时对这一复杂模型进行准确预测。事实上,我们的贡献是双重的:1)创建了一个大型公共数据集,报告 COSMO 在各种不同配置下的运行时间;2)对 ML 模型进行比较研究,这些模型大大优于领域专家目前使用的实践状态。这些数据收集是这一应用领域的重要初始基准,也是分析模型性能的有用资源:更准确的运行时间预测可以帮助设施所有者改进整个系统的作业调度和资源分配;而对于最终用户来说,后验分析可以帮助识别异常运行。
{"title":"Machine learning approaches to predict the execution time of the meteorological simulation software COSMO","authors":"Allegra De Filippo, Emanuele Di Giacomo, Andrea Borghesi","doi":"10.1007/s10844-024-00880-x","DOIUrl":"https://doi.org/10.1007/s10844-024-00880-x","url":null,"abstract":"<p>Predicting the execution time of weather forecast models is a complex task, since these models are usually performed on High Performance Computing systems that require large computing capabilities. Indeed, a reliable prediction can imply several benefits, by allowing for an improved planning of the model execution, a better allocation of available resources, and the identification of possible anomalies. However, to make such predictions is usually hard, since there is a scarcity of datasets that benchmark the existing meteorological simulation models. In this work, we focus on the runtime predictions of the execution of the COSMO (COnsortium for SMall-scale MOdeling) weather forecasting model used at the Hydro-Meteo-Climate Structure of the Regional Agency for the Environment and Energy Prevention Emilia-Romagna. We show how a plethora of Machine Learning approaches can obtain accurate runtime predictions of this complex model, by designing a new well-defined benchmark for this application task. Indeed, our contribution is twofold: 1) the creation of a large public dataset reporting the runtime of COSMO run under a variety of different configurations; 2) a comparative study of ML models, which greatly outperform the current state-of-practice used by the domain experts. This data collection represents an essential initial benchmark for this application field, and a useful resource for analyzing the model performance: better accuracy in runtime predictions could help facility owners to improve job scheduling and resource allocation of the entire system; while for a final user, a posteriori analysis could help to identify anomalous runs.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"75 1 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-22DOI: 10.1007/s10844-024-00881-w
Shuxia Ren, Zewei Guo, Xiaohan Li, Ruikun Zhong
Aspect-Based Sentiment Triple Extraction (ASTE), a critical sub-task of Aspect-Based Sentiment Analysis (ABSA), has received extensive attention in recent years. ASTE aims to extract structured sentiment triples from texts, with most existing studies focusing on designing new strategic frameworks. Nonetheless, these methods often overlook the complex characteristics of linguistic expression and the deeper semantic nuances, leading to deficiencies in extracting the semantic representations of triples and effectively utilizing syntactic relationships in texts. To address these challenges, this paper introduces a span-based semantic and syntactic Dual-Enhanced model that deeply integrates rich syntactic information, such as part-of-speech tagging, constituent syntax, and dependency syntax structures. Specifically, we designed a semantic encoder and a syntactic encoder to capture the semantic-syntactic information closely related to the sentence’s underlying intent. Through a Feature Interaction Module, we effectively integrate information across different dimensions and promote a more comprehensive understanding of the relationships between aspects and opinions. We also adopted a span-based tagging scheme that generates more precise aspect sentiment triple extractions by exploring cross-level information and constraints. Experimental results on benchmark datasets derived from the SemEval challenge prove that our model significantly outperforms existing baselines.
{"title":"Span-based semantic syntactic dual enhancement for aspect sentiment triplet extraction","authors":"Shuxia Ren, Zewei Guo, Xiaohan Li, Ruikun Zhong","doi":"10.1007/s10844-024-00881-w","DOIUrl":"https://doi.org/10.1007/s10844-024-00881-w","url":null,"abstract":"<p>Aspect-Based Sentiment Triple Extraction (ASTE), a critical sub-task of Aspect-Based Sentiment Analysis (ABSA), has received extensive attention in recent years. ASTE aims to extract structured sentiment triples from texts, with most existing studies focusing on designing new strategic frameworks. Nonetheless, these methods often overlook the complex characteristics of linguistic expression and the deeper semantic nuances, leading to deficiencies in extracting the semantic representations of triples and effectively utilizing syntactic relationships in texts. To address these challenges, this paper introduces a span-based semantic and syntactic Dual-Enhanced model that deeply integrates rich syntactic information, such as part-of-speech tagging, constituent syntax, and dependency syntax structures. Specifically, we designed a semantic encoder and a syntactic encoder to capture the semantic-syntactic information closely related to the sentence’s underlying intent. Through a Feature Interaction Module, we effectively integrate information across different dimensions and promote a more comprehensive understanding of the relationships between aspects and opinions. We also adopted a span-based tagging scheme that generates more precise aspect sentiment triple extractions by exploring cross-level information and constraints. Experimental results on benchmark datasets derived from the SemEval challenge prove that our model significantly outperforms existing baselines.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"26 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-14DOI: 10.1007/s10844-024-00875-8
Anti Alman, Alessio Arleo, Iris Beerepoot, Andrea Burattin, Claudio Di Ciccio, Manuel Resinas
Knowledge-intensive processes represent a particularly challenging scenario for process mining. The flexibility that such processes allow constitutes a hurdle as they are hard to capture in a single model. To tackle this problem, multiple visual representations of the same processes could be beneficial, each addressing different information dimensions according to the specific needs and background knowledge of the concrete process workers and stakeholders. In this paper, we propose, describe, and evaluate a framework, named Tiramisù , that leverages visual analytics for the interactive visualization of multi-faceted process information, aimed at supporting the investigation and insight generation of users in their process analysis tasks. Tiramisù is based on a multi-layer visualization methodology that includes a visual backdrop that provides context and an arbitrary number of superimposed and on-demand dimension layers. This arrangement allows our framework to display process information from different perspectives and to project this information onto a domain-friendly representation of the context in which the process unfolds. We provide an in-depth description of the approach’s founding principles, deeply rooted in visualization research, that justify our design choices for the whole framework. We demonstrate the feasibility of the framework through its application in two use-case scenarios in the context of healthcare and personal information management. Plus, we conducted qualitative evaluations with potential end users of both scenarios, gathering precious insights about the efficacy and applicability of our framework to various application domains.
{"title":"Tiramisù: making sense of multi-faceted process information through time and space","authors":"Anti Alman, Alessio Arleo, Iris Beerepoot, Andrea Burattin, Claudio Di Ciccio, Manuel Resinas","doi":"10.1007/s10844-024-00875-8","DOIUrl":"https://doi.org/10.1007/s10844-024-00875-8","url":null,"abstract":"<p>Knowledge-intensive processes represent a particularly challenging scenario for process mining. The flexibility that such processes allow constitutes a hurdle as they are hard to capture in a single model. To tackle this problem, multiple visual representations of the same processes could be beneficial, each addressing different information dimensions according to the specific needs and background knowledge of the concrete process workers and stakeholders. In this paper, we propose, describe, and evaluate a framework, named <span>Tiramisù</span> , that leverages visual analytics for the interactive visualization of multi-faceted process information, aimed at supporting the investigation and insight generation of users in their process analysis tasks. <span>Tiramisù</span> is based on a multi-layer visualization methodology that includes a visual backdrop that provides context and an arbitrary number of superimposed and on-demand dimension layers. This arrangement allows our framework to display process information from different perspectives and to project this information onto a domain-friendly representation of the context in which the process unfolds. We provide an in-depth description of the approach’s founding principles, deeply rooted in visualization research, that justify our design choices for the whole framework. We demonstrate the feasibility of the framework through its application in two use-case scenarios in the context of healthcare and personal information management. Plus, we conducted qualitative evaluations with potential end users of both scenarios, gathering precious insights about the efficacy and applicability of our framework to various application domains.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"72 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-13DOI: 10.1007/s10844-024-00873-w
Gyunam Park, Lukas Liss, Wil M. P. van der Aalst
This paper presents a novel approach for generating actionable recommendations from educational event data collected by Campus Management Systems (CMS) to enhance study planning in higher education. The approach unfolds in three phases: feature identification tailored to the educational context, predictive modeling employing the RuleFit algorithm, and extracting actionable recommendations. We utilize diverse features, encompassing academic histories and course sequences, to capture the multi-dimensional nature of student academic behaviors. The effectiveness of our approach is empirically validated using data from the computer science bachelor’s program at RWTH Aachen University, with the goal of predicting overall GPA and formulating recommendations to enhance academic performance. Our contributions lie in the novel adaptation of behavioral features for the educational domain and the strategic use of the RuleFit algorithm for both predictive modeling and the generation of practical recommendations, offering a data-driven foundation for informed study planning and academic decision-making.
{"title":"Learning recommendations from educational event data in higher education","authors":"Gyunam Park, Lukas Liss, Wil M. P. van der Aalst","doi":"10.1007/s10844-024-00873-w","DOIUrl":"https://doi.org/10.1007/s10844-024-00873-w","url":null,"abstract":"<p>This paper presents a novel approach for generating actionable recommendations from educational event data collected by Campus Management Systems (CMS) to enhance study planning in higher education. The approach unfolds in three phases: feature identification tailored to the educational context, predictive modeling employing the RuleFit algorithm, and extracting actionable recommendations. We utilize diverse features, encompassing academic histories and course sequences, to capture the multi-dimensional nature of student academic behaviors. The effectiveness of our approach is empirically validated using data from the computer science bachelor’s program at RWTH Aachen University, with the goal of predicting overall GPA and formulating recommendations to enhance academic performance. Our contributions lie in the novel adaptation of behavioral features for the educational domain and the strategic use of the RuleFit algorithm for both predictive modeling and the generation of practical recommendations, offering a data-driven foundation for informed study planning and academic decision-making.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"16 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-13DOI: 10.1007/s10844-024-00878-5
Teng Fu, Gang Zhou
Temporal graph entity alignment aims at finding the equivalent entity pairs across different temporal knowledge graphs (TKGs). Primarily methods mainly utilize a time-aware and relationship-aware approach to embed and align. However, the existence of long-tail entities in TKGs still restricts the accuracy of alignment, as the limited neighborhood information may restrict the available neighborhood information for obtaining high-quality embeddings, and hence would impact the efficiency of entity alignment in representation space. Moreover, most previous researches are supervised, with heavy dependence on seed labels for alignment, restricting their applicability in scenarios with limited resources. To tackle these challenges, we propose a Temporal Knowledge Completion enhanced Self-supervised Entity Alignment (TSEA). We argue that, with high-quality embeddings, the entities would be aligned in a self-supervised manner. To this end, TSEA is constituted of two modules: A graph completion module to predict the missing links for the long-tailed entities. With the improved graph, TSEA further incorporates a self-supervised entity alignment module to achieve unsupervised alignment. Experimental results on widely adopted benchmarks demonstrate improved performance compared to several recent baseline methods. Additional ablation experiments further corroborate the efficacy of the proposed modules.
{"title":"Temporal knowledge completion enhanced self-supervised entity alignment","authors":"Teng Fu, Gang Zhou","doi":"10.1007/s10844-024-00878-5","DOIUrl":"https://doi.org/10.1007/s10844-024-00878-5","url":null,"abstract":"<p>Temporal graph entity alignment aims at finding the equivalent entity pairs across different temporal knowledge graphs (TKGs). Primarily methods mainly utilize a time-aware and relationship-aware approach to embed and align. However, the existence of long-tail entities in TKGs still restricts the accuracy of alignment, as the limited neighborhood information may restrict the available neighborhood information for obtaining high-quality embeddings, and hence would impact the efficiency of entity alignment in representation space. Moreover, most previous researches are supervised, with heavy dependence on seed labels for alignment, restricting their applicability in scenarios with limited resources. To tackle these challenges, we propose a Temporal Knowledge Completion enhanced Self-supervised Entity Alignment (TSEA). We argue that, with high-quality embeddings, the entities would be aligned in a self-supervised manner. To this end, TSEA is constituted of two modules: A graph completion module to predict the missing links for the long-tailed entities. With the improved graph, TSEA further incorporates a self-supervised entity alignment module to achieve unsupervised alignment. Experimental results on widely adopted benchmarks demonstrate improved performance compared to several recent baseline methods. Additional ablation experiments further corroborate the efficacy of the proposed modules.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"58 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-07DOI: 10.1007/s10844-024-00870-z
Ahmed A. Ewees, Marwa A. Gaheen, Mohammed M. Alshahrani, Ahmed M. Anter, Fatma H. Ismail
This paper introduces MPAG, a new feature selection method aimed at overcoming the limitations of the conventional Marine Predators Algorithm (MPA). The MPA may experience stagnation and become trapped in local optima during optimization. To address this challenge, we propose a refined version of the MPA, termed MPAG, which incorporates the Local Escape Operator (LEO) from the gradient-based optimizer (GBO). By leveraging the LEO operator, MPAG enhances the exploration ability of the MPA, particularly during the initial one-third of iterations. This enhancement injects more diversity into populations, thereby improving the process of search space discovery and mitigating the risk of premature convergence. The performance of MPAG is evaluated on 14 feature selection benchmark datasets, employing seven performance measures including fitness value, classification accuracy, and selected features. Our findings indicate that MPAG outperforms other algorithms in 86% of the datasets, underscoring its capability to select the most relevant features across various datasets while maintaining stability. Additionally, MPAG is evaluated using two cybersecurity applications, specifically spam detection datasets, where it demonstrates superior performance across most performance measures compared to other methods.
{"title":"Improved machine learning technique for feature reduction and its application in spam email detection","authors":"Ahmed A. Ewees, Marwa A. Gaheen, Mohammed M. Alshahrani, Ahmed M. Anter, Fatma H. Ismail","doi":"10.1007/s10844-024-00870-z","DOIUrl":"https://doi.org/10.1007/s10844-024-00870-z","url":null,"abstract":"<p>This paper introduces MPAG, a new feature selection method aimed at overcoming the limitations of the conventional Marine Predators Algorithm (MPA). The MPA may experience stagnation and become trapped in local optima during optimization. To address this challenge, we propose a refined version of the MPA, termed MPAG, which incorporates the Local Escape Operator (LEO) from the gradient-based optimizer (GBO). By leveraging the LEO operator, MPAG enhances the exploration ability of the MPA, particularly during the initial one-third of iterations. This enhancement injects more diversity into populations, thereby improving the process of search space discovery and mitigating the risk of premature convergence. The performance of MPAG is evaluated on 14 feature selection benchmark datasets, employing seven performance measures including fitness value, classification accuracy, and selected features. Our findings indicate that MPAG outperforms other algorithms in 86% of the datasets, underscoring its capability to select the most relevant features across various datasets while maintaining stability. Additionally, MPAG is evaluated using two cybersecurity applications, specifically spam detection datasets, where it demonstrates superior performance across most performance measures compared to other methods.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"77 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141932616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Entity relation extraction is a key technology for extracting structured information from unstructured text and serves as the foundation for building large-scale knowledge graphs. Current joint entity relation extraction methods primarily focus on improving the recognition of overlapping triplets to enhance the overall performance of the model. However, the model still faces numerous challenges in managing intra-triplet and inter-triplet interactions, expanding the breadth of semantic encoding, and reducing information redundancy during the extraction process. These issues make it challenging for the model to achieve satisfactory performance in both normal and overlapping triple extraction. To address these challenges, this study proposes a comprehensive prediction network that includes multi-feature semantic fusion. We have developed a semantic fusion module that integrates entity mask embedding sequences, which enhance connections between entities, and context embedding sequences that provide richer semantic information, to enhance inter-triplet interactions and expand semantic encoding. Subsequently, using a parallel decoder to simultaneously generate a set of triplets, improving the interaction between them. Additionally, we utilize an entity mask sequence to finely prune these triplets, optimizing the final set of triplets. Experimental results on the publicly available datasets NYT and WebNLG demonstrate that, with BERT as the encoder, our model outperforms the baseline model in terms of accuracy and F1 score.
实体关系提取是从非结构化文本中提取结构化信息的关键技术,也是构建大规模知识图谱的基础。目前的联合实体关系提取方法主要侧重于提高重叠三元组的识别率,以增强模型的整体性能。然而,该模型在管理三元组内和三元组间的交互、扩展语义编码的广度以及减少提取过程中的信息冗余方面仍面临诸多挑战。这些问题使得该模型在正常三元组和重叠三元组提取中都难以取得令人满意的性能。为了应对这些挑战,本研究提出了一种包含多特征语义融合的综合预测网络。我们开发了一个语义融合模块,该模块整合了实体掩码嵌入序列和上下文嵌入序列,前者可增强实体间的联系,后者可提供更丰富的语义信息,从而增强三元组间的交互并扩展语义编码。随后,利用并行解码器同时生成一组三元组,改善它们之间的互动。此外,我们还利用实体掩码序列对这些三元组进行精细修剪,从而优化最终的三元组。在公开数据集 NYT 和 WebNLG 上的实验结果表明,使用 BERT 作为编码器,我们的模型在准确率和 F1 分数方面都优于基线模型。
{"title":"Joint entity and relation extraction with fusion of multi-feature semantics","authors":"Ting Wang, Wenjie Yang, Tao Wu, Chuan Yang, Jiaying Liang, Hongyang Wang, Jia Li, Dong Xiang, Zheng Zhou","doi":"10.1007/s10844-024-00871-y","DOIUrl":"https://doi.org/10.1007/s10844-024-00871-y","url":null,"abstract":"<p>Entity relation extraction is a key technology for extracting structured information from unstructured text and serves as the foundation for building large-scale knowledge graphs. Current joint entity relation extraction methods primarily focus on improving the recognition of overlapping triplets to enhance the overall performance of the model. However, the model still faces numerous challenges in managing intra-triplet and inter-triplet interactions, expanding the breadth of semantic encoding, and reducing information redundancy during the extraction process. These issues make it challenging for the model to achieve satisfactory performance in both normal and overlapping triple extraction. To address these challenges, this study proposes a comprehensive prediction network that includes multi-feature semantic fusion. We have developed a semantic fusion module that integrates entity mask embedding sequences, which enhance connections between entities, and context embedding sequences that provide richer semantic information, to enhance inter-triplet interactions and expand semantic encoding. Subsequently, using a parallel decoder to simultaneously generate a set of triplets, improving the interaction between them. Additionally, we utilize an entity mask sequence to finely prune these triplets, optimizing the final set of triplets. Experimental results on the publicly available datasets NYT and WebNLG demonstrate that, with BERT as the encoder, our model outperforms the baseline model in terms of accuracy and F1 score.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"34 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141870542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-30DOI: 10.1007/s10844-024-00869-6
Vitor A. Batista, Diogo S. M. Gomes, Alexandre Evsukoff
Question Answering is one of the most relevant areas in the field of Natural Language Processing, rapidly evolving with promising results due to the increasing availability of suitable datasets and the advent of new technologies, such as Generative Models. This article introduces SESAME, a Self-supervised framework for Extractive queStion Answering over docuMent collEctions. SESAME aims to enhance open-domain question answering systems (ODQA) by leveraging domain adaptation with synthetic datasets, enabling efficient question answering over private document collections with low resource usage. The framework incorporates recent advances with large language models, and an efficient hybrid method for context retrieval. We conducted several sets of experiments with the Machine Reading for Question Answering (MRQA) 2019 Shared Task datasets, FAQuAD - a Brazilian Portuguese reading comprehension dataset, Wikipedia, and Retrieval-Augmented Generation Benchmark, to demonstrate SESAME’s effectiveness. The results indicate that SESAME’s domain adaptation using synthetic data significantly improves QA performance, generalizes across different domains and languages, and competes with or surpasses state-of-the-art systems in ODQA. Finally, SESAME is an open-source tool, and all code, datasets and experimental data are available for public use in our repository.
{"title":"SESAME - self-supervised framework for extractive question answering over document collections","authors":"Vitor A. Batista, Diogo S. M. Gomes, Alexandre Evsukoff","doi":"10.1007/s10844-024-00869-6","DOIUrl":"https://doi.org/10.1007/s10844-024-00869-6","url":null,"abstract":"<p>Question Answering is one of the most relevant areas in the field of Natural Language Processing, rapidly evolving with promising results due to the increasing availability of suitable datasets and the advent of new technologies, such as Generative Models. This article introduces SESAME, a Self-supervised framework for Extractive queStion Answering over docuMent collEctions. SESAME aims to enhance open-domain question answering systems (ODQA) by leveraging domain adaptation with synthetic datasets, enabling efficient question answering over private document collections with low resource usage. The framework incorporates recent advances with large language models, and an efficient hybrid method for context retrieval. We conducted several sets of experiments with the Machine Reading for Question Answering (MRQA) 2019 Shared Task datasets, FAQuAD - a Brazilian Portuguese reading comprehension dataset, Wikipedia, and Retrieval-Augmented Generation Benchmark, to demonstrate SESAME’s effectiveness. The results indicate that SESAME’s domain adaptation using synthetic data significantly improves QA performance, generalizes across different domains and languages, and competes with or surpasses state-of-the-art systems in ODQA. Finally, SESAME is an open-source tool, and all code, datasets and experimental data are available for public use in our repository.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"15 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141870711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.1007/s10844-024-00867-8
Camilla Sancricca, Giovanni Siracusa, Cinzia Cappiello
Data play a key role in AI systems that support decision-making processes. Data-centric AI highlights the importance of having high-quality input data to obtain reliable results. However, well-preparing data for machine learning is becoming difficult due to the variety of data quality issues and available data preparation tasks. For this reason, approaches that help users in performing this demanding phase are needed. This work proposes DIANA, a framework for data-centric AI to support data exploration and preparation, suggesting suitable cleaning tasks to obtain valuable analysis results. We design an adaptive self-service environment that can handle the analysis and preparation of different types of sources, i.e., tabular, and streaming data. The central component of our framework is a knowledge base that collects evidence related to the effectiveness of the data preparation actions along with the type of input data and the considered machine learning model. In this paper, we first describe the framework, the knowledge base model, and its enrichment process. Then, we show the experiments conducted to enrich the knowledge base in a particular case study: time series data streams.
{"title":"Enhancing data preparation: insights from a time series case study","authors":"Camilla Sancricca, Giovanni Siracusa, Cinzia Cappiello","doi":"10.1007/s10844-024-00867-8","DOIUrl":"https://doi.org/10.1007/s10844-024-00867-8","url":null,"abstract":"<p>Data play a key role in AI systems that support decision-making processes. Data-centric AI highlights the importance of having high-quality input data to obtain reliable results. However, well-preparing data for machine learning is becoming difficult due to the variety of data quality issues and available data preparation tasks. For this reason, approaches that help users in performing this demanding phase are needed. This work proposes DIANA, a framework for data-centric AI to support data exploration and preparation, suggesting suitable cleaning tasks to obtain valuable analysis results. We design an adaptive self-service environment that can handle the analysis and preparation of different types of sources, i.e., tabular, and streaming data. The central component of our framework is a knowledge base that collects evidence related to the effectiveness of the data preparation actions along with the type of input data and the considered machine learning model. In this paper, we first describe the framework, the knowledge base model, and its enrichment process. Then, we show the experiments conducted to enrich the knowledge base in a particular case study: time series data streams.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"78 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141777860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-10DOI: 10.1007/s10844-024-00858-9
Yang Shi, Jinglang Cai, Lei Liao
The effectiveness of multimodal sentiment analysis hinges on the seamless integration of information from diverse modalities, where the quality of modality fusion directly influences sentiment analysis accuracy. Prior methods often rely on intricate fusion strategies, elevating computational costs and potentially yielding inaccurate multimodal representations due to distribution gaps and information redundancy across heterogeneous modalities. This paper centers on the backpropagation of loss and introduces a Transformer-based model called Multi-Task Learning and Mutual Information Maximization with Crossmodal Transformer (MMMT). Addressing the issue of inaccurate multimodal representation for MSA, MMMT effectively combines mutual information maximization with crossmodal Transformer to convey more modality-invariant information to multimodal representation, fully exploring modal commonalities. Notably, it utilizes multi-modal labels for uni-modal training, presenting a fresh perspective on multi-task learning in MSA. Comparative experiments on the CMU-MOSI and CMU-MOSEI datasets demonstrate that MMMT improves model accuracy while reducing computational burden, making it suitable for resource-constrained and real-time performance-requiring application scenarios. Additionally, ablation experiments validate the efficacy of multi-task learning and probe the specific impact of combining mutual information maximization with Transformer in MSA.
{"title":"Multi-task learning and mutual information maximization with crossmodal transformer for multimodal sentiment analysis","authors":"Yang Shi, Jinglang Cai, Lei Liao","doi":"10.1007/s10844-024-00858-9","DOIUrl":"https://doi.org/10.1007/s10844-024-00858-9","url":null,"abstract":"<p>The effectiveness of multimodal sentiment analysis hinges on the seamless integration of information from diverse modalities, where the quality of modality fusion directly influences sentiment analysis accuracy. Prior methods often rely on intricate fusion strategies, elevating computational costs and potentially yielding inaccurate multimodal representations due to distribution gaps and information redundancy across heterogeneous modalities. This paper centers on the backpropagation of loss and introduces a Transformer-based model called Multi-Task Learning and Mutual Information Maximization with Crossmodal Transformer (MMMT). Addressing the issue of inaccurate multimodal representation for MSA, MMMT effectively combines mutual information maximization with crossmodal Transformer to convey more modality-invariant information to multimodal representation, fully exploring modal commonalities. Notably, it utilizes multi-modal labels for uni-modal training, presenting a fresh perspective on multi-task learning in MSA. Comparative experiments on the CMU-MOSI and CMU-MOSEI datasets demonstrate that MMMT improves model accuracy while reducing computational burden, making it suitable for resource-constrained and real-time performance-requiring application scenarios. Additionally, ablation experiments validate the efficacy of multi-task learning and probe the specific impact of combining mutual information maximization with Transformer in MSA.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"16 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141567460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}