首页 > 最新文献

2022 14th International Conference on Knowledge and Systems Engineering (KSE)最新文献

英文 中文
tBART: Abstractive summarization based on the joining of Topic modeling and BART tBART:基于主题建模和BART结合的抽象摘要
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953613
Binh Dang, Dinh-Truong Do, Le-Minh Nguyen
Topic information has been helpful to direct semantics in text summarization. In this paper, we present a study on a novel and efficient method to incorporate the topic information with the BART model for abstractive summarization, called the tBART. The proposed model inherits the advantages of the BART, learns latent topics, and transfers the topic vector of tokens to context space by an align function. The experimental results illustrate the effectiveness of our proposed method, which significantly outperforms previous methods on two benchmark datasets: XSUM and CNN/DAILY MAIL.
在文本摘要中,主题信息有助于指导语义。在本文中,我们研究了一种新颖而有效的方法,将主题信息与BART模型结合起来进行抽象摘要,称为tBART。该模型继承了BART的优点,学习潜在主题,并通过对齐函数将标记的主题向量转移到上下文空间。实验结果表明了本文方法的有效性,在XSUM和CNN/DAILY MAIL两个基准数据集上显著优于之前的方法。
{"title":"tBART: Abstractive summarization based on the joining of Topic modeling and BART","authors":"Binh Dang, Dinh-Truong Do, Le-Minh Nguyen","doi":"10.1109/KSE56063.2022.9953613","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953613","url":null,"abstract":"Topic information has been helpful to direct semantics in text summarization. In this paper, we present a study on a novel and efficient method to incorporate the topic information with the BART model for abstractive summarization, called the tBART. The proposed model inherits the advantages of the BART, learns latent topics, and transfers the topic vector of tokens to context space by an align function. The experimental results illustrate the effectiveness of our proposed method, which significantly outperforms previous methods on two benchmark datasets: XSUM and CNN/DAILY MAIL.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133836897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Copyright Page 版权页
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953795
Presents the copyright information for the conference. May include reprint permission information.
展示会议的版权信息。可能包括转载许可信息。
{"title":"Copyright Page","authors":"","doi":"10.1109/KSE56063.2022.9953795","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953795","url":null,"abstract":"Presents the copyright information for the conference. May include reprint permission information.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116891957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DWEN: A novel method for accurate estimation of cell type compositions from bulk data samples DWEN:一种从大量数据样本中准确估计细胞类型组成的新方法
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953757
Duc Tran, Ha Nguyen, Hung Nguyen, Tin Nguyen
Advances in single-cell RNA sequencing (scRNAseq) technologies have allowed us to study the heterogeneity of cell populations. The cell compositions of tissues from different hosts may vary greatly, indicating the condition of the hosts, from which the samples are collected. However, the high sequencing cost and the lack of fresh tissues make single-cell approaches less appealing. In many cases, it is practically impossible to generate single-cell data in a large number of subjects, making it challenging to monitor changes in cell type compositions in various diseases. Here we introduce a novel approach, named Deconvolution using Weighted Elastic Net (DWEN), that allows researchers to accurately estimate the cell type compositions from bulk data samples without the need of generating single-cell data. It also allows for the re-analysis of bulk data collected from rare conditions to extract more in-depth cell-type level insights. The approach consists of two modules. The first module constructs the cell type signature matrix from single-cell data while the second module estimates the cell type compositions of input bulk samples. In an extensive analysis using 20 datasets generated from scRNA-seq data of different human tissues, we demonstrate that DWEN outperforms current state-of-the-arts in estimating cell type compositions of bulk samples.
单细胞RNA测序(scRNAseq)技术的进步使我们能够研究细胞群体的异质性。不同寄主组织的细胞组成可能差异很大,这表明所采集样本的寄主的状况不同。然而,高昂的测序成本和缺乏新鲜组织使得单细胞方法不那么吸引人。在许多情况下,在大量受试者中产生单细胞数据实际上是不可能的,这使得监测各种疾病中细胞类型组成的变化具有挑战性。在这里,我们介绍了一种名为加权弹性网(DWEN)的新方法,该方法允许研究人员从大量数据样本中准确估计细胞类型组成,而无需生成单细胞数据。它还允许重新分析从罕见条件下收集的大量数据,以提取更深入的细胞类型水平的见解。该方法由两个模块组成。第一个模块从单细胞数据构建细胞类型签名矩阵,而第二个模块估计输入大样本的细胞类型组成。通过对来自不同人体组织的scRNA-seq数据生成的20个数据集的广泛分析,我们证明DWEN在估计大量样本的细胞类型组成方面优于目前最先进的技术。
{"title":"DWEN: A novel method for accurate estimation of cell type compositions from bulk data samples","authors":"Duc Tran, Ha Nguyen, Hung Nguyen, Tin Nguyen","doi":"10.1109/KSE56063.2022.9953757","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953757","url":null,"abstract":"Advances in single-cell RNA sequencing (scRNAseq) technologies have allowed us to study the heterogeneity of cell populations. The cell compositions of tissues from different hosts may vary greatly, indicating the condition of the hosts, from which the samples are collected. However, the high sequencing cost and the lack of fresh tissues make single-cell approaches less appealing. In many cases, it is practically impossible to generate single-cell data in a large number of subjects, making it challenging to monitor changes in cell type compositions in various diseases. Here we introduce a novel approach, named Deconvolution using Weighted Elastic Net (DWEN), that allows researchers to accurately estimate the cell type compositions from bulk data samples without the need of generating single-cell data. It also allows for the re-analysis of bulk data collected from rare conditions to extract more in-depth cell-type level insights. The approach consists of two modules. The first module constructs the cell type signature matrix from single-cell data while the second module estimates the cell type compositions of input bulk samples. In an extensive analysis using 20 datasets generated from scRNA-seq data of different human tissues, we demonstrate that DWEN outperforms current state-of-the-arts in estimating cell type compositions of bulk samples.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"210 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114154589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multilingual Natural Language Understanding for the FPT.AI Conversational Platform FPT的多语种自然语言理解。人工智能对话平台
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953794
Hong Phuong Le, Thi Thuy Linh Nguyen, Minh Tu Pham, Thanh Hai Vu
This paper presents a multilingual natural language understanding model which is based on BERT and ELECTRA neural networks. The model is pre-trained and fine-tuned on large datasets of four languages: Indonesian, Malaysian, Japanese and Vietnamese. Our fine-tuning method uses an attentional recurrent neural network instead of the common fine-tuning with linear layers. The proposed model is evaluated on several standard benchmark datasets, including intent classification, named entity recognition and sentiment analysis. For Indonesian and Malaysian, our model achieves the same or higher results compared to the existing state-of-the-art IndoNLU and Bahasa ELECTRA models for these languages. For Japanese, our model achieves promising results on sentiment analysis and two-layer named entity recognition. For Vietnamese, our model improves the performance of two sequence labeling tasks including part-of-speech tagging and named entity recognition compared to the state-of-the-art results. The model has been deployed as a core component of the commercial FPT.AI conversational platform, effectively serving many clients in the Indonesian, Malaysian, Japanese and Vietnamese markets–the platform has served 62 million API requests in the first five months of 2022 for chatbot services.11including requests deployed for on-premise contracts.
提出了一种基于BERT和ELECTRA神经网络的多语种自然语言理解模型。该模型在四种语言(印尼语、马来西亚语、日语和越南语)的大型数据集上进行了预先训练和微调。我们的微调方法使用了一个注意递归神经网络,而不是普通的线性层微调。该模型在多个标准基准数据集上进行了评估,包括意图分类、命名实体识别和情感分析。对于印尼语和马来西亚语,我们的模型与针对这些语言的现有最先进的IndoNLU和Bahasa ELECTRA模型相比,获得了相同或更高的结果。对于日语,我们的模型在情感分析和两层命名实体识别上取得了令人满意的结果。对于越南语,与最先进的结果相比,我们的模型提高了两个序列标记任务的性能,包括词性标记和命名实体识别。该模型已被部署为商用FPT的核心组件。人工智能对话平台,有效地为印度尼西亚、马来西亚、日本和越南市场的许多客户提供服务,该平台在2022年前五个月为聊天机器人服务提供了6200万个API请求。包括为内部部署合同部署的请求。
{"title":"Multilingual Natural Language Understanding for the FPT.AI Conversational Platform","authors":"Hong Phuong Le, Thi Thuy Linh Nguyen, Minh Tu Pham, Thanh Hai Vu","doi":"10.1109/KSE56063.2022.9953794","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953794","url":null,"abstract":"This paper presents a multilingual natural language understanding model which is based on BERT and ELECTRA neural networks. The model is pre-trained and fine-tuned on large datasets of four languages: Indonesian, Malaysian, Japanese and Vietnamese. Our fine-tuning method uses an attentional recurrent neural network instead of the common fine-tuning with linear layers. The proposed model is evaluated on several standard benchmark datasets, including intent classification, named entity recognition and sentiment analysis. For Indonesian and Malaysian, our model achieves the same or higher results compared to the existing state-of-the-art IndoNLU and Bahasa ELECTRA models for these languages. For Japanese, our model achieves promising results on sentiment analysis and two-layer named entity recognition. For Vietnamese, our model improves the performance of two sequence labeling tasks including part-of-speech tagging and named entity recognition compared to the state-of-the-art results. The model has been deployed as a core component of the commercial FPT.AI conversational platform, effectively serving many clients in the Indonesian, Malaysian, Japanese and Vietnamese markets–the platform has served 62 million API requests in the first five months of 2022 for chatbot services.11including requests deployed for on-premise contracts.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126840112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using bliss points to enhance direction based multi-objective algorithms 利用极乐点增强基于方向的多目标算法
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953747
Minh Tran Binh, Long Nguyen, D. N. Duc
Using improvement direction to control the evolution of multi-objective optimization algorithms is an interesting and effective method. Improvement direction techniques often evaluate the geometric properties of the solution set in the objective space and based on that to adjusting the evolutionary process to ensure it is capable of exploration and exploitation. The direction of improvement is usually determined based on the convergent and diverse nature of the solution population, in fact, the distribution of the solution population can suggest an online adjustment of the evolutionary process to overcome the problem of keeping the balance between convergence and diversity. In this study, we identify empty regions in the solution population and use the centers of those areas, which we call bliss points, to direct and adjust the algorithms which use improvement direction to enhance the quality of the algorithms. Experimental results have shown competitive results, promising to apply to multi-objective evolutionary algorithms using other geometric techniques.
利用改进方向来控制多目标优化算法的演化是一种有趣而有效的方法。改进方向技术通常评估解集在目标空间中的几何性质,并以此为基础调整演化过程,以确保其能够被探索和利用。改进的方向通常是根据解群的收敛性和多样性来确定的,实际上,解群的分布可以提示进化过程的在线调整,以克服保持收敛性和多样性之间的平衡的问题。在本研究中,我们识别解群中的空白区域,并使用这些区域的中心,我们称之为极乐点,来指导和调整使用改进方向的算法,以提高算法的质量。实验结果显示了具有竞争力的结果,有望应用于使用其他几何技术的多目标进化算法。
{"title":"Using bliss points to enhance direction based multi-objective algorithms","authors":"Minh Tran Binh, Long Nguyen, D. N. Duc","doi":"10.1109/KSE56063.2022.9953747","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953747","url":null,"abstract":"Using improvement direction to control the evolution of multi-objective optimization algorithms is an interesting and effective method. Improvement direction techniques often evaluate the geometric properties of the solution set in the objective space and based on that to adjusting the evolutionary process to ensure it is capable of exploration and exploitation. The direction of improvement is usually determined based on the convergent and diverse nature of the solution population, in fact, the distribution of the solution population can suggest an online adjustment of the evolutionary process to overcome the problem of keeping the balance between convergence and diversity. In this study, we identify empty regions in the solution population and use the centers of those areas, which we call bliss points, to direct and adjust the algorithms which use improvement direction to enhance the quality of the algorithms. Experimental results have shown competitive results, promising to apply to multi-objective evolutionary algorithms using other geometric techniques.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128352746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Improved Method of The Static Directed Automated Random Testing Method in Test Data Generation for C/C++ Projects C/ c++项目测试数据生成中静态定向自动随机测试方法的改进
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953772
Hoang-Viet Tran, Pham Ngoc Hung
Concolic testing has been well-known among software quality assurance methods thanks to its fully automated capability of generating test data, executing them, and producing code coverage reports. This paper presents an improved method named ISDART for SDART, which is one of the most recent advanced methods based on concolic testing, to increase its performance. The key idea of the proposed method is to remove the waste time on generating and executing random test data which do not increase the code coverage. Initially, ISDART generates random test data only once. Then, with the code coverage information retrieved from the randomly generated test data, ISDART explores an uncovered test path, transforms them to test path constraints, solves those constraints, and generates a new test data from the resulting solution. The process is repeated until no uncovered test path can be found. We have implemented both SDART and ISDART and performed experiments with some common unit functions. The experimental results show that ISDART outperforms SDART in terms of speed for the whole testing process whilst reducing the number of generated test data.
Concolic测试在软件质量保证方法中非常有名,这要归功于它完全自动化的生成测试数据、执行测试数据和生成代码覆盖报告的能力。为了提高sart的性能,本文提出了一种改进的ISDART方法,这是最新的基于结肠测试的先进方法之一。提出的方法的关键思想是消除在生成和执行随机测试数据上的浪费时间,这不会增加代码覆盖率。最初,ISDART只生成一次随机测试数据。然后,使用从随机生成的测试数据中检索到的代码覆盖率信息,ISDART探索未覆盖的测试路径,将它们转换为测试路径约束,解决这些约束,并从结果解决方案中生成新的测试数据。重复这个过程,直到找不到未覆盖的测试路径。我们实现了SDART和ISDART,并使用一些常见的单元函数进行了实验。实验结果表明,ISDART在整个测试过程的速度上优于sart,同时减少了生成测试数据的数量。
{"title":"An Improved Method of The Static Directed Automated Random Testing Method in Test Data Generation for C/C++ Projects","authors":"Hoang-Viet Tran, Pham Ngoc Hung","doi":"10.1109/KSE56063.2022.9953772","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953772","url":null,"abstract":"Concolic testing has been well-known among software quality assurance methods thanks to its fully automated capability of generating test data, executing them, and producing code coverage reports. This paper presents an improved method named ISDART for SDART, which is one of the most recent advanced methods based on concolic testing, to increase its performance. The key idea of the proposed method is to remove the waste time on generating and executing random test data which do not increase the code coverage. Initially, ISDART generates random test data only once. Then, with the code coverage information retrieved from the randomly generated test data, ISDART explores an uncovered test path, transforms them to test path constraints, solves those constraints, and generates a new test data from the resulting solution. The process is repeated until no uncovered test path can be found. We have implemented both SDART and ISDART and performed experiments with some common unit functions. The experimental results show that ISDART outperforms SDART in terms of speed for the whole testing process whilst reducing the number of generated test data.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"61 17","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134226652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributionally Robust Fractional 0-1 Programming 分布鲁棒分数0-1规划
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953793
T. Dam, Thuy Anh Ta
This work concerns a stochastic fractional 0-1 program whose coefficients are assumed to be random and follow a given distribution. To solve such a problem, one would need to sample over the randomness of the coefficients. However, in many situations, the sample size would be limited, which makes it difficult for existing approaches (e.g, the sample average approximation approach) to give good solutions. To deal with this issue, we explore a distributionally robust optimization version (DRO) of the fractional problem. We show that the DRO can be reformulated as an equivalent variance regularization version and can be further transformed into a mixed-integer second order cone program (MISOCP), for which an off-the-shelf solver (i.e., CPLEX) can handle. We, then, perform computational results comparing our robust method against the conventional sample average approximation (SAA), using synthetic instances. Our results show that our approach is more effective than the SAA approach in protecting the decision-maker against bad scenarios.
本文研究了一个随机分数型0-1规划,该规划的系数被假设为随机的,并遵循给定的分布。为了解决这样的问题,我们需要对系数的随机性进行抽样。然而,在许多情况下,样本量是有限的,这使得现有的方法(如样本平均近似法)很难给出很好的解。为了解决这个问题,我们探索了分数阶问题的分布鲁棒优化版本(DRO)。我们证明了DRO可以被重新表述为等效方差正则化版本,并可以进一步转化为一个混合整数二阶锥规划(MISOCP),一个现成的求解器(即CPLEX)可以处理它。然后,我们使用合成实例,将我们的鲁棒方法与传统的样本平均近似(SAA)进行计算结果比较。我们的结果表明,我们的方法在保护决策者免受不良情景的影响方面比SAA方法更有效。
{"title":"Distributionally Robust Fractional 0-1 Programming","authors":"T. Dam, Thuy Anh Ta","doi":"10.1109/KSE56063.2022.9953793","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953793","url":null,"abstract":"This work concerns a stochastic fractional 0-1 program whose coefficients are assumed to be random and follow a given distribution. To solve such a problem, one would need to sample over the randomness of the coefficients. However, in many situations, the sample size would be limited, which makes it difficult for existing approaches (e.g, the sample average approximation approach) to give good solutions. To deal with this issue, we explore a distributionally robust optimization version (DRO) of the fractional problem. We show that the DRO can be reformulated as an equivalent variance regularization version and can be further transformed into a mixed-integer second order cone program (MISOCP), for which an off-the-shelf solver (i.e., CPLEX) can handle. We, then, perform computational results comparing our robust method against the conventional sample average approximation (SAA), using synthetic instances. Our results show that our approach is more effective than the SAA approach in protecting the decision-maker against bad scenarios.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130247305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sentiment Classification for Beauty-fashion Reviews 面向美时尚评论的情感分类
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953782
L. Tran, Binh Van Duong, Binh T. Nguyen
The fast growth of e-commerce markets helps companies bring their products closer to customers and lets users have many choices for online shopping. However, it causes the need to have a proper strategy to keep customers in every company. As a rising solution, sentiment analysis on users’ feedback using artificial intelligence is a timely-fashioned way for business owners to understand their customers and clients, which could help them improve their business against competitors. Therefore, in the scope of our research, we introduce our results on the task of customers’ review sentiment analysis using the dataset provided in the Fashion and Beauty Review Rating (one competition organized in Kaggle), where our solution reached first place with a score of 0.51269 RMSE. Our proposed solution combines deep learning models (Bidirectional Long Short-term Memory, Bidirectional Gated Recurrent Unit, Convolutional Neural Network) and a rule-based method (a method that uses linguistic rules to predict the rating of reviews). We can describe the solution in this paper with the support of analysis techniques to give more insightful points.
电子商务市场的快速发展有助于企业将产品更贴近消费者,并让用户在网上购物时有更多选择。然而,它导致需要有一个适当的战略,以保持客户在每个公司。作为一种新兴的解决方案,利用人工智能对用户的反馈进行情绪分析,是企业主了解客户和客户的一种及时的方式,可以帮助他们在竞争中提高业务水平。因此,在我们的研究范围内,我们使用Fashion and Beauty review Rating(在Kaggle组织的一场比赛)中提供的数据集介绍了我们在客户评论情感分析任务上的结果,我们的解决方案以0.51269 RMSE的分数获得了第一名。我们提出的解决方案结合了深度学习模型(双向长短期记忆、双向门控循环单元、卷积神经网络)和基于规则的方法(一种使用语言规则来预测评论评级的方法)。我们可以在分析技术的支持下描述本文的解决方案,以给出更有见地的观点。
{"title":"Sentiment Classification for Beauty-fashion Reviews","authors":"L. Tran, Binh Van Duong, Binh T. Nguyen","doi":"10.1109/KSE56063.2022.9953782","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953782","url":null,"abstract":"The fast growth of e-commerce markets helps companies bring their products closer to customers and lets users have many choices for online shopping. However, it causes the need to have a proper strategy to keep customers in every company. As a rising solution, sentiment analysis on users’ feedback using artificial intelligence is a timely-fashioned way for business owners to understand their customers and clients, which could help them improve their business against competitors. Therefore, in the scope of our research, we introduce our results on the task of customers’ review sentiment analysis using the dataset provided in the Fashion and Beauty Review Rating (one competition organized in Kaggle), where our solution reached first place with a score of 0.51269 RMSE. Our proposed solution combines deep learning models (Bidirectional Long Short-term Memory, Bidirectional Gated Recurrent Unit, Convolutional Neural Network) and a rule-based method (a method that uses linguistic rules to predict the rating of reviews). We can describe the solution in this paper with the support of analysis techniques to give more insightful points.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114983392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automatic Region of Interest Prediction from Instructor’s Behaviors in Lecture Archives 基于讲师行为的兴趣区域自动预测
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953765
Yuhui Yang, Koichi Ota, Wen Gu, S. Hasegawa
This research proposes an automatic region of interest (ROI) prediction architecture with a deep neural network for estimating the learners’ ROI from instructor’s behaviors in lecture archives to generate ROI zoomed videos to fit smaller screens like smart devices. To achieve this goal, we first created a dataset of ROIs from learners’ gaze data in watching the archives and generated 16,039 ROI labels after clustering and smoothing with K-means algorithm based on the gaze point data obtained for the one-second segmented videos. Next, we extracted the instructor’s behaviors as feature maps from the segment video, considering the Frame Difference, Optical Flow, OpenPose, and temporal information. We then composed an Encoder-Decoder architecture that combined U-Net and Resnet with these behavioral features as input to build a deep neural network model for predicting ROI. Through the experiment, the agreement between the ROI labels and the predicted regions was evaluated by Dice loss using each feature map and improved from 0.9 in a single image as a baseline to 0.4 in Openpose and temporal features. The positive potential was obtained from automatic content generation for smart devices through the ROI prediction with the instructor’s behaviors.
本研究提出了一种基于深度神经网络的感兴趣区域(ROI)自动预测架构,用于从授课档案中讲师的行为中估计学习者的ROI,以生成适合智能设备等小屏幕的ROI缩放视频。为了实现这一目标,我们首先从学习者观看档案的注视点数据中创建了一个ROI数据集,并基于1秒分割视频获得的注视点数据,通过K-means算法进行聚类和平滑,生成了16039个ROI标签。接下来,我们从片段视频中提取教练的行为作为特征映射,考虑帧差、光流、OpenPose和时间信息。然后,我们组合了一个编码器-解码器架构,将U-Net和Resnet与这些行为特征结合起来作为输入,构建一个用于预测ROI的深度神经网络模型。通过实验,利用每个特征图的Dice loss来评估ROI标签与预测区域之间的一致性,并将其从单幅图像的0.9作为基线提高到Openpose和时态特征的0.4。通过对讲师行为的ROI预测,获得智能设备内容自动生成的正电位。
{"title":"Automatic Region of Interest Prediction from Instructor’s Behaviors in Lecture Archives","authors":"Yuhui Yang, Koichi Ota, Wen Gu, S. Hasegawa","doi":"10.1109/KSE56063.2022.9953765","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953765","url":null,"abstract":"This research proposes an automatic region of interest (ROI) prediction architecture with a deep neural network for estimating the learners’ ROI from instructor’s behaviors in lecture archives to generate ROI zoomed videos to fit smaller screens like smart devices. To achieve this goal, we first created a dataset of ROIs from learners’ gaze data in watching the archives and generated 16,039 ROI labels after clustering and smoothing with K-means algorithm based on the gaze point data obtained for the one-second segmented videos. Next, we extracted the instructor’s behaviors as feature maps from the segment video, considering the Frame Difference, Optical Flow, OpenPose, and temporal information. We then composed an Encoder-Decoder architecture that combined U-Net and Resnet with these behavioral features as input to build a deep neural network model for predicting ROI. Through the experiment, the agreement between the ROI labels and the predicted regions was evaluated by Dice loss using each feature map and improved from 0.9 in a single image as a baseline to 0.4 in Openpose and temporal features. The positive potential was obtained from automatic content generation for smart devices through the ROI prediction with the instructor’s behaviors.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"196 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134101664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Experimental Evaluation of Homomorphic Encryption in Cloud and Edge Machine Learning 云和边缘机器学习中同态加密的实验评价
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953624
Joe Hrzich, Gunjan Basra, Talal Halabi
Machine Learning (ML)-based intelligent services are gradually becoming the leading service design and delivery model in edge computing, where user and device data is outsourced to take part of large-scale BigData analytics. This paradigm however entails challenging security and privacy concerns, which require rethinking the fundamental concepts behind performing ML. For instance, the encryption of sensitive data provides a straightforward solution that ensures data security and privacy. In particular, Homomorphic encryption allows arbitrary computation on encrypted data and has gained a lot of attention recently. However, it has not been fully adopted by edge computing-based ML due to its potential impact on classification accuracy and model performance. This paper conducts an experimental evaluation of different types of Homomorphic encryption techniques, namely, Partial, Somewhat, and Fully Homomorphic encryption over several ML models, which train on encrypted data and produce classification predictions based on encrypted input data. The results demonstrate two potential directions in the context of ML privacy at the network edge: privacy-preserving training and privacy-preserving classification. The performance of encryption-driven ML models is compared using different metrics such as accuracy and computation time for plaintext vs. encrypted text. This evaluation will guide future research in investigating which ML models perform better over encrypted data.
基于机器学习(ML)的智能服务正逐渐成为边缘计算领域领先的服务设计和交付模式,用户和设备数据被外包,以参与大规模大数据分析。然而,这种模式需要挑战安全和隐私问题,这需要重新思考执行ML背后的基本概念。例如,敏感数据的加密提供了一个确保数据安全和隐私的直接解决方案。特别是,同态加密允许对加密数据进行任意计算,最近引起了人们的广泛关注。然而,由于其对分类精度和模型性能的潜在影响,它尚未被基于边缘计算的机器学习完全采用。本文在几种ML模型上对不同类型的同态加密技术,即部分、部分和完全同态加密进行了实验评估,这些模型在加密数据上进行训练,并基于加密输入数据产生分类预测。结果表明,在网络边缘的ML隐私环境中,有两个潜在的方向:隐私保护训练和隐私保护分类。使用不同的度量来比较加密驱动的ML模型的性能,例如纯文本与加密文本的准确性和计算时间。这一评估将指导未来的研究,以调查哪种ML模型在加密数据上表现更好。
{"title":"Experimental Evaluation of Homomorphic Encryption in Cloud and Edge Machine Learning","authors":"Joe Hrzich, Gunjan Basra, Talal Halabi","doi":"10.1109/KSE56063.2022.9953624","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953624","url":null,"abstract":"Machine Learning (ML)-based intelligent services are gradually becoming the leading service design and delivery model in edge computing, where user and device data is outsourced to take part of large-scale BigData analytics. This paradigm however entails challenging security and privacy concerns, which require rethinking the fundamental concepts behind performing ML. For instance, the encryption of sensitive data provides a straightforward solution that ensures data security and privacy. In particular, Homomorphic encryption allows arbitrary computation on encrypted data and has gained a lot of attention recently. However, it has not been fully adopted by edge computing-based ML due to its potential impact on classification accuracy and model performance. This paper conducts an experimental evaluation of different types of Homomorphic encryption techniques, namely, Partial, Somewhat, and Fully Homomorphic encryption over several ML models, which train on encrypted data and produce classification predictions based on encrypted input data. The results demonstrate two potential directions in the context of ML privacy at the network edge: privacy-preserving training and privacy-preserving classification. The performance of encryption-driven ML models is compared using different metrics such as accuracy and computation time for plaintext vs. encrypted text. This evaluation will guide future research in investigating which ML models perform better over encrypted data.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133821608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2022 14th International Conference on Knowledge and Systems Engineering (KSE)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1