首页 > 最新文献

2022 14th International Conference on Knowledge and Systems Engineering (KSE)最新文献

英文 中文
Vietnamese Text Detection, Recognition and Classification in Images 图像中的越南语文本检测、识别与分类
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953789
Tuan Le Xuan, Hang Pham Thi, Hai Nguyen Do
Detecting and recognizing text in images is a task that has received a lot of attention recently due to its high applicability in many fields such as digitization, storage, lookup, authentication However, most current research works and products are focusing on detecting and extracting text from images but not paying very much attention to analyzing and exploiting semantics and nuances of those extracted texts. In this study, we propose a three-in-one system to detect, recognize and classify Vietnamese text in images collected from social media to help authorities in monitoring tasks. The system receives as input images containing Vietnamese text, uses the Character-Region Awareness For Text detection (CRAFT) model to perform background processing to produce areas containing text in the image; these text containers will then be rearranged in the same order as in the original image, and the text in the image will also be extracted out according to the text container. Next, we use VietOCR model to convert these text images into text fragments. Finally, these texts will be classified using an ensemble of machine learning models. Preliminary results show that the proposed model has an accuracy of up to 88.0% in detecting and recognizing text and 94% in classifying text nuances on the collected data set.
图像中的文本检测与识别由于其在数字化、存储、查找、认证等领域的高适用性,近年来受到了广泛的关注。然而,目前大多数研究工作和产品都集中在图像中的文本检测与提取上,而对提取文本的语义和细微差别的分析和利用关注较少。在这项研究中,我们提出了一个三合一的系统来检测、识别和分类从社交媒体收集的图像中的越南语文本,以帮助当局监测任务。系统接收包含越南文的图像作为输入,使用字符区域感知文本检测(CRAFT)模型进行背景处理,以产生图像中包含文本的区域;然后这些文本容器将按照与原始图像相同的顺序重新排列,并且图像中的文本也将根据文本容器提取出来。接下来,我们使用VietOCR模型将这些文本图像转换为文本片段。最后,这些文本将使用机器学习模型的集合进行分类。初步结果表明,该模型在文本检测和识别方面的准确率高达88.0%,在文本细微差别分类方面的准确率高达94%。
{"title":"Vietnamese Text Detection, Recognition and Classification in Images","authors":"Tuan Le Xuan, Hang Pham Thi, Hai Nguyen Do","doi":"10.1109/KSE56063.2022.9953789","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953789","url":null,"abstract":"Detecting and recognizing text in images is a task that has received a lot of attention recently due to its high applicability in many fields such as digitization, storage, lookup, authentication However, most current research works and products are focusing on detecting and extracting text from images but not paying very much attention to analyzing and exploiting semantics and nuances of those extracted texts. In this study, we propose a three-in-one system to detect, recognize and classify Vietnamese text in images collected from social media to help authorities in monitoring tasks. The system receives as input images containing Vietnamese text, uses the Character-Region Awareness For Text detection (CRAFT) model to perform background processing to produce areas containing text in the image; these text containers will then be rearranged in the same order as in the original image, and the text in the image will also be extracted out according to the text container. Next, we use VietOCR model to convert these text images into text fragments. Finally, these texts will be classified using an ensemble of machine learning models. Preliminary results show that the proposed model has an accuracy of up to 88.0% in detecting and recognizing text and 94% in classifying text nuances on the collected data set.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126059871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image classification of lung nodules by requiring the integration of Attention Mechanism into ResNet model 将注意机制整合到ResNet模型中的肺结节图像分类
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953758
Khai Dinh Lai, T. Le, T. T. Nguyen
In this research, in order to accurately diagnose lung nodules using the LUNA16 dataset, a deep learning model, ResNetl01, is analyzed and chosen. The paper includes: (1) demonstrating the efficiency of the ResNetl01 network on the LUNA16; (2) analyzing the benefits and drawbacks of Attention modules before selecting the best Attention module to integrate into the ResNetl01 model in the classification of lung nodules in CT scans challenge; (3) comparing the efficacy of the proposed model to prior outcomes to demonstrate the model’s feasibility.
为了利用LUNA16数据集准确诊断肺结节,本研究对深度学习模型resnet01进行了分析和选择。本文包括:(1)在LUNA16上演示ResNetl01网络的效率;(2)分析各Attention模块的优缺点,选择最佳的Attention模块整合到resnet01模型中,在CT扫描中对肺结节进行分类挑战;(3)将提出的模型的有效性与先前的结果进行比较,以证明模型的可行性。
{"title":"Image classification of lung nodules by requiring the integration of Attention Mechanism into ResNet model","authors":"Khai Dinh Lai, T. Le, T. T. Nguyen","doi":"10.1109/KSE56063.2022.9953758","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953758","url":null,"abstract":"In this research, in order to accurately diagnose lung nodules using the LUNA16 dataset, a deep learning model, ResNetl01, is analyzed and chosen. The paper includes: (1) demonstrating the efficiency of the ResNetl01 network on the LUNA16; (2) analyzing the benefits and drawbacks of Attention modules before selecting the best Attention module to integrate into the ResNetl01 model in the classification of lung nodules in CT scans challenge; (3) comparing the efficacy of the proposed model to prior outcomes to demonstrate the model’s feasibility.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122680700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Composition Models of Fuzzy Relations Considering Importance Levels of Features* 考虑特征重要度的模糊关系组合模型*
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953776
N. Cao, R. Valášek, Stanislav Ožana
Severa1 fuzzy concepts are involved in relational databases such as the degree of fulfilment of a graded property, the level of importance (or of possibility) of a component in a query, grouping features, or the concept of fuzzy quantifiers. We have recently approached the concepts of excluding features and unavoidable features to construct the extensions of fuzzy relational compositions. The extended compositions include the employment of fuzzy quantifiers as well. In this work, we approach the concept of importance levels of considered features in a particular sense that is intuitively suitable to the classification tasks. Then we propose a direction of incorporating this concept into the existing fuzzy relational compositions. We provide various useful properties related to the new models of the compositions. Furthermore, a simple example of the classification of animals in biology is addressed for the behaviour illustration of the proposed models. Finally, we examine the applicability of the new models to the practical application of the Dragonfly classification, which has been considered previously.
关系数据库中涉及几个模糊概念,如分级属性的实现程度、查询中组件的重要程度(或可能性)、分组特征或模糊量词的概念。我们最近探讨了排除特征和不可避免特征的概念来构造模糊关系组合的扩展。扩展的组合也包括模糊量词的使用。在这项工作中,我们在直观上适合分类任务的特定意义上接近被考虑特征的重要程度的概念。在此基础上,提出了将这一概念引入现有模糊关系组合的方向。我们提供了与组合物的新模型相关的各种有用的属性。此外,一个简单的动物在生物学分类的例子被处理的行为说明所提出的模型。最后,我们检验了新模型在蜻蜓分类的实际应用中的适用性,这是之前已经考虑过的。
{"title":"Composition Models of Fuzzy Relations Considering Importance Levels of Features*","authors":"N. Cao, R. Valášek, Stanislav Ožana","doi":"10.1109/KSE56063.2022.9953776","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953776","url":null,"abstract":"Severa1 fuzzy concepts are involved in relational databases such as the degree of fulfilment of a graded property, the level of importance (or of possibility) of a component in a query, grouping features, or the concept of fuzzy quantifiers. We have recently approached the concepts of excluding features and unavoidable features to construct the extensions of fuzzy relational compositions. The extended compositions include the employment of fuzzy quantifiers as well. In this work, we approach the concept of importance levels of considered features in a particular sense that is intuitively suitable to the classification tasks. Then we propose a direction of incorporating this concept into the existing fuzzy relational compositions. We provide various useful properties related to the new models of the compositions. Furthermore, a simple example of the classification of animals in biology is addressed for the behaviour illustration of the proposed models. Finally, we examine the applicability of the new models to the practical application of the Dragonfly classification, which has been considered previously.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128721148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
EndoUNet: A Unified Model for Anatomical Site Classification, Lesion Categorization and Segmentation for Upper Gastrointestinal Endoscopy EndoUNet:上消化道内镜解剖部位分类、病变分类和分割的统一模型
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953766
N. Manh, D. V. Hang, D. Long, Le Quang Hung, P. C. Khanh, Nguyen Thi Oanh, N. T. Thuy, D. V. Sang
Endoscopy is one of the most effective methods for diagnosing diseases in the upper GI tract. This paper proposes a unified encoder-decoder model for dealing with three tasks simultaneously: anatomical site classification, lesion classification, and lesion segmentation. In addition, the model can learn from a training set comprised of data from multiple sources. We report results on our own large dataset of 8207 images obtained during routine upper GI endoscopic examinations. Experiments show that our model performs admirably in terms of classification accuracy and yields competitive segmentation results compared to the single-task model with the same architecture.
内镜检查是诊断上消化道疾病最有效的方法之一。本文提出了一种统一的编码器-解码器模型,用于同时处理解剖部位分类、病变分类和病变分割三个任务。此外,该模型可以从由多个来源的数据组成的训练集中学习。我们报告了我们自己在常规上消化道内镜检查中获得的8207张图像的大数据集的结果。实验表明,与具有相同架构的单任务模型相比,我们的模型在分类精度方面表现出色,并且产生了具有竞争力的分割结果。
{"title":"EndoUNet: A Unified Model for Anatomical Site Classification, Lesion Categorization and Segmentation for Upper Gastrointestinal Endoscopy","authors":"N. Manh, D. V. Hang, D. Long, Le Quang Hung, P. C. Khanh, Nguyen Thi Oanh, N. T. Thuy, D. V. Sang","doi":"10.1109/KSE56063.2022.9953766","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953766","url":null,"abstract":"Endoscopy is one of the most effective methods for diagnosing diseases in the upper GI tract. This paper proposes a unified encoder-decoder model for dealing with three tasks simultaneously: anatomical site classification, lesion classification, and lesion segmentation. In addition, the model can learn from a training set comprised of data from multiple sources. We report results on our own large dataset of 8207 images obtained during routine upper GI endoscopic examinations. Experiments show that our model performs admirably in terms of classification accuracy and yields competitive segmentation results compared to the single-task model with the same architecture.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133668381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Automated Stub Method for Unit Testing C/C++ Projects 用于C/ c++项目单元测试的自动存根方法
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953784
Tran Nguyen Huong, Le Huu Chung, Lam Nguyen Tung, Hoang-Viet Tran, Pham Ngoc Hung
Automated stubs generation is an important problem when testing units which contains calls to other uncompleted functions as testing and development phases are normally performed in parallel. This paper presents a fully automated method, named AS4UT, for generating stubs used in unit testing of C/C++ projects. The key idea of AS4UT is to consider each function call a mock variable. The idea is done by adding a Pre-process CFG (control flow graph) phase to concolic testing method. In this phase, all function calls in the CFG of a unit under test are replaced by their corresponding mock variables. Then, the updated CFG is used as an input for concolic testing method to generate the required test data set. We have implemented AS4UT in a tool, named AutoStubTesing, and performed experiments with some common functions which calls other units. Experimental results show that AS4UT can increase the code coverage of the generated test data set whilst reducing the number of test data and keeping the required time acceptable.
当测试单元包含对其他未完成功能的调用时,自动生成存根是一个重要的问题,因为测试和开发阶段通常是并行执行的。本文提出了一种名为AS4UT的全自动方法,用于生成用于C/ c++项目单元测试的存根。AS4UT的关键思想是将每个函数调用视为模拟变量。该思想是通过在圆锥测试方法中增加一个预处理CFG(控制流图)阶段来实现的。在这个阶段,被测单元的CFG中的所有函数调用都被相应的模拟变量替换。然后,将更新后的CFG作为concolic测试方法的输入,生成所需的测试数据集。我们在一个名为AutoStubTesing的工具中实现了AS4UT,并使用一些调用其他单元的常见函数进行了实验。实验结果表明,AS4UT可以提高生成测试数据集的代码覆盖率,同时减少测试数据的数量并保持所需的时间可接受。
{"title":"An Automated Stub Method for Unit Testing C/C++ Projects","authors":"Tran Nguyen Huong, Le Huu Chung, Lam Nguyen Tung, Hoang-Viet Tran, Pham Ngoc Hung","doi":"10.1109/KSE56063.2022.9953784","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953784","url":null,"abstract":"Automated stubs generation is an important problem when testing units which contains calls to other uncompleted functions as testing and development phases are normally performed in parallel. This paper presents a fully automated method, named AS4UT, for generating stubs used in unit testing of C/C++ projects. The key idea of AS4UT is to consider each function call a mock variable. The idea is done by adding a Pre-process CFG (control flow graph) phase to concolic testing method. In this phase, all function calls in the CFG of a unit under test are replaced by their corresponding mock variables. Then, the updated CFG is used as an input for concolic testing method to generate the required test data set. We have implemented AS4UT in a tool, named AutoStubTesing, and performed experiments with some common functions which calls other units. Experimental results show that AS4UT can increase the code coverage of the generated test data set whilst reducing the number of test data and keeping the required time acceptable.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114999999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A liveness detection protocol based on deep visual-linguistic alignment 基于深度视觉语言对齐的动态检测协议
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953623
Viet-Trung Tran, Van-Sang Tran, Xuan-Bang Nguyen, The-Trung Tran
Face anti-spoofing has become increasingly critical due to the widespread deployment of face recognition technology. Current approaches mostly focus on presentation attacks, where they rely on textual and spatio-temporal features in captured facial videos. However, in an environment where end-users manage their own devices, attackers can cheat by using virtual camera sensors and easily bypass sophisticated approaches for presentation attacks. In this paper, we propose a novel liveness detection protocol where users are required to read a random-generated sequence of words. Our proposed prediction model, LipBERT, a deep visual-linguistic alignment, is trained to detect if the captured facial stream conforms to the valid textual sequence. For the experiments, we introduce VNFaceTalking 1, an extensive dataset of 188,561 samples (around 130 hours in total). Each sample is at most 3 seconds video of frontal face talking Vietnamese. Experiments on the VNFaceTalking dataset demonstrate promising results.1https://github.com/tranvansanghust/VNFaceTalking
由于人脸识别技术的广泛应用,人脸防欺骗变得越来越重要。目前的方法主要集中在表现攻击上,它们依赖于捕获的面部视频中的文本和时空特征。然而,在终端用户管理自己的设备的环境中,攻击者可以通过使用虚拟相机传感器进行欺骗,并且很容易绕过复杂的表示攻击方法。在本文中,我们提出了一种新的动态检测协议,要求用户读取随机生成的单词序列。我们提出的预测模型LipBERT是一种深度视觉语言对齐模型,它被训练来检测捕获的面部流是否符合有效的文本序列。对于实验,我们引入了VNFaceTalking 1,这是一个包含188,561个样本的广泛数据集(总共约130小时)。每个样本最多有3秒的越南正面对话视频。在VNFaceTalking数据集上的实验证明了有希望的结果。1https://github.com/tranvansanghust/VNFaceTalking
{"title":"A liveness detection protocol based on deep visual-linguistic alignment","authors":"Viet-Trung Tran, Van-Sang Tran, Xuan-Bang Nguyen, The-Trung Tran","doi":"10.1109/KSE56063.2022.9953623","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953623","url":null,"abstract":"Face anti-spoofing has become increasingly critical due to the widespread deployment of face recognition technology. Current approaches mostly focus on presentation attacks, where they rely on textual and spatio-temporal features in captured facial videos. However, in an environment where end-users manage their own devices, attackers can cheat by using virtual camera sensors and easily bypass sophisticated approaches for presentation attacks. In this paper, we propose a novel liveness detection protocol where users are required to read a random-generated sequence of words. Our proposed prediction model, LipBERT, a deep visual-linguistic alignment, is trained to detect if the captured facial stream conforms to the valid textual sequence. For the experiments, we introduce VNFaceTalking 1, an extensive dataset of 188,561 samples (around 130 hours in total). Each sample is at most 3 seconds video of frontal face talking Vietnamese. Experiments on the VNFaceTalking dataset demonstrate promising results.1https://github.com/tranvansanghust/VNFaceTalking","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115118484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ensemble Learning Methods for Legal Processing Tasks in ALQAC 2022 ALQAC 2022中法律处理任务的集成学习方法
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953756
Hau Nguyen Trung, S. N. Truong
Automated Legal Question Answering Competition is an annual competition to find the best solution to automatically answer legal questions based on well-known statute laws in the Vietnamese Language. In this paper, we will demonstrate how to solve the problems posed by ALQAC 2022, using BERT and its variants as a backbone network. In addition, we also study using tf-idf and BM-25 to rank the relevance of legal documents. At the same time, this publication also show how to enhance training data to solve the problem of limited training data.
自动法律问题回答比赛是一项年度比赛,旨在寻找基于越南语知名成文法的自动回答法律问题的最佳解决方案。在本文中,我们将演示如何使用BERT及其变体作为骨干网来解决ALQAC 2022带来的问题。此外,我们还研究了使用tf-idf和BM-25对法律文件的相关性进行排序。同时,本文还展示了如何对训练数据进行增强,以解决训练数据有限的问题。
{"title":"Ensemble Learning Methods for Legal Processing Tasks in ALQAC 2022","authors":"Hau Nguyen Trung, S. N. Truong","doi":"10.1109/KSE56063.2022.9953756","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953756","url":null,"abstract":"Automated Legal Question Answering Competition is an annual competition to find the best solution to automatically answer legal questions based on well-known statute laws in the Vietnamese Language. In this paper, we will demonstrate how to solve the problems posed by ALQAC 2022, using BERT and its variants as a backbone network. In addition, we also study using tf-idf and BM-25 to rank the relevance of legal documents. At the same time, this publication also show how to enhance training data to solve the problem of limited training data.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123649051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
KSE 2022 Cover Page KSE 2022封面
Pub Date : 2022-10-19 DOI: 10.1109/kse56063.2022.9953775
{"title":"KSE 2022 Cover Page","authors":"","doi":"10.1109/kse56063.2022.9953775","DOIUrl":"https://doi.org/10.1109/kse56063.2022.9953775","url":null,"abstract":"","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116880118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Deep Clustering Variational Auto-Encoder for Anomaly-based Network Intrusion Detection 一种新的基于异常的网络入侵检测深度聚类变分自编码器
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953763
Van Quan Nguyen, V. H. Nguyen, T. Hoang, Nathan Shone
The role of semi-supervised network intrusion detection systems is becoming increasingly important in the ever-changing digital landscape. Despite the boom in commercial and research interest, there are still many concerns over accuracy yet to be addressed. Two of the major limitations contributing to this concern are reliably learning the underlying probability distribution of normal network data and the identification of the boundary between the normal and anomalous data regions in the latent space. Recent research has proposed many different ways to learn the latent representation of normal data in a semi-supervised manner, such as using Clustering-based Autoencoder (CAE) and hybridized approaches of Principal Component Analysis (PCA) and CAE. However, such approaches are still affected by these limitations, predominantly due to an overreliance on feature engineering, or the inability to handle the large data dimensionality. In this paper, we propose a novel Cluster Variational Autoencoder (CVAE) deep learning model to overcome the aforementioned limitations and increase the efficiency of network intrusion detection. This enables a more concise and dominant representation of the latent space to be learnt. The probability distribution learning capabilities of the VAE are fully exploited to learn the underlying probability distribution of the normal network data. This combination enables us to address the limitations discussed. The performance of the proposed model is evaluated using eight benchmark network intrusion datasets: NSL-KDD, UNSW-NB15, CICIDS2017 and five scenarios from CTU13 (CTU13-08, CTU-13-09, CTU13-10, CTU13-12 and CTU13-13). The experimental results achieved clearly demonstrate that the proposed method outperforms semi-supervised approaches from existing works.
在不断变化的数字环境中,半监督网络入侵检测系统的作用变得越来越重要。尽管商业和研究兴趣蓬勃发展,但仍有许多关于准确性的担忧有待解决。造成这一问题的两个主要限制是可靠地学习正常网络数据的潜在概率分布,以及识别潜在空间中正常和异常数据区域之间的边界。最近的研究提出了许多以半监督方式学习正常数据潜在表示的方法,如基于聚类的自编码器(CAE)和主成分分析(PCA)和CAE的混合方法。然而,这些方法仍然受到这些限制的影响,主要是由于过度依赖特征工程,或者无法处理大数据维度。本文提出了一种新的聚类变分自编码器(CVAE)深度学习模型来克服上述局限性,提高网络入侵检测的效率。这使得学习潜在空间的更简洁和主导的表示成为可能。充分利用VAE的概率分布学习能力来学习正常网络数据的底层概率分布。这种组合使我们能够解决所讨论的限制。使用8个基准网络入侵数据集(NSL-KDD、UNSW-NB15、CICIDS2017)和CTU13的5个场景(CTU13-08、CTU13- 09、CTU13-10、CTU13-12和CTU13-13)对该模型的性能进行了评估。实验结果清楚地表明,该方法优于现有的半监督方法。
{"title":"A Novel Deep Clustering Variational Auto-Encoder for Anomaly-based Network Intrusion Detection","authors":"Van Quan Nguyen, V. H. Nguyen, T. Hoang, Nathan Shone","doi":"10.1109/KSE56063.2022.9953763","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953763","url":null,"abstract":"The role of semi-supervised network intrusion detection systems is becoming increasingly important in the ever-changing digital landscape. Despite the boom in commercial and research interest, there are still many concerns over accuracy yet to be addressed. Two of the major limitations contributing to this concern are reliably learning the underlying probability distribution of normal network data and the identification of the boundary between the normal and anomalous data regions in the latent space. Recent research has proposed many different ways to learn the latent representation of normal data in a semi-supervised manner, such as using Clustering-based Autoencoder (CAE) and hybridized approaches of Principal Component Analysis (PCA) and CAE. However, such approaches are still affected by these limitations, predominantly due to an overreliance on feature engineering, or the inability to handle the large data dimensionality. In this paper, we propose a novel Cluster Variational Autoencoder (CVAE) deep learning model to overcome the aforementioned limitations and increase the efficiency of network intrusion detection. This enables a more concise and dominant representation of the latent space to be learnt. The probability distribution learning capabilities of the VAE are fully exploited to learn the underlying probability distribution of the normal network data. This combination enables us to address the limitations discussed. The performance of the proposed model is evaluated using eight benchmark network intrusion datasets: NSL-KDD, UNSW-NB15, CICIDS2017 and five scenarios from CTU13 (CTU13-08, CTU-13-09, CTU13-10, CTU13-12 and CTU13-13). The experimental results achieved clearly demonstrate that the proposed method outperforms semi-supervised approaches from existing works.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131077783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
KSE 2022 Conference Committee KSE 2022会议委员会
Pub Date : 2022-10-19 DOI: 10.1109/KSE56063.2022.9953760
Yuka. Nagai, Nguyen Viet Ha, Nguyen Thanh Toai, Akira Shimazu, Alireza Alaei
{"title":"KSE 2022 Conference Committee","authors":"Yuka. Nagai, Nguyen Viet Ha, Nguyen Thanh Toai, Akira Shimazu, Alireza Alaei","doi":"10.1109/KSE56063.2022.9953760","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953760","url":null,"abstract":"","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127956294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2022 14th International Conference on Knowledge and Systems Engineering (KSE)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1