首页 > 最新文献

Journal of Information and Telecommunication最新文献

英文 中文
MSA-SDMN: multicast source authentication scheme for multi-domain software defined mobile networks MSA-SDMN:多域软件定义移动网络的组播源认证方案
IF 2.7 Q1 Computer Science Pub Date : 2023-08-25 DOI: 10.1080/24751839.2023.2250123
Hamdi Eltaief, Ali El kamel, H. Youssef
{"title":"MSA-SDMN: multicast source authentication scheme for multi-domain software defined mobile networks","authors":"Hamdi Eltaief, Ali El kamel, H. Youssef","doi":"10.1080/24751839.2023.2250123","DOIUrl":"https://doi.org/10.1080/24751839.2023.2250123","url":null,"abstract":"","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43599465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel solution for energy-saving and lifetime-maximizing of LoRa wireless mesh networks LoRa无线网状网络节能和寿命最大化的新解决方案
IF 2.7 Q1 Computer Science Pub Date : 2023-08-05 DOI: 10.1080/24751839.2023.2235114
Hoang Hai Son, Vo Phuc Tinh, Duc Ngoc Minh Dang, Bui Thi Duyen, Duy-Dong Le, Thai-Thinh Dang, Q. Nguyen, Thanh-Qui Pham, Van-Luong Nguyen, Tran Anh Khoa, Nguyen Hoang Nam
{"title":"A novel solution for energy-saving and lifetime-maximizing of LoRa wireless mesh networks","authors":"Hoang Hai Son, Vo Phuc Tinh, Duc Ngoc Minh Dang, Bui Thi Duyen, Duy-Dong Le, Thai-Thinh Dang, Q. Nguyen, Thanh-Qui Pham, Van-Luong Nguyen, Tran Anh Khoa, Nguyen Hoang Nam","doi":"10.1080/24751839.2023.2235114","DOIUrl":"https://doi.org/10.1080/24751839.2023.2235114","url":null,"abstract":"","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47367923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed deep learning approach for intrusion detection system in industrial control systems based on big data technique and transfer learning 基于大数据技术和迁移学习的工业控制系统入侵检测分布式深度学习方法
IF 2.7 Q1 Computer Science Pub Date : 2023-07-25 DOI: 10.1080/24751839.2023.2239617
Ahlem Abid, F. Jemili, O. Korbaa
ABSTRACT Industry 4.0 refers to a new generation of connected and intelligent factories that is driven by the emergence of new technologies such as artificial intelligence, Cloud computing, Big Data and industrial control systems (ICS) in order to automate all phases of industrial operations. The presence of connected systems in industrial environments poses a considerable security challenge, moreover with the huge amount of data generated daily, there are complex attacks that occur in seconds and target production lines and their integrity. But, until now, factories do not have all the necessary tools to protect themselves, they mainly use traditional protection. In order to improve industrial control systems in terms of efficiency and response time, the present paper propose a new distributed intrusion detection approach using artificial intelligence methods, Big Data techniques and deployed in a cloud environment. A variety of Machine Learning and Deep Learning algorithms, basically convolutional neural networks (CNN), have been tested to compare performance and choose the most suitable model for the classification. We test the performance of our model by using the industrial dataset SWat.
{"title":"Distributed deep learning approach for intrusion detection system in industrial control systems based on big data technique and transfer learning","authors":"Ahlem Abid, F. Jemili, O. Korbaa","doi":"10.1080/24751839.2023.2239617","DOIUrl":"https://doi.org/10.1080/24751839.2023.2239617","url":null,"abstract":"ABSTRACT Industry 4.0 refers to a new generation of connected and intelligent factories that is driven by the emergence of new technologies such as artificial intelligence, Cloud computing, Big Data and industrial control systems (ICS) in order to automate all phases of industrial operations. The presence of connected systems in industrial environments poses a considerable security challenge, moreover with the huge amount of data generated daily, there are complex attacks that occur in seconds and target production lines and their integrity. But, until now, factories do not have all the necessary tools to protect themselves, they mainly use traditional protection. In order to improve industrial control systems in terms of efficiency and response time, the present paper propose a new distributed intrusion detection approach using artificial intelligence methods, Big Data techniques and deployed in a cloud environment. A variety of Machine Learning and Deep Learning algorithms, basically convolutional neural networks (CNN), have been tested to compare performance and choose the most suitable model for the classification. We test the performance of our model by using the industrial dataset SWat.","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41982654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
M2SA: a novel dataset for multi-level and multi-domain sentiment analysis M2SA:一种用于多层次、多领域情感分析的新数据集
IF 2.7 Q1 Computer Science Pub Date : 2023-07-07 DOI: 10.1080/24751839.2023.2229700
H. Phan, N. Nguyen, D. Hwang, Yeong-Seok Seo
ABSTRACT People have more channels to express their opinions and feelings about events, products, and celebrities because of the development of social networks. They are becoming rich data sources, gaining attention for many practical applications and in the field of research. Sentiment analysis (SA) is one of the most common uses of this data source. Of the currently available SA datasets, most are only suitable for use in SA corresponding to a specific level, such as document, sentence, or aspect levels. This renders it difficult to develop practical systems that require a combination of sentiment analyzes at all three levels. Additionally, the previous datasets included opinions on only a single domain, although many people often mention multiple domains when expressing their views. This study introduces a new dataset called multi-level and multi-domain (M2SA) for SA. Each sample in M2SA contains a short text with at least two sentences and two aspects with different domains and sentiment polarities. The release of the M2SA dataset will contribute to the promotion of research in the field of SA, primarily by promoting the development and improvement of methods for multi-level SA or multi-aspect, multi-domain SA. The M2SA dataset was tested using state-of-the-art SA methods and was compared with other standard datasets. The results demonstrate that the M2SA dataset is better than the previous datasets in supporting to improve of the performance of SA methods.
{"title":"M2SA: a novel dataset for multi-level and multi-domain sentiment analysis","authors":"H. Phan, N. Nguyen, D. Hwang, Yeong-Seok Seo","doi":"10.1080/24751839.2023.2229700","DOIUrl":"https://doi.org/10.1080/24751839.2023.2229700","url":null,"abstract":"ABSTRACT People have more channels to express their opinions and feelings about events, products, and celebrities because of the development of social networks. They are becoming rich data sources, gaining attention for many practical applications and in the field of research. Sentiment analysis (SA) is one of the most common uses of this data source. Of the currently available SA datasets, most are only suitable for use in SA corresponding to a specific level, such as document, sentence, or aspect levels. This renders it difficult to develop practical systems that require a combination of sentiment analyzes at all three levels. Additionally, the previous datasets included opinions on only a single domain, although many people often mention multiple domains when expressing their views. This study introduces a new dataset called multi-level and multi-domain (M2SA) for SA. Each sample in M2SA contains a short text with at least two sentences and two aspects with different domains and sentiment polarities. The release of the M2SA dataset will contribute to the promotion of research in the field of SA, primarily by promoting the development and improvement of methods for multi-level SA or multi-aspect, multi-domain SA. The M2SA dataset was tested using state-of-the-art SA methods and was compared with other standard datasets. The results demonstrate that the M2SA dataset is better than the previous datasets in supporting to improve of the performance of SA methods.","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48290306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance statistics of broadcasting networks with receiver diversity and Fountain codes 具有接收器分集和喷泉码的广播网络的性能统计
IF 2.7 Q1 Computer Science Pub Date : 2023-06-21 DOI: 10.1080/24751839.2023.2225254
L. Tu, T. N. Nguyen, Phuong T. Tran, Tran Trung Duy, Q.-S. Nguyen
ABSTRACT The performance of broadcasting networks employing Fountain codes with receiver diversity techniques is investigated in the present work. Particularly, we derive the closed-form expressions of the cumulative distribution function (CDF), the probability mass function (PMF), and the raw moments of the number of the needed time slots to deliver a common message to all users under two diversity schemes, namely, maximal ratio combining (MRC) and selection combining (SC). Numerical results are supplied to verify the accuracy of the considered networks and highlight the behaviours of these metrics as a function of some vital parameters such as the number of receivers, and the number of received antennae. Additionally, we also confirm the advantages of the MRC scheme compared with the SC scheme in the broadcasting networks.
{"title":"Performance statistics of broadcasting networks with receiver diversity and Fountain codes","authors":"L. Tu, T. N. Nguyen, Phuong T. Tran, Tran Trung Duy, Q.-S. Nguyen","doi":"10.1080/24751839.2023.2225254","DOIUrl":"https://doi.org/10.1080/24751839.2023.2225254","url":null,"abstract":"ABSTRACT The performance of broadcasting networks employing Fountain codes with receiver diversity techniques is investigated in the present work. Particularly, we derive the closed-form expressions of the cumulative distribution function (CDF), the probability mass function (PMF), and the raw moments of the number of the needed time slots to deliver a common message to all users under two diversity schemes, namely, maximal ratio combining (MRC) and selection combining (SC). Numerical results are supplied to verify the accuracy of the considered networks and highlight the behaviours of these metrics as a function of some vital parameters such as the number of receivers, and the number of received antennae. Additionally, we also confirm the advantages of the MRC scheme compared with the SC scheme in the broadcasting networks.","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42503526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On the security and reliability performance of SWIPT-enabled full-duplex relaying in the non-orthogonal multiple access networks SWIPT全双工中继在非正交多址网络中的安全可靠性研究
IF 2.7 Q1 Computer Science Pub Date : 2023-06-03 DOI: 10.1080/24751839.2023.2218046
Q.-S. Nguyen, T. N. Nguyen, L. Tu
ABSTRACT The performance of the simultaneous wireless information and power transfer (SWIPT) enabled full-duplex (FD) relaying in non-orthogonal multiple access (NOMA) networks is investigated in both reliability and security aspects. More precisely, for the viewpoint of reliability, we derive in the closed-form expression the outage probability (OP) at both end-users. On the other hand, intercept probability (IP) is considered a helpful metric to measure the security of the considered systems. Moreover, we derive the IP in the closed-form expression too. Numerical results are also given to confirm the correctness of the derived mathematical framework as well as to identify the insights of both metrics as a function of some key parameters such as the transmit power, the power-splitting (PS) ratio, and the power allocation ratio.
{"title":"On the security and reliability performance of SWIPT-enabled full-duplex relaying in the non-orthogonal multiple access networks","authors":"Q.-S. Nguyen, T. N. Nguyen, L. Tu","doi":"10.1080/24751839.2023.2218046","DOIUrl":"https://doi.org/10.1080/24751839.2023.2218046","url":null,"abstract":"ABSTRACT The performance of the simultaneous wireless information and power transfer (SWIPT) enabled full-duplex (FD) relaying in non-orthogonal multiple access (NOMA) networks is investigated in both reliability and security aspects. More precisely, for the viewpoint of reliability, we derive in the closed-form expression the outage probability (OP) at both end-users. On the other hand, intercept probability (IP) is considered a helpful metric to measure the security of the considered systems. Moreover, we derive the IP in the closed-form expression too. Numerical results are also given to confirm the correctness of the derived mathematical framework as well as to identify the insights of both metrics as a function of some key parameters such as the transmit power, the power-splitting (PS) ratio, and the power allocation ratio.","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43875507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards data fusion-based big data analytics for intrusion detection 基于数据融合的入侵检测大数据分析
IF 2.7 Q1 Computer Science Pub Date : 2023-05-24 DOI: 10.1080/24751839.2023.2214976
F. Jemili
ABSTRACT Intrusion detection is seen as the most promising way for computer security. It is used to protect computer networks against different types of attacks. The major problem in the literature is the classification of data into two main classes: normal and intrusion. To solve this problem, several approaches have been proposed but the problem of false alarms is still present. To provide a solution to this problem, we have proposed a new intrusion detection approach based on data fusion. The main objective of this work is to suggest an approach of data fusion-based Big Data analytics to detect intrusions; It is to build one dataset which combines various datasets and contains all the attack types. This research consists in merging the heterogeneous datasets and removing redundancy information using Big Data analytics tools: Hadoop/MapReduce and Neo4j. In the next step, machine learning algorithms are implemented for learning. The first algorithm, called SSDM (Semantically Similar Data Miner), uses fuzzy logic to generate association rules between the different item sets. The second algorithm, called K2, is a score-based greedy search algorithm for learning Bayesian networks from data. Experimentation results prove that – in both cases – data fusion contributes to having very good results.
{"title":"Towards data fusion-based big data analytics for intrusion detection","authors":"F. Jemili","doi":"10.1080/24751839.2023.2214976","DOIUrl":"https://doi.org/10.1080/24751839.2023.2214976","url":null,"abstract":"ABSTRACT\u0000 Intrusion detection is seen as the most promising way for computer security. It is used to protect computer networks against different types of attacks. The major problem in the literature is the classification of data into two main classes: normal and intrusion. To solve this problem, several approaches have been proposed but the problem of false alarms is still present. To provide a solution to this problem, we have proposed a new intrusion detection approach based on data fusion. The main objective of this work is to suggest an approach of data fusion-based Big Data analytics to detect intrusions; It is to build one dataset which combines various datasets and contains all the attack types. This research consists in merging the heterogeneous datasets and removing redundancy information using Big Data analytics tools: Hadoop/MapReduce and Neo4j. In the next step, machine learning algorithms are implemented for learning. The first algorithm, called SSDM (Semantically Similar Data Miner), uses fuzzy logic to generate association rules between the different item sets. The second algorithm, called K2, is a score-based greedy search algorithm for learning Bayesian networks from data. Experimentation results prove that – in both cases – data fusion contributes to having very good results.","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42759912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Abnormal network packets identification using header information collected from Honeywall architecture 使用从Honeywall架构收集的报头信息识别异常网络数据包
IF 2.7 Q1 Computer Science Pub Date : 2023-05-23 DOI: 10.1080/24751839.2023.2215135
Kha Van Nguyen, H. Nguyen, Thang Quyet Le, Quang Nhat Minh Truong
ABSTRACT Most devices are now connected through the Internet, so cybersecurity issues have raised concerns. This study proposes network services in a virtual environment to collect, analyze and identify network attacks with various techniques. Our contributions include multi-fold. First, we deployed Honeynet architecture to collect network packets, including actual cyber-attacks performed by real hackers and crackers. In the second contribution, we have leveraged some techniques to normalize data and extract header information with 29 features from 200,000 samples of many types of network attacks for abnormal packet identification with machine learning algorithms. Furthermore, we introduce an Adaptive Cybersecurity (AC) system to detect attacks and provide warnings. The system can automatically collect more data for further analysis to improve performance. Our proposed method performs better than Snort in detecting dangerous malicious attacks. Finally, we have experimented with different cyber-attack approaches to exploit the ten website security risks recommended by the Open Web Application Security Project (OWASP). From the research results, the system is expected to be able to detect cybercriminal attacks and provide early warnings to prevent a potential cyber-attack.
{"title":"Abnormal network packets identification using header information collected from Honeywall architecture","authors":"Kha Van Nguyen, H. Nguyen, Thang Quyet Le, Quang Nhat Minh Truong","doi":"10.1080/24751839.2023.2215135","DOIUrl":"https://doi.org/10.1080/24751839.2023.2215135","url":null,"abstract":"ABSTRACT Most devices are now connected through the Internet, so cybersecurity issues have raised concerns. This study proposes network services in a virtual environment to collect, analyze and identify network attacks with various techniques. Our contributions include multi-fold. First, we deployed Honeynet architecture to collect network packets, including actual cyber-attacks performed by real hackers and crackers. In the second contribution, we have leveraged some techniques to normalize data and extract header information with 29 features from 200,000 samples of many types of network attacks for abnormal packet identification with machine learning algorithms. Furthermore, we introduce an Adaptive Cybersecurity (AC) system to detect attacks and provide warnings. The system can automatically collect more data for further analysis to improve performance. Our proposed method performs better than Snort in detecting dangerous malicious attacks. Finally, we have experimented with different cyber-attack approaches to exploit the ten website security risks recommended by the Open Web Application Security Project (OWASP). From the research results, the system is expected to be able to detect cybercriminal attacks and provide early warnings to prevent a potential cyber-attack.","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45166454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speech feature extraction using linear Chirplet transform and its applications* 线性Chirplet变换的语音特征提取及其应用*
IF 2.7 Q1 Computer Science Pub Date : 2023-05-03 DOI: 10.1080/24751839.2023.2207267
H. Do, D. Chau, S. Tran
ABSTRACT Most speech processing models begin with feature extraction and then pass the feature vector to the primary processing model. The solution's performance mainly depends on the quality of the feature representation and the model architecture. Much research focuses on designing robust deep network architecture and ignoring feature representation's important role during the deep neural network era. This work aims to exploit a new approach to design a speech signal representation in the time-frequency domain via Linear Chirplet Transform (LCT). The proposed method provides a feature vector sensitive to the frequency change inside human speech with a solid mathematical foundation. This is a potential direction for many applications. The experimental results show the improvement of the feature based on LCT compared to MFCC or Fourier Transform. In both speaker gender recognition, dialect recognition, and speech recognition, LCT significantly improved compared with MFCC and other features. This result also implies that the feature based on LCT is independent of language, so it can be used in various applications.
摘要大多数语音处理模型从特征提取开始,然后将特征向量传递给主处理模型。解决方案的性能主要取决于特征表示和模型架构的质量。许多研究都集中在设计健壮的深度网络架构上,而忽略了特征表示在深度神经网络时代的重要作用。这项工作旨在开发一种新的方法,通过线性Chirplet变换(LCT)设计时频域中的语音信号表示。所提出的方法为对人类语音内部频率变化敏感的特征向量提供了坚实的数学基础。这是许多应用的潜在方向。实验结果表明,与MFCC或傅立叶变换相比,基于LCT的特征得到了改进。在说话人性别识别、方言识别和语音识别中,LCT与MFCC等特征相比均有显著改善。这一结果也表明,基于LCT的特征与语言无关,因此可以在各种应用中使用。
{"title":"Speech feature extraction using linear Chirplet transform and its applications*","authors":"H. Do, D. Chau, S. Tran","doi":"10.1080/24751839.2023.2207267","DOIUrl":"https://doi.org/10.1080/24751839.2023.2207267","url":null,"abstract":"ABSTRACT Most speech processing models begin with feature extraction and then pass the feature vector to the primary processing model. The solution's performance mainly depends on the quality of the feature representation and the model architecture. Much research focuses on designing robust deep network architecture and ignoring feature representation's important role during the deep neural network era. This work aims to exploit a new approach to design a speech signal representation in the time-frequency domain via Linear Chirplet Transform (LCT). The proposed method provides a feature vector sensitive to the frequency change inside human speech with a solid mathematical foundation. This is a potential direction for many applications. The experimental results show the improvement of the feature based on LCT compared to MFCC or Fourier Transform. In both speaker gender recognition, dialect recognition, and speech recognition, LCT significantly improved compared with MFCC and other features. This result also implies that the feature based on LCT is independent of language, so it can be used in various applications.","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42322050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On the use of text augmentation for stance and fake news detection 关于文本增强在姿态和假新闻检测中的应用
IF 2.7 Q1 Computer Science Pub Date : 2023-04-19 DOI: 10.1080/24751839.2023.2198820
Ilhem Salah, Khaled Jouini, O. Korbaa
ABSTRACT Data Augmentation (DA) aims at synthesizing new training instances by applying transformations to available ones. DA has several well-known benefits such as: (i) increasing generalization ability; (ii) preventing data scarcity; and (iii) helping resolve class imbalance issues. In this work, we investigate the use of DA for stance and fake news detection. In the first part of our work, we explore the effect of various DA techniques on the performance of common classification algorithms. Our study reveals that the motto ‘the more, the better’ is the wrong approach regarding text augmentation and that there is no one-size-fits-all text augmentation technique. The second part of our work leverages the results of our study to propose a novel augmentation-based, ensemble learning approach. The proposed approach leverages text augmentation to enhance base learners' diversity and accuracy, ergo the predictive performance of the ensemble. The third part of our work experimentally investigates the use of DA to cope with the class imbalance problem. Class imbalance is very common in stance and fake news detection and often results in biased models. In this work we show how and to what extent text augmentation can help resolving moderate and severe imbalance.
数据增强(Data Augmentation, DA)旨在通过对已有的训练实例进行转换来合成新的训练实例。数据分析有几个众所周知的好处,例如:(i)提高泛化能力;(ii)防止数据短缺;(三)帮助解决阶级失衡问题。在这项工作中,我们研究了数据处理在姿态和假新闻检测中的应用。在我们工作的第一部分中,我们探讨了各种数据处理技术对常用分类算法性能的影响。我们的研究表明,“越多越好”的座右铭是关于文本增强的错误方法,并且没有一种适用于所有文本增强的技术。我们工作的第二部分利用我们的研究结果提出了一种新的基于增强的集成学习方法。提出的方法利用文本增强来提高基础学习者的多样性和准确性,从而提高集成的预测性能。第三部分实验研究了数据挖掘在处理类不平衡问题中的应用。阶级不平衡在立场和假新闻检测中非常普遍,并且经常导致有偏见的模型。在这项工作中,我们展示了文本增强如何以及在多大程度上可以帮助解决中度和严重的不平衡。
{"title":"On the use of text augmentation for stance and fake news detection","authors":"Ilhem Salah, Khaled Jouini, O. Korbaa","doi":"10.1080/24751839.2023.2198820","DOIUrl":"https://doi.org/10.1080/24751839.2023.2198820","url":null,"abstract":"ABSTRACT Data Augmentation (DA) aims at synthesizing new training instances by applying transformations to available ones. DA has several well-known benefits such as: (i) increasing generalization ability; (ii) preventing data scarcity; and (iii) helping resolve class imbalance issues. In this work, we investigate the use of DA for stance and fake news detection. In the first part of our work, we explore the effect of various DA techniques on the performance of common classification algorithms. Our study reveals that the motto ‘the more, the better’ is the wrong approach regarding text augmentation and that there is no one-size-fits-all text augmentation technique. The second part of our work leverages the results of our study to propose a novel augmentation-based, ensemble learning approach. The proposed approach leverages text augmentation to enhance base learners' diversity and accuracy, ergo the predictive performance of the ensemble. The third part of our work experimentally investigates the use of DA to cope with the class imbalance problem. Class imbalance is very common in stance and fake news detection and often results in biased models. In this work we show how and to what extent text augmentation can help resolving moderate and severe imbalance.","PeriodicalId":32180,"journal":{"name":"Journal of Information and Telecommunication","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41249018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Journal of Information and Telecommunication
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1