Annals of Emerging Technologies in Computing最新文献

The Proposal of Countermeasures for DeepFake Voices on Social Media Considering Waveform and Text Embedding 考虑波形和文本嵌入的社交媒体深度伪造声音对策建议

Q2 Computer Science

Annals of Emerging Technologies in Computing

Pub Date : 2024-04-01 DOI: 10.33166/aetic.2024.02.002

Y. Yanagi, R. Orihara, Yasuyuki Tahara, Y. Sei, Tanel Alumäe, Akihiko Ohsuga

In recent times, advancements in text-to-speech technologies have yielded more natural-sounding voices. However, this has also made it easier to generate malicious fake voices and disseminate false narratives. ASVspoof stands out as a prominent benchmark in the ongoing effort to automatically detect fake voices, thereby playing a crucial role in countering illicit access to biometric systems. Consequently, there is a growing need to broaden our perspectives, particularly when it comes to detecting fake voices on social media platforms. Moreover, existing detection models commonly face challenges related to their generalization performance. This study sheds light on specific instances involving the latest speech generation models. Furthermore, we introduce a novel framework designed to address the nuances of detecting fake voices in the context of social media. This framework considers not only the voice waveform but also the speech content. Our experiments have demonstrated that the proposed framework considerably enhances classification performance, as evidenced by the reduction in equal error rate. This underscores the importance of considering the waveform and the content of the voice when tasked with identifying fake voices and disseminating false claims.

近来，文本到语音技术的进步产生了更自然的声音。然而，这也使得生成恶意假语音和传播虚假叙述变得更加容易。ASVspoof 在自动检测假语音的持续努力中脱颖而出，成为一个突出的基准，从而在打击非法访问生物识别系统方面发挥了至关重要的作用。因此，我们越来越需要拓宽视野，尤其是在检测社交媒体平台上的虚假声音时。此外，现有的检测模型通常在泛化性能方面面临挑战。本研究揭示了涉及最新语音生成模型的具体实例。此外，我们还介绍了一个新颖的框架，旨在解决社交媒体中假冒声音检测的细微差别。该框架不仅考虑了语音波形，还考虑了语音内容。我们的实验表明，所提出的框架大大提高了分类性能，等错误率的降低就证明了这一点。这强调了在识别虚假声音和传播虚假信息时考虑语音波形和内容的重要性。

{"title":"The Proposal of Countermeasures for DeepFake Voices on Social Media Considering Waveform and Text Embedding","authors":"Y. Yanagi, R. Orihara, Yasuyuki Tahara, Y. Sei, Tanel Alumäe, Akihiko Ohsuga","doi":"10.33166/aetic.2024.02.002","DOIUrl":"https://doi.org/10.33166/aetic.2024.02.002","url":null,"abstract":"In recent times, advancements in text-to-speech technologies have yielded more natural-sounding voices. However, this has also made it easier to generate malicious fake voices and disseminate false narratives. ASVspoof stands out as a prominent benchmark in the ongoing effort to automatically detect fake voices, thereby playing a crucial role in countering illicit access to biometric systems. Consequently, there is a growing need to broaden our perspectives, particularly when it comes to detecting fake voices on social media platforms. Moreover, existing detection models commonly face challenges related to their generalization performance. This study sheds light on specific instances involving the latest speech generation models. Furthermore, we introduce a novel framework designed to address the nuances of detecting fake voices in the context of social media. This framework considers not only the voice waveform but also the speech content. Our experiments have demonstrated that the proposed framework considerably enhances classification performance, as evidenced by the reduction in equal error rate. This underscores the importance of considering the waveform and the content of the voice when tasked with identifying fake voices and disseminating false claims.","PeriodicalId":36440,"journal":{"name":"Annals of Emerging Technologies in Computing","volume":"1612 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140773884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Wildfire Prediction in the United States Using Time Series Forecasting Models 利用时间序列预测模型预测美国的野火

Q2 Computer Science

Annals of Emerging Technologies in Computing

Pub Date : 2024-04-01 DOI: 10.33166/aetic.2024.02.003

Muhammad Khubayeeb Kabir, Kawshik Kumar Ghosh, Md. Fahim Ul Islam, Jia Uddin

Wildfires are a widespread phenomenon that affects every corner of the world with the warming climate. Wildfires burn tens of thousands of square kilometres of forests and vegetation every year in the United States alone with the past decade witnessing a dramatic increase in the number of wildfire incidents. This research aims to understand the regions of forests and vegetation across the US that are susceptible to wildfires using spatiotemporal kernel heat maps and, forecast these wildfires across the United States at country-wide and state levels on a weekly and monthly basis in an attempt to reduce the reaction time of the suppression operations and effectively design resource maps to mitigate wildfires. We employed the state-of-the-art Neural Basis Expansion Analysis for Time Series (N-BEATS) model to predict the total area burned by wildfires by several weeks and months into the future. The model was evaluated based on forecasting metrics including mean-squared error (MSE)., and mean average error (MAE). The N-BEATS model demonstrates improved performance compared to other state-of-the-art (SOTA) models, obtaining MSE values of 116.3, 38.2, and 19.0 for yearly, monthly, and weekly forecasting, respectively.

野火是一种普遍现象，随着气候变暖影响着世界的每一个角落。仅在美国，野火每年就会烧毁数万平方公里的森林和植被，而在过去十年中，野火事件的数量急剧增加。本研究旨在利用时空核热图了解全美易受野火影响的森林和植被区域，并在全美和各州范围内每周和每月对这些野火进行预测，以缩短灭火行动的反应时间，并有效设计资源地图来缓解野火。我们采用了最先进的时间序列神经基础扩展分析（N-BEATS）模型来预测未来数周和数月野火烧毁的总面积。该模型根据均方误差 (MSE) 和平均误差 (MAE) 等预测指标进行了评估。与其他最先进（SOTA）模型相比，N-BEATS 模型的性能有所提高，在年度、月度和周度预测中的 MSE 值分别为 116.3、38.2 和 19.0。

{"title":"Wildfire Prediction in the United States Using Time Series Forecasting Models","authors":"Muhammad Khubayeeb Kabir, Kawshik Kumar Ghosh, Md. Fahim Ul Islam, Jia Uddin","doi":"10.33166/aetic.2024.02.003","DOIUrl":"https://doi.org/10.33166/aetic.2024.02.003","url":null,"abstract":"Wildfires are a widespread phenomenon that affects every corner of the world with the warming climate. Wildfires burn tens of thousands of square kilometres of forests and vegetation every year in the United States alone with the past decade witnessing a dramatic increase in the number of wildfire incidents. This research aims to understand the regions of forests and vegetation across the US that are susceptible to wildfires using spatiotemporal kernel heat maps and, forecast these wildfires across the United States at country-wide and state levels on a weekly and monthly basis in an attempt to reduce the reaction time of the suppression operations and effectively design resource maps to mitigate wildfires. We employed the state-of-the-art Neural Basis Expansion Analysis for Time Series (N-BEATS) model to predict the total area burned by wildfires by several weeks and months into the future. The model was evaluated based on forecasting metrics including mean-squared error (MSE)., and mean average error (MAE). The N-BEATS model demonstrates improved performance compared to other state-of-the-art (SOTA) models, obtaining MSE values of 116.3, 38.2, and 19.0 for yearly, monthly, and weekly forecasting, respectively.","PeriodicalId":36440,"journal":{"name":"Annals of Emerging Technologies in Computing","volume":"59 43","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140795772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Torpor-based Enhanced Security Model for CSMA/CA Protocol in Wireless Networks 基于 Torpor 的无线网络 CSMA/CA 协议增强安全模型

Q2 Computer Science

Annals of Emerging Technologies in Computing

Pub Date : 2024-04-01 DOI: 10.33166/aetic.2024.02.004

A. Akinwale, John E. Efiong, E. A. Olajubu, G. A. Aderounmu

Mobile wireless networks enable the connection of devices to a network with minimal or no infrastructure. This comes with the advantages of ease and cost-effectiveness, thus largely popularizing the network. Notwithstanding these merits, the open physical media, infrastructural-less attributes, and pervasive deployment of wireless networks make the channel of communication (media access) vulnerable to attacks such as traffic analysis, monitoring, and jamming. This study designed a virtual local area network (VLAN) model to circumvent virtual jamming attacks and other intrusions at the Media Access Control (MAC) layer of the Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) protocol. A Torpor VLAN (TVLAN) Data Frame Encapsulation and the algorithm for T-VLAN security in CSMA/CA were formulated and presented. A simulation experiment was conducted on the model using OMNeT++ software. The performance metrics used to evaluate the model were packet delivery ratio, network throughput, end-to-end channel delay, and channel load. The simulation results show that the TVLAN defence mechanism did not increase the channel load arbitrarily during TVLAN defence. similarly, the system throughput was shown to be 82% during TVLAN defence. Nevertheless, the network delay of the system during TVLAN defence was significantly high but the channel load was 297 when the TVLAN security mechanism was launched. These results demonstrate the model’s ability to provide a survivability mechanism for critical systems when under attack and add a security layer to the CSMA/CA protocol in wireless networks. Such a remarkable performance is required of a CSMA/CA infrastructure for improving the cybersecurity posture of a wireless network.

移动无线网络只需最少的基础设施或无需基础设施就能将设备连接到网络。这种网络具有简便和成本效益高的优点，因此在很大程度上得到了普及。尽管有这些优点，但无线网络开放的物理介质、无基础设施的属性和普遍的部署方式，使得通信通道（媒体接入）容易受到攻击，如流量分析、监控和干扰。本研究设计了一个虚拟局域网（VLAN）模型，以规避载波侦测多路访问与碰撞避免（CSMA/CA）协议的媒体访问控制（MAC）层的虚拟干扰攻击和其他入侵。制定并介绍了一种 Torpor VLAN（TVLAN）数据帧封装和 CSMA/CA 中的 T-VLAN 安全算法。使用 OMNeT++ 软件对模型进行了仿真实验。评估模型的性能指标包括数据包传送率、网络吞吐量、端到端信道延迟和信道负载。仿真结果表明，在 TVLAN 防御期间，TVLAN 防御机制没有任意增加信道负载。同样，在 TVLAN 防御期间，系统吞吐量显示为 82%。不过，在 TVLAN 防御期间，系统的网络延迟明显较高，但在 TVLAN 安全机制启动时，信道负载为 297。这些结果表明，该模型有能力为受到攻击的关键系统提供生存机制，并为无线网络中的 CSMA/CA 协议添加一个安全层。CSMA/CA 基础架构需要具备如此卓越的性能，才能改善无线网络的网络安全态势。

{"title":"A Torpor-based Enhanced Security Model for CSMA/CA Protocol in Wireless Networks","authors":"A. Akinwale, John E. Efiong, E. A. Olajubu, G. A. Aderounmu","doi":"10.33166/aetic.2024.02.004","DOIUrl":"https://doi.org/10.33166/aetic.2024.02.004","url":null,"abstract":"Mobile wireless networks enable the connection of devices to a network with minimal or no infrastructure. This comes with the advantages of ease and cost-effectiveness, thus largely popularizing the network. Notwithstanding these merits, the open physical media, infrastructural-less attributes, and pervasive deployment of wireless networks make the channel of communication (media access) vulnerable to attacks such as traffic analysis, monitoring, and jamming. This study designed a virtual local area network (VLAN) model to circumvent virtual jamming attacks and other intrusions at the Media Access Control (MAC) layer of the Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) protocol. A Torpor VLAN (TVLAN) Data Frame Encapsulation and the algorithm for T-VLAN security in CSMA/CA were formulated and presented. A simulation experiment was conducted on the model using OMNeT++ software. The performance metrics used to evaluate the model were packet delivery ratio, network throughput, end-to-end channel delay, and channel load. The simulation results show that the TVLAN defence mechanism did not increase the channel load arbitrarily during TVLAN defence. similarly, the system throughput was shown to be 82% during TVLAN defence. Nevertheless, the network delay of the system during TVLAN defence was significantly high but the channel load was 297 when the TVLAN security mechanism was launched. These results demonstrate the model’s ability to provide a survivability mechanism for critical systems when under attack and add a security layer to the CSMA/CA protocol in wireless networks. Such a remarkable performance is required of a CSMA/CA infrastructure for improving the cybersecurity posture of a wireless network.","PeriodicalId":36440,"journal":{"name":"Annals of Emerging Technologies in Computing","volume":"49 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140789913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing Robot Navigation Efficiency Using Cellular Automata with Active Cells 利用带主动单元的细胞自动机提高机器人导航效率

Q2 Computer Science

Annals of Emerging Technologies in Computing

Pub Date : 2024-04-01 DOI: 10.33166/AETiC.2024.02.005

Saleem Alzoubi, Mahdi H. Miraz

Enhancing robot navigation efficiency is a crucial objective in modern robotics. Robots relying on external navigation systems are often susceptible to electromagnetic interference (EMI) and encounter environmental disturbances, resulting in orientation errors within their surroundings. Therefore, the study employed an internal navigation system to enhance robot navigation efficacy under interference conditions, based on the analysis of the internal parameters and the external signals. This article presents details of the robot’s autonomous operation, which allows for setting the robot's trajectory using an embedded map. The robot’s navigation process involves counting the number of wheel revolutions as well as adjusting wheel orientation after each straight path section. In this article, an autonomous robot navigation system has been presented that leverages an embedded control navigation map utilising cellular automata with active cells which can effectively navigate in an environment containing various types of obstacles. By analysing the neighbouring cells of the active cell, the cellular environment determines which cell should become active during the robot’s next movement step. This approach ensures the robot’s independence from external control inputs. Furthermore, the accuracy and speed of the robot’s movement have been further enhanced using a hexagonal mosaic for navigation surface mapping. This concept of utilising on cellular automata with active cells has been extended to the navigation of a group of robots on a shared navigation surface, taking into account the intersections of the robots’ trajectories over time. To achieve this, a distance control module has been used that records the travelled trajectories in terms of wheel turns and revolutions.

提高机器人导航效率是现代机器人技术的一个重要目标。依赖外部导航系统的机器人往往容易受到电磁干扰（EMI）和环境干扰的影响，从而导致其在周围环境中出现定位错误。因此，本研究采用内部导航系统，在分析内部参数和外部信号的基础上，提高机器人在干扰条件下的导航效率。本文详细介绍了机器人的自主操作，它允许使用嵌入式地图设置机器人的轨迹。机器人的导航过程包括计算车轮转数，以及在每段直线路径后调整车轮方向。本文介绍了一种自主机器人导航系统，该系统利用带有主动单元的蜂窝自动机嵌入式控制导航地图，可在包含各种障碍物的环境中有效导航。通过分析活动单元的邻近单元，蜂窝环境决定机器人下一步移动时哪个单元应处于活动状态。这种方法确保了机器人不受外部控制输入的影响。此外，利用六边形镶嵌法绘制导航面图，还进一步提高了机器人运动的精度和速度。考虑到随着时间的推移机器人轨迹的交叉点，这种利用具有活动单元的蜂窝自动机的概念已扩展到一组机器人在共享导航面上的导航。为此，我们使用了一个距离控制模块，该模块以车轮转数和转数的形式记录行进轨迹。

{"title":"Enhancing Robot Navigation Efficiency Using Cellular Automata with Active Cells","authors":"Saleem Alzoubi, Mahdi H. Miraz","doi":"10.33166/AETiC.2024.02.005","DOIUrl":"https://doi.org/10.33166/AETiC.2024.02.005","url":null,"abstract":"Enhancing robot navigation efficiency is a crucial objective in modern robotics. Robots relying on external navigation systems are often susceptible to electromagnetic interference (EMI) and encounter environmental disturbances, resulting in orientation errors within their surroundings. Therefore, the study employed an internal navigation system to enhance robot navigation efficacy under interference conditions, based on the analysis of the internal parameters and the external signals. This article presents details of the robot’s autonomous operation, which allows for setting the robot's trajectory using an embedded map. The robot’s navigation process involves counting the number of wheel revolutions as well as adjusting wheel orientation after each straight path section. In this article, an autonomous robot navigation system has been presented that leverages an embedded control navigation map utilising cellular automata with active cells which can effectively navigate in an environment containing various types of obstacles. By analysing the neighbouring cells of the active cell, the cellular environment determines which cell should become active during the robot’s next movement step. This approach ensures the robot’s independence from external control inputs. Furthermore, the accuracy and speed of the robot’s movement have been further enhanced using a hexagonal mosaic for navigation surface mapping. This concept of utilising on cellular automata with active cells has been extended to the navigation of a group of robots on a shared navigation surface, taking into account the intersections of the robots’ trajectories over time. To achieve this, a distance control module has been used that records the travelled trajectories in terms of wheel turns and revolutions.","PeriodicalId":36440,"journal":{"name":"Annals of Emerging Technologies in Computing","volume":"99 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140790361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Lightweight Model for Occlusion Removal from Face Images 从人脸图像中去除遮挡的轻量级模型

Q2 Computer Science

Annals of Emerging Technologies in Computing

Pub Date : 2024-04-01 DOI: 10.33166/aetic.2024.02.001

Sincy John, A. Danti

In the realm of deep learning, the prevalence of models with large number of parameters poses a significant challenge for low computation device. Critical influence of model size, primarily governed by weight parameters in shaping the computational demands of the occlusion removal process. Recognizing the computational burdens associated with existing occlusion removal algorithms, characterized by their propensity for substantial computational resources and large model sizes, we advocate for a paradigm shift towards solutions conducive to low-computation environments. Existing occlusion riddance techniques typically demand substantial computational resources and storage capacity. To support real-time applications, it's imperative to deploy trained models on resource-constrained devices like handheld devices and internet of things (IoT) devices possess limited memory and computational capabilities. There arises a critical need to compress and accelerate these models for deployment on resource-constrained devices, without compromising significantly on model accuracy. Our study introduces a significant contribution in the form of a compressed model designed specifically for addressing occlusion in face images for low computation devices. We perform dynamic quantization technique by reducing the weights of the Pix2pix generator model. The trained model is then compressed, which significantly reduces its size and execution time. The proposed model, is lightweight, due to storage space requirement reduced drastically with significant improvement in the execution time. The performance of the proposed method has been compared with other state of the art methods in terms of PSNR and SSIM. Hence the proposed lightweight model is more suitable for the real time applications with less computational cost.

在深度学习领域，具有大量参数的模型对低计算设备提出了巨大挑战。模型大小的关键影响因素主要是权重参数，它决定了消除遮挡过程的计算需求。现有的消除遮挡算法需要大量的计算资源和庞大的模型，我们认识到了这些算法带来的计算负担，因此主张转变模式，采用有利于低计算环境的解决方案。现有的消除遮挡技术通常需要大量的计算资源和存储容量。为了支持实时应用，必须在资源受限的设备上部署训练有素的模型，如内存和计算能力有限的手持设备和物联网（IoT）设备。因此亟需压缩和加速这些模型，以便在资源有限的设备上部署，同时又不影响模型的准确性。我们的研究以压缩模型的形式做出了重大贡献，该模型专为解决低计算设备人脸图像中的闭塞问题而设计。我们通过减少 Pix2pix 生成器模型的权重来执行动态量化技术。然后对训练好的模型进行压缩，从而大大减少了模型的大小和执行时间。由于对存储空间的要求大大降低，同时执行时间也显著缩短，因此所提出的模型是轻量级的。在 PSNR 和 SSIM 方面，已将所提方法的性能与其他先进方法进行了比较。因此，所提出的轻量级模型更适合实时应用，而且计算成本更低。

{"title":"Lightweight Model for Occlusion Removal from Face Images","authors":"Sincy John, A. Danti","doi":"10.33166/aetic.2024.02.001","DOIUrl":"https://doi.org/10.33166/aetic.2024.02.001","url":null,"abstract":"In the realm of deep learning, the prevalence of models with large number of parameters poses a significant challenge for low computation device. Critical influence of model size, primarily governed by weight parameters in shaping the computational demands of the occlusion removal process. Recognizing the computational burdens associated with existing occlusion removal algorithms, characterized by their propensity for substantial computational resources and large model sizes, we advocate for a paradigm shift towards solutions conducive to low-computation environments. Existing occlusion riddance techniques typically demand substantial computational resources and storage capacity. To support real-time applications, it's imperative to deploy trained models on resource-constrained devices like handheld devices and internet of things (IoT) devices possess limited memory and computational capabilities. There arises a critical need to compress and accelerate these models for deployment on resource-constrained devices, without compromising significantly on model accuracy. Our study introduces a significant contribution in the form of a compressed model designed specifically for addressing occlusion in face images for low computation devices. We perform dynamic quantization technique by reducing the weights of the Pix2pix generator model. The trained model is then compressed, which significantly reduces its size and execution time. The proposed model, is lightweight, due to storage space requirement reduced drastically with significant improvement in the execution time. The performance of the proposed method has been compared with other state of the art methods in terms of PSNR and SSIM. Hence the proposed lightweight model is more suitable for the real time applications with less computational cost.","PeriodicalId":36440,"journal":{"name":"Annals of Emerging Technologies in Computing","volume":"181 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140778521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Single-channel Speech Separation Based on Double-density Dual-tree CWT and SNMF 基于双密度双树 CWT 和 SNMF 的单通道语音分离技术

Q2 Computer Science

Annals of Emerging Technologies in Computing

Pub Date : 2024-01-01 DOI: 10.33166/aetic.2024.01.001

Md. Imran Hossain, Md. Abdur Rahim, Md. Najmul Hossain

Speech is essential to human communication; therefore, distinguishing it from noise is crucial. Speech separation becomes challenging in real-world circumstances with background noise and overlapping speech. Moreover, the speech separation using short-term Fourier transform (STFT) and discrete wavelet transform (DWT) addresses time and frequency resolution and time-variation issues, respectively. To solve the above issues, a new speech separation technique is presented based on the double-density dual-tree complex wavelet transform (DDDTCWT) and sparse non-negative matrix factorization (SNMF). The signal is separated into high-pass and low-pass frequency components using DDDTCWT wavelet decomposition. For this analysis, we only considered the low-pass frequency components and zeroed out the high-pass ones. Subsequently, the STFT is then applied to each sub-band signal to generate a complex spectrogram. Therefore, we have used SNMF to factorize the joint form of magnitude and the absolute value of real and imaginary (RI) components that decompose the basis and weight matrices. Most researchers enhance the magnitude spectra only, ignore the phase spectra, and estimate the separated speech using noisy phase. As a result, some noise components are present in the estimated speech results. We are dealing with the signal's magnitude as well as the RI components and estimating the phase of the RI parts. Finally, separated speech signals can be achieved using the inverse STFT (ISTFT) and the inverse DDDTCWT (IDDDTCWT). Separation performance is improved for estimating the phase component and the shift-invariant, better direction selectivity, and scheme freedom properties of DDDTCWT. The speech separation efficiency of the proposed algorithm outperforms performance by 6.53–8.17 dB SDR gain, 7.37-9.87 dB SAR gain, and 14.92–17.21 dB SIR gain compared to the NMF method with masking on the TIMIT dataset.

语音是人类交流的基本要素，因此将语音与噪音区分开来至关重要。在现实世界中，由于背景噪声和语音重叠，语音分离变得极具挑战性。此外，使用短期傅里叶变换（STFT）和离散小波变换（DWT）进行语音分离时，需要分别解决时间和频率分辨率以及时变问题。为解决上述问题，本文提出了一种基于双密度双树复小波变换（DDDTCWT）和稀疏非负矩阵因式分解（SNMF）的新型语音分离技术。通过 DDDTCWT 小波分解，信号被分离成高通和低通频率分量。在本分析中，我们只考虑低通频率分量，而将高通频率分量清零。随后，STFT 应用于每个子带信号，生成复频谱图。因此，我们使用 SNMF 对分解基矩阵和权重矩阵的幅值和实部与虚部（RI）分量的绝对值的联合形式进行因式分解。大多数研究人员只增强了幅度频谱，忽略了相位频谱，并使用噪声相位来估计分离的语音。因此，在估计的语音结果中会出现一些噪声成分。我们既要处理信号的幅度，也要处理 RI 分量，并估算 RI 部分的相位。最后，可以使用反 STFT（ISTFT）和反 DDDTCWT（IDDTCWT）来分离语音信号。在估计相位分量和 DDDTCWT 的移位不变性、更好的方向选择性和方案自由度特性时，分离性能得到了提高。在 TIMIT 数据集上，与带掩码的 NMF 方法相比，所提算法的语音分离效率提高了 6.53-8.17 dB SDR 增益、7.37-9.87 dB SAR 增益和 14.92-17.21 dB SIR 增益。

{"title":"Single-channel Speech Separation Based on Double-density Dual-tree CWT and SNMF","authors":"Md. Imran Hossain, Md. Abdur Rahim, Md. Najmul Hossain","doi":"10.33166/aetic.2024.01.001","DOIUrl":"https://doi.org/10.33166/aetic.2024.01.001","url":null,"abstract":"Speech is essential to human communication; therefore, distinguishing it from noise is crucial. Speech separation becomes challenging in real-world circumstances with background noise and overlapping speech. Moreover, the speech separation using short-term Fourier transform (STFT) and discrete wavelet transform (DWT) addresses time and frequency resolution and time-variation issues, respectively. To solve the above issues, a new speech separation technique is presented based on the double-density dual-tree complex wavelet transform (DDDTCWT) and sparse non-negative matrix factorization (SNMF). The signal is separated into high-pass and low-pass frequency components using DDDTCWT wavelet decomposition. For this analysis, we only considered the low-pass frequency components and zeroed out the high-pass ones. Subsequently, the STFT is then applied to each sub-band signal to generate a complex spectrogram. Therefore, we have used SNMF to factorize the joint form of magnitude and the absolute value of real and imaginary (RI) components that decompose the basis and weight matrices. Most researchers enhance the magnitude spectra only, ignore the phase spectra, and estimate the separated speech using noisy phase. As a result, some noise components are present in the estimated speech results. We are dealing with the signal's magnitude as well as the RI components and estimating the phase of the RI parts. Finally, separated speech signals can be achieved using the inverse STFT (ISTFT) and the inverse DDDTCWT (IDDDTCWT). Separation performance is improved for estimating the phase component and the shift-invariant, better direction selectivity, and scheme freedom properties of DDDTCWT. The speech separation efficiency of the proposed algorithm outperforms performance by 6.53–8.17 dB SDR gain, 7.37-9.87 dB SAR gain, and 14.92–17.21 dB SIR gain compared to the NMF method with masking on the TIMIT dataset.","PeriodicalId":36440,"journal":{"name":"Annals of Emerging Technologies in Computing","volume":"10 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139128222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Explainable Software Defects Classification Using SMOTE and Machine Learning 利用 SMOTE 和机器学习进行可解释的软件缺陷分类

Q2 Computer Science

Annals of Emerging Technologies in Computing

Pub Date : 2024-01-01 DOI: 10.33166/aetic.2024.01.004

Agboeze Jude, Jia Uddin

Software defect prediction is a critical task in software engineering that aims to identify and mitigate potential defects in software systems. In recent years, numerous techniques and approaches have been developed to improve the accuracy and efficiency of the defect prediction model. In this research paper, we proposed a comprehensive approach that addresses class imbalance by utilizing stratified splitting, explainable AI techniques, and a hybrid machine learning algorithm. To mitigate the impact of class imbalance, we employed stratified splitting during the training and evaluation phases. This method ensures that the class distribution is maintained in both the training and testing sets, enabling the model to learn from and generalize to the minority class examples effectively. Furthermore, we leveraged explainable AI methods, Lime and Shap, to enhance interpretability in the machine learning models. To improve prediction accuracy, we propose a hybrid machine learning algorithm that combines the strength of multiple models. This hybridization allows us to exploit the strength of each model, resulting in improved overall performance. The experiment is evaluated using the NASA-MD datasets. The result revealed that handling the class imbalanced data using stratify splitting approach achieves a better overall performance than the SMOTE approach in Software Defect Detection (SDD).

软件缺陷预测是软件工程中的一项重要任务，旨在识别和减少软件系统中的潜在缺陷。近年来，人们开发了许多技术和方法来提高缺陷预测模型的准确性和效率。在本研究论文中，我们提出了一种综合方法，通过利用分层拆分、可解释人工智能技术和混合机器学习算法来解决类不平衡问题。为了减轻类不平衡的影响，我们在训练和评估阶段采用了分层拆分法。这种方法可确保训练集和测试集中的类别分布得以保持，从而使模型能够有效地从少数类别示例中学习并泛化。此外，我们还利用可解释的人工智能方法 Lime 和 Shap 来增强机器学习模型的可解释性。为了提高预测准确性，我们提出了一种混合机器学习算法，该算法结合了多种模型的优势。这种混合算法使我们能够利用每个模型的优势，从而提高整体性能。实验使用 NASA-MD 数据集进行评估。结果显示，在软件缺陷检测（SDD）中，使用分层分割方法处理类不平衡数据比使用 SMOTE 方法取得了更好的整体性能。

{"title":"Explainable Software Defects Classification Using SMOTE and Machine Learning","authors":"Agboeze Jude, Jia Uddin","doi":"10.33166/aetic.2024.01.004","DOIUrl":"https://doi.org/10.33166/aetic.2024.01.004","url":null,"abstract":"Software defect prediction is a critical task in software engineering that aims to identify and mitigate potential defects in software systems. In recent years, numerous techniques and approaches have been developed to improve the accuracy and efficiency of the defect prediction model. In this research paper, we proposed a comprehensive approach that addresses class imbalance by utilizing stratified splitting, explainable AI techniques, and a hybrid machine learning algorithm. To mitigate the impact of class imbalance, we employed stratified splitting during the training and evaluation phases. This method ensures that the class distribution is maintained in both the training and testing sets, enabling the model to learn from and generalize to the minority class examples effectively. Furthermore, we leveraged explainable AI methods, Lime and Shap, to enhance interpretability in the machine learning models. To improve prediction accuracy, we propose a hybrid machine learning algorithm that combines the strength of multiple models. This hybridization allows us to exploit the strength of each model, resulting in improved overall performance. The experiment is evaluated using the NASA-MD datasets. The result revealed that handling the class imbalanced data using stratify splitting approach achieves a better overall performance than the SMOTE approach in Software Defect Detection (SDD).","PeriodicalId":36440,"journal":{"name":"Annals of Emerging Technologies in Computing","volume":"19 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139128074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Learning Based Cyberbullying Detection in Bangla Language 基于深度学习的孟加拉语网络欺凌检测

Q2 Computer Science

Annals of Emerging Technologies in Computing

Pub Date : 2024-01-01 DOI: 10.33166/aetic.2024.01.005

Sristy Shidul Nath, Razuan Karim, Mahdi H. Miraz

The Internet is currently the largest platform for global communication including expressions of opinions, reviews, contents, images, videos and so forth. Moreover, social media has now become a very broad and highly engaging platform due to its immense popularity and swift adoption trend. Increased social networking, however, also has detrimental impacts on the society leading to a range of unwanted phenomena, such as online assault, intimidation, digital bullying, criminality and trolling. Hence, cyberbullying has become a pervasive and worrying problem that poses considerable psychological and emotional harm to the people, particularly amongst the teens and the young adults. In order to lessen its negative effects and provide victims with prompt support, a great deal of research to identify cyberbullying instances at various online platforms is emerging. In comparison to other languages, Bangla (also known as Bengali) has fewer research studies in this domain. This study demonstrates a deep learning strategy for identifying cyberbullying in Bengali, using a dataset of 12282 versatile comments from multiple social media sites. In this study, a two-layer bidirectional long short-term memory (Bi-LSTM) model has been built to identify cyberbullying, using a variety of optimisers as well as 5-fold cross validation. To evaluate the functionality and efficacy of the proposed system, rigorous assessment and validation procedures have been employed throughout the project. The results of this study reveals that the proposed model’s accuracy, using momentum-based stochastic gradient descent (SGD) optimiser, is 94.46%. It also reflects a higher accuracy of 95.08% and a F1 score of 95.23% using Adam optimiser as well as a better accuracy of 94.31% in 5-fold cross validation.

互联网是目前全球最大的交流平台，包括意见表达、评论、内容、图片、视频等。此外，社交媒体因其巨大的受欢迎程度和迅速采用的趋势，现已成为一个非常广泛和极具吸引力的平台。然而，社交网络的发展也对社会产生了不利影响，导致了一系列不受欢迎的现象，如网络攻击、恐吓、数字欺凌、犯罪和嘲弄。因此，网络欺凌已成为一个普遍存在且令人担忧的问题，对人们的心理和情感造成了相当大的伤害，尤其是在青少年和年轻成年人中。为了减少网络欺凌的负面影响并为受害者提供及时的支持，大量旨在识别各种网络平台上的网络欺凌事件的研究正在兴起。与其他语言相比，孟加拉语在这一领域的研究较少。本研究利用来自多个社交媒体网站的 12282 条多功能评论数据集，展示了一种识别孟加拉语网络欺凌的深度学习策略。在这项研究中，我们建立了一个双层双向长短期记忆（Bi-LSTM）模型，利用各种优化器和 5 倍交叉验证来识别网络欺凌。为了评估所建议系统的功能和功效，整个项目采用了严格的评估和验证程序。研究结果表明，使用基于动量的随机梯度下降（SGD）优化器，建议模型的准确率为 94.46%。使用亚当优化器时，准确率为 95.08%，F1 分数为 95.23%，在 5 倍交叉验证中的准确率为 94.31%。

{"title":"Deep Learning Based Cyberbullying Detection in Bangla Language","authors":"Sristy Shidul Nath, Razuan Karim, Mahdi H. Miraz","doi":"10.33166/aetic.2024.01.005","DOIUrl":"https://doi.org/10.33166/aetic.2024.01.005","url":null,"abstract":"The Internet is currently the largest platform for global communication including expressions of opinions, reviews, contents, images, videos and so forth. Moreover, social media has now become a very broad and highly engaging platform due to its immense popularity and swift adoption trend. Increased social networking, however, also has detrimental impacts on the society leading to a range of unwanted phenomena, such as online assault, intimidation, digital bullying, criminality and trolling. Hence, cyberbullying has become a pervasive and worrying problem that poses considerable psychological and emotional harm to the people, particularly amongst the teens and the young adults. In order to lessen its negative effects and provide victims with prompt support, a great deal of research to identify cyberbullying instances at various online platforms is emerging. In comparison to other languages, Bangla (also known as Bengali) has fewer research studies in this domain. This study demonstrates a deep learning strategy for identifying cyberbullying in Bengali, using a dataset of 12282 versatile comments from multiple social media sites. In this study, a two-layer bidirectional long short-term memory (Bi-LSTM) model has been built to identify cyberbullying, using a variety of optimisers as well as 5-fold cross validation. To evaluate the functionality and efficacy of the proposed system, rigorous assessment and validation procedures have been employed throughout the project. The results of this study reveals that the proposed model’s accuracy, using momentum-based stochastic gradient descent (SGD) optimiser, is 94.46%. It also reflects a higher accuracy of 95.08% and a F1 score of 95.23% using Adam optimiser as well as a better accuracy of 94.31% in 5-fold cross validation.","PeriodicalId":36440,"journal":{"name":"Annals of Emerging Technologies in Computing","volume":"17 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139129751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimisation and Performance Computation of a Phase Frequency Detector Module for IoT Devices 物联网设备相位频率检测器模块的优化和性能计算

Q2 Computer Science

Annals of Emerging Technologies in Computing

Pub Date : 2024-01-01 DOI: 10.33166/aetic.2024.01.002

Md. Shahriar Khan Hemel, M. Reaz, S. Ali, Mohammad Arif Sobhan Bhuiyan, Mahdi H. Miraz

The Internet of Things (IoT) is pivotal in transforming the way we live and interact with our surroundings. To cope with the advancement in technologies, it is vital to acquire accuracy with the speed. A phase frequency detector (PFD) is a critical device to regulate and provide accurate frequency in IoT devices. Designing a PFD poses challenges in achieving precise phase detection, minimising dead zones, optimising power consumption, and ensuring robust performance across various operational frequencies, necessitating complex engineering and innovative solutions. This study delves into optimising a PFD circuit, designed using 90 nm standard CMOS technology, aiming to achieve superior operational frequencies. An efficient and high-frequency PFD design is crafted and analysed using cadence virtuoso. The study focused on investigating the impact of optimising PFD design. With the optimised PFD, an operational frequency of 5 GHz has been achieved, along with a power consumption of only 29 µW. The dead zone of the PFD was only 25 ps.

物联网（IoT）在改变我们的生活方式以及与周围环境的互动方面起着举足轻重的作用。为了应对技术的进步，必须以最快的速度获得精确度。相位频率检测器（PFD）是在物联网设备中调节和提供精确频率的关键设备。设计相位频率检测器在实现精确的相位检测、最小化死区、优化功耗以及确保各种工作频率下的稳健性能等方面都面临挑战，因此需要复杂的工程设计和创新的解决方案。本研究深入探讨了如何优化使用 90 纳米标准 CMOS 技术设计的 PFD 电路，以实现卓越的工作频率。使用 cadence virtuoso 制作和分析了高效高频 PFD 设计。研究重点是调查优化 PFD 设计的影响。经过优化的 PFD 实现了 5 GHz 的工作频率，功耗仅为 29 µW。PFD 的死区仅为 25 ps。

引用次数: 0

Trajectory Optimization for a 6 DOF Robotic Arm Based on Reachability Time 基于可达性时间的 6 DOF 机械臂轨迹优化

Q2 Computer Science

Annals of Emerging Technologies in Computing

Pub Date : 2024-01-01 DOI: 10.33166/aetic.2024.01.003

Mahmoud A. A. Mousa, Abdelrahman Elgohr, Hatem A. Khater

The design of the robotic arm's trajectory is based on inverse kinematics problem solving, with additional refinements of certain criteria. One common design issue is the trajectory optimization of the robotic arm. Due to the difficulty of the work in the past, many of the suggested ways only resulted in a marginal improvement. This paper introduces two approaches to solve the problem of achieving robotic arm trajectory control while maintaining the minimum reachability time. These two approaches are based on rule-based optimization and a genetic algorithm. The way we addressed the problem here is based on the robot’s forward and inverse kinematics and takes into account the minimization of operating time throughout the operation cycle. The proposed techniques were validated, and all recommended criteria were compared on the trajectory optimization of the KUKA KR 4 R600 six-degree-of-freedom robot. As a conclusion, the genetic based algorithm behaves better than the rule-based one in terms of achieving a minimal trip time. We found that solutions generated by the Genetic based algorithm are approximately 3 times faster than the other solutions generated by the rule-based algorithm to the same paths. We argue that as the rule-based algorithm produces its solutions after discovering all the problem’s searching space which is time consuming, and it is not the case as per the genetic based algorithm.

机械臂的轨迹设计基于逆运动学问题的解决，并对某些标准进行了额外的改进。一个常见的设计问题是机械臂的轨迹优化。由于过去的工作难度较大，许多建议的方法只能带来微不足道的改进。本文介绍了两种方法来解决实现机械臂轨迹控制，同时保持最短到达时间的问题。这两种方法分别基于规则优化和遗传算法。我们解决该问题的方法基于机器人的正向和反向运动学，并考虑了整个操作周期内操作时间的最小化。在对库卡 KR 4 R600 六自由度机器人进行轨迹优化时，对提出的技术进行了验证，并对所有推荐标准进行了比较。结论是，在实现最短行程时间方面，基于遗传的算法要优于基于规则的算法。我们发现，在相同路径上，基于遗传算法生成的解决方案比基于规则算法生成的其他解决方案快约 3 倍。我们认为，基于规则的算法是在发现问题的所有搜索空间后生成解决方案的，这很耗时，而基于遗传的算法则不然。

{"title":"Trajectory Optimization for a 6 DOF Robotic Arm Based on Reachability Time","authors":"Mahmoud A. A. Mousa, Abdelrahman Elgohr, Hatem A. Khater","doi":"10.33166/aetic.2024.01.003","DOIUrl":"https://doi.org/10.33166/aetic.2024.01.003","url":null,"abstract":"The design of the robotic arm's trajectory is based on inverse kinematics problem solving, with additional refinements of certain criteria. One common design issue is the trajectory optimization of the robotic arm. Due to the difficulty of the work in the past, many of the suggested ways only resulted in a marginal improvement. This paper introduces two approaches to solve the problem of achieving robotic arm trajectory control while maintaining the minimum reachability time. These two approaches are based on rule-based optimization and a genetic algorithm. The way we addressed the problem here is based on the robot’s forward and inverse kinematics and takes into account the minimization of operating time throughout the operation cycle. The proposed techniques were validated, and all recommended criteria were compared on the trajectory optimization of the KUKA KR 4 R600 six-degree-of-freedom robot. As a conclusion, the genetic based algorithm behaves better than the rule-based one in terms of achieving a minimal trip time. We found that solutions generated by the Genetic based algorithm are approximately 3 times faster than the other solutions generated by the rule-based algorithm to the same paths. We argue that as the rule-based algorithm produces its solutions after discovering all the problem’s searching space which is time consuming, and it is not the case as per the genetic based algorithm.","PeriodicalId":36440,"journal":{"name":"Annals of Emerging Technologies in Computing","volume":"16 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139129227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0