Leveraging the ultra-wideband advantages of the terahertz band, Integrated sensing and communication (ISAC) facilitates high-precision sensing demands in human smart home applications. ISAC channel characteristics are the basis for ISAC system design. Currently, the ISAC channel is divided into target and background channels. Existing researches primarily focus on the attributes of human target itself, e.g. radar cross-section and micro-Doppler effect. However, the impact of human target on neither the pathloss characteristic of background channel nor the multipath propagation characteristic of target channel is considered. To address the gap, we conduct indoor channel measurements at 105 GHz to investigate the ISAC channel characteristics with the impact of human target. Firstly, by analysing the power angular delay profiles with and without human target, the changes in quantity and power of multipath components (MPCs) are observed. Then, a parameter called power control factor is proposed to evaluate the human target impact on pathloss, thereby modifying the existing pathloss model of background channel. Eventually, the MPCs belonging to target channel are extracted within target-oriented power delay profile to count the power proportion of each bounce MPCs of the target-Rx link, which supports the necessity of multi-bounce (indirect) paths modelling in target channel.
{"title":"An empirical study of ISAC channel characteristics with human target impact at 105 GHz","authors":"Wenjun Chen, Yuxiang Zhang, Yameng Liu, Jianhua Zhang, Huiwen Gong, Tao Jiang, Liang Xia","doi":"10.1049/ell2.70017","DOIUrl":"https://doi.org/10.1049/ell2.70017","url":null,"abstract":"<p>Leveraging the ultra-wideband advantages of the terahertz band, Integrated sensing and communication (ISAC) facilitates high-precision sensing demands in human smart home applications. ISAC channel characteristics are the basis for ISAC system design. Currently, the ISAC channel is divided into target and background channels. Existing researches primarily focus on the attributes of human target itself, e.g. radar cross-section and micro-Doppler effect. However, the impact of human target on neither the pathloss characteristic of background channel nor the multipath propagation characteristic of target channel is considered. To address the gap, we conduct indoor channel measurements at 105 GHz to investigate the ISAC channel characteristics with the impact of human target. Firstly, by analysing the power angular delay profiles with and without human target, the changes in quantity and power of multipath components (MPCs) are observed. Then, a parameter called power control factor is proposed to evaluate the human target impact on pathloss, thereby modifying the existing pathloss model of background channel. Eventually, the MPCs belonging to target channel are extracted within target-oriented power delay profile to count the power proportion of each bounce MPCs of the target-Rx link, which supports the necessity of multi-bounce (indirect) paths modelling in target channel.</p>","PeriodicalId":11556,"journal":{"name":"Electronics Letters","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ell2.70017","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142231101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Harshvardhan Uppaluru, Zoe Templin, Mohammed Rafeeq Khan, Md Omar Faruque, Feng Zhao, Jinhui Wang
Promising synaptic behaviour has been exhibited by memristors based on natural organic materials. Such memristor-based neuromorphic systems offer notable benefits, including environmental sustainability, low production and disposal costs, non-volatile storage capability, and bio/Complementary Metal-Oxide-Semiconductor (CMOS) compatibility. Here, a 256-level honey memristor-based neuromorphic system is experimentally evaluated for image recognition. In detail, first, 256-level honey memristors are manufactured and tested based on in-house technology; next, the non-linear characteristics and inherent variation of honey memristor devices, which lead to imprecise weight updates and limit the inference accuracy, are investigated. Experimental results indicate that the inference accuracy of the 256-level honey memristor-based neuromorphic system is greater than 88% without cycle-to-cycle variations and 87% with cycle-to-cycle variations for different optimization algorithms. The overall performance of optimization algorithms with and without variation is compared in terms of energy and latency, where the momentum algorithm consistently outperforms the rest of the algorithms. This 256-level honey memristor is a promising alternative enabling sustainable neuromorphic systems, encouraging further research into natural organic materials for neuromorphic computing.
{"title":"256-level honey memristor-based in-memory neuromorphic system","authors":"Harshvardhan Uppaluru, Zoe Templin, Mohammed Rafeeq Khan, Md Omar Faruque, Feng Zhao, Jinhui Wang","doi":"10.1049/ell2.70029","DOIUrl":"https://doi.org/10.1049/ell2.70029","url":null,"abstract":"<p>Promising synaptic behaviour has been exhibited by memristors based on natural organic materials. Such memristor-based neuromorphic systems offer notable benefits, including environmental sustainability, low production and disposal costs, non-volatile storage capability, and bio/Complementary Metal-Oxide-Semiconductor (CMOS) compatibility. Here, a 256-level honey memristor-based neuromorphic system is experimentally evaluated for image recognition. In detail, first, 256-level honey memristors are manufactured and tested based on in-house technology; next, the non-linear characteristics and inherent variation of honey memristor devices, which lead to imprecise weight updates and limit the inference accuracy, are investigated. Experimental results indicate that the inference accuracy of the 256-level honey memristor-based neuromorphic system is greater than 88% without cycle-to-cycle variations and 87% with cycle-to-cycle variations for different optimization algorithms. The overall performance of optimization algorithms with and without variation is compared in terms of energy and latency, where the momentum algorithm consistently outperforms the rest of the algorithms. This 256-level honey memristor is a promising alternative enabling sustainable neuromorphic systems, encouraging further research into natural organic materials for neuromorphic computing.</p>","PeriodicalId":11556,"journal":{"name":"Electronics Letters","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ell2.70029","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142231102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual defect recognition techniques based on deep learning models are crucial for modern industrial quality inspection. The backbone, serving as the primary feature extraction component of the defect recognition model, has not been thoroughly exploited. High-performance vision transformer (ViT) is less adopted due to high computational complexity and limitations of computational resources and storage hardware in industrial scenarios. This paper presents LSA-Former, a lightweight transformer architectural backbone that integrates the benefits of convolution and ViT. LSA-Former proposes a novel self-attention with linear computational complexity, enabling it to capture local and global semantic features with fewer parameters. LSA-Former is pre-trained on ImageNet-1K and surpasses state-of-the-art methods. LSA-Former is employed as the backbone for various detectors, evaluated specifically on the PCB defect detection task. The proposed method reduces at least 18M parameters and exceeds the baseline by more than 2.2 mAP.
{"title":"A lightweight transformer with linear self-attention for defect recognition","authors":"Yuwen Zhai, Xinyu Li, Liang Gao, Yiping Gao","doi":"10.1049/ell2.13292","DOIUrl":"https://doi.org/10.1049/ell2.13292","url":null,"abstract":"<p>Visual defect recognition techniques based on deep learning models are crucial for modern industrial quality inspection. The backbone, serving as the primary feature extraction component of the defect recognition model, has not been thoroughly exploited. High-performance vision transformer (ViT) is less adopted due to high computational complexity and limitations of computational resources and storage hardware in industrial scenarios. This paper presents LSA-Former, a lightweight transformer architectural backbone that integrates the benefits of convolution and ViT. LSA-Former proposes a novel self-attention with linear computational complexity, enabling it to capture local and global semantic features with fewer parameters. LSA-Former is pre-trained on ImageNet-1K and surpasses state-of-the-art methods. LSA-Former is employed as the backbone for various detectors, evaluated specifically on the PCB defect detection task. The proposed method reduces at least 18M parameters and exceeds the baseline by more than 2.2 mAP.</p>","PeriodicalId":11556,"journal":{"name":"Electronics Letters","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ell2.13292","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Niu, Fengming Luo, Bo Yuan, Yi Zhang, Jianyong Wang
Recently, the Visual Image Transformer (ViT) has revolutionized various domains in computer vision. The transfer of pre-trained ViT models on large-scale datasets has proven to be a promising method for downstream tasks. However, traditional transfer methods introduce numerous additional parameters in transformer blocks, posing new challenges in learning downstream tasks. This article proposes an efficient transfer method from the perspective of neural Ordinary Differential Equations (ODEs) to address this issue. On the one hand, the residual connections in the transformer layers can be interpreted as the numerical integration of differential equations. Therefore, the transformer block can be described as two explicit Euler method equations. By dynamically learning the step size in the explicit Euler equation, a highly lightweight method for transferring the transformer block is obtained. On the other hand, a new learnable neural memory ODE block is proposed by taking inspiration from the self-inhibition mechanism in neural systems. It increases the diversity of dynamical behaviours of the neurons to transfer the head block efficiently and enhances non-linearity simultaneously. Experimental results in image classification demonstrate that the proposed approach can effectively transfer ViT models and outperform state-of-the-art methods.
最近,视觉图像转换器(ViT)给计算机视觉的各个领域带来了革命性的变化。在大规模数据集上转移预训练的 ViT 模型已被证明是一种很有前途的下游任务方法。然而,传统的转移方法在转换器块中引入了大量额外参数,给下游任务的学习带来了新的挑战。本文从神经常微分方程(ODE)的角度提出了一种高效的转移方法来解决这一问题。一方面,变压器层中的残差连接可以解释为微分方程的数值积分。因此,变压器块可以描述为两个显式欧拉法方程。通过动态学习显式欧拉方程中的步长,可以获得一种高度轻量级的转换变压器块的方法。另一方面,从神经系统的自我抑制机制中获得灵感,提出了一种新的可学习神经记忆 ODE 模块。它增加了神经元动态行为的多样性,从而有效地转移了头块,并同时增强了非线性。在图像分类方面的实验结果表明,所提出的方法可以有效地转移 ViT 模型,并优于最先进的方法。
{"title":"Efficient visual transformer transferring from neural ODE perspective","authors":"Hao Niu, Fengming Luo, Bo Yuan, Yi Zhang, Jianyong Wang","doi":"10.1049/ell2.70015","DOIUrl":"https://doi.org/10.1049/ell2.70015","url":null,"abstract":"<p>Recently, the Visual Image Transformer (ViT) has revolutionized various domains in computer vision. The transfer of pre-trained ViT models on large-scale datasets has proven to be a promising method for downstream tasks. However, traditional transfer methods introduce numerous additional parameters in transformer blocks, posing new challenges in learning downstream tasks. This article proposes an efficient transfer method from the perspective of neural Ordinary Differential Equations (ODEs) to address this issue. On the one hand, the residual connections in the transformer layers can be interpreted as the numerical integration of differential equations. Therefore, the transformer block can be described as two explicit Euler method equations. By dynamically learning the step size in the explicit Euler equation, a highly lightweight method for transferring the transformer block is obtained. On the other hand, a new learnable neural memory ODE block is proposed by taking inspiration from the self-inhibition mechanism in neural systems. It increases the diversity of dynamical behaviours of the neurons to transfer the head block efficiently and enhances non-linearity simultaneously. Experimental results in image classification demonstrate that the proposed approach can effectively transfer ViT models and outperform state-of-the-art methods.</p>","PeriodicalId":11556,"journal":{"name":"Electronics Letters","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ell2.70015","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this letter, novel methods for quickly generating frequency hopping (FH) sequences with good out-of-phase peak aperiodic Hamming auto-correlation (PAHAC) property are proposed. Compared with brute-force search algorithm, the proposed algorithms have an obvious advantage on their time complexity. Results show that the algorithms can generate FH sequences with optimal/quasi-optimal PAHAC performance in all tested cases.
{"title":"Low-complexity algorithms for generating frequency hopping sequences with good aperiodic hamming correlation property","authors":"Shifu Yang, Zhe Xu, Kaichuang Jiang, Xing Liu","doi":"10.1049/ell2.70006","DOIUrl":"https://doi.org/10.1049/ell2.70006","url":null,"abstract":"<p>In this letter, novel methods for quickly generating frequency hopping (FH) sequences with good out-of-phase peak aperiodic Hamming auto-correlation (PAHAC) property are proposed. Compared with brute-force search algorithm, the proposed algorithms have an obvious advantage on their time complexity. Results show that the algorithms can generate FH sequences with optimal/quasi-optimal PAHAC performance in all tested cases.</p>","PeriodicalId":11556,"journal":{"name":"Electronics Letters","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ell2.70006","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142158543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A constrained joint dictionary learning (CJDL) algorithm for high-precision channel representation in massive multiple input multiple output (MIMO) satellite systems is proposed. Furthermore, taking into account the angular reciprocity of massive MIMO satellite systems, joint dictionary learning can establish a common support basis for both uplink and downlink. Previous deterministic dictionary has utilized deterministic basis, such as discrete Fourier transform (DFT) or orthogonal DFT (ODFT) basis, which tend to represent noise interference as part of channel characteristics. Furthermore, this deterministic dictionary is not able to adapt to dynamic communication environments. However, dictionary learning has shown the potential to significantly improve the accuracy of channel representation. Nevertheless, current research on training dictionary lacks analysis regarding constraints and boundary requirements, resulting in a suboptimal basis. To address this issue, conditional constraints associated with joint dictionary for channel representation are analysed. To screen for optimal basis, the joint dictionary is subject to constraints, including uplink and downlink constraints. Furthermore, the authors aim to quantify the maximum boundary of joint dictionary learning. Additionally, a joint dictionary updating method with singular value decomposition under constraint boundary conditions is proposed. Simulation results demonstrate that the proposed CJDL algorithm provides a more accurate and robust channel representation.
本文提出了一种用于大规模多输入多输出(MIMO)卫星系统中高精度信道表示的受限联合字典学习(CJDL)算法。此外,考虑到大规模 MIMO 卫星系统的角度互易性,联合字典学习可为上行和下行链路建立共同的支持基础。以往的确定性字典采用确定性基础,如离散傅里叶变换(DFT)或正交 DFT(ODFT)基础,这些基础往往将噪声干扰作为信道特征的一部分。此外,这种确定性字典无法适应动态通信环境。不过,字典学习已显示出显著提高信道表示精度的潜力。然而,目前关于训练字典的研究缺乏对约束条件和边界要求的分析,从而导致字典基础不够理想。为了解决这个问题,我们分析了与用于信道表示的联合字典相关的条件约束。为了筛选出最佳基础,联合字典受到了各种约束,包括上行和下行约束。此外,作者还旨在量化联合字典学习的最大边界。此外,作者还提出了一种在约束边界条件下采用奇异值分解的联合字典更新方法。仿真结果表明,所提出的 CJDL 算法能提供更准确、更稳健的信道表示。
{"title":"Sparse representation for massive MIMO satellite channel based on joint dictionary learning","authors":"Qing yang Guan, Shuang Wu","doi":"10.1049/ell2.70021","DOIUrl":"https://doi.org/10.1049/ell2.70021","url":null,"abstract":"<p>A constrained joint dictionary learning (CJDL) algorithm for high-precision channel representation in massive multiple input multiple output (MIMO) satellite systems is proposed. Furthermore, taking into account the angular reciprocity of massive MIMO satellite systems, joint dictionary learning can establish a common support basis for both uplink and downlink. Previous deterministic dictionary has utilized deterministic basis, such as discrete Fourier transform (DFT) or orthogonal DFT (ODFT) basis, which tend to represent noise interference as part of channel characteristics. Furthermore, this deterministic dictionary is not able to adapt to dynamic communication environments. However, dictionary learning has shown the potential to significantly improve the accuracy of channel representation. Nevertheless, current research on training dictionary lacks analysis regarding constraints and boundary requirements, resulting in a suboptimal basis. To address this issue, conditional constraints associated with joint dictionary for channel representation are analysed. To screen for optimal basis, the joint dictionary is subject to constraints, including uplink and downlink constraints. Furthermore, the authors aim to quantify the maximum boundary of joint dictionary learning. Additionally, a joint dictionary updating method with singular value decomposition under constraint boundary conditions is proposed. Simulation results demonstrate that the proposed CJDL algorithm provides a more accurate and robust channel representation.</p>","PeriodicalId":11556,"journal":{"name":"Electronics Letters","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ell2.70021","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142158542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wireless quantum optical communications over multipath channels with uncorrelated scattering still seem to be an open issue, although classical optical transmission was already considered, however, neglecting Bose–Einstein distributed thermal noise photons, leading to results that cannot be applied here. A viable way forward is proposed.
{"title":"Wireless quantum optical communications with uncorrelated scattering multipath reception","authors":"Peter Jung, Guido Horst Bruck","doi":"10.1049/ell2.70022","DOIUrl":"https://doi.org/10.1049/ell2.70022","url":null,"abstract":"<p>Wireless quantum optical communications over multipath channels with uncorrelated scattering still seem to be an open issue, although classical optical transmission was already considered, however, neglecting Bose–Einstein distributed thermal noise photons, leading to results that cannot be applied here. A viable way forward is proposed.</p>","PeriodicalId":11556,"journal":{"name":"Electronics Letters","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ell2.70022","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142158541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reconfigurable intelligent surfaces (RISs) have attracted a great deal of interest due to the potential contributions to the next-generation wireless networks. This letter proposes an enhancement to the physical layer security (PLS) of a multi-hop RIS-assisted underwater optical wireless communication (UOWC) system. Owing to the complexity of the underwater environment, a security-based adaptive RIS (SA-RIS) clustering strategy, which aims to reflect optical signals among clusters to improve the performance of the overall system, is evaluated. By combining the underwater channel model, the closed-form expressions of probability density function (PDF) and cumulative distribution function (CDF) are derived. Moreover, by increasing the numbers of RIS clusters, the performance metrics such as secrecy outage probability (SOP) and average secrecy capacity (ASC) are evaluated under different scenarios. The obtained results demonstrated that, in contrast to the case without preventing the eavesdropper, the proposed strategy in evasion scenarios could improve the SOP significantly. It can be concluded that the system secrecy performances are further improved by assigning different RIS clusters with proper channel quality.
{"title":"A novel security-based adaptive reconfigurable intelligent surfaces assisted clustering strategy","authors":"Yue Tian, Xiaofan Zheng","doi":"10.1049/ell2.70008","DOIUrl":"https://doi.org/10.1049/ell2.70008","url":null,"abstract":"<p>Reconfigurable intelligent surfaces (RISs) have attracted a great deal of interest due to the potential contributions to the next-generation wireless networks. This letter proposes an enhancement to the physical layer security (PLS) of a multi-hop RIS-assisted underwater optical wireless communication (UOWC) system. Owing to the complexity of the underwater environment, a security-based adaptive RIS (SA-RIS) clustering strategy, which aims to reflect optical signals among clusters to improve the performance of the overall system, is evaluated. By combining the underwater channel model, the closed-form expressions of probability density function (PDF) and cumulative distribution function (CDF) are derived. Moreover, by increasing the numbers of RIS clusters, the performance metrics such as secrecy outage probability (SOP) and average secrecy capacity (ASC) are evaluated under different scenarios. The obtained results demonstrated that, in contrast to the case without preventing the eavesdropper, the proposed strategy in evasion scenarios could improve the SOP significantly. It can be concluded that the system secrecy performances are further improved by assigning different RIS clusters with proper channel quality.</p>","PeriodicalId":11556,"journal":{"name":"Electronics Letters","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ell2.70008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jie Li, Kun Wang, Zhiyuan Li, Bingchen Zhang, Yirong Wu
Tomographic synthetic aperture radar is an advanced multi-channel interferometric technique for retrieving 3-D spatial information. It can be regarded as an inherently sparse reconstruction problem and can be solved using compressive sensing algorithms. However, the performances are limited by the number of acquisitions and suffer from computational burdens in practice. This paper proposes a novel method based on deep learning, which is carried out and optimized in an end-to-end manner by the generative adversarial neural networks. The proposed method applies the cascaded U-Net architectures to achieve the reconstruction of full-channel synthetic aperture radar images and the refinement of obtained tomographic results, respectively. The proposed network is trained using simulated data and validate the technique on simulated and real data. The tests show promising results with the limited number of acquisitions while reducing the computation time.
{"title":"Tomographic SAR imaging via generative adversarial neural network with cascaded U-Net architecture","authors":"Jie Li, Kun Wang, Zhiyuan Li, Bingchen Zhang, Yirong Wu","doi":"10.1049/ell2.13211","DOIUrl":"https://doi.org/10.1049/ell2.13211","url":null,"abstract":"<p>Tomographic synthetic aperture radar is an advanced multi-channel interferometric technique for retrieving 3-D spatial information. It can be regarded as an inherently sparse reconstruction problem and can be solved using compressive sensing algorithms. However, the performances are limited by the number of acquisitions and suffer from computational burdens in practice. This paper proposes a novel method based on deep learning, which is carried out and optimized in an end-to-end manner by the generative adversarial neural networks. The proposed method applies the cascaded U-Net architectures to achieve the reconstruction of full-channel synthetic aperture radar images and the refinement of obtained tomographic results, respectively. The proposed network is trained using simulated data and validate the technique on simulated and real data. The tests show promising results with the limited number of acquisitions while reducing the computation time.</p>","PeriodicalId":11556,"journal":{"name":"Electronics Letters","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ell2.13211","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, video-based hand–object interaction has received widespread attention from researchers. However, due to the complexity and occlusion of hand movements, hand–object interaction recognition based on RGB videos remains a highly challenging task. Here, an end-to-end spatio-temporal former (STFormer) network for understanding hand behaviour in interactions is proposed. The network consists of three modules: FlexiViT feature extraction, hand–object pose estimator, and interaction action classifier. The FlexiViT is used to extract multi-scale features from each image frame. The hand–object pose estimator is designed to predict 3D hand pose keypoints and object labels for each frame. The interaction action classifier is used to predict the interaction action categories for the entire video. The experimental results demonstrate that our approach achieves competitive recognition accuracies of 94.96% and 88.84% on two datasets, namely first-person hand action (FPHA) and 2 Hands and Objects (H2O).
{"title":"STFormer: Spatio-temporal former for hand–object interaction recognition from egocentric RGB video","authors":"Jiao Liang, Xihan Wang, Jiayi Yang, Quanli Gao","doi":"10.1049/ell2.70010","DOIUrl":"https://doi.org/10.1049/ell2.70010","url":null,"abstract":"<p>In recent years, video-based hand–object interaction has received widespread attention from researchers. However, due to the complexity and occlusion of hand movements, hand–object interaction recognition based on RGB videos remains a highly challenging task. Here, an end-to-end spatio-temporal former (STFormer) network for understanding hand behaviour in interactions is proposed. The network consists of three modules: FlexiViT feature extraction, hand–object pose estimator, and interaction action classifier. The FlexiViT is used to extract multi-scale features from each image frame. The hand–object pose estimator is designed to predict 3D hand pose keypoints and object labels for each frame. The interaction action classifier is used to predict the interaction action categories for the entire video. The experimental results demonstrate that our approach achieves competitive recognition accuracies of 94.96% and 88.84% on two datasets, namely first-person hand action (FPHA) and 2 Hands and Objects (H2O).</p>","PeriodicalId":11556,"journal":{"name":"Electronics Letters","volume":null,"pages":null},"PeriodicalIF":0.7,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ell2.70010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142142361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}