Frontiers in signal processing最新文献_第3页

RIS-aided integrated sensing and communication: a mini-review ris辅助集成传感和通信:一个小回顾

Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Frontiers in signal processing

Pub Date : 2023-05-05 DOI: 10.3389/frsip.2023.1197240

Mirza Asif Haider, Yimin D. Zhang

Integrating sensing and communication (ISAC) is a cutting-edge technology aimed at achieving high-resolution target sensing and high data-rate communications using a shared spectrum. This innovative approach optimizes the usage of the radio spectrum with no or minimal level of mutual interference. The capability of reconfigurable intelligent surface (RIS) to control the environment and provide additional degrees of freedom is driving the development of RIS-aided ISAC. In this mini-review, we provide an overview of the current state-of-the-art of RIS-aided ISAC technology, including various system configurations, approaches, and signal processing techniques.

集成传感和通信(ISAC)是一项尖端技术，旨在利用共享频谱实现高分辨率目标传感和高数据速率通信。这种创新的方法优化了无线电频谱的使用，没有或最低程度的相互干扰。可重构智能表面(RIS)控制环境和提供额外自由度的能力正在推动RIS辅助ISAC的发展。在这篇小型综述中，我们概述了当前最先进的ris辅助ISAC技术，包括各种系统配置、方法和信号处理技术。

引用次数: 0

Degradation learning and Skip-Transformer for blind face restoration 盲面恢复的退化学习和Skip-Transformer

Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Frontiers in signal processing

Pub Date : 2023-05-02 DOI: 10.3389/frsip.2023.1106465

Ahmed Cheikh Sidiya, Xuanang Xu, N. Xu, Xin Li

Blindrestoration of low-quality faces in the real world has advanced rapidly in recent years. The rich and diverse priors encapsulated by pre-trained face GAN have demonstrated their effectiveness in reconstructing high-quality faces from low-quality observations in the real world. However, the modeling of degradation in real-world face images remains poorly understood, affecting the property of generalization of existing methods. Inspired by the success of pre-trained models and transformers in recent years, we propose to solve the problem of blind restoration by jointly exploiting their power for degradation and prior learning, respectively. On the one hand, we train a two-generator architecture for degradation learning to transfer the style of low-quality real-world faces to the high-resolution output of pre-trained StyleGAN. On the other hand, we present a hybrid architecture, called Skip-Transformer (ST), which combines transformer encoder modules with a pre-trained StyleGAN-based decoder using skip layers. Such a hybrid design is innovative in that it represents the first attempt to jointly exploit the global attention mechanism of the transformer and pre-trained StyleGAN-based generative facial priors. We have compared our DL-ST model with the latest three benchmarks for blind image restoration (DFDNet, PSFRGAN, and GFP-GAN). Our experimental results have shown that this work outperforms all other competing methods, both subjectively and objectively (as measured by the Fréchet Inception Distance and NIQE metrics).

近年来，现实世界中低质量人脸的盲恢复技术发展迅速。由预训练的人脸GAN封装的丰富多样的先验已经证明了它们在从现实世界的低质量观测中重建高质量人脸方面的有效性。然而，现实世界人脸图像的退化建模仍然知之甚少，影响了现有方法的泛化性能。受近年来预训练模型和变压器成功的启发，我们提出分别利用它们的退化能力和先验学习能力来解决盲目恢复问题。一方面，我们训练了一个用于退化学习的双发生器架构，将低质量的真实世界人脸的风格转移到预训练的StyleGAN的高分辨率输出中。另一方面，我们提出了一种称为skip - transformer (ST)的混合架构，它将变压器编码器模块与使用跳过层的预训练的基于stylegan的解码器相结合。这种混合设计是创新的，因为它首次尝试联合利用变压器的全局注意机制和预先训练的基于stylegan的生成面部先验。我们将DL-ST模型与最新的三个盲图像恢复基准(DFDNet、PSFRGAN和GFP-GAN)进行了比较。我们的实验结果表明，这项工作在主观上和客观上都优于所有其他竞争方法(通过fr起始距离和NIQE度量)。

{"title":"Degradation learning and Skip-Transformer for blind face restoration","authors":"Ahmed Cheikh Sidiya, Xuanang Xu, N. Xu, Xin Li","doi":"10.3389/frsip.2023.1106465","DOIUrl":"https://doi.org/10.3389/frsip.2023.1106465","url":null,"abstract":"Blindrestoration of low-quality faces in the real world has advanced rapidly in recent years. The rich and diverse priors encapsulated by pre-trained face GAN have demonstrated their effectiveness in reconstructing high-quality faces from low-quality observations in the real world. However, the modeling of degradation in real-world face images remains poorly understood, affecting the property of generalization of existing methods. Inspired by the success of pre-trained models and transformers in recent years, we propose to solve the problem of blind restoration by jointly exploiting their power for degradation and prior learning, respectively. On the one hand, we train a two-generator architecture for degradation learning to transfer the style of low-quality real-world faces to the high-resolution output of pre-trained StyleGAN. On the other hand, we present a hybrid architecture, called Skip-Transformer (ST), which combines transformer encoder modules with a pre-trained StyleGAN-based decoder using skip layers. Such a hybrid design is innovative in that it represents the first attempt to jointly exploit the global attention mechanism of the transformer and pre-trained StyleGAN-based generative facial priors. We have compared our DL-ST model with the latest three benchmarks for blind image restoration (DFDNet, PSFRGAN, and GFP-GAN). Our experimental results have shown that this work outperforms all other competing methods, both subjectively and objectively (as measured by the Fréchet Inception Distance and NIQE metrics).","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75394101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Issues of ubiquitous music archaeology: Shared knowledge, simulation, terseness, and ambiguity in early computer music 无所不在的音乐考古问题:早期电脑音乐中的共享知识、模拟、简洁和歧义

Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Frontiers in signal processing

Pub Date : 2023-04-05 DOI: 10.3389/frsip.2023.1132672

Victor Lazzarini, Damián Keller , Nemanja Radivojević

The reconstruction of tools and artworks belonging to the origins of music computing unveils the dynamics of distributed knowledge underlying some of the major breakthroughs that took place during the analogue-digital transition of the 1950s and 1960s. We document the implementation of two musical replicas, the Computer Suite for Little Boy and For Ann (Rising). Our archaeological ubiquitous-music methods yield fresh insights on both convergences and contradictions implicit in the creation of cutting-edge technologies, pointing to design qualities such as terseness and ambiguity. Through new renditions of historically significant artefacts, enabled by the recovery of artistic first-hand sources and of one of the early computer music environments, MUSIC V, we explore the emergence of exploratory simulations of new musical worlds.

音乐计算起源的工具和艺术品的重建揭示了分布式知识的动态，这些知识是20世纪50年代和60年代模拟-数字过渡期间发生的一些重大突破的基础。我们记录了两个音乐复制品的实现，为小男孩和为安(上升)的计算机套件。我们的考古无处不在的音乐方法对尖端技术创造中隐含的融合和矛盾产生了新的见解，指出了简洁和模糊等设计品质。通过对具有历史意义的人工制品的新再现，通过艺术第一手资料的恢复和早期计算机音乐环境之一的music V，我们探索了新音乐世界的探索性模拟的出现。

引用次数: 1

ICA’s bug: How ghost ICs emerge from effective rank deficiency caused by EEG electrode interpolation and incorrect re-referencing ICA的bug:由于EEG电极插值和不正确的重新引用导致的有效秩不足，如何产生幽灵ic

Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Frontiers in signal processing

Pub Date : 2023-04-03 DOI: 10.3389/frsip.2023.1064138

Hyeonseok Kim, Justin Luo, Shannon Chu, C. Cannard, Sven Hoffmann, M. Miyakoshi

Independent component analysis (ICA) has been widely used for electroencephalography (EEG) analyses. However, ICA performance relies on several crucial assumptions about the data. Here, we focus on the granularity of data rank, i.e., the number of linearly independent EEG channels. When the data are rank-full (i.e., all channels are independent), ICA produces as many independent components (ICs) as the number of input channels (rank-full decomposition). However, when the input data are rank-deficient, as is the case with bridged or interpolated electrodes, ICA produces the same number of ICs as the data rank (forced rank deficiency decomposition), introducing undesired ghost ICs and indicating a bug in ICA. We demonstrated that the ghost ICs have white noise properties, in both time and frequency domains, while maintaining surprisingly typical scalp topographies, and can therefore be easily missed by EEG researchers and affect findings in unknown ways. This problem occurs when the minimum eigenvalue λ min of the input data is smaller than a certain threshold, leading to matrix inversion failure as if the rank-deficient inversion was forced, even if the data rank is cleanly deficient by one. We defined this problem as the effective rank deficiency. Using sound file mixing simulations, we first demonstrated the effective rank deficiency problem and determined that the critical threshold for λ min is 10−7 in the given situation. Second, we used empirical EEG data to show how two preprocessing stages, re-referencing to average without including the initial reference and non-linear electrode interpolation, caused this forced rank deficiency problem. Finally, we showed that the effective rank deficiency problem can be solved by using the identified threshold ( λ min = 10−7) and the correct re-referencing procedure described herein. The former ensures the achievement of effective rank-full decomposition by properly reducing the input data rank, and the latter allows avoidance of a widely practiced incorrect re-referencing approach. Based on the current literature, we discuss the ambiguous status of the initial reference electrode when re-referencing. We have made our data and code available to facilitate the implementation of our recommendations by the EEG community.

独立分量分析(ICA)在脑电图分析中得到了广泛的应用。然而，ICA的性能依赖于对数据的几个关键假设。在这里，我们关注的是数据等级的粒度，即线性独立的脑电信号通道的数量。当数据是全秩的(即，所有通道都是独立的)，ICA产生与输入通道数量一样多的独立分量(ic)(全秩分解)。然而，当输入数据秩不足时，如桥接或内插电极的情况，ICA产生与数据秩相同数量的ic(强制秩不足分解)，引入不希望的幽灵ic，并表明ICA存在缺陷。我们证明了幽灵ic在时域和频域都具有白噪声特性，同时保持了令人惊讶的典型头皮地形，因此很容易被脑电图研究人员遗漏，并以未知的方式影响研究结果。当输入数据的最小特征值λ min小于某一阈值时，即使数据秩明显不足1，也会导致矩阵反演失败，就像强制进行秩亏缺反演一样。我们把这个问题定义为有效等级不足。通过声音文件混合模拟，我们首先证明了有效秩不足问题，并确定在给定情况下λ min的临界阈值为10−7。其次，我们使用经验EEG数据来显示两个预处理阶段，即不包括初始参考和非线性电极插值的重新参考平均，是如何导致这种强制秩不足问题的。最后，我们证明了有效的秩不足问题可以通过使用识别的阈值(λ min = 10−7)和本文描述的正确的重新引用程序来解决。前者通过适当降低输入数据的秩确保实现有效的秩全分解，后者允许避免广泛使用的错误重引用方法。在现有文献的基础上，我们讨论了初始参比电极在重新参比时的模糊状态。我们已经提供了我们的数据和代码，以促进EEG社区实施我们的建议。

{"title":"ICA’s bug: How ghost ICs emerge from effective rank deficiency caused by EEG electrode interpolation and incorrect re-referencing","authors":"Hyeonseok Kim, Justin Luo, Shannon Chu, C. Cannard, Sven Hoffmann, M. Miyakoshi","doi":"10.3389/frsip.2023.1064138","DOIUrl":"https://doi.org/10.3389/frsip.2023.1064138","url":null,"abstract":"Independent component analysis (ICA) has been widely used for electroencephalography (EEG) analyses. However, ICA performance relies on several crucial assumptions about the data. Here, we focus on the granularity of data rank, i.e., the number of linearly independent EEG channels. When the data are rank-full (i.e., all channels are independent), ICA produces as many independent components (ICs) as the number of input channels (rank-full decomposition). However, when the input data are rank-deficient, as is the case with bridged or interpolated electrodes, ICA produces the same number of ICs as the data rank (forced rank deficiency decomposition), introducing undesired ghost ICs and indicating a bug in ICA. We demonstrated that the ghost ICs have white noise properties, in both time and frequency domains, while maintaining surprisingly typical scalp topographies, and can therefore be easily missed by EEG researchers and affect findings in unknown ways. This problem occurs when the minimum eigenvalue λ min of the input data is smaller than a certain threshold, leading to matrix inversion failure as if the rank-deficient inversion was forced, even if the data rank is cleanly deficient by one. We defined this problem as the effective rank deficiency. Using sound file mixing simulations, we first demonstrated the effective rank deficiency problem and determined that the critical threshold for λ min is 10−7 in the given situation. Second, we used empirical EEG data to show how two preprocessing stages, re-referencing to average without including the initial reference and non-linear electrode interpolation, caused this forced rank deficiency problem. Finally, we showed that the effective rank deficiency problem can be solved by using the identified threshold ( λ min = 10−7) and the correct re-referencing procedure described herein. The former ensures the achievement of effective rank-full decomposition by properly reducing the input data rank, and the latter allows avoidance of a widely practiced incorrect re-referencing approach. Based on the current literature, we discuss the ambiguous status of the initial reference electrode when re-referencing. We have made our data and code available to facilitate the implementation of our recommendations by the EEG community.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83631245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

MRET: Multi-resolution transformer for video quality assessment 用于视频质量评估的多分辨率变压器

Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Frontiers in signal processing

Pub Date : 2023-03-13 DOI: 10.3389/frsip.2023.1137006

Junjie Ke, Tian Zhang, Yilin Wang, P. Milanfar, Feng Yang

No-reference video quality assessment (NR-VQA) for user generated content (UGC) is crucial for understanding and improving visual experience. Unlike video recognition tasks, VQA tasks are sensitive to changes in input resolution. Since large amounts of UGC videos nowadays are 720p or above, the fixed and relatively small input used in conventional NR-VQA methods results in missing high-frequency details for many videos. In this paper, we propose a novel Transformer-based NR-VQA framework that preserves the high-resolution quality information. With the multi-resolution input representation and a novel multi-resolution patch sampling mechanism, our method enables a comprehensive view of both the global video composition and local high-resolution details. The proposed approach can effectively aggregate quality information across different granularities in spatial and temporal dimensions, making the model robust to input resolution variations. Our method achieves state-of-the-art performance on large-scale UGC VQA datasets LSVQ and LSVQ-1080p, and on KoNViD-1k and LIVE-VQC without fine-tuning.

用户生成内容(UGC)的无参考视频质量评估(NR-VQA)对于理解和改善视觉体验至关重要。与视频识别任务不同，VQA任务对输入分辨率的变化很敏感。由于现在大量的UGC视频都是720p或以上，传统的NR-VQA方法所使用的固定且相对较小的输入导致许多视频缺少高频细节。在本文中，我们提出了一种新的基于变压器的NR-VQA框架，该框架保留了高分辨率的质量信息。通过多分辨率输入表示和一种新颖的多分辨率补丁采样机制，我们的方法可以全面查看全局视频组成和局部高分辨率细节。该方法可以在空间和时间维度上有效地聚合不同粒度的质量信息，使模型对输入分辨率变化具有鲁棒性。我们的方法在大规模UGC VQA数据集LSVQ和LSVQ-1080p以及KoNViD-1k和LIVE-VQC上实现了最先进的性能，而无需进行微调。

引用次数: 0

Recording and analysing physical control variables used in clarinet playing: A musical instrument performance capture and analysis toolbox (MIPCAT) 记录和分析单簧管演奏中使用的物理控制变量:乐器性能捕获和分析工具箱(MIPCAT)

Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Frontiers in signal processing

Pub Date : 2023-02-10 DOI: 10.3389/frsip.2023.1089366

A. Almeida, Weicong Li, Emery Schubert, John Smith, J. Wolfe

Measuring fine-grained physical interaction between the human player and the musical instrument can significantly improve our understanding of music performance. This article presents a Musical Instrument Performance Capture and Analysis Toolbox (MIPCAT) that can be used to capture and to process the physical control variables used by a musician while performing music. This includes both a measurement apparatus with sensors and a software toolbox for analysis. Several of the components used here can also be applied in other musical contexts. The system is here applied to the clarinet, where the instrument sensors record blowing pressure, reed position, tongue contact, and sound pressures in the mouth, mouthpiece, and barrel. Radiated sound and multiple videos are also recorded to allow details of the embouchure and the instrument’s motion to be determined. The software toolbox can synchronise measurements from different devices, including video sources, extract time-variable descriptors, segment by notes and excerpts, and summarise descriptors per note, phrase, or excerpt. An example of its application shows how to compare performances from different musicians.

测量人类演奏者和乐器之间细微的物理互动可以显著提高我们对音乐表演的理解。本文介绍了一个乐器性能捕获和分析工具箱(MIPCAT)，可用于捕获和处理音乐家在演奏音乐时使用的物理控制变量。这包括带有传感器的测量仪器和用于分析的软件工具箱。这里使用的几个组成部分也可以应用于其他音乐环境。该系统在这里应用于单簧管，仪器传感器记录吹压，簧片位置，舌接触，以及口，吹嘴和管中的声压。辐射声音和多个视频也被记录下来，以便确定喷口和乐器运动的细节。软件工具箱可以同步来自不同设备的测量，包括视频源，提取时间变量描述符，分段笔记和摘录，并总结每个笔记，短语或摘录的描述符。一个应用实例展示了如何比较不同音乐家的表演。

{"title":"Recording and analysing physical control variables used in clarinet playing: A musical instrument performance capture and analysis toolbox (MIPCAT)","authors":"A. Almeida, Weicong Li, Emery Schubert, John Smith, J. Wolfe","doi":"10.3389/frsip.2023.1089366","DOIUrl":"https://doi.org/10.3389/frsip.2023.1089366","url":null,"abstract":"Measuring fine-grained physical interaction between the human player and the musical instrument can significantly improve our understanding of music performance. This article presents a Musical Instrument Performance Capture and Analysis Toolbox (MIPCAT) that can be used to capture and to process the physical control variables used by a musician while performing music. This includes both a measurement apparatus with sensors and a software toolbox for analysis. Several of the components used here can also be applied in other musical contexts. The system is here applied to the clarinet, where the instrument sensors record blowing pressure, reed position, tongue contact, and sound pressures in the mouth, mouthpiece, and barrel. Radiated sound and multiple videos are also recorded to allow details of the embouchure and the instrument’s motion to be determined. The software toolbox can synchronise measurements from different devices, including video sources, extract time-variable descriptors, segment by notes and excerpts, and summarise descriptors per note, phrase, or excerpt. An example of its application shows how to compare performances from different musicians.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89084071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Simultaneous segmentation of multiple structures in fundal images using multi-tasking deep neural networks 基于多任务深度神经网络的基础图像多结构同时分割

Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Frontiers in signal processing

Pub Date : 2023-01-09 DOI: 10.3389/frsip.2022.936875

Sunil Kumar Vengalil, Bharath K. Krishnamurthy, N. Sinha

Introduction: Fundal imaging is the most commonly used non-invasive technique for early detection of many retinal diseases such as diabetic retinopathy (DR). An initial step in automatic processing of fundal images for detecting diseases is to identify and segment the normal landmarks: the optic disc, blood vessels, and macula. In addition to these structures, other parameters such as exudates that help in pathological evaluations are also visible in fundal images. Segmenting features like blood vessels pose multiple challenges because of their fine-grained structure that must be captured at original resolution and the fact that they are spread across the entire retina with varying patterns and densities. Exudates appear as white patches of irregular shapes that occur at multiple locations, and they can be confused with the optic disc, if features like brightness or color are used for segmentation. Methods: Segmentation algorithms solely based on image processing involve multiple parameters and thresholds that need to be tuned. Another approach is to use machine learning models with inputs of hand-crafted features to segment the image. The challenge in this approach is to identify the correct features and then devise algorithms to extract these features. End-to-end deep neural networks take raw images with minimal preprocessing, such as resizing and normalization, as inputs, learn a set of images in the intermediate layers, and then perform the segmentation in the last layer. These networks tend to have longer training and prediction times because of the complex architecture which can involve millions of parameters. This also necessitates huge numbers of training images (2000‒10,000). For structures like blood vessels and exudates that are spread across the entire image, one approach used to increase the training data is to generate multiple patches from a single training image, thus increasing the total number of training samples. Patch-based time cannot be applied to structures like the optic disc and fovea that appear only once per image. Also the prediction time is larger because segmenting a full image involves segmenting multiple patches in the image. Results and Discussion: Most of the existing research has been focused on segmenting these structures independently to achieve high performance metrics. In this work, we propose a multi-tasking, deep learning architecture for segmenting the optic disc, blood vessels, macula, and exudates simultaneously. Both training and prediction are performed using the whole image. The objective was to improve the prediction results on blood vessels and exudates, which are relatively more challenging, while utilizing segmentation of the optic disc and the macula as auxiliary tasks. Our experimental results on images from publicly available datasets show that simultaneous segmentation of all these structures results in a significant improvement in performance. The proposed approach makes predictions of all f

眼底成像是许多视网膜疾病如糖尿病视网膜病变(DR)早期检测最常用的非侵入性技术。用于疾病检测的基础图像自动处理的第一步是识别和分割正常的标志:视盘、血管和黄斑。除了这些结构外，其他参数，如有助于病理评估的渗出物，也可以在眼底图像中看到。分割血管等特征带来了多重挑战，因为它们的细粒度结构必须以原始分辨率捕获，而且它们以不同的模式和密度分布在整个视网膜上。渗出物表现为不规则形状的白色斑块，出现在多个位置，如果使用亮度或颜色等特征进行分割，它们可能与视盘混淆。方法:单纯基于图像处理的分割算法涉及多个需要调优的参数和阈值。另一种方法是使用带有手工特征输入的机器学习模型来分割图像。这种方法的挑战在于识别正确的特征，然后设计算法来提取这些特征。端到端深度神经网络将经过最小预处理(如调整大小和归一化)的原始图像作为输入，在中间层学习一组图像，然后在最后一层进行分割。由于复杂的结构可能涉及数百万个参数，这些网络往往需要更长的训练和预测时间。这也需要大量的训练图像(2000-10,000)。对于血管和渗出物等分布在整个图像上的结构，增加训练数据的一种方法是在单个训练图像上生成多个patch，从而增加训练样本的总数。基于斑块的时间不能应用于像视盘和中央凹这样的结构，因为它们在每张图像中只出现一次。此外，由于分割完整图像涉及分割图像中的多个补丁，因此预测时间更长。结果和讨论:大多数现有的研究都集中在独立分割这些结构以实现高性能指标上。在这项工作中，我们提出了一个多任务的深度学习架构，用于同时分割视盘、血管、黄斑和渗出液。训练和预测都是使用整个图像进行的。目的是在利用视盘和黄斑的分割作为辅助任务的同时，改善相对更具挑战性的血管和渗出物的预测结果。我们在公开数据集的图像上的实验结果表明，同时分割所有这些结构可以显著提高性能。提出的方法可以在单个向前通道中对整个图像中的所有四种结构进行预测。我们使用改进的U-Net架构，只有卷积层和去卷积层，并进行了比较。

{"title":"Simultaneous segmentation of multiple structures in fundal images using multi-tasking deep neural networks","authors":"Sunil Kumar Vengalil, Bharath K. Krishnamurthy, N. Sinha","doi":"10.3389/frsip.2022.936875","DOIUrl":"https://doi.org/10.3389/frsip.2022.936875","url":null,"abstract":"Introduction: Fundal imaging is the most commonly used non-invasive technique for early detection of many retinal diseases such as diabetic retinopathy (DR). An initial step in automatic processing of fundal images for detecting diseases is to identify and segment the normal landmarks: the optic disc, blood vessels, and macula. In addition to these structures, other parameters such as exudates that help in pathological evaluations are also visible in fundal images. Segmenting features like blood vessels pose multiple challenges because of their fine-grained structure that must be captured at original resolution and the fact that they are spread across the entire retina with varying patterns and densities. Exudates appear as white patches of irregular shapes that occur at multiple locations, and they can be confused with the optic disc, if features like brightness or color are used for segmentation. Methods: Segmentation algorithms solely based on image processing involve multiple parameters and thresholds that need to be tuned. Another approach is to use machine learning models with inputs of hand-crafted features to segment the image. The challenge in this approach is to identify the correct features and then devise algorithms to extract these features. End-to-end deep neural networks take raw images with minimal preprocessing, such as resizing and normalization, as inputs, learn a set of images in the intermediate layers, and then perform the segmentation in the last layer. These networks tend to have longer training and prediction times because of the complex architecture which can involve millions of parameters. This also necessitates huge numbers of training images (2000‒10,000). For structures like blood vessels and exudates that are spread across the entire image, one approach used to increase the training data is to generate multiple patches from a single training image, thus increasing the total number of training samples. Patch-based time cannot be applied to structures like the optic disc and fovea that appear only once per image. Also the prediction time is larger because segmenting a full image involves segmenting multiple patches in the image. Results and Discussion: Most of the existing research has been focused on segmenting these structures independently to achieve high performance metrics. In this work, we propose a multi-tasking, deep learning architecture for segmenting the optic disc, blood vessels, macula, and exudates simultaneously. Both training and prediction are performed using the whole image. The objective was to improve the prediction results on blood vessels and exudates, which are relatively more challenging, while utilizing segmentation of the optic disc and the macula as auxiliary tasks. Our experimental results on images from publicly available datasets show that simultaneous segmentation of all these structures results in a significant improvement in performance. The proposed approach makes predictions of all f","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74451103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Decimation keystone algorithm for forward-looking monopulse imaging on platforms with uniformly accelerated motion 匀速运动平台前视单脉冲成像的抽取梯形算法

Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Frontiers in signal processing

Pub Date : 2023-01-05 DOI: 10.3389/frsip.2022.1074053

Ze Li, Yue Li

Forward-looking imaging for maneuvering platforms has garnered significant interest in many military and civilian fields. As the maneuvering trajectory in the scanning period can be simplified as the constant acceleration maneuver, monopulse imaging is applied to enhance the azimuthal resolution of the forward-looking image. However, the maneuver causes severe range migration and Doppler shift; this often results in range location error due to the space-varying Doppler shifts and the failure of angle estimation. We propose a decimation keystone algorithm based on the chirp-Z transform (CZT). First, the pulse repetition frequency (PRF) is decimated with an integer; thus, the azimuthal sampling sequence is decimated into many sub-sequences. Then, the linear range walk correction (LRWC) is performed on each sub-sequence using the keystone transform, significantly reducing the influence of the change of Doppler-ambiguity-number on range location. Further, the sub-sequences are regrouped as one sequence, and the range curvature due to the acceleration is compensated in the frequency domain. Finally, the varying Doppler centroid in each coherent processing interval (CPI) is analyzed and compensated for the sum-difference angular measurements. Simulation results demonstrate the effectiveness of the proposed algorithm for forward-looking imaging under constant acceleration maneuvers and the feasibility of range location error correction.

机动平台的前视成像已经引起了许多军事和民用领域的极大兴趣。由于扫描周期内的机动轨迹可以简化为等加速度机动，因此采用单脉冲成像技术提高前视图像的方位分辨率。然而，这种机动会造成严重的距离偏移和多普勒频移;由于空间变化的多普勒频移和角度估计的失败，往往会导致距离定位误差。提出了一种基于chirp-Z变换(CZT)的抽取梯形算法。首先，用整数抽取脉冲重复频率(PRF);因此，方位角采样序列被抽取成许多子序列。然后，利用梯形变换对每个子序列进行线性距离行走校正(LRWC)，显著降低了多普勒模糊度数变化对距离定位的影响;进一步，将子序列重组为一个序列，并在频域补偿由加速度引起的距离曲率。最后，对各相干处理间隔(CPI)的多普勒质心变化进行了分析，并对和差角测量值进行了补偿。仿真结果验证了该算法在恒加速度机动下前视成像的有效性和距离定位误差校正的可行性。

{"title":"Decimation keystone algorithm for forward-looking monopulse imaging on platforms with uniformly accelerated motion","authors":"Ze Li, Yue Li","doi":"10.3389/frsip.2022.1074053","DOIUrl":"https://doi.org/10.3389/frsip.2022.1074053","url":null,"abstract":"Forward-looking imaging for maneuvering platforms has garnered significant interest in many military and civilian fields. As the maneuvering trajectory in the scanning period can be simplified as the constant acceleration maneuver, monopulse imaging is applied to enhance the azimuthal resolution of the forward-looking image. However, the maneuver causes severe range migration and Doppler shift; this often results in range location error due to the space-varying Doppler shifts and the failure of angle estimation. We propose a decimation keystone algorithm based on the chirp-Z transform (CZT). First, the pulse repetition frequency (PRF) is decimated with an integer; thus, the azimuthal sampling sequence is decimated into many sub-sequences. Then, the linear range walk correction (LRWC) is performed on each sub-sequence using the keystone transform, significantly reducing the influence of the change of Doppler-ambiguity-number on range location. Further, the sub-sequences are regrouped as one sequence, and the range curvature due to the acceleration is compensated in the frequency domain. Finally, the varying Doppler centroid in each coherent processing interval (CPI) is analyzed and compensated for the sum-difference angular measurements. Simulation results demonstrate the effectiveness of the proposed algorithm for forward-looking imaging under constant acceleration maneuvers and the feasibility of range location error correction.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77190655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Subject-invariant feature learning for mTBI identification using LSTM-based variational autoencoder with adversarial regularization 基于lstm的对抗正则化变分自编码器mTBI识别的主体不变特征学习

Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Frontiers in signal processing

Pub Date : 2022-11-30 DOI: 10.3389/frsip.2022.1019253

Shiva Salsabilian, L. Najafizadeh

Developing models for identifying mild traumatic brain injury (mTBI) has often been challenging due to large variations in data from subjects, resulting in difficulties for the mTBI-identification models to generalize to data from unseen subjects. To tackle this problem, we present a long short-term memory-based adversarial variational autoencoder (LSTM-AVAE) framework for subject-invariant mTBI feature extraction. In the proposed model, first, an LSTM variational autoencoder (LSTM-VAE) combines the representation learning ability of the variational autoencoder (VAE) with the temporal modeling characteristics of the LSTM to learn the latent space representations from neural activity. Then, to detach the subject’s individuality from neural feature representations, and make the model proper for cross-subject transfer learning, an adversary network is attached to the encoder in a discriminative setting. The model is trained using the 1 held-out approach. The trained encoder is then used to extract the representations from the held-out subject’s data. The extracted representations are then classified into normal and mTBI groups using different classifiers. The proposed model is evaluated on cortical recordings of Thy1-GCaMP6s transgenic mice obtained via widefield calcium imaging, prior to and after inducing injury. In cross-subject transfer learning experiment, the proposed LSTM-AVAE framework achieves classification accuracy results of 95.8% and 97.79%, without and with utilizing conditional VAE (cVAE), respectively, demonstrating that the proposed model is capable of learning invariant representations from mTBI data.

由于来自受试者的数据差异很大，开发识别轻度创伤性脑损伤(mTBI)的模型通常具有挑战性，导致mTBI识别模型难以推广到来自未见受试者的数据。为了解决这个问题，我们提出了一个基于长短期记忆的对抗变分自编码器(LSTM-AVAE)框架，用于主题不变的mTBI特征提取。在该模型中，首先，LSTM变分自编码器(LSTM-VAE)将变分自编码器(VAE)的表征学习能力与LSTM的时间建模特性相结合，从神经活动中学习潜在空间表征。然后，为了将受试者的个性从神经特征表征中分离出来，并使模型适合跨主题迁移学习，在判别设置中将对手网络附加到编码器上。该模型使用1 - hold -out方法进行训练。然后使用经过训练的编码器从滞留对象的数据中提取表征。然后使用不同的分类器将提取的表示分类为正常组和mTBI组。在诱导损伤之前和之后，通过宽视场钙成像获得Thy1-GCaMP6s转基因小鼠的皮质记录来评估所提出的模型。在跨学科迁移学习实验中，LSTM-AVAE框架在不使用条件VAE (cVAE)和使用条件VAE (cVAE)的情况下，分类准确率分别达到95.8%和97.79%，表明该模型能够从mTBI数据中学习不变表征。

{"title":"Subject-invariant feature learning for mTBI identification using LSTM-based variational autoencoder with adversarial regularization","authors":"Shiva Salsabilian, L. Najafizadeh","doi":"10.3389/frsip.2022.1019253","DOIUrl":"https://doi.org/10.3389/frsip.2022.1019253","url":null,"abstract":"Developing models for identifying mild traumatic brain injury (mTBI) has often been challenging due to large variations in data from subjects, resulting in difficulties for the mTBI-identification models to generalize to data from unseen subjects. To tackle this problem, we present a long short-term memory-based adversarial variational autoencoder (LSTM-AVAE) framework for subject-invariant mTBI feature extraction. In the proposed model, first, an LSTM variational autoencoder (LSTM-VAE) combines the representation learning ability of the variational autoencoder (VAE) with the temporal modeling characteristics of the LSTM to learn the latent space representations from neural activity. Then, to detach the subject’s individuality from neural feature representations, and make the model proper for cross-subject transfer learning, an adversary network is attached to the encoder in a discriminative setting. The model is trained using the 1 held-out approach. The trained encoder is then used to extract the representations from the held-out subject’s data. The extracted representations are then classified into normal and mTBI groups using different classifiers. The proposed model is evaluated on cortical recordings of Thy1-GCaMP6s transgenic mice obtained via widefield calcium imaging, prior to and after inducing injury. In cross-subject transfer learning experiment, the proposed LSTM-AVAE framework achieves classification accuracy results of 95.8% and 97.79%, without and with utilizing conditional VAE (cVAE), respectively, demonstrating that the proposed model is capable of learning invariant representations from mTBI data.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"86 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80586582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Performance analysis of code division multiplexing communication under evaporation duct environment 蒸发管道环境下码分复用通信性能分析

Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Frontiers in signal processing

Pub Date : 2022-11-22 DOI: 10.3389/frsip.2022.1067055

Wenjing Liu, Xiqing Liu, Shi Yan, Ling Zhao, M. Peng

The evaporation duct is an effective means for realizing non-line-of-sight (NLOS) wireless transmission over the sea. However, the effects of marine weather conditions on electromagnetic propagation have rarely been studied. In this study, the influence of the marine atmospheric environment on electromagnetic propagation was analyzed through numerical simulation. Additionally, the impacts of antenna height, transmission distance, and electromagnetic wave frequency on path loss were studied. Finally, the link capacity of the code division multiplexing (CDM) communication system in the evaporation duct environment was studied via numerical analysis and simulations. Simulation results demonstrated that CDM communication technology can improve the link capacity under an evaporation duct compared with that of the spread-spectrum communication technology.

蒸发风道是实现海上非视距无线传输的有效手段。然而，海洋天气条件对电磁传播的影响研究很少。本文通过数值模拟分析了海洋大气环境对电磁传播的影响。此外，还研究了天线高度、传输距离和电磁波频率对路径损耗的影响。最后，通过数值分析和仿真研究了蒸发管道环境下码分复用(CDM)通信系统的链路容量。仿真结果表明，与扩频通信技术相比，CDM通信技术可以提高蒸发管道下的链路容量。

引用次数: 0