Pub Date : 2023-05-05DOI: 10.3389/frsip.2023.1197240
Mirza Asif Haider, Yimin D. Zhang
Integrating sensing and communication (ISAC) is a cutting-edge technology aimed at achieving high-resolution target sensing and high data-rate communications using a shared spectrum. This innovative approach optimizes the usage of the radio spectrum with no or minimal level of mutual interference. The capability of reconfigurable intelligent surface (RIS) to control the environment and provide additional degrees of freedom is driving the development of RIS-aided ISAC. In this mini-review, we provide an overview of the current state-of-the-art of RIS-aided ISAC technology, including various system configurations, approaches, and signal processing techniques.
{"title":"RIS-aided integrated sensing and communication: a mini-review","authors":"Mirza Asif Haider, Yimin D. Zhang","doi":"10.3389/frsip.2023.1197240","DOIUrl":"https://doi.org/10.3389/frsip.2023.1197240","url":null,"abstract":"Integrating sensing and communication (ISAC) is a cutting-edge technology aimed at achieving high-resolution target sensing and high data-rate communications using a shared spectrum. This innovative approach optimizes the usage of the radio spectrum with no or minimal level of mutual interference. The capability of reconfigurable intelligent surface (RIS) to control the environment and provide additional degrees of freedom is driving the development of RIS-aided ISAC. In this mini-review, we provide an overview of the current state-of-the-art of RIS-aided ISAC technology, including various system configurations, approaches, and signal processing techniques.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83958340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-02DOI: 10.3389/frsip.2023.1106465
Ahmed Cheikh Sidiya, Xuanang Xu, N. Xu, Xin Li
Blindrestoration of low-quality faces in the real world has advanced rapidly in recent years. The rich and diverse priors encapsulated by pre-trained face GAN have demonstrated their effectiveness in reconstructing high-quality faces from low-quality observations in the real world. However, the modeling of degradation in real-world face images remains poorly understood, affecting the property of generalization of existing methods. Inspired by the success of pre-trained models and transformers in recent years, we propose to solve the problem of blind restoration by jointly exploiting their power for degradation and prior learning, respectively. On the one hand, we train a two-generator architecture for degradation learning to transfer the style of low-quality real-world faces to the high-resolution output of pre-trained StyleGAN. On the other hand, we present a hybrid architecture, called Skip-Transformer (ST), which combines transformer encoder modules with a pre-trained StyleGAN-based decoder using skip layers. Such a hybrid design is innovative in that it represents the first attempt to jointly exploit the global attention mechanism of the transformer and pre-trained StyleGAN-based generative facial priors. We have compared our DL-ST model with the latest three benchmarks for blind image restoration (DFDNet, PSFRGAN, and GFP-GAN). Our experimental results have shown that this work outperforms all other competing methods, both subjectively and objectively (as measured by the Fréchet Inception Distance and NIQE metrics).
{"title":"Degradation learning and Skip-Transformer for blind face restoration","authors":"Ahmed Cheikh Sidiya, Xuanang Xu, N. Xu, Xin Li","doi":"10.3389/frsip.2023.1106465","DOIUrl":"https://doi.org/10.3389/frsip.2023.1106465","url":null,"abstract":"Blindrestoration of low-quality faces in the real world has advanced rapidly in recent years. The rich and diverse priors encapsulated by pre-trained face GAN have demonstrated their effectiveness in reconstructing high-quality faces from low-quality observations in the real world. However, the modeling of degradation in real-world face images remains poorly understood, affecting the property of generalization of existing methods. Inspired by the success of pre-trained models and transformers in recent years, we propose to solve the problem of blind restoration by jointly exploiting their power for degradation and prior learning, respectively. On the one hand, we train a two-generator architecture for degradation learning to transfer the style of low-quality real-world faces to the high-resolution output of pre-trained StyleGAN. On the other hand, we present a hybrid architecture, called Skip-Transformer (ST), which combines transformer encoder modules with a pre-trained StyleGAN-based decoder using skip layers. Such a hybrid design is innovative in that it represents the first attempt to jointly exploit the global attention mechanism of the transformer and pre-trained StyleGAN-based generative facial priors. We have compared our DL-ST model with the latest three benchmarks for blind image restoration (DFDNet, PSFRGAN, and GFP-GAN). Our experimental results have shown that this work outperforms all other competing methods, both subjectively and objectively (as measured by the Fréchet Inception Distance and NIQE metrics).","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75394101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.3389/frsip.2023.1132672
Victor Lazzarini, Damián Keller , Nemanja Radivojević
The reconstruction of tools and artworks belonging to the origins of music computing unveils the dynamics of distributed knowledge underlying some of the major breakthroughs that took place during the analogue-digital transition of the 1950s and 1960s. We document the implementation of two musical replicas, the Computer Suite for Little Boy and For Ann (Rising). Our archaeological ubiquitous-music methods yield fresh insights on both convergences and contradictions implicit in the creation of cutting-edge technologies, pointing to design qualities such as terseness and ambiguity. Through new renditions of historically significant artefacts, enabled by the recovery of artistic first-hand sources and of one of the early computer music environments, MUSIC V, we explore the emergence of exploratory simulations of new musical worlds.
{"title":"Issues of ubiquitous music archaeology: Shared knowledge, simulation, terseness, and ambiguity in early computer music","authors":"Victor Lazzarini, Damián Keller , Nemanja Radivojević ","doi":"10.3389/frsip.2023.1132672","DOIUrl":"https://doi.org/10.3389/frsip.2023.1132672","url":null,"abstract":"The reconstruction of tools and artworks belonging to the origins of music computing unveils the dynamics of distributed knowledge underlying some of the major breakthroughs that took place during the analogue-digital transition of the 1950s and 1960s. We document the implementation of two musical replicas, the Computer Suite for Little Boy and For Ann (Rising). Our archaeological ubiquitous-music methods yield fresh insights on both convergences and contradictions implicit in the creation of cutting-edge technologies, pointing to design qualities such as terseness and ambiguity. Through new renditions of historically significant artefacts, enabled by the recovery of artistic first-hand sources and of one of the early computer music environments, MUSIC V, we explore the emergence of exploratory simulations of new musical worlds.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"71 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84133493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-03DOI: 10.3389/frsip.2023.1064138
Hyeonseok Kim, Justin Luo, Shannon Chu, C. Cannard, Sven Hoffmann, M. Miyakoshi
Independent component analysis (ICA) has been widely used for electroencephalography (EEG) analyses. However, ICA performance relies on several crucial assumptions about the data. Here, we focus on the granularity of data rank, i.e., the number of linearly independent EEG channels. When the data are rank-full (i.e., all channels are independent), ICA produces as many independent components (ICs) as the number of input channels (rank-full decomposition). However, when the input data are rank-deficient, as is the case with bridged or interpolated electrodes, ICA produces the same number of ICs as the data rank (forced rank deficiency decomposition), introducing undesired ghost ICs and indicating a bug in ICA. We demonstrated that the ghost ICs have white noise properties, in both time and frequency domains, while maintaining surprisingly typical scalp topographies, and can therefore be easily missed by EEG researchers and affect findings in unknown ways. This problem occurs when the minimum eigenvalue λ min of the input data is smaller than a certain threshold, leading to matrix inversion failure as if the rank-deficient inversion was forced, even if the data rank is cleanly deficient by one. We defined this problem as the effective rank deficiency. Using sound file mixing simulations, we first demonstrated the effective rank deficiency problem and determined that the critical threshold for λ min is 10−7 in the given situation. Second, we used empirical EEG data to show how two preprocessing stages, re-referencing to average without including the initial reference and non-linear electrode interpolation, caused this forced rank deficiency problem. Finally, we showed that the effective rank deficiency problem can be solved by using the identified threshold ( λ min = 10−7) and the correct re-referencing procedure described herein. The former ensures the achievement of effective rank-full decomposition by properly reducing the input data rank, and the latter allows avoidance of a widely practiced incorrect re-referencing approach. Based on the current literature, we discuss the ambiguous status of the initial reference electrode when re-referencing. We have made our data and code available to facilitate the implementation of our recommendations by the EEG community.
独立分量分析(ICA)在脑电图分析中得到了广泛的应用。然而,ICA的性能依赖于对数据的几个关键假设。在这里,我们关注的是数据等级的粒度,即线性独立的脑电信号通道的数量。当数据是全秩的(即,所有通道都是独立的),ICA产生与输入通道数量一样多的独立分量(ic)(全秩分解)。然而,当输入数据秩不足时,如桥接或内插电极的情况,ICA产生与数据秩相同数量的ic(强制秩不足分解),引入不希望的幽灵ic,并表明ICA存在缺陷。我们证明了幽灵ic在时域和频域都具有白噪声特性,同时保持了令人惊讶的典型头皮地形,因此很容易被脑电图研究人员遗漏,并以未知的方式影响研究结果。当输入数据的最小特征值λ min小于某一阈值时,即使数据秩明显不足1,也会导致矩阵反演失败,就像强制进行秩亏缺反演一样。我们把这个问题定义为有效等级不足。通过声音文件混合模拟,我们首先证明了有效秩不足问题,并确定在给定情况下λ min的临界阈值为10−7。其次,我们使用经验EEG数据来显示两个预处理阶段,即不包括初始参考和非线性电极插值的重新参考平均,是如何导致这种强制秩不足问题的。最后,我们证明了有效的秩不足问题可以通过使用识别的阈值(λ min = 10−7)和本文描述的正确的重新引用程序来解决。前者通过适当降低输入数据的秩确保实现有效的秩全分解,后者允许避免广泛使用的错误重引用方法。在现有文献的基础上,我们讨论了初始参比电极在重新参比时的模糊状态。我们已经提供了我们的数据和代码,以促进EEG社区实施我们的建议。
{"title":"ICA’s bug: How ghost ICs emerge from effective rank deficiency caused by EEG electrode interpolation and incorrect re-referencing","authors":"Hyeonseok Kim, Justin Luo, Shannon Chu, C. Cannard, Sven Hoffmann, M. Miyakoshi","doi":"10.3389/frsip.2023.1064138","DOIUrl":"https://doi.org/10.3389/frsip.2023.1064138","url":null,"abstract":"Independent component analysis (ICA) has been widely used for electroencephalography (EEG) analyses. However, ICA performance relies on several crucial assumptions about the data. Here, we focus on the granularity of data rank, i.e., the number of linearly independent EEG channels. When the data are rank-full (i.e., all channels are independent), ICA produces as many independent components (ICs) as the number of input channels (rank-full decomposition). However, when the input data are rank-deficient, as is the case with bridged or interpolated electrodes, ICA produces the same number of ICs as the data rank (forced rank deficiency decomposition), introducing undesired ghost ICs and indicating a bug in ICA. We demonstrated that the ghost ICs have white noise properties, in both time and frequency domains, while maintaining surprisingly typical scalp topographies, and can therefore be easily missed by EEG researchers and affect findings in unknown ways. This problem occurs when the minimum eigenvalue λ min of the input data is smaller than a certain threshold, leading to matrix inversion failure as if the rank-deficient inversion was forced, even if the data rank is cleanly deficient by one. We defined this problem as the effective rank deficiency. Using sound file mixing simulations, we first demonstrated the effective rank deficiency problem and determined that the critical threshold for λ min is 10−7 in the given situation. Second, we used empirical EEG data to show how two preprocessing stages, re-referencing to average without including the initial reference and non-linear electrode interpolation, caused this forced rank deficiency problem. Finally, we showed that the effective rank deficiency problem can be solved by using the identified threshold ( λ min = 10−7) and the correct re-referencing procedure described herein. The former ensures the achievement of effective rank-full decomposition by properly reducing the input data rank, and the latter allows avoidance of a widely practiced incorrect re-referencing approach. Based on the current literature, we discuss the ambiguous status of the initial reference electrode when re-referencing. We have made our data and code available to facilitate the implementation of our recommendations by the EEG community.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83631245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-13DOI: 10.3389/frsip.2023.1137006
Junjie Ke, Tian Zhang, Yilin Wang, P. Milanfar, Feng Yang
No-reference video quality assessment (NR-VQA) for user generated content (UGC) is crucial for understanding and improving visual experience. Unlike video recognition tasks, VQA tasks are sensitive to changes in input resolution. Since large amounts of UGC videos nowadays are 720p or above, the fixed and relatively small input used in conventional NR-VQA methods results in missing high-frequency details for many videos. In this paper, we propose a novel Transformer-based NR-VQA framework that preserves the high-resolution quality information. With the multi-resolution input representation and a novel multi-resolution patch sampling mechanism, our method enables a comprehensive view of both the global video composition and local high-resolution details. The proposed approach can effectively aggregate quality information across different granularities in spatial and temporal dimensions, making the model robust to input resolution variations. Our method achieves state-of-the-art performance on large-scale UGC VQA datasets LSVQ and LSVQ-1080p, and on KoNViD-1k and LIVE-VQC without fine-tuning.
{"title":"MRET: Multi-resolution transformer for video quality assessment","authors":"Junjie Ke, Tian Zhang, Yilin Wang, P. Milanfar, Feng Yang","doi":"10.3389/frsip.2023.1137006","DOIUrl":"https://doi.org/10.3389/frsip.2023.1137006","url":null,"abstract":"No-reference video quality assessment (NR-VQA) for user generated content (UGC) is crucial for understanding and improving visual experience. Unlike video recognition tasks, VQA tasks are sensitive to changes in input resolution. Since large amounts of UGC videos nowadays are 720p or above, the fixed and relatively small input used in conventional NR-VQA methods results in missing high-frequency details for many videos. In this paper, we propose a novel Transformer-based NR-VQA framework that preserves the high-resolution quality information. With the multi-resolution input representation and a novel multi-resolution patch sampling mechanism, our method enables a comprehensive view of both the global video composition and local high-resolution details. The proposed approach can effectively aggregate quality information across different granularities in spatial and temporal dimensions, making the model robust to input resolution variations. Our method achieves state-of-the-art performance on large-scale UGC VQA datasets LSVQ and LSVQ-1080p, and on KoNViD-1k and LIVE-VQC without fine-tuning.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"11 3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81040767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-10DOI: 10.3389/frsip.2023.1089366
A. Almeida, Weicong Li, Emery Schubert, John Smith, J. Wolfe
Measuring fine-grained physical interaction between the human player and the musical instrument can significantly improve our understanding of music performance. This article presents a Musical Instrument Performance Capture and Analysis Toolbox (MIPCAT) that can be used to capture and to process the physical control variables used by a musician while performing music. This includes both a measurement apparatus with sensors and a software toolbox for analysis. Several of the components used here can also be applied in other musical contexts. The system is here applied to the clarinet, where the instrument sensors record blowing pressure, reed position, tongue contact, and sound pressures in the mouth, mouthpiece, and barrel. Radiated sound and multiple videos are also recorded to allow details of the embouchure and the instrument’s motion to be determined. The software toolbox can synchronise measurements from different devices, including video sources, extract time-variable descriptors, segment by notes and excerpts, and summarise descriptors per note, phrase, or excerpt. An example of its application shows how to compare performances from different musicians.
{"title":"Recording and analysing physical control variables used in clarinet playing: A musical instrument performance capture and analysis toolbox (MIPCAT)","authors":"A. Almeida, Weicong Li, Emery Schubert, John Smith, J. Wolfe","doi":"10.3389/frsip.2023.1089366","DOIUrl":"https://doi.org/10.3389/frsip.2023.1089366","url":null,"abstract":"Measuring fine-grained physical interaction between the human player and the musical instrument can significantly improve our understanding of music performance. This article presents a Musical Instrument Performance Capture and Analysis Toolbox (MIPCAT) that can be used to capture and to process the physical control variables used by a musician while performing music. This includes both a measurement apparatus with sensors and a software toolbox for analysis. Several of the components used here can also be applied in other musical contexts. The system is here applied to the clarinet, where the instrument sensors record blowing pressure, reed position, tongue contact, and sound pressures in the mouth, mouthpiece, and barrel. Radiated sound and multiple videos are also recorded to allow details of the embouchure and the instrument’s motion to be determined. The software toolbox can synchronise measurements from different devices, including video sources, extract time-variable descriptors, segment by notes and excerpts, and summarise descriptors per note, phrase, or excerpt. An example of its application shows how to compare performances from different musicians.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89084071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-09DOI: 10.3389/frsip.2022.936875
Sunil Kumar Vengalil, Bharath K. Krishnamurthy, N. Sinha
Introduction: Fundal imaging is the most commonly used non-invasive technique for early detection of many retinal diseases such as diabetic retinopathy (DR). An initial step in automatic processing of fundal images for detecting diseases is to identify and segment the normal landmarks: the optic disc, blood vessels, and macula. In addition to these structures, other parameters such as exudates that help in pathological evaluations are also visible in fundal images. Segmenting features like blood vessels pose multiple challenges because of their fine-grained structure that must be captured at original resolution and the fact that they are spread across the entire retina with varying patterns and densities. Exudates appear as white patches of irregular shapes that occur at multiple locations, and they can be confused with the optic disc, if features like brightness or color are used for segmentation. Methods: Segmentation algorithms solely based on image processing involve multiple parameters and thresholds that need to be tuned. Another approach is to use machine learning models with inputs of hand-crafted features to segment the image. The challenge in this approach is to identify the correct features and then devise algorithms to extract these features. End-to-end deep neural networks take raw images with minimal preprocessing, such as resizing and normalization, as inputs, learn a set of images in the intermediate layers, and then perform the segmentation in the last layer. These networks tend to have longer training and prediction times because of the complex architecture which can involve millions of parameters. This also necessitates huge numbers of training images (2000‒10,000). For structures like blood vessels and exudates that are spread across the entire image, one approach used to increase the training data is to generate multiple patches from a single training image, thus increasing the total number of training samples. Patch-based time cannot be applied to structures like the optic disc and fovea that appear only once per image. Also the prediction time is larger because segmenting a full image involves segmenting multiple patches in the image. Results and Discussion: Most of the existing research has been focused on segmenting these structures independently to achieve high performance metrics. In this work, we propose a multi-tasking, deep learning architecture for segmenting the optic disc, blood vessels, macula, and exudates simultaneously. Both training and prediction are performed using the whole image. The objective was to improve the prediction results on blood vessels and exudates, which are relatively more challenging, while utilizing segmentation of the optic disc and the macula as auxiliary tasks. Our experimental results on images from publicly available datasets show that simultaneous segmentation of all these structures results in a significant improvement in performance. The proposed approach makes predictions of all f
{"title":"Simultaneous segmentation of multiple structures in fundal images using multi-tasking deep neural networks","authors":"Sunil Kumar Vengalil, Bharath K. Krishnamurthy, N. Sinha","doi":"10.3389/frsip.2022.936875","DOIUrl":"https://doi.org/10.3389/frsip.2022.936875","url":null,"abstract":"Introduction: Fundal imaging is the most commonly used non-invasive technique for early detection of many retinal diseases such as diabetic retinopathy (DR). An initial step in automatic processing of fundal images for detecting diseases is to identify and segment the normal landmarks: the optic disc, blood vessels, and macula. In addition to these structures, other parameters such as exudates that help in pathological evaluations are also visible in fundal images. Segmenting features like blood vessels pose multiple challenges because of their fine-grained structure that must be captured at original resolution and the fact that they are spread across the entire retina with varying patterns and densities. Exudates appear as white patches of irregular shapes that occur at multiple locations, and they can be confused with the optic disc, if features like brightness or color are used for segmentation. Methods: Segmentation algorithms solely based on image processing involve multiple parameters and thresholds that need to be tuned. Another approach is to use machine learning models with inputs of hand-crafted features to segment the image. The challenge in this approach is to identify the correct features and then devise algorithms to extract these features. End-to-end deep neural networks take raw images with minimal preprocessing, such as resizing and normalization, as inputs, learn a set of images in the intermediate layers, and then perform the segmentation in the last layer. These networks tend to have longer training and prediction times because of the complex architecture which can involve millions of parameters. This also necessitates huge numbers of training images (2000‒10,000). For structures like blood vessels and exudates that are spread across the entire image, one approach used to increase the training data is to generate multiple patches from a single training image, thus increasing the total number of training samples. Patch-based time cannot be applied to structures like the optic disc and fovea that appear only once per image. Also the prediction time is larger because segmenting a full image involves segmenting multiple patches in the image. Results and Discussion: Most of the existing research has been focused on segmenting these structures independently to achieve high performance metrics. In this work, we propose a multi-tasking, deep learning architecture for segmenting the optic disc, blood vessels, macula, and exudates simultaneously. Both training and prediction are performed using the whole image. The objective was to improve the prediction results on blood vessels and exudates, which are relatively more challenging, while utilizing segmentation of the optic disc and the macula as auxiliary tasks. Our experimental results on images from publicly available datasets show that simultaneous segmentation of all these structures results in a significant improvement in performance. The proposed approach makes predictions of all f","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74451103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-05DOI: 10.3389/frsip.2022.1074053
Ze Li, Yue Li
Forward-looking imaging for maneuvering platforms has garnered significant interest in many military and civilian fields. As the maneuvering trajectory in the scanning period can be simplified as the constant acceleration maneuver, monopulse imaging is applied to enhance the azimuthal resolution of the forward-looking image. However, the maneuver causes severe range migration and Doppler shift; this often results in range location error due to the space-varying Doppler shifts and the failure of angle estimation. We propose a decimation keystone algorithm based on the chirp-Z transform (CZT). First, the pulse repetition frequency (PRF) is decimated with an integer; thus, the azimuthal sampling sequence is decimated into many sub-sequences. Then, the linear range walk correction (LRWC) is performed on each sub-sequence using the keystone transform, significantly reducing the influence of the change of Doppler-ambiguity-number on range location. Further, the sub-sequences are regrouped as one sequence, and the range curvature due to the acceleration is compensated in the frequency domain. Finally, the varying Doppler centroid in each coherent processing interval (CPI) is analyzed and compensated for the sum-difference angular measurements. Simulation results demonstrate the effectiveness of the proposed algorithm for forward-looking imaging under constant acceleration maneuvers and the feasibility of range location error correction.
{"title":"Decimation keystone algorithm for forward-looking monopulse imaging on platforms with uniformly accelerated motion","authors":"Ze Li, Yue Li","doi":"10.3389/frsip.2022.1074053","DOIUrl":"https://doi.org/10.3389/frsip.2022.1074053","url":null,"abstract":"Forward-looking imaging for maneuvering platforms has garnered significant interest in many military and civilian fields. As the maneuvering trajectory in the scanning period can be simplified as the constant acceleration maneuver, monopulse imaging is applied to enhance the azimuthal resolution of the forward-looking image. However, the maneuver causes severe range migration and Doppler shift; this often results in range location error due to the space-varying Doppler shifts and the failure of angle estimation. We propose a decimation keystone algorithm based on the chirp-Z transform (CZT). First, the pulse repetition frequency (PRF) is decimated with an integer; thus, the azimuthal sampling sequence is decimated into many sub-sequences. Then, the linear range walk correction (LRWC) is performed on each sub-sequence using the keystone transform, significantly reducing the influence of the change of Doppler-ambiguity-number on range location. Further, the sub-sequences are regrouped as one sequence, and the range curvature due to the acceleration is compensated in the frequency domain. Finally, the varying Doppler centroid in each coherent processing interval (CPI) is analyzed and compensated for the sum-difference angular measurements. Simulation results demonstrate the effectiveness of the proposed algorithm for forward-looking imaging under constant acceleration maneuvers and the feasibility of range location error correction.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77190655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-30DOI: 10.3389/frsip.2022.1019253
Shiva Salsabilian, L. Najafizadeh
Developing models for identifying mild traumatic brain injury (mTBI) has often been challenging due to large variations in data from subjects, resulting in difficulties for the mTBI-identification models to generalize to data from unseen subjects. To tackle this problem, we present a long short-term memory-based adversarial variational autoencoder (LSTM-AVAE) framework for subject-invariant mTBI feature extraction. In the proposed model, first, an LSTM variational autoencoder (LSTM-VAE) combines the representation learning ability of the variational autoencoder (VAE) with the temporal modeling characteristics of the LSTM to learn the latent space representations from neural activity. Then, to detach the subject’s individuality from neural feature representations, and make the model proper for cross-subject transfer learning, an adversary network is attached to the encoder in a discriminative setting. The model is trained using the 1 held-out approach. The trained encoder is then used to extract the representations from the held-out subject’s data. The extracted representations are then classified into normal and mTBI groups using different classifiers. The proposed model is evaluated on cortical recordings of Thy1-GCaMP6s transgenic mice obtained via widefield calcium imaging, prior to and after inducing injury. In cross-subject transfer learning experiment, the proposed LSTM-AVAE framework achieves classification accuracy results of 95.8% and 97.79%, without and with utilizing conditional VAE (cVAE), respectively, demonstrating that the proposed model is capable of learning invariant representations from mTBI data.
由于来自受试者的数据差异很大,开发识别轻度创伤性脑损伤(mTBI)的模型通常具有挑战性,导致mTBI识别模型难以推广到来自未见受试者的数据。为了解决这个问题,我们提出了一个基于长短期记忆的对抗变分自编码器(LSTM-AVAE)框架,用于主题不变的mTBI特征提取。在该模型中,首先,LSTM变分自编码器(LSTM-VAE)将变分自编码器(VAE)的表征学习能力与LSTM的时间建模特性相结合,从神经活动中学习潜在空间表征。然后,为了将受试者的个性从神经特征表征中分离出来,并使模型适合跨主题迁移学习,在判别设置中将对手网络附加到编码器上。该模型使用1 - hold -out方法进行训练。然后使用经过训练的编码器从滞留对象的数据中提取表征。然后使用不同的分类器将提取的表示分类为正常组和mTBI组。在诱导损伤之前和之后,通过宽视场钙成像获得Thy1-GCaMP6s转基因小鼠的皮质记录来评估所提出的模型。在跨学科迁移学习实验中,LSTM-AVAE框架在不使用条件VAE (cVAE)和使用条件VAE (cVAE)的情况下,分类准确率分别达到95.8%和97.79%,表明该模型能够从mTBI数据中学习不变表征。
{"title":"Subject-invariant feature learning for mTBI identification using LSTM-based variational autoencoder with adversarial regularization","authors":"Shiva Salsabilian, L. Najafizadeh","doi":"10.3389/frsip.2022.1019253","DOIUrl":"https://doi.org/10.3389/frsip.2022.1019253","url":null,"abstract":"Developing models for identifying mild traumatic brain injury (mTBI) has often been challenging due to large variations in data from subjects, resulting in difficulties for the mTBI-identification models to generalize to data from unseen subjects. To tackle this problem, we present a long short-term memory-based adversarial variational autoencoder (LSTM-AVAE) framework for subject-invariant mTBI feature extraction. In the proposed model, first, an LSTM variational autoencoder (LSTM-VAE) combines the representation learning ability of the variational autoencoder (VAE) with the temporal modeling characteristics of the LSTM to learn the latent space representations from neural activity. Then, to detach the subject’s individuality from neural feature representations, and make the model proper for cross-subject transfer learning, an adversary network is attached to the encoder in a discriminative setting. The model is trained using the 1 held-out approach. The trained encoder is then used to extract the representations from the held-out subject’s data. The extracted representations are then classified into normal and mTBI groups using different classifiers. The proposed model is evaluated on cortical recordings of Thy1-GCaMP6s transgenic mice obtained via widefield calcium imaging, prior to and after inducing injury. In cross-subject transfer learning experiment, the proposed LSTM-AVAE framework achieves classification accuracy results of 95.8% and 97.79%, without and with utilizing conditional VAE (cVAE), respectively, demonstrating that the proposed model is capable of learning invariant representations from mTBI data.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"86 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80586582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-22DOI: 10.3389/frsip.2022.1067055
Wenjing Liu, Xiqing Liu, Shi Yan, Ling Zhao, M. Peng
The evaporation duct is an effective means for realizing non-line-of-sight (NLOS) wireless transmission over the sea. However, the effects of marine weather conditions on electromagnetic propagation have rarely been studied. In this study, the influence of the marine atmospheric environment on electromagnetic propagation was analyzed through numerical simulation. Additionally, the impacts of antenna height, transmission distance, and electromagnetic wave frequency on path loss were studied. Finally, the link capacity of the code division multiplexing (CDM) communication system in the evaporation duct environment was studied via numerical analysis and simulations. Simulation results demonstrated that CDM communication technology can improve the link capacity under an evaporation duct compared with that of the spread-spectrum communication technology.
{"title":"Performance analysis of code division multiplexing communication under evaporation duct environment","authors":"Wenjing Liu, Xiqing Liu, Shi Yan, Ling Zhao, M. Peng","doi":"10.3389/frsip.2022.1067055","DOIUrl":"https://doi.org/10.3389/frsip.2022.1067055","url":null,"abstract":"The evaporation duct is an effective means for realizing non-line-of-sight (NLOS) wireless transmission over the sea. However, the effects of marine weather conditions on electromagnetic propagation have rarely been studied. In this study, the influence of the marine atmospheric environment on electromagnetic propagation was analyzed through numerical simulation. Additionally, the impacts of antenna height, transmission distance, and electromagnetic wave frequency on path loss were studied. Finally, the link capacity of the code division multiplexing (CDM) communication system in the evaporation duct environment was studied via numerical analysis and simulations. Simulation results demonstrated that CDM communication technology can improve the link capacity under an evaporation duct compared with that of the spread-spectrum communication technology.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83659945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}