Intelligent reflecting surface (IRS) is expected to play a pivotal role in future wireless sensing networks owing to its potential for high-resolution and high-accuracy sensing. In this work, we investigate a multi-target direction-of-arrival (DoA) estimation problem in a semi-passive IRS-assisted sensing system, where IRS reflecting elements (REs) reflect signals from the base station to targets, and IRS sensing elements (SEs) estimate DoA based on echo signals reflected by the targets. {First of all, instead of solely relying on IRS SEs for DoA estimation as done in the existing literature, this work fully exploits the DoA information embedded in both IRS REs and SEs matrices via the atomic norm minimization (ANM) scheme. Subsequently, the Cram'er-Rao bound for DoA estimation is derived, revealing an inverse proportionality to $MN^3+NM^3$ under the case of identity covariance matrix of the IRS measurement matrix and a single target, where $M$ and $N$ are the number of IRS SEs and REs, respectively. Finally, extensive numerical results substantiate the superior accuracy and resolution performance of the proposed ANM-based DoA estimation method over representative baselines.
智能反射面(IRS)具有高分辨率和高精度传感的潜力,因此有望在未来的无线传感网络中发挥关键作用。在这项工作中,我们研究了半被动 IRS 辅助传感系统中的多目标到达方向(DoA)估计问题,其中 IRS 反射元件(RE)将信号从基站反射到目标,IRS 传感元件(SE)根据目标反射的信号估计 DoA。{首先,这项工作并不像现有文献那样仅仅依靠 IRS SEs 来估计 DoA,而是通过原子规范最小化(ANM)方案,有效地利用了嵌入在 IRS REs 和 SEs 矩阵中的 DoA 信息。随后,推导出了 DoA 估计的 Cram'er-Raobound ,揭示了在 IRS 测量矩阵的同方差矩阵和单一目标的情况下,与 $MN^3+NM^3$ 的反比例关系,其中 $M$ 和 $N$ 分别是 IRS SE 和 RE 的数量。最后,大量的数值结果证明了基于 ANM 的 DoAestimation 方法比有代表性的基线方法具有更高的精度和分辨率。
{"title":"Atomic Norm Minimization-based DoA Estimation for IRS-assisted Sensing Systems","authors":"Renwang Li, Shu Sun, Meixia Tao","doi":"arxiv-2409.09982","DOIUrl":"https://doi.org/arxiv-2409.09982","url":null,"abstract":"Intelligent reflecting surface (IRS) is expected to play a pivotal role in\u0000future wireless sensing networks owing to its potential for high-resolution and\u0000high-accuracy sensing. In this work, we investigate a multi-target\u0000direction-of-arrival (DoA) estimation problem in a semi-passive IRS-assisted\u0000sensing system, where IRS reflecting elements (REs) reflect signals from the\u0000base station to targets, and IRS sensing elements (SEs) estimate DoA based on\u0000echo signals reflected by the targets. {First of all, instead of solely relying\u0000on IRS SEs for DoA estimation as done in the existing literature, this work\u0000fully exploits the DoA information embedded in both IRS REs and SEs matrices\u0000via the atomic norm minimization (ANM) scheme. Subsequently, the Cram'er-Rao\u0000bound for DoA estimation is derived, revealing an inverse proportionality to\u0000$MN^3+NM^3$ under the case of identity covariance matrix of the IRS measurement\u0000matrix and a single target, where $M$ and $N$ are the number of IRS SEs and\u0000REs, respectively. Finally, extensive numerical results substantiate the\u0000superior accuracy and resolution performance of the proposed ANM-based DoA\u0000estimation method over representative baselines.","PeriodicalId":501034,"journal":{"name":"arXiv - EE - Signal Processing","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the near-field context, the Fresnel approximation is typically employed to mathematically represent solvable functions of spherical waves. However, these efforts may fail to take into account the significant increase in the lower limit of the Fresnel approximation, known as the Fresnel distance. The lower bound of the Fresnel approximation imposes a constraint that becomes more pronounced as the array size grows. Beyond this constraint, the validity of the Fresnel approximation is broken. As a potential solution, the wavenumber-domain paradigm characterizes the spherical wave using a spectrum composed of a series of linear orthogonal bases. However, this approach falls short of covering the effects of the array geometry, especially when using Gaussian-mixed-model (GMM)-based von Mises-Fisher distributions to approximate all spectra. To fill this gap, this paper introduces a novel wavenumber-domain ellipse fitting (WDEF) method to tackle these challenges. Particularly, the channel is accurately estimated in the near-field region, by maximizing the closed-form likelihood function of the wavenumber-domain spectrum conditioned on the scatterers' geometric parameters. Simulation results are provided to demonstrate the robustness of the proposed scheme against both the distance and angles of arrival.
在近场情况下,通常采用菲涅尔近似来表示球面波的可解函数。然而,这些努力可能没有考虑到菲涅尔近似下限(即菲涅尔距离)的显著增加。菲涅尔近似的下限施加了一个约束,随着阵列尺寸的增大,这个约束变得更加明显。超过这个限制,菲涅尔近似的有效性就会被打破。作为一种潜在的解决方案,波长域范式使用由一系列线性正交基组成的频谱来描述球面波。然而,这种方法无法涵盖阵列几何的影响,尤其是在使用基于高斯混合模型(GMM)的 von Mises-Fisher 分布来近似所有频谱时。为了填补这一空白,本文引入了一种新颖的波数域椭圆拟合(WDEF)方法来应对这些挑战。特别是,通过最大化以散射体几何参数为条件的闭式频谱似然函数,可以准确估计近场区域的信道。仿真结果证明了所提方案对到达距离和到达角的稳健性。
{"title":"Wavenumber-Domain Near-Field Channel Estimation: Beyond the Fresnel Bound","authors":"Xufeng Guo, Yuanbin Chen, Ying Wang, Zhaocheng Wang, Chau Yuen","doi":"arxiv-2409.10123","DOIUrl":"https://doi.org/arxiv-2409.10123","url":null,"abstract":"In the near-field context, the Fresnel approximation is typically employed to\u0000mathematically represent solvable functions of spherical waves. However, these\u0000efforts may fail to take into account the significant increase in the lower\u0000limit of the Fresnel approximation, known as the Fresnel distance. The lower\u0000bound of the Fresnel approximation imposes a constraint that becomes more\u0000pronounced as the array size grows. Beyond this constraint, the validity of the\u0000Fresnel approximation is broken. As a potential solution, the wavenumber-domain\u0000paradigm characterizes the spherical wave using a spectrum composed of a series\u0000of linear orthogonal bases. However, this approach falls short of covering the\u0000effects of the array geometry, especially when using Gaussian-mixed-model\u0000(GMM)-based von Mises-Fisher distributions to approximate all spectra. To fill\u0000this gap, this paper introduces a novel wavenumber-domain ellipse fitting\u0000(WDEF) method to tackle these challenges. Particularly, the channel is\u0000accurately estimated in the near-field region, by maximizing the closed-form\u0000likelihood function of the wavenumber-domain spectrum conditioned on the\u0000scatterers' geometric parameters. Simulation results are provided to\u0000demonstrate the robustness of the proposed scheme against both the distance and\u0000angles of arrival.","PeriodicalId":501034,"journal":{"name":"arXiv - EE - Signal Processing","volume":"75 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the sixth generation (6G) of cellular networks, the demands for capacity and connectivity will increase dramatically to meet the requirements of emerging services for both humans and machines. Semantic communication has shown great potential because of its efficiency, and suitability for users who only care about the semantic meaning. But bit communication is still needed for users requiring original messages. Therefore, there will be a coexistence of semantic and bit communications in future networks. This motivates us to explore how to allocate resources in such a coexistence scenario. We investigate different uplink multiple access (MA) schemes for the coexistence of semantic users and a bit user, namely orthogonal multiple access (OMA), non-orthogonal multiple access (NOMA) and rate-splitting multiple access (RSMA). We characterize the rate regions achieved by those MA schemes. The simulation results show that RSMA always outperforms NOMA and has better performance in high semantic rate regimes compared to OMA. We find that RSMA scheme design, rate region, and power allocation are quite different in the coexistence scenario compared to the bit-only communication, primarily due to the need to consider the understandability in semantic communications. Interestingly, in contrast to bit-only communications where RSMA is capacity achieving without any need for time sharing, in the coexistence scenario, time sharing helps enlarging RSMA rate region.
在第六代(6G)蜂窝网络中,对容量和连接性的要求将急剧增加,以满足人类和机器对新兴服务的需求。语义通信因其效率高、适合只关心语义的用户而显示出巨大的潜力。但是,需要原始信息的用户仍然需要比特通信。因此,在未来的网络中,语义通信和比特通信将并存。这促使我们探索如何在这种共存情况下分配资源。我们研究了语义用户和比特用户共存时的不同上行链路多址接入(MA)方案,即正交多址接入(OMA)、非正交多址接入(NOMA)和速率分割多址接入(RSMA)。我们描述了这些多址接入方案实现的速率区域。仿真结果表明,RSMA 的性能始终优于 NOMA,而且与 OMA 相比,RSMA 在高语义速率区的性能更好。我们发现,在共存场景中,RSMA 方案的设计、速率区域和功率分配与纯比特通信相比有很大不同,这主要是由于需要考虑语义通信中的可理解性。有趣的是,在纯比特通信中,RSMA 无需分时即可实现容量,而在共存场景中,分时有助于扩大 RSMA 的速率区域。
{"title":"Rate-Splitting Multiple Access for Coexistence of Semantic and Bit Communications","authors":"Yuanwen Liu, Bruno Clerckx","doi":"arxiv-2409.10314","DOIUrl":"https://doi.org/arxiv-2409.10314","url":null,"abstract":"In the sixth generation (6G) of cellular networks, the demands for capacity\u0000and connectivity will increase dramatically to meet the requirements of\u0000emerging services for both humans and machines. Semantic communication has\u0000shown great potential because of its efficiency, and suitability for users who\u0000only care about the semantic meaning. But bit communication is still needed for\u0000users requiring original messages. Therefore, there will be a coexistence of\u0000semantic and bit communications in future networks. This motivates us to\u0000explore how to allocate resources in such a coexistence scenario. We\u0000investigate different uplink multiple access (MA) schemes for the coexistence\u0000of semantic users and a bit user, namely orthogonal multiple access (OMA),\u0000non-orthogonal multiple access (NOMA) and rate-splitting multiple access\u0000(RSMA). We characterize the rate regions achieved by those MA schemes. The\u0000simulation results show that RSMA always outperforms NOMA and has better\u0000performance in high semantic rate regimes compared to OMA. We find that RSMA\u0000scheme design, rate region, and power allocation are quite different in the\u0000coexistence scenario compared to the bit-only communication, primarily due to\u0000the need to consider the understandability in semantic communications.\u0000Interestingly, in contrast to bit-only communications where RSMA is capacity\u0000achieving without any need for time sharing, in the coexistence scenario, time\u0000sharing helps enlarging RSMA rate region.","PeriodicalId":501034,"journal":{"name":"arXiv - EE - Signal Processing","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Charbel Bou Chaaya, Abanoub M. Girgis, Mehdi Bennis
In this work, we propose a novel data-driven machine learning (ML) technique to model and predict the dynamics of the wireless propagation environment in latent space. Leveraging the idea of channel charting, which learns compressed representations of high-dimensional channel state information (CSI), we incorporate a predictive component to capture the dynamics of the wireless system. Hence, we jointly learn a channel encoder that maps the estimated CSI to an appropriate latent space, and a predictor that models the relationships between such representations. Accordingly, our problem boils down to training a joint-embedding predictive architecture (JEPA) that simulates the latent dynamics of a wireless network from CSI. We present numerical evaluations on measured data and show that the proposed JEPA displays a two-fold increase in accuracy over benchmarks, for longer look-ahead prediction tasks.
{"title":"Learning Latent Wireless Dynamics from Channel State Information","authors":"Charbel Bou Chaaya, Abanoub M. Girgis, Mehdi Bennis","doi":"arxiv-2409.10045","DOIUrl":"https://doi.org/arxiv-2409.10045","url":null,"abstract":"In this work, we propose a novel data-driven machine learning (ML) technique\u0000to model and predict the dynamics of the wireless propagation environment in\u0000latent space. Leveraging the idea of channel charting, which learns compressed\u0000representations of high-dimensional channel state information (CSI), we\u0000incorporate a predictive component to capture the dynamics of the wireless\u0000system. Hence, we jointly learn a channel encoder that maps the estimated CSI\u0000to an appropriate latent space, and a predictor that models the relationships\u0000between such representations. Accordingly, our problem boils down to training a\u0000joint-embedding predictive architecture (JEPA) that simulates the latent\u0000dynamics of a wireless network from CSI. We present numerical evaluations on\u0000measured data and show that the proposed JEPA displays a two-fold increase in\u0000accuracy over benchmarks, for longer look-ahead prediction tasks.","PeriodicalId":501034,"journal":{"name":"arXiv - EE - Signal Processing","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Current emotional text-to-speech (TTS) models predominantly conduct supervised training to learn the conversion from text and desired emotion to its emotional speech, focusing on a single emotion per text-speech pair. These models only learn the correct emotional outputs without fully comprehending other emotion characteristics, which limits their capabilities of capturing the nuances between different emotions. We propose a controllable Emo-DPO approach, which employs direct preference optimization to differentiate subtle emotional nuances between emotions through optimizing towards preferred emotions over less preferred emotional ones. Instead of relying on traditional neural architectures used in existing emotional TTS models, we propose utilizing the emotion-aware LLM-TTS neural architecture to leverage LLMs' in-context learning and instruction-following capabilities. Comprehensive experiments confirm that our proposed method outperforms the existing baselines.
{"title":"Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization","authors":"Xiaoxue Gao, Chen Zhang, Yiming Chen, Huayun Zhang, Nancy F. Chen","doi":"arxiv-2409.10157","DOIUrl":"https://doi.org/arxiv-2409.10157","url":null,"abstract":"Current emotional text-to-speech (TTS) models predominantly conduct\u0000supervised training to learn the conversion from text and desired emotion to\u0000its emotional speech, focusing on a single emotion per text-speech pair. These\u0000models only learn the correct emotional outputs without fully comprehending\u0000other emotion characteristics, which limits their capabilities of capturing the\u0000nuances between different emotions. We propose a controllable Emo-DPO approach,\u0000which employs direct preference optimization to differentiate subtle emotional\u0000nuances between emotions through optimizing towards preferred emotions over\u0000less preferred emotional ones. Instead of relying on traditional neural\u0000architectures used in existing emotional TTS models, we propose utilizing the\u0000emotion-aware LLM-TTS neural architecture to leverage LLMs' in-context learning\u0000and instruction-following capabilities. Comprehensive experiments confirm that\u0000our proposed method outperforms the existing baselines.","PeriodicalId":501034,"journal":{"name":"arXiv - EE - Signal Processing","volume":"75 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xi Wang, Xin Liu, Songming Zhu, Zhanwen Li, Lina Gao
The recent emergence of Distributed Acoustic Sensing (DAS) technology has facilitated the effective capture of traffic-induced seismic data. The traffic-induced seismic wave is a prominent contributor to urban vibrations and contain crucial information to advance urban exploration and governance. However, identifying vehicular movements within massive noisy data poses a significant challenge. In this study, we introduce a real-time semi-supervised vehicle monitoring framework tailored to urban settings. It requires only a small fraction of manual labels for initial training and exploits unlabeled data for model improvement. Additionally, the framework can autonomously adapt to newly collected unlabeled data. Before DAS data undergo object detection as two-dimensional images to preserve spatial information, we leveraged comprehensive one-dimensional signal preprocessing to mitigate noise. Furthermore, we propose a novel prior loss that incorporates the shapes of vehicular traces to track a single vehicle with varying speeds. To evaluate our model, we conducted experiments with seismic data from the Stanford 2 DAS Array. The results showed that our model outperformed the baseline model Efficient Teacher and its supervised counterpart, YOLO (You Only Look Once), in both accuracy and robustness. With only 35 labeled images, our model surpassed YOLO's mAP 0.5:0.95 criterion by 18% and showed a 7% increase over Efficient Teacher. We conducted comparative experiments with multiple update strategies for self-updating and identified an optimal approach. This approach surpasses the performance of non-overfitting training conducted with all data in a single pass.
最近出现的分布式声学传感(DAS)技术有助于有效捕捉交通诱发的地震数据。交通诱发的地震波是城市振动的一个突出因素,包含着推进城市探索和治理的重要信息。然而,在海量噪声数据中识别车辆运动是一项重大挑战。在这项研究中,我们引入了一个专为城市环境定制的实时半监督车辆监测框架。它只需要少量人工标签进行初始训练,并利用无标签数据改进模型。此外,该框架还能自主适应新收集到的未标记数据。在将 DAS 数据作为二维图像进行物体检测以保留空间信息之前,我们利用全面的一维信号预处理来减少噪声。此外,我们还提出了一种新颖的先验损失,它结合了车辆轨迹的形状来跟踪不同速度的单个车辆。为了评估我们的模型,我们使用斯坦福 2 DAS 阵列的地震数据进行了实验。结果表明,我们的模型在准确性和鲁棒性方面都优于基线模型 "高效教师"(Efficient Teacher)及其监督模型 "YOLO"(You Only Look Once)。在只有 35 张标注图像的情况下,我们的模型比 YOLO 的 mAP 0.5:0.95 标准高出 18%,比 Efficient Teacher 高出 7%。我们使用多种自我更新策略进行了对比实验,并确定了一种最佳方法。这种方法的性能超过了单次使用所有数据进行非过拟合训练的效果。
{"title":"Self-Updating Vehicle Monitoring Framework Employing Distributed Acoustic Sensing towards Real-World Settings","authors":"Xi Wang, Xin Liu, Songming Zhu, Zhanwen Li, Lina Gao","doi":"arxiv-2409.10259","DOIUrl":"https://doi.org/arxiv-2409.10259","url":null,"abstract":"The recent emergence of Distributed Acoustic Sensing (DAS) technology has\u0000facilitated the effective capture of traffic-induced seismic data. The\u0000traffic-induced seismic wave is a prominent contributor to urban vibrations and\u0000contain crucial information to advance urban exploration and governance.\u0000However, identifying vehicular movements within massive noisy data poses a\u0000significant challenge. In this study, we introduce a real-time semi-supervised\u0000vehicle monitoring framework tailored to urban settings. It requires only a\u0000small fraction of manual labels for initial training and exploits unlabeled\u0000data for model improvement. Additionally, the framework can autonomously adapt\u0000to newly collected unlabeled data. Before DAS data undergo object detection as\u0000two-dimensional images to preserve spatial information, we leveraged\u0000comprehensive one-dimensional signal preprocessing to mitigate noise.\u0000Furthermore, we propose a novel prior loss that incorporates the shapes of\u0000vehicular traces to track a single vehicle with varying speeds. To evaluate our\u0000model, we conducted experiments with seismic data from the Stanford 2 DAS\u0000Array. The results showed that our model outperformed the baseline model\u0000Efficient Teacher and its supervised counterpart, YOLO (You Only Look Once), in\u0000both accuracy and robustness. With only 35 labeled images, our model surpassed\u0000YOLO's mAP 0.5:0.95 criterion by 18% and showed a 7% increase over Efficient\u0000Teacher. We conducted comparative experiments with multiple update strategies\u0000for self-updating and identified an optimal approach. This approach surpasses\u0000the performance of non-overfitting training conducted with all data in a single\u0000pass.","PeriodicalId":501034,"journal":{"name":"arXiv - EE - Signal Processing","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Since hybrid beamforming (HBF) can approach the performance of fully-digital beamforming (FDBF) with much lower hardware complexity, we investigate the HBF design for beam-hopping (BH) low earth orbit (LEO) satellite communications (SatComs). Aiming at maximizing the sum-rate of totally illuminated beam positions during the whole BH period, we consider joint beamforming and illumination pattern design subject to the HBF constraints and sum-rate requirements. To address the non-convexity of the HBF constraints, we temporarily replace the HBF constraints with the FDBF constraints. Then we propose an FDBF and illumination pattern random search (FDBF-IPRS) scheme to optimize illumination patterns and fully-digital beamformers using constrained random search and fractional programming methods. To further reduce the computational complexity, we propose an FDBF and illumination pattern alternating optimization (FDBF-IPAO) scheme, where we relax the integer illumination pattern to continuous variables and after finishing all the iterations we quantize the continuous variables into integer ones. Based on the fully-digital beamformers designed by the FDBF-IPRS or FDBF-IPAO scheme, we propose an HBF alternating minimization algorithm to design the hybrid beamformers. Simulation results show that the proposed schemes can achieve satisfactory sum-rate performance for BH LEO SatComs.
{"title":"Joint Beamforming and Illumination Pattern Design for Beam-Hopping LEO Satellite Communications","authors":"Jing Wang, Chenhao Qi, Shui Yu, Shiwen Mao","doi":"arxiv-2409.10127","DOIUrl":"https://doi.org/arxiv-2409.10127","url":null,"abstract":"Since hybrid beamforming (HBF) can approach the performance of fully-digital\u0000beamforming (FDBF) with much lower hardware complexity, we investigate the HBF\u0000design for beam-hopping (BH) low earth orbit (LEO) satellite communications\u0000(SatComs). Aiming at maximizing the sum-rate of totally illuminated beam\u0000positions during the whole BH period, we consider joint beamforming and\u0000illumination pattern design subject to the HBF constraints and sum-rate\u0000requirements. To address the non-convexity of the HBF constraints, we\u0000temporarily replace the HBF constraints with the FDBF constraints. Then we\u0000propose an FDBF and illumination pattern random search (FDBF-IPRS) scheme to\u0000optimize illumination patterns and fully-digital beamformers using constrained\u0000random search and fractional programming methods. To further reduce the\u0000computational complexity, we propose an FDBF and illumination pattern\u0000alternating optimization (FDBF-IPAO) scheme, where we relax the integer\u0000illumination pattern to continuous variables and after finishing all the\u0000iterations we quantize the continuous variables into integer ones. Based on the\u0000fully-digital beamformers designed by the FDBF-IPRS or FDBF-IPAO scheme, we\u0000propose an HBF alternating minimization algorithm to design the hybrid\u0000beamformers. Simulation results show that the proposed schemes can achieve\u0000satisfactory sum-rate performance for BH LEO SatComs.","PeriodicalId":501034,"journal":{"name":"arXiv - EE - Signal Processing","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multimodal schizophrenia assessment systems have gained traction over the last few years. This work introduces a schizophrenia assessment system to discern between prominent symptom classes of schizophrenia and predict an overall schizophrenia severity score. We develop a Vector Quantized Variational Auto-Encoder (VQ-VAE) based Multimodal Representation Learning (MRL) model to produce task-agnostic speech representations from vocal Tract Variables (TVs) and Facial Action Units (FAUs). These representations are then used in a Multi-Task Learning (MTL) based downstream prediction model to obtain class labels and an overall severity score. The proposed framework outperforms the previous works on the multi-class classification task across all evaluation metrics (Weighted F1 score, AUC-ROC score, and Weighted Accuracy). Additionally, it estimates the schizophrenia severity score, a task not addressed by earlier approaches.
多模态精神分裂症评估系统在过去几年中得到了广泛应用。这项研究介绍了一种精神分裂症评估系统,用于区分精神分裂症的主要症状类别,并预测精神分裂症的总体严重程度。我们开发了一种基于多模态表征学习(MRL)模型的矢量量化变异自动编码器(VQ-VAE),可从声道变量(TVs)和面部动作单元(FAUs)中生成与任务无关的语音表征。然后将这些表征用于基于多任务学习(MTL)的下游预测模型,以获得类别标签和总体严重程度评分。在多类分类任务的所有评价指标(加权 F1 分数、AUC-ROC 分数和加权准确率)上,所提出的框架都优于之前的研究成果。
{"title":"Self-supervised Multimodal Speech Representations for the Assessment of Schizophrenia Symptoms","authors":"Gowtham Premananth, Carol Espy-Wilson","doi":"arxiv-2409.09733","DOIUrl":"https://doi.org/arxiv-2409.09733","url":null,"abstract":"Multimodal schizophrenia assessment systems have gained traction over the\u0000last few years. This work introduces a schizophrenia assessment system to\u0000discern between prominent symptom classes of schizophrenia and predict an\u0000overall schizophrenia severity score. We develop a Vector Quantized Variational\u0000Auto-Encoder (VQ-VAE) based Multimodal Representation Learning (MRL) model to\u0000produce task-agnostic speech representations from vocal Tract Variables (TVs)\u0000and Facial Action Units (FAUs). These representations are then used in a\u0000Multi-Task Learning (MTL) based downstream prediction model to obtain class\u0000labels and an overall severity score. The proposed framework outperforms the\u0000previous works on the multi-class classification task across all evaluation\u0000metrics (Weighted F1 score, AUC-ROC score, and Weighted Accuracy).\u0000Additionally, it estimates the schizophrenia severity score, a task not\u0000addressed by earlier approaches.","PeriodicalId":501034,"journal":{"name":"arXiv - EE - Signal Processing","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brandon T. HuntMontana Technological University, Hussein MoradiIdaho National Laboratory, Behrouz Farhang-BoroujenyThe University of Utah
Growing traffic over the high-frequency (HF) band poses significant challenges to establishing robust communication links. While existing spread-spectrum HF transceivers are, to some degree, robust against harsh HF channel conditions, their performance significantly degrades in the presence of strong co-channel interference. To improve performance in congested channel conditions, we propose a filter-bank based multicarrier spread-spectrum waveform with noncontiguous subcarrier bands. The use of noncontiguous subcarriers allows the system to at once leverage the robustness of a wideband system while retaining the frequency agility of a narrowband system. In this study, we explore differences between contiguous and noncontiguous systems by considering their respective peak-to-average power ratios (PAPRs) and matched-filter responses. Additionally, we develop a modified filter-bank receiver structure to facilitate both efficient signal processing and noncontiguous channel estimation. We conclude by presenting simulated and over-the-air results of the noncontiguous waveform, demonstrating both its robustness in harsh HF channels and its enhanced performance in congested spectral conditions.
{"title":"Multicarrier Spread Spectrum Communications with Noncontiguous Subcarrier Bands for HF Skywave Links","authors":"Brandon T. HuntMontana Technological University, Hussein MoradiIdaho National Laboratory, Behrouz Farhang-BoroujenyThe University of Utah","doi":"arxiv-2409.09723","DOIUrl":"https://doi.org/arxiv-2409.09723","url":null,"abstract":"Growing traffic over the high-frequency (HF) band poses significant\u0000challenges to establishing robust communication links. While existing\u0000spread-spectrum HF transceivers are, to some degree, robust against harsh HF\u0000channel conditions, their performance significantly degrades in the presence of\u0000strong co-channel interference. To improve performance in congested channel\u0000conditions, we propose a filter-bank based multicarrier spread-spectrum\u0000waveform with noncontiguous subcarrier bands. The use of noncontiguous\u0000subcarriers allows the system to at once leverage the robustness of a wideband\u0000system while retaining the frequency agility of a narrowband system. In this\u0000study, we explore differences between contiguous and noncontiguous systems by\u0000considering their respective peak-to-average power ratios (PAPRs) and\u0000matched-filter responses. Additionally, we develop a modified filter-bank\u0000receiver structure to facilitate both efficient signal processing and\u0000noncontiguous channel estimation. We conclude by presenting simulated and\u0000over-the-air results of the noncontiguous waveform, demonstrating both its\u0000robustness in harsh HF channels and its enhanced performance in congested\u0000spectral conditions.","PeriodicalId":501034,"journal":{"name":"arXiv - EE - Signal Processing","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammadali Mohammadi, Zahra Mobini, Hien Quoc Ngo, Michail Matthaiou
We present an overview of ongoing research endeavors focused on in-band full-duplex (IBFD) massive multiple-input multiple-output (MIMO) systems and their applications. In response to the unprecedented demands for mobile traffic in concurrent and upcoming wireless networks, a paradigm shift from conventional cellular networks to distributed communication systems becomes imperative. Cell-free massive MIMO (CF-mMIMO) emerges as a practical and scalable implementation of distributed/network MIMO systems, serving as a crucial physical layer technology for the advancement of next-generation wireless networks. This architecture inherits benefits from co-located massive MIMO and distributed systems and provides the flexibility for integration with the IBFD technology. We delineate the evolutionary trajectory of cellular networks, transitioning from conventional half-duplex multi-user MIMO networks to IBFD CF-mMIMO. The discussion extends further to the emerging paradigm of network-assisted IBFD CF-mMIMO (NAFD CF-mMIMO), serving as an energy-efficient prototype for asymmetric uplink and downlink communication services. This novel approach finds applications in dual-functionality scenarios, including simultaneous wireless power and information transmission, wireless surveillance, and integrated sensing and communications. We highlight various current use case applications, discuss open challenges, and outline future research directions aimed at fully realizing the potential of NAFD CF-mMIMO systems to meet the evolving demands of future wireless networks.
我们概述了正在进行的有关带内全双工(IBFD)大规模多输入多输出(MIMO)系统及其应用的研究工作。为了应对并发和即将到来的无线网络中前所未有的移动通信需求,从传统蜂窝网络向分布式通信系统的模式转变变得非常重要。无蜂窝大规模多输入多输出(CF-mMIMO)是分布式/网络多输入多输出系统的一种实用且可扩展的实现方式,是推动下一代无线网络发展的重要物理层技术。这种架构继承了同地大规模 MIMO 和分布式系统的优点,并提供了与 IBFD 技术集成的灵活性。我们描绘了蜂窝网络从传统半双工多用户 MIMO 网络过渡到 IBFD CF-mMIMO 的演进轨迹。讨论进一步延伸到网络辅助 IBFD CF-mMIMO(NAFD CF-mMIMO)这一新兴范例,作为非对称上行和下行通信服务的能效原型。这种新方法可应用于双功能场景,包括同时进行无线电力和信息传输、无线监控以及综合传感和通信。我们重点介绍了当前的各种用例应用,讨论了面临的挑战,并概述了未来的研究方向,旨在充分发挥 NAFD CF-mMIMO 系统的潜力,满足未来无线网络不断发展的需求。
{"title":"Ten Years of Research Advances in Full-Duplex Massive MIMO","authors":"Mohammadali Mohammadi, Zahra Mobini, Hien Quoc Ngo, Michail Matthaiou","doi":"arxiv-2409.09732","DOIUrl":"https://doi.org/arxiv-2409.09732","url":null,"abstract":"We present an overview of ongoing research endeavors focused on in-band\u0000full-duplex (IBFD) massive multiple-input multiple-output (MIMO) systems and\u0000their applications. In response to the unprecedented demands for mobile traffic\u0000in concurrent and upcoming wireless networks, a paradigm shift from\u0000conventional cellular networks to distributed communication systems becomes\u0000imperative. Cell-free massive MIMO (CF-mMIMO) emerges as a practical and\u0000scalable implementation of distributed/network MIMO systems, serving as a\u0000crucial physical layer technology for the advancement of next-generation\u0000wireless networks. This architecture inherits benefits from co-located massive\u0000MIMO and distributed systems and provides the flexibility for integration with\u0000the IBFD technology. We delineate the evolutionary trajectory of cellular\u0000networks, transitioning from conventional half-duplex multi-user MIMO networks\u0000to IBFD CF-mMIMO. The discussion extends further to the emerging paradigm of\u0000network-assisted IBFD CF-mMIMO (NAFD CF-mMIMO), serving as an energy-efficient\u0000prototype for asymmetric uplink and downlink communication services. This novel\u0000approach finds applications in dual-functionality scenarios, including\u0000simultaneous wireless power and information transmission, wireless\u0000surveillance, and integrated sensing and communications. We highlight various\u0000current use case applications, discuss open challenges, and outline future\u0000research directions aimed at fully realizing the potential of NAFD CF-mMIMO\u0000systems to meet the evolving demands of future wireless networks.","PeriodicalId":501034,"journal":{"name":"arXiv - EE - Signal Processing","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142251418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}