首页 > 最新文献

IEEE Transactions on Machine Learning in Communications and Networking最新文献

英文 中文
Deep Learning-Based Positioning With Multi-Task Learning and Uncertainty-Based Fusion 基于多任务学习和不确定性融合的深度学习定位技术
Pub Date : 2024-08-09 DOI: 10.1109/TMLCN.2024.3441521
Anastasios Foliadis;Mario H. Castañeda Garcia;Richard A. Stirling-Gallacher;Reiner S. Thomä
Deep learning (DL) methods have been shown to improve the performance of several use cases for the fifth-generation (5G) New radio (NR) air interface. In this paper we investigate user equipment (UE) positioning using the channel state information (CSI) fingerprints between a UE and multiple base stations (BSs). In such a setup, we consider two different fusion techniques: early and late fusion. With early fusion, a single DL model can be trained for UE positioning by combining the CSI fingerprints of the multiple BSs as input. With late fusion, a separate DL model is trained at each BS using the CSI specific to that BS and the outputs of these individual models are then combined to determine the UE’s position. In this work we compare these different fusion techniques and show that fusing the outputs of separate models achieves higher positioning accuracy, especially in a dynamic scenario. We also show that the combination of multiple outputs further benefits from considering the uncertainty of the output of the DL model at each BS. For a more efficient training of the DL model across BSs, we additionally propose a multi-task learning (MTL) scheme by sharing some parameters across the models while jointly training all models. This method, not only improves the accuracy of the individual models, but also of the final combined estimate. Lastly, we evaluate the reliability of the uncertainty estimation to determine which of the fusion methods provides the highest quality of uncertainty estimates.
深度学习(DL)方法已被证明可以提高第五代(5G)新无线电(NR)空中接口的多个用例的性能。在本文中,我们利用 UE 和多个基站(BS)之间的信道状态信息(CSI)指纹研究用户设备(UE)定位。在这种设置中,我们考虑了两种不同的融合技术:早期融合和后期融合。在早期融合中,通过将多个基站的 CSI 指纹作为输入,可以训练出用于 UE 定位的单一 DL 模型。在后期融合中,每个 BS 都要使用特定于该 BS 的 CSI 来训练一个单独的 DL 模型,然后将这些单独模型的输出结合起来以确定 UE 的位置。在这项工作中,我们对这些不同的融合技术进行了比较,结果表明,融合不同模型的输出可实现更高的定位精度,尤其是在动态场景中。我们还表明,考虑到每个 BS 的 DL 模型输出的不确定性,多种输出的融合还能进一步获益。为了更有效地跨 BS 训练 DL 模型,我们还提出了一种多任务学习(MTL)方案,即在联合训练所有模型的同时,各模型共享一些参数。这种方法不仅能提高单个模型的准确性,还能提高最终综合估计的准确性。最后,我们评估了不确定性估计的可靠性,以确定哪种融合方法能提供最高质量的不确定性估计。
{"title":"Deep Learning-Based Positioning With Multi-Task Learning and Uncertainty-Based Fusion","authors":"Anastasios Foliadis;Mario H. Castañeda Garcia;Richard A. Stirling-Gallacher;Reiner S. Thomä","doi":"10.1109/TMLCN.2024.3441521","DOIUrl":"https://doi.org/10.1109/TMLCN.2024.3441521","url":null,"abstract":"Deep learning (DL) methods have been shown to improve the performance of several use cases for the fifth-generation (5G) New radio (NR) air interface. In this paper we investigate user equipment (UE) positioning using the channel state information (CSI) fingerprints between a UE and multiple base stations (BSs). In such a setup, we consider two different fusion techniques: early and late fusion. With early fusion, a single DL model can be trained for UE positioning by combining the CSI fingerprints of the multiple BSs as input. With late fusion, a separate DL model is trained at each BS using the CSI specific to that BS and the outputs of these individual models are then combined to determine the UE’s position. In this work we compare these different fusion techniques and show that fusing the outputs of separate models achieves higher positioning accuracy, especially in a dynamic scenario. We also show that the combination of multiple outputs further benefits from considering the uncertainty of the output of the DL model at each BS. For a more efficient training of the DL model across BSs, we additionally propose a multi-task learning (MTL) scheme by sharing some parameters across the models while jointly training all models. This method, not only improves the accuracy of the individual models, but also of the final combined estimate. Lastly, we evaluate the reliability of the uncertainty estimation to determine which of the fusion methods provides the highest quality of uncertainty estimates.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"1127-1141"},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10632202","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142083901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Reinforcement Learning for Uplink Scheduling in NOMA-URLLC Networks 针对 NOMA-URLLC 网络中上行链路调度的深度强化学习
Pub Date : 2024-08-02 DOI: 10.1109/TMLCN.2024.3437351
Benoît-Marie Robaglia;Marceau Coupechoux;Dimitrios Tsilimantos
This article addresses the problem of Ultra Reliable Low Latency Communications (URLLC) in wireless networks, a framework with particularly stringent constraints imposed by many Internet of Things (IoT) applications from diverse sectors. We propose a novel Deep Reinforcement Learning (DRL) scheduling algorithm, named NOMA-PPO, to solve the Non-Orthogonal Multiple Access (NOMA) uplink URLLC scheduling problem involving strict deadlines. The challenge of addressing uplink URLLC requirements in NOMA systems is related to the combinatorial complexity of the action space due to the possibility to schedule multiple devices, and to the partial observability constraint that we impose to our algorithm in order to meet the IoT communication constraints and be scalable. Our approach involves 1) formulating the NOMA-URLLC problem as a Partially Observable Markov Decision Process (POMDP) and the introduction of an agent state, serving as a sufficient statistic of past observations and actions, enabling a transformation of the POMDP into a Markov Decision Process (MDP); 2) adapting the Proximal Policy Optimization (PPO) algorithm to handle the combinatorial action space; 3) incorporating prior knowledge into the learning agent with the introduction of a Bayesian policy. Numerical results reveal that not only does our approach outperform traditional multiple access protocols and DRL benchmarks on 3GPP scenarios, but also proves to be robust under various channel and traffic configurations, efficiently exploiting inherent time correlations.
本文探讨了无线网络中的超可靠低延迟通信(URLLC)问题,来自不同领域的许多物联网(IoT)应用对这一框架提出了特别严格的限制。我们提出了一种名为 NOMA-PPO 的新型深度强化学习(DRL)调度算法,用于解决涉及严格期限的非正交多址(NOMA)上行 URLLC 调度问题。在 NOMA 系统中解决上行链路 URLLC 要求所面临的挑战与行动空间的组合复杂性有关,因为有可能调度多个设备,还与部分可观测性约束有关,为了满足物联网通信约束和可扩展性,我们对算法施加了部分可观测性约束。我们的方法包括:1)将 NOMA-URLLC 问题表述为部分可观测马尔可夫决策过程(POMDP),并引入代理状态,作为过去观测和行动的充分统计,从而将 POMDP 转换为马尔可夫决策过程(MDP);2)调整近端策略优化(PPO)算法,以处理组合行动空间;3)通过引入贝叶斯策略,将先验知识纳入学习代理。数值结果表明,在 3GPP 场景下,我们的方法不仅优于传统的多重接入协议和 DRL 基准,而且在各种信道和流量配置下都证明是稳健的,能有效利用固有的时间相关性。
{"title":"Deep Reinforcement Learning for Uplink Scheduling in NOMA-URLLC Networks","authors":"Benoît-Marie Robaglia;Marceau Coupechoux;Dimitrios Tsilimantos","doi":"10.1109/TMLCN.2024.3437351","DOIUrl":"https://doi.org/10.1109/TMLCN.2024.3437351","url":null,"abstract":"This article addresses the problem of Ultra Reliable Low Latency Communications (URLLC) in wireless networks, a framework with particularly stringent constraints imposed by many Internet of Things (IoT) applications from diverse sectors. We propose a novel Deep Reinforcement Learning (DRL) scheduling algorithm, named NOMA-PPO, to solve the Non-Orthogonal Multiple Access (NOMA) uplink URLLC scheduling problem involving strict deadlines. The challenge of addressing uplink URLLC requirements in NOMA systems is related to the combinatorial complexity of the action space due to the possibility to schedule multiple devices, and to the partial observability constraint that we impose to our algorithm in order to meet the IoT communication constraints and be scalable. Our approach involves 1) formulating the NOMA-URLLC problem as a Partially Observable Markov Decision Process (POMDP) and the introduction of an agent state, serving as a sufficient statistic of past observations and actions, enabling a transformation of the POMDP into a Markov Decision Process (MDP); 2) adapting the Proximal Policy Optimization (PPO) algorithm to handle the combinatorial action space; 3) incorporating prior knowledge into the learning agent with the introduction of a Bayesian policy. Numerical results reveal that not only does our approach outperform traditional multiple access protocols and DRL benchmarks on 3GPP scenarios, but also proves to be robust under various channel and traffic configurations, efficiently exploiting inherent time correlations.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"1142-1158"},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10621640","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142099787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Learning Suitable Caching Policies for In-Network Caching 论学习适合网络内缓存的缓存策略
Pub Date : 2024-07-31 DOI: 10.1109/TMLCN.2024.3436472
Stéfani Pires;Adriana Ribeiro;Leobino N. Sampaio
In-network cache architectures, such as Information-centric networks (ICNs), have proven to be an efficient alternative to deal with the growing content consumption on networks. In caching networks, any device can potentially act as a caching node. In practice, real cache networks may employ different caching replacement policies by a node. The reason is that the policies may vary in efficiency according to unbounded context factors, such as cache size, content request pattern, content distribution popularity, and the relative cache location. The lack of suitable policies for all nodes and scenarios undermines the efficient use of available cache resources. Therefore, a new model for choosing caching policies appropriately to cache contexts on-demand and over time becomes necessary. In this direction, we propose a new caching meta-policy strategy capable of learning the most appropriate policy for cache online and dynamically adapting to context variations that leads to changes in which policy is best. The meta-policy decouples the eviction strategy from managing the context information used by the policy, and models the choice of suitable policies as online learning with a bandit feedback problem. The meta-policy supports deploying a diverse set of self-contained caching policies in different scenarios, including adaptive policies. Experimental results with single and multiple caches have shown the meta-policy effectiveness and adaptability to different content request models in synthetic and trace-driven simulations. Moreover, we compared the meta-policy adaptive behavior with the Adaptive Replacement Policy (ARC) behavior.
以信息为中心的网络(ICN)等网内高速缓存架构已被证明是应对网络上日益增长的内容消费的有效替代方案。在缓存网络中,任何设备都有可能充当缓存节点。在实践中,实际的高速缓存网络可能会对一个节点采用不同的高速缓存替换策略。原因在于,这些策略的效率可能会因各种无限制的背景因素而有所不同,如缓存大小、内容请求模式、内容分布流行度和相对缓存位置等。缺乏适合所有节点和场景的策略会影响可用缓存资源的有效利用。因此,有必要建立一个新的模型,根据缓存环境按需并随时间推移选择合适的缓存策略。为此,我们提出了一种新的缓存元政策策略,它能够在线学习最合适的缓存政策,并动态适应导致最佳政策发生变化的上下文变化。元策略将驱逐策略与管理策略所使用的上下文信息分离开来,并将合适策略的选择建模为具有强盗反馈问题的在线学习。元政策支持在不同场景下部署多种自足缓存政策,包括自适应政策。单缓存和多缓存的实验结果表明,在合成和跟踪驱动的模拟中,元政策对不同的内容请求模型具有有效性和适应性。此外,我们还将元政策自适应行为与自适应替换政策(ARC)行为进行了比较。
{"title":"On Learning Suitable Caching Policies for In-Network Caching","authors":"Stéfani Pires;Adriana Ribeiro;Leobino N. Sampaio","doi":"10.1109/TMLCN.2024.3436472","DOIUrl":"https://doi.org/10.1109/TMLCN.2024.3436472","url":null,"abstract":"In-network cache architectures, such as Information-centric networks (ICNs), have proven to be an efficient alternative to deal with the growing content consumption on networks. In caching networks, any device can potentially act as a caching node. In practice, real cache networks may employ different caching replacement policies by a node. The reason is that the policies may vary in efficiency according to unbounded context factors, such as cache size, content request pattern, content distribution popularity, and the relative cache location. The lack of suitable policies for all nodes and scenarios undermines the efficient use of available cache resources. Therefore, a new model for choosing caching policies appropriately to cache contexts on-demand and over time becomes necessary. In this direction, we propose a new caching meta-policy strategy capable of learning the most appropriate policy for cache online and dynamically adapting to context variations that leads to changes in which policy is best. The meta-policy decouples the eviction strategy from managing the context information used by the policy, and models the choice of suitable policies as online learning with a bandit feedback problem. The meta-policy supports deploying a diverse set of self-contained caching policies in different scenarios, including adaptive policies. Experimental results with single and multiple caches have shown the meta-policy effectiveness and adaptability to different content request models in synthetic and trace-driven simulations. Moreover, we compared the meta-policy adaptive behavior with the Adaptive Replacement Policy (ARC) behavior.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"1076-1092"},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10616152","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141965679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ensemble-Based Reliability Enhancement for Edge-Deployed CNNs in Few-Shot Scenarios 以集合为基础增强边缘部署的 CNN 在少镜头场景中的可靠性
Pub Date : 2024-07-29 DOI: 10.1109/TMLCN.2024.3435168
Zhen Gao;Shuang Liu;Junbo Zhao;Xiaofei Wang;Yu Wang;Zhu Han
Convolutional Neural Networks (CNNs) have been applied in wide areas of computer vision, and edge intelligence is expected to provide instant AI service with the support of broadband mobile networks. However, the deployment of CNNs on network edge faces severe challenges. First, edge or embedded devices are usually not reliable, and hardware failures can corrupt the CNN system, which is unacceptable for critical applications, such as autonomous driving and object detection on space platforms. Second, edge or embedded devices are usually resource-limited, and therefore traditional redundancy-based protection methods are not applicable due to huge overhead. Although network pruning is effective to reduce the complexity of CNNs, we cannot have sufficient data for performance recovery in many scenarios due to privacy and security concerns. To enhance the reliability of CNNs on resource-limited devices with the few-shot constraint, we propose to construct an ensemble system with weak base CNNs pruned from the original strong CNN. To improve the ensemble performance with diverse base CNNs, we first propose a novel filter importance evaluation method by combining the amplitude and gradient information of the filter. Since the gradient part is related to the input data, different subsets of data are used for layer sensitivity analysis for different base CNNs, so that the different pruning configurations can be obtained for each base CNN. On this basis, a modified ReLU function is proposed to determine the final pruning rate of each layer in each base CNN. Extensive experiments prove that the proposed solution can effectively improve the reliability of CNNs with much less resource requirement for each edge server.
卷积神经网络(CNN)已被广泛应用于计算机视觉领域,而边缘智能有望在宽带移动网络的支持下提供即时的人工智能服务。然而,在网络边缘部署 CNN 面临着严峻的挑战。首先,边缘或嵌入式设备通常不可靠,硬件故障可能会破坏 CNN 系统,这对于自动驾驶和空间平台上的物体检测等关键应用来说是不可接受的。其次,边缘或嵌入式设备通常资源有限,因此传统的基于冗余的保护方法因开销巨大而不适用。虽然网络剪枝能有效降低 CNN 的复杂性,但出于隐私和安全考虑,我们在很多场景下无法获得足够的数据来恢复性能。为了在资源受限的设备上提高少镜头限制下的 CNN 的可靠性,我们建议用从原始强 CNN 中剪枝出来的弱基础 CNN 构建一个集合系统。为了提高不同基础 CNN 的集合性能,我们首先提出了一种新颖的滤波器重要性评估方法,该方法结合了滤波器的振幅和梯度信息。由于梯度部分与输入数据相关,因此对不同的基础 CNN 使用不同的数据子集进行层敏感性分析,从而为每个基础 CNN 获得不同的剪枝配置。在此基础上,提出了一种修改后的 ReLU 函数,用于确定每个基础 CNN 中各层的最终剪枝率。大量实验证明,所提出的解决方案可以有效提高 CNN 的可靠性,同时大大减少对每个边缘服务器的资源需求。
{"title":"Ensemble-Based Reliability Enhancement for Edge-Deployed CNNs in Few-Shot Scenarios","authors":"Zhen Gao;Shuang Liu;Junbo Zhao;Xiaofei Wang;Yu Wang;Zhu Han","doi":"10.1109/TMLCN.2024.3435168","DOIUrl":"https://doi.org/10.1109/TMLCN.2024.3435168","url":null,"abstract":"Convolutional Neural Networks (CNNs) have been applied in wide areas of computer vision, and edge intelligence is expected to provide instant AI service with the support of broadband mobile networks. However, the deployment of CNNs on network edge faces severe challenges. First, edge or embedded devices are usually not reliable, and hardware failures can corrupt the CNN system, which is unacceptable for critical applications, such as autonomous driving and object detection on space platforms. Second, edge or embedded devices are usually resource-limited, and therefore traditional redundancy-based protection methods are not applicable due to huge overhead. Although network pruning is effective to reduce the complexity of CNNs, we cannot have sufficient data for performance recovery in many scenarios due to privacy and security concerns. To enhance the reliability of CNNs on resource-limited devices with the few-shot constraint, we propose to construct an ensemble system with weak base CNNs pruned from the original strong CNN. To improve the ensemble performance with diverse base CNNs, we first propose a novel filter importance evaluation method by combining the amplitude and gradient information of the filter. Since the gradient part is related to the input data, different subsets of data are used for layer sensitivity analysis for different base CNNs, so that the different pruning configurations can be obtained for each base CNN. On this basis, a modified ReLU function is proposed to determine the final pruning rate of each layer in each base CNN. Extensive experiments prove that the proposed solution can effectively improve the reliability of CNNs with much less resource requirement for each edge server.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"1062-1075"},"PeriodicalIF":0.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10614218","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141965372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Reinforcement Learning for Multi-Layer Multi-Service Non-Terrestrial Vehicular Edge Computing 多层多服务非地面车载边缘计算的分层强化学习
Pub Date : 2024-07-25 DOI: 10.1109/TMLCN.2024.3433620
Swapnil Sadashiv Shinde;Daniele Tarchi
Vehicular Edge Computing (VEC) represents a novel advancement within the Internet of Vehicles (IoV). Despite its implementation through Road Side Units (RSUs), VEC frequently falls short of satisfying the escalating demands of Vehicle Users (VUs) for new services, necessitating supplementary computational and communication resources. Non-Terrestrial Networks (NTN) with onboard Edge Computing (EC) facilities are gaining a central place in the 6G vision, allowing one to extend future services also to uncovered areas. This scenario, composed of a multitude of VUs, terrestrial and non-terrestrial nodes, and characterized by mobility and stringent requirements, brings in a very high complexity. Machine Learning (ML) represents a perfect tool for solving these types of problems. Integrated Terrestrial and Non-terrestrial (T-NT) EC, supported by innovative intelligent solutions enabled through ML technology, can boost the VEC capacity, coverage range, and resource utilization. Therefore, by exploring the integrated T-NT EC platforms, we design a multi-EC-enabled vehicular networking platform with a heterogeneous set of services. Next, we model the latency and energy requirements for processing the VU tasks through partial computation offloading operations. We aim to optimize the overall latency and energy requirements for processing the VU data by selecting the appropriate edge nodes and the offloading amount. The problem is defined as a multi-layer sequential decision-making problem through the Markov Decision Processes (MDP). The Hierarchical Reinforcement Learning (HRL) method, implemented through a Deep Q network, is used to optimize the network selection and offloading policies. Simulation results are compared with different benchmark methods to show performance gains in terms of overall cost requirements and reliability.
车载边缘计算(VEC)是车联网(IoV)的一项新进展。尽管通过路侧单元(RSU)实现了 VEC,但 VEC 经常无法满足车辆用户(VU)对新服务不断升级的需求,因此需要补充计算和通信资源。带有车载边缘计算(EC)设施的非地面网络(NTN)在 6G 愿景中占据了重要位置,使未来的服务也能扩展到未覆盖区域。这种场景由众多 VU、地面和非地面节点组成,具有移动性和严格要求的特点,因此复杂度非常高。机器学习 (ML) 是解决此类问题的完美工具。地面和非地面(T-NT)集成电子通信系统在通过 ML 技术实现的创新智能解决方案的支持下,可以提高 VEC 容量、覆盖范围和资源利用率。因此,通过探索集成的 T-NT 电子通信平台,我们设计了一个具有异构服务集的多电子通信车联网平台。接下来,我们通过部分计算卸载操作对处理 VU 任务的延迟和能源需求进行建模。我们的目标是通过选择合适的边缘节点和卸载量,优化处理 VU 数据的整体延迟和能源需求。通过马尔可夫决策过程(Markov Decision Processes,MDP),该问题被定义为多层顺序决策问题。通过深度 Q 网络实现的分层强化学习(HRL)方法用于优化网络选择和卸载策略。仿真结果与不同的基准方法进行了比较,以显示在总体成本要求和可靠性方面的性能提升。
{"title":"Hierarchical Reinforcement Learning for Multi-Layer Multi-Service Non-Terrestrial Vehicular Edge Computing","authors":"Swapnil Sadashiv Shinde;Daniele Tarchi","doi":"10.1109/TMLCN.2024.3433620","DOIUrl":"https://doi.org/10.1109/TMLCN.2024.3433620","url":null,"abstract":"Vehicular Edge Computing (VEC) represents a novel advancement within the Internet of Vehicles (IoV). Despite its implementation through Road Side Units (RSUs), VEC frequently falls short of satisfying the escalating demands of Vehicle Users (VUs) for new services, necessitating supplementary computational and communication resources. Non-Terrestrial Networks (NTN) with onboard Edge Computing (EC) facilities are gaining a central place in the 6G vision, allowing one to extend future services also to uncovered areas. This scenario, composed of a multitude of VUs, terrestrial and non-terrestrial nodes, and characterized by mobility and stringent requirements, brings in a very high complexity. Machine Learning (ML) represents a perfect tool for solving these types of problems. Integrated Terrestrial and Non-terrestrial (T-NT) EC, supported by innovative intelligent solutions enabled through ML technology, can boost the VEC capacity, coverage range, and resource utilization. Therefore, by exploring the integrated T-NT EC platforms, we design a multi-EC-enabled vehicular networking platform with a heterogeneous set of services. Next, we model the latency and energy requirements for processing the VU tasks through partial computation offloading operations. We aim to optimize the overall latency and energy requirements for processing the VU data by selecting the appropriate edge nodes and the offloading amount. The problem is defined as a multi-layer sequential decision-making problem through the Markov Decision Processes (MDP). The Hierarchical Reinforcement Learning (HRL) method, implemented through a Deep Q network, is used to optimize the network selection and offloading policies. Simulation results are compared with different benchmark methods to show performance gains in terms of overall cost requirements and reliability.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"1045-1061"},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10609447","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141964883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Full-Duplex Millimeter Wave MIMO Channel Estimation: A Neural Network Approach 全双工毫米波多输入多输出信道估计:神经网络方法
Pub Date : 2024-07-24 DOI: 10.1109/TMLCN.2024.3432865
Mehdi Sattari;Hao Guo;Deniz Gündüz;Ashkan Panahi;Tommy Svensson
Millimeter wave (mmWave) multiple-input-multi-output (MIMO) is now a reality with great potential for further improvement. We study full-duplex transmissions as an effective way to improve mmWave MIMO systems. Compared to half-duplex systems, full-duplex transmissions may offer higher data rates and lower latency. However, full-duplex transmission is hindered by self-interference (SI) at the receive antennas, and SI channel estimation becomes a crucial step to make the full-duplex systems feasible. In this paper, we address the problem of channel estimation in full-duplex mmWave MIMO systems using neural networks (NNs). Our approach involves sharing pilot resources between user equipments (UEs) and transmit antennas at the base station (BS), aiming to reduce the pilot overhead in full-duplex systems and to achieve a comparable level to that of a half-duplex system. Additionally, in the case of separate antenna configurations in a full-duplex BS, providing channel estimates of transmit antenna (TX) arrays to the downlink UEs poses another challenge, as the TX arrays are not capable of receiving pilot signals. To address this, we employ an NN to map the channel from the downlink UEs to the receive antenna (RX) arrays to the channel from the TX arrays to the downlink UEs. We further elaborate on how NNs perform the estimation with different architectures, (e.g., different numbers of hidden layers), the introduction of non-linear distortion (e.g., with a 1-bit analog-to-digital converter (ADC)), and different channel conditions (e.g., low-correlated and high-correlated channels). Our work provides novel insights into NN-based channel estimators.
毫米波(mmWave)多输入多输出(MIMO)现已成为现实,并具有进一步改进的巨大潜力。我们将全双工传输作为改进毫米波多输入多输出系统的有效方法进行研究。与半双工系统相比,全双工传输可提供更高的数据传输速率和更低的延迟。然而,全双工传输会受到接收天线自干扰(SI)的阻碍,因此 SI 信道估计成为使全双工系统可行的关键步骤。在本文中,我们利用神经网络(NN)解决了全双工毫米波 MIMO 系统中的信道估计问题。我们的方法涉及在用户设备(UE)和基站(BS)的发射天线之间共享先导资源,旨在减少全双工系统中的先导开销,并达到与半双工系统相当的水平。此外,在全双工基站采用独立天线配置的情况下,向下行链路 UE 提供发射天线(TX)阵列的信道估计也是一个挑战,因为 TX 阵列无法接收先导信号。为了解决这个问题,我们采用了一种 NN,将下行链路 UE 到接收天线 (RX) 阵列的信道映射到从发射天线阵列到下行链路 UE 的信道。我们进一步阐述了 NN 如何在不同架构(如不同数量的隐藏层)、引入非线性失真(如使用 1 位模数转换器 (ADC))和不同信道条件(如低相关和高相关信道)下执行估计。我们的工作为基于 NN 的信道估计器提供了新的见解。
{"title":"Full-Duplex Millimeter Wave MIMO Channel Estimation: A Neural Network Approach","authors":"Mehdi Sattari;Hao Guo;Deniz Gündüz;Ashkan Panahi;Tommy Svensson","doi":"10.1109/TMLCN.2024.3432865","DOIUrl":"https://doi.org/10.1109/TMLCN.2024.3432865","url":null,"abstract":"Millimeter wave (mmWave) multiple-input-multi-output (MIMO) is now a reality with great potential for further improvement. We study full-duplex transmissions as an effective way to improve mmWave MIMO systems. Compared to half-duplex systems, full-duplex transmissions may offer higher data rates and lower latency. However, full-duplex transmission is hindered by self-interference (SI) at the receive antennas, and SI channel estimation becomes a crucial step to make the full-duplex systems feasible. In this paper, we address the problem of channel estimation in full-duplex mmWave MIMO systems using neural networks (NNs). Our approach involves sharing pilot resources between user equipments (UEs) and transmit antennas at the base station (BS), aiming to reduce the pilot overhead in full-duplex systems and to achieve a comparable level to that of a half-duplex system. Additionally, in the case of separate antenna configurations in a full-duplex BS, providing channel estimates of transmit antenna (TX) arrays to the downlink UEs poses another challenge, as the TX arrays are not capable of receiving pilot signals. To address this, we employ an NN to map the channel from the downlink UEs to the receive antenna (RX) arrays to the channel from the TX arrays to the downlink UEs. We further elaborate on how NNs perform the estimation with different architectures, (e.g., different numbers of hidden layers), the introduction of non-linear distortion (e.g., with a 1-bit analog-to-digital converter (ADC)), and different channel conditions (e.g., low-correlated and high-correlated channels). Our work provides novel insights into NN-based channel estimators.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"1093-1108"},"PeriodicalIF":0.0,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10608175","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141994029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Intellicise Router Promotes Endogenous Intelligence in Communication Network Intellicise 路由器促进通信网络的内生智能化
Pub Date : 2024-07-24 DOI: 10.1109/TMLCN.2024.3432861
Qiyun Guo;Haotai Liang;Zhicheng Bao;Chen Dong;Xiaodong Xu;Zhongzheng Tang;Yue Bei
Endogenous intelligence has emerged as a crucial aspect of next-generation communication networks. This concept is closely intertwined with artificial intelligence (AI), with its primary components being data, algorithms, and computility. Data collection remains a critical concern that warrants focused attention. To address the challenge of data expansion and forwarding, the intellicise router is proposed. It extends the local dataset and continuously enhances the local model through a specifically crafted algorithm, which enhances AI performance, as exemplified by its application in image recognition tasks. Service capability is employed to gauge the router’s ability to provide services and the upper bounds are derived. To analyze the algorithm’s effectiveness, a category-increase model is developed to calculate the probability of categories rising under both equal and unequal probabilities of image communication categories. The numerical analysis results align with simulation results, affirming the validity of the category-increase model. To assess the performance of the intellicise router, a communication system is simulated. A comparative analysis of these experimental results demonstrates that the intellicise router can continuously improve its performance to provide better service.
内生智能已成为下一代通信网络的一个重要方面。这一概念与人工智能(AI)密切相关,其主要组成部分是数据、算法和计算能力。数据收集仍然是值得重点关注的关键问题。为了应对数据扩展和转发的挑战,我们提出了智能路由器。它扩展了本地数据集,并通过专门设计的算法不断增强本地模型,从而提高人工智能性能,其在图像识别任务中的应用就是例证。利用服务能力来衡量路由器提供服务的能力,并得出其上限。为了分析算法的有效性,我们开发了一个类别增加模型,用于计算在图像通信类别概率相等和不相等的情况下类别增加的概率。数值分析结果与模拟结果一致,肯定了类别增加模型的有效性。为了评估 intellicise 路由器的性能,我们模拟了一个通信系统。对这些实验结果的比较分析表明,intellicise 路由器可以不断提高性能,提供更好的服务。
{"title":"Intellicise Router Promotes Endogenous Intelligence in Communication Network","authors":"Qiyun Guo;Haotai Liang;Zhicheng Bao;Chen Dong;Xiaodong Xu;Zhongzheng Tang;Yue Bei","doi":"10.1109/TMLCN.2024.3432861","DOIUrl":"https://doi.org/10.1109/TMLCN.2024.3432861","url":null,"abstract":"Endogenous intelligence has emerged as a crucial aspect of next-generation communication networks. This concept is closely intertwined with artificial intelligence (AI), with its primary components being data, algorithms, and computility. Data collection remains a critical concern that warrants focused attention. To address the challenge of data expansion and forwarding, the intellicise router is proposed. It extends the local dataset and continuously enhances the local model through a specifically crafted algorithm, which enhances AI performance, as exemplified by its application in image recognition tasks. Service capability is employed to gauge the router’s ability to provide services and the upper bounds are derived. To analyze the algorithm’s effectiveness, a category-increase model is developed to calculate the probability of categories rising under both equal and unequal probabilities of image communication categories. The numerical analysis results align with simulation results, affirming the validity of the category-increase model. To assess the performance of the intellicise router, a communication system is simulated. A comparative analysis of these experimental results demonstrates that the intellicise router can continuously improve its performance to provide better service.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"1509-1526"},"PeriodicalIF":0.0,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10608170","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating Fair Federated Learning: Adaptive Federated Adam 加速公平联合学习:自适应联合亚当
Pub Date : 2024-07-04 DOI: 10.1109/TMLCN.2024.3423648
Li Ju;Tianru Zhang;Salman Toor;Andreas Hellander
Federated learning is a distributed and privacy-preserving approach to train a statistical model collaboratively from decentralized data held by different parties. However, when the datasets are not independent and identically distributed, models trained by naive federated algorithms may be biased towards certain participants, and model performance across participants is non-uniform. This is known as the fairness problem in federated learning. In this paper, we formulate fairness-controlled federated learning as a dynamical multi-objective optimization problem to ensure the fairness and convergence with theoretical guarantee. To solve the problem efficiently, we study the convergence and bias of Adam as the server optimizer in federated learning, and propose Adaptive Federated Adam (AdaFedAdam) to accelerate fair federated learning with alleviated bias. We validated the effectiveness, Pareto optimality and robustness of AdaFedAdam with numerical experiments and show that AdaFedAdam outperforms existing algorithms, providing better convergence and fairness properties of the federated scheme.
联合学习是一种分布式和保护隐私的方法,用于从不同方持有的分散数据中协作训练统计模型。然而,当数据集不是独立和同分布的时候,用天真的联合算法训练出来的模型可能会偏向于某些参与者,而且不同参与者的模型性能也不一致。这就是联合学习中的公平性问题。在本文中,我们将公平性控制联合学习表述为一个动态多目标优化问题,以确保理论上的公平性和收敛性。为了高效地解决这个问题,我们研究了联盟学习中作为服务器优化器的 Adam 的收敛性和偏差,并提出了自适应联盟 Adam(AdaFedAdam),以加速公平联盟学习并减轻偏差。我们通过数值实验验证了 AdaFedAdam 的有效性、帕累托最优性和鲁棒性,结果表明 AdaFedAdam 优于现有算法,为联合方案提供了更好的收敛性和公平性。
{"title":"Accelerating Fair Federated Learning: Adaptive Federated Adam","authors":"Li Ju;Tianru Zhang;Salman Toor;Andreas Hellander","doi":"10.1109/TMLCN.2024.3423648","DOIUrl":"https://doi.org/10.1109/TMLCN.2024.3423648","url":null,"abstract":"Federated learning is a distributed and privacy-preserving approach to train a statistical model collaboratively from decentralized data held by different parties. However, when the datasets are not independent and identically distributed, models trained by naive federated algorithms may be biased towards certain participants, and model performance across participants is non-uniform. This is known as the fairness problem in federated learning. In this paper, we formulate fairness-controlled federated learning as a dynamical multi-objective optimization problem to ensure the fairness and convergence with theoretical guarantee. To solve the problem efficiently, we study the convergence and bias of \u0000<monospace>Adam</monospace>\u0000 as the server optimizer in federated learning, and propose Adaptive Federated Adam (\u0000<monospace>AdaFedAdam</monospace>\u0000) to accelerate fair federated learning with alleviated bias. We validated the effectiveness, Pareto optimality and robustness of \u0000<monospace>AdaFedAdam</monospace>\u0000 with numerical experiments and show that \u0000<monospace>AdaFedAdam</monospace>\u0000 outperforms existing algorithms, providing better convergence and fairness properties of the federated scheme.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"1017-1032"},"PeriodicalIF":0.0,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10584508","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning on Bandwidth Constrained Multi-Source Data With MIMO-Inspired DPP MAP Inference 利用 MIMO 启发的 DPP MAP 推理学习带宽受限的多源数据
Pub Date : 2024-07-02 DOI: 10.1109/TMLCN.2024.3421907
Xiwen Chen;Huayu Li;Rahul Amin;Abolfazl Razi
Determinantal Point Process (DPP) is a powerful technique to enhance data diversity by promoting the repulsion of similar elements in the selected samples. Particularly, DPP-based Maximum A Posteriori (MAP) inference is used to identify subsets with the highest diversity. However, a commonly adopted presumption of all data samples being available at one point hinders its applicability to real-world scenarios where data samples are distributed across distinct sources with intermittent and bandwidth-limited connections. This paper proposes a distributed version of DPP inference to enhance multi-source data diversification under limited communication budgets. First, we convert the lower bound of the diversity-maximized distributed sample selection from matrix determinant optimization to a simpler form of the sum of individual terms. Next, a determinant-preserved sparse representation of selected samples is formed by the sink as a surrogate for collected samples and sent back to sources as lightweight messages to eliminate the need for raw data exchange. Our approach is inspired by the channel orthogonalization process of Multiple-Input Multiple-Output (MIMO) systems based on the Channel State Information (CSI). Extensive experiments verify the superiority of our scalable method over the most commonly used data selection methods, including GreeDi, Greedymax, random selection, and stratified sampling by a substantial gain of at least 12% reduction in Relative Diversity Error (RDE). This enhanced diversity translates to a substantial improvement in the performance of various downstream learning tasks, including multi-level classification (2%-4% gain in accuracy), object detection (2% gain in mAP), and multiple-instance learning (1.3% gain in AUC).
确定性点过程(DPP)是一种强大的技术,可通过促进所选样本中相似元素的排斥来增强数据的多样性。特别是,基于 DPP 的最大后验(MAP)推理可用于识别具有最高多样性的子集。然而,通常采用的假设是所有数据样本在一个点上都是可用的,这阻碍了它在现实世界中的应用,因为在现实世界中,数据样本分布在不同的数据源上,连接时断时续,带宽有限。本文提出了分布式版本的 DPP 推理,以加强有限通信预算下的多源数据多样化。首先,我们将多样性最大化分布式样本选择的下限从矩阵行列式优化转换为单项之和的更简单形式。其次,由信息汇形成所选样本的行列式保留稀疏表示,作为所收集样本的代理,并以轻量级消息的形式发送回源,从而消除原始数据交换的需要。我们的方法受到基于信道状态信息(CSI)的多输入多输出(MIMO)系统信道正交化过程的启发。广泛的实验验证了我们的可扩展方法优于最常用的数据选择方法,包括 GreeDi、Greedymax、随机选择和分层抽样,在相对分集误差 (RDE) 方面至少降低了 12% 。这种多样性的增强转化为各种下游学习任务性能的大幅提高,包括多级分类(准确率提高 2%-4%)、物体检测(mAP 提高 2%)和多实例学习(AUC 提高 1.3%)。
{"title":"Learning on Bandwidth Constrained Multi-Source Data With MIMO-Inspired DPP MAP Inference","authors":"Xiwen Chen;Huayu Li;Rahul Amin;Abolfazl Razi","doi":"10.1109/TMLCN.2024.3421907","DOIUrl":"https://doi.org/10.1109/TMLCN.2024.3421907","url":null,"abstract":"Determinantal Point Process (DPP) is a powerful technique to enhance data diversity by promoting the repulsion of similar elements in the selected samples. Particularly, DPP-based Maximum A Posteriori (MAP) inference is used to identify subsets with the highest diversity. However, a commonly adopted presumption of all data samples being available at one point hinders its applicability to real-world scenarios where data samples are distributed across distinct sources with intermittent and bandwidth-limited connections. This paper proposes a distributed version of DPP inference to enhance multi-source data diversification under limited communication budgets. First, we convert the lower bound of the diversity-maximized distributed sample selection from matrix determinant optimization to a simpler form of the sum of individual terms. Next, a determinant-preserved sparse representation of selected samples is formed by the sink as a surrogate for collected samples and sent back to sources as lightweight messages to eliminate the need for raw data exchange. Our approach is inspired by the channel orthogonalization process of Multiple-Input Multiple-Output (MIMO) systems based on the Channel State Information (CSI). Extensive experiments verify the superiority of our scalable method over the most commonly used data selection methods, including GreeDi, Greedymax, random selection, and stratified sampling by a substantial gain of at least 12% reduction in Relative Diversity Error (RDE). This enhanced diversity translates to a substantial improvement in the performance of various downstream learning tasks, including multi-level classification (2%-4% gain in accuracy), object detection (2% gain in mAP), and multiple-instance learning (1.3% gain in AUC).","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"1341-1356"},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10580972","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142246439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fair Probabilistic Multi-Armed Bandit With Applications to Network Optimization 公平概率多臂匪徒与网络优化应用
Pub Date : 2024-07-01 DOI: 10.1109/TMLCN.2024.3421170
Zhiwu Guo;Chicheng Zhang;Ming Li;Marwan Krunz
Online learning, particularly Multi-Armed Bandit (MAB) algorithms, has been extensively adopted in various real-world networking applications. In certain applications, such as fair heterogeneous networks coexistence, multiple links (individual arms) are selected in each round, and the throughputs (rewards) of these arms depend on the chosen set of links. Additionally, ensuring fairness among individual arms is a critical objective. However, existing MAB algorithms are unsuitable for these applications due to different models and assumptions. In this paper, we introduce a new fair probabilistic MAB (FP-MAB) problem aimed at either maximizing the minimum reward for all arms or maximizing the total reward while imposing a fairness constraint that guarantees a minimum selection fraction for each arm. In FP-MAB, the learning agent probabilistically selects a meta-arm, which is associated with one or multiple individual arms in each decision round. To address the FP-MAB problem, we propose two algorithms: Fair Probabilistic Explore-Then-Commit (FP-ETC) and Fair Probabilistic Optimism In the Face of Uncertainty (FP-OFU). We also introduce a novel concept of regret in the context of the max-min fairness objective. We analyze the performance of FP-ETC and FP-OFU in terms of the upper bound of average regret and average constraint violation. Simulation results demonstrate that FP-ETC and FP-OFU achieve lower regrets (or higher objective values) under the same fairness requirements compared to existing MAB algorithms.
在线学习,尤其是多臂匪徒(MAB)算法,已被广泛应用于现实世界的各种网络应用中。在某些应用中,例如公平异构网络共存,每轮都会选择多个链路(单臂),而这些单臂的吞吐量(奖励)取决于所选的链路集。此外,确保各个臂之间的公平性也是一个关键目标。然而,由于模型和假设不同,现有的 MAB 算法并不适合这些应用。在本文中,我们引入了一个新的公平概率 MAB(FP-MAB)问题,旨在最大化所有臂的最小奖励,或最大化总奖励,同时施加公平约束,保证每个臂的最小选择分数。在 FP-MAB 中,学习代理以概率方式选择元臂,元臂在每轮决策中与一个或多个单臂相关联。为解决 FP-MAB 问题,我们提出了两种算法:公平概率探索-然后承诺(FP-ETC)和公平概率不确定性乐观(FP-OFU)。我们还在最大最小公平目标的背景下引入了一个新的遗憾概念。我们从平均遗憾上限和平均违反约束上限的角度分析了 FP-ETC 和 FP-OFU 的性能。仿真结果表明,与现有的 MAB 算法相比,在相同的公平性要求下,FP-ETC 和 FP-OFU 能获得更低的遗憾值(或更高的目标值)。
{"title":"Fair Probabilistic Multi-Armed Bandit With Applications to Network Optimization","authors":"Zhiwu Guo;Chicheng Zhang;Ming Li;Marwan Krunz","doi":"10.1109/TMLCN.2024.3421170","DOIUrl":"https://doi.org/10.1109/TMLCN.2024.3421170","url":null,"abstract":"Online learning, particularly Multi-Armed Bandit (MAB) algorithms, has been extensively adopted in various real-world networking applications. In certain applications, such as fair heterogeneous networks coexistence, multiple links (individual arms) are selected in each round, and the throughputs (rewards) of these arms depend on the chosen set of links. Additionally, ensuring fairness among individual arms is a critical objective. However, existing MAB algorithms are unsuitable for these applications due to different models and assumptions. In this paper, we introduce a new fair probabilistic MAB (FP-MAB) problem aimed at either maximizing the minimum reward for all arms or maximizing the total reward while imposing a fairness constraint that guarantees a minimum selection fraction for each arm. In FP-MAB, the learning agent probabilistically selects a meta-arm, which is associated with one or multiple individual arms in each decision round. To address the FP-MAB problem, we propose two algorithms: Fair Probabilistic Explore-Then-Commit (FP-ETC) and Fair Probabilistic Optimism In the Face of Uncertainty (FP-OFU). We also introduce a novel concept of regret in the context of the max-min fairness objective. We analyze the performance of FP-ETC and FP-OFU in terms of the upper bound of average regret and average constraint violation. Simulation results demonstrate that FP-ETC and FP-OFU achieve lower regrets (or higher objective values) under the same fairness requirements compared to existing MAB algorithms.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"994-1016"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10579843","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141618064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Machine Learning in Communications and Networking
全部 Geobiology Appl. Clay Sci. Geochim. Cosmochim. Acta J. Hydrol. Org. Geochem. Carbon Balance Manage. Contrib. Mineral. Petrol. Int. J. Biometeorol. IZV-PHYS SOLID EART+ J. Atmos. Chem. Acta Oceanolog. Sin. Acta Geophys. ACTA GEOL POL ACTA PETROL SIN ACTA GEOL SIN-ENGL AAPG Bull. Acta Geochimica Adv. Atmos. Sci. Adv. Meteorol. Am. J. Phys. Anthropol. Am. J. Sci. Am. Mineral. Annu. Rev. Earth Planet. Sci. Appl. Geochem. Aquat. Geochem. Ann. Glaciol. Archaeol. Anthropol. Sci. ARCHAEOMETRY ARCT ANTARCT ALP RES Asia-Pac. J. Atmos. Sci. ATMOSPHERE-BASEL Atmos. Res. Aust. J. Earth Sci. Atmos. Chem. Phys. Atmos. Meas. Tech. Basin Res. Big Earth Data BIOGEOSCIENCES Geostand. Geoanal. Res. GEOLOGY Geosci. J. Geochem. J. Geochem. Trans. Geosci. Front. Geol. Ore Deposits Global Biogeochem. Cycles Gondwana Res. Geochem. Int. Geol. J. Geophys. Prospect. Geosci. Model Dev. GEOL BELG GROUNDWATER Hydrogeol. J. Hydrol. Earth Syst. Sci. Hydrol. Processes Int. J. Climatol. Int. J. Earth Sci. Int. Geol. Rev. Int. J. Disaster Risk Reduct. Int. J. Geomech. Int. J. Geog. Inf. Sci. Isl. Arc J. Afr. Earth. Sci. J. Adv. Model. Earth Syst. J APPL METEOROL CLIM J. Atmos. Oceanic Technol. J. Atmos. Sol. Terr. Phys. J. Clim. J. Earth Sci. J. Earth Syst. Sci. J. Environ. Eng. Geophys. J. Geog. Sci. Mineral. Mag. Miner. Deposita Mon. Weather Rev. Nat. Hazards Earth Syst. Sci. Nat. Clim. Change Nat. Geosci. Ocean Dyn. Ocean and Coastal Research npj Clim. Atmos. Sci. Ocean Modell. Ocean Sci. Ore Geol. Rev. OCEAN SCI J Paleontol. J. PALAEOGEOGR PALAEOCL PERIOD MINERAL PETROLOGY+ Phys. Chem. Miner. Polar Sci. Prog. Oceanogr. Quat. Sci. Rev. Q. J. Eng. Geol. Hydrogeol. RADIOCARBON Pure Appl. Geophys. Resour. Geol. Rev. Geophys. Sediment. Geol.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1