Backscatter-Assisted Computation Offloading for Energy Harvesting IoT Devices via Policy-based Deep Reinforcement Learning

Yutong Xie, Zhengzhuo Xu, Yuxing Zhong, Jing Xu, Shimin Gong, Yi Wang
{"title":"Backscatter-Assisted Computation Offloading for Energy Harvesting IoT Devices via Policy-based Deep Reinforcement Learning","authors":"Yutong Xie, Zhengzhuo Xu, Yuxing Zhong, Jing Xu, Shimin Gong, Yi Wang","doi":"10.1109/ICCChinaW.2019.8849964","DOIUrl":null,"url":null,"abstract":"Wireless Internet of Things (IoT) devices can be deployed for data acquisition and decision making, e.g., the wearable sensors used for healthcare monitoring. Due to limited computation capability, the low-power IoT devices can optionally offload power-consuming computation to a nearby computing server. To balance power consumption in data offloading and computation, we propose a novel hybrid data offloading scheme that allows each device to offload data via either the conventional RF communications or low-power backscatter communications. Such a flexibility makes it more complicated to optimize the offloading strategy with uncertain workload and energy supply at each device. As such, we propose the deep reinforcement learning (DRL) to learn the optimal offloading policy from past experience. In particular, we rely on the policy-based DRL approach for continuous control problems in the actor-critic framework. By interacting with the network environment, we can optimize each user's energy harvesting time and the workload allocation among different offloading schemes. The numerical results show that the proposed DRL approach can achieve much higher reward and learning speed compared to the conventional deep Q-network method.","PeriodicalId":252172,"journal":{"name":"2019 IEEE/CIC International Conference on Communications Workshops in China (ICCC Workshops)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/CIC International Conference on Communications Workshops in China (ICCC Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCChinaW.2019.8849964","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22

Abstract

Wireless Internet of Things (IoT) devices can be deployed for data acquisition and decision making, e.g., the wearable sensors used for healthcare monitoring. Due to limited computation capability, the low-power IoT devices can optionally offload power-consuming computation to a nearby computing server. To balance power consumption in data offloading and computation, we propose a novel hybrid data offloading scheme that allows each device to offload data via either the conventional RF communications or low-power backscatter communications. Such a flexibility makes it more complicated to optimize the offloading strategy with uncertain workload and energy supply at each device. As such, we propose the deep reinforcement learning (DRL) to learn the optimal offloading policy from past experience. In particular, we rely on the policy-based DRL approach for continuous control problems in the actor-critic framework. By interacting with the network environment, we can optimize each user's energy harvesting time and the workload allocation among different offloading schemes. The numerical results show that the proposed DRL approach can achieve much higher reward and learning speed compared to the conventional deep Q-network method.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于策略的深度强化学习的能量收集物联网设备的反向散射辅助计算卸载
可以部署无线物联网(IoT)设备进行数据采集和决策,例如用于医疗保健监测的可穿戴传感器。由于计算能力有限,低功耗物联网设备可以选择性地将高功耗计算卸载到附近的计算服务器上。为了平衡数据卸载和计算中的功耗,我们提出了一种新的混合数据卸载方案,该方案允许每个设备通过传统的射频通信或低功耗反向散射通信来卸载数据。这种灵活性使得在每个设备的工作量和能量供应不确定的情况下,优化卸载策略变得更加复杂。因此,我们提出深度强化学习(DRL)从过去的经验中学习最优卸载策略。特别是,我们依靠基于策略的DRL方法来解决参与者-批评框架中的连续控制问题。通过与网络环境的交互,优化每个用户的能量收集时间和不同卸载方案之间的工作负载分配。数值结果表明,与传统的深度q -网络方法相比,所提出的DRL方法可以获得更高的奖励和学习速度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Space Propagation Model for Wireless Power Transfer System of Dual Transmitter Signal Detection for Batteryless Backscatter Systems with Multiple-Antenna Tags Research on wireless sensor network location based on Improve Pigeon-inspired optimization A novel spinal codes based on chaotic Kent mapping Spectrum usage model for smart spectrum
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1