Reinforcement Learning Environment for Wavefront Sensorless Adaptive Optics in Single-Mode Fiber Coupled Optical Satellite Communications Downlinks

IF 2.1 4区 物理与天体物理 Q2 OPTICS Photonics Pub Date : 2023-12-13 DOI:10.3390/photonics10121371
Payam Parvizi, Runnan Zou, Colin Bellinger, R. Cheriton, Davide Spinello
{"title":"Reinforcement Learning Environment for Wavefront Sensorless Adaptive Optics in Single-Mode Fiber Coupled Optical Satellite Communications Downlinks","authors":"Payam Parvizi, Runnan Zou, Colin Bellinger, R. Cheriton, Davide Spinello","doi":"10.3390/photonics10121371","DOIUrl":null,"url":null,"abstract":"Optical satellite communications (OSC) downlinks can support much higher bandwidths than radio-frequency channels. However, atmospheric turbulence degrades the optical beam wavefront, leading to reduced data transfer rates. In this study, we propose using reinforcement learning (RL) as a lower-cost alternative to standard wavefront sensor-based solutions. We estimate that RL has the potential to reduce system latency, while lowering system costs by omitting the wavefront sensor and low-latency wavefront processing electronics. This is achieved by adopting a control policy learned through interactions with a cost-effective and ultra-fast readout of a low-dimensional photodetector array, rather than relying on a wavefront phase profiling camera. However, RL-based wavefront sensorless adaptive optics (AO) for OSC downlinks faces challenges relating to prediction latency, sample efficiency, and adaptability. To gain a deeper insight into these challenges, we have developed and shared the first OSC downlink RL environment and evaluated a diverse set of deep RL algorithms in the environment. Our results indicate that the Proximal Policy Optimization (PPO) algorithm outperforms the Soft Actor–Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) algorithms. Moreover, PPO converges to within 86% of the maximum performance achievable by the predominant Shack–Hartmann wavefront sensor-based AO system. Our findings indicate the potential of RL in replacing wavefront sensor-based AO while reducing the cost of OSC downlinks.","PeriodicalId":20154,"journal":{"name":"Photonics","volume":"119 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Photonics","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.3390/photonics10121371","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Optical satellite communications (OSC) downlinks can support much higher bandwidths than radio-frequency channels. However, atmospheric turbulence degrades the optical beam wavefront, leading to reduced data transfer rates. In this study, we propose using reinforcement learning (RL) as a lower-cost alternative to standard wavefront sensor-based solutions. We estimate that RL has the potential to reduce system latency, while lowering system costs by omitting the wavefront sensor and low-latency wavefront processing electronics. This is achieved by adopting a control policy learned through interactions with a cost-effective and ultra-fast readout of a low-dimensional photodetector array, rather than relying on a wavefront phase profiling camera. However, RL-based wavefront sensorless adaptive optics (AO) for OSC downlinks faces challenges relating to prediction latency, sample efficiency, and adaptability. To gain a deeper insight into these challenges, we have developed and shared the first OSC downlink RL environment and evaluated a diverse set of deep RL algorithms in the environment. Our results indicate that the Proximal Policy Optimization (PPO) algorithm outperforms the Soft Actor–Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) algorithms. Moreover, PPO converges to within 86% of the maximum performance achievable by the predominant Shack–Hartmann wavefront sensor-based AO system. Our findings indicate the potential of RL in replacing wavefront sensor-based AO while reducing the cost of OSC downlinks.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
单模光纤耦合光学卫星通信下行链路中波前无传感器自适应光学的强化学习环境
与射频信道相比,光学卫星通信(OSC)下行链路可支持更高的带宽。然而,大气湍流会降低光束波前,导致数据传输速率降低。在这项研究中,我们建议使用强化学习(RL)作为基于波前传感器的标准解决方案的低成本替代方案。我们估计,RL 有可能减少系统延迟,同时通过省略波前传感器和低延迟波前处理电子设备来降低系统成本。这是通过与低维光电探测器阵列的高性价比和超快速读出进行交互,而不是依赖波前相位剖析相机来学习控制策略来实现的。然而,用于 OSC 下行链路的基于 RL 的无波前传感器自适应光学(AO)面临着与预测延迟、采样效率和适应性有关的挑战。为了更深入地了解这些挑战,我们开发并共享了首个 OSC 下行链路 RL 环境,并在该环境中评估了多种深度 RL 算法。结果表明,近端策略优化(PPO)算法优于软行为批判(SAC)和深度确定性策略梯度(DDPG)算法。此外,PPO收敛到了基于 Shack-Hartmann 波前传感器的主流 AO 系统所能达到的最高性能的 86% 以内。我们的研究结果表明,RL 有潜力取代基于波前传感器的自动光学系统,同时降低 OSC 下行链路的成本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Photonics
Photonics Physics and Astronomy-Instrumentation
CiteScore
2.60
自引率
20.80%
发文量
817
审稿时长
8 weeks
期刊介绍: Photonics (ISSN 2304-6732) aims at a fast turn around time for peer-reviewing manuscripts and producing accepted articles. The online-only and open access nature of the journal will allow for a speedy and wide circulation of your research as well as review articles. We aim at establishing Photonics as a leading venue for publishing high impact fundamental research but also applications of optics and photonics. The journal particularly welcomes both theoretical (simulation) and experimental research. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced. Electronic files and software regarding the full details of the calculation and experimental procedure, if unable to be published in a normal way, can be deposited as supplementary material.
期刊最新文献
Complex Noise-Based Phase Retrieval Using Total Variation and Wavelet Transform Regularization Investigation of Multiple High Quality-Factor Fano Resonances in Asymmetric Nanopillar Arrays for Optical Sensing An Experimental Determination of Critical Power for Self-Focusing of Femtosecond Pulses in Air Using Focal-Spot Measurements Dual-Polarized Reconfigurable Manipulation Based on Flexible-Printed Intelligent Reflection Surface Multi-Array Visible-Light Optical Generalized Spatial Multiplexing–Multiple Input Multiple-Output System with Pearson Coefficient-Based Antenna Selection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1