作为马尔可夫决策过程的量子逻辑门综合

IF 6.6 1区物理与天体物理 Q1 PHYSICS, APPLIED npj Quantum Information Pub Date : 2023-10-25 DOI:10.1038/s41534-023-00766-w

M. Sohaib Alam, Noah F. Berthusen, Peter P. Orth

{"title":"作为马尔可夫决策过程的量子逻辑门综合","authors":"M. Sohaib Alam, Noah F. Berthusen, Peter P. Orth","doi":"10.1038/s41534-023-00766-w","DOIUrl":null,"url":null,"abstract":"<p>Reinforcement learning has witnessed recent applications to a variety of tasks in quantum programming. The underlying assumption is that those tasks could be modeled as Markov decision processes (MDPs). Here, we investigate the feasibility of this assumption by exploring its consequences for single-qubit quantum state preparation and gate compilation. By forming discrete MDPs, we solve for the optimal policy exactly through policy iteration. We find optimal paths that correspond to the shortest possible sequence of gates to prepare a state or compile a gate, up to some target accuracy. Our method works in both the absence and presence of noise and compares favorably to other quantum compilation methods, such as the Ross–Selinger algorithm. This work provides theoretical insight into why reinforcement learning may be successfully used to find optimally short gate sequences in quantum programming.</p>","PeriodicalId":19212,"journal":{"name":"npj Quantum Information","volume":null,"pages":null},"PeriodicalIF":6.6000,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Quantum logic gate synthesis as a Markov decision process\",\"authors\":\"M. Sohaib Alam, Noah F. Berthusen, Peter P. Orth\",\"doi\":\"10.1038/s41534-023-00766-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Reinforcement learning has witnessed recent applications to a variety of tasks in quantum programming. The underlying assumption is that those tasks could be modeled as Markov decision processes (MDPs). Here, we investigate the feasibility of this assumption by exploring its consequences for single-qubit quantum state preparation and gate compilation. By forming discrete MDPs, we solve for the optimal policy exactly through policy iteration. We find optimal paths that correspond to the shortest possible sequence of gates to prepare a state or compile a gate, up to some target accuracy. Our method works in both the absence and presence of noise and compares favorably to other quantum compilation methods, such as the Ross–Selinger algorithm. This work provides theoretical insight into why reinforcement learning may be successfully used to find optimally short gate sequences in quantum programming.</p>\",\"PeriodicalId\":19212,\"journal\":{\"name\":\"npj Quantum Information\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2023-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"npj Quantum Information\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1038/s41534-023-00766-w\",\"RegionNum\":1,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PHYSICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"npj Quantum Information","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1038/s41534-023-00766-w","RegionNum":1,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHYSICS, APPLIED","Score":null,"Total":0}

引用次数: 4

摘要

最近，强化学习在量子编程的各种任务中得到了应用。基本的假设是，这些任务可以建模为马尔可夫决策过程（MDP）。在这里，我们通过探索这一假设对单量子比特量子态制备和门编译的影响来研究这一假设的可行性。通过形成离散的MDP，我们通过策略迭代精确地求解最优策略。我们找到了与最短的门序列相对应的最佳路径，以准备状态或编译门，达到一定的目标精度。我们的方法在不存在和存在噪声的情况下都能工作，并且与其他量子编译方法（如Ross–Selinger算法）相比是有利的。这项工作为为什么强化学习可以成功地用于量子编程中的最佳短门序列提供了理论见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Quantum logic gate synthesis as a Markov decision process

Reinforcement learning has witnessed recent applications to a variety of tasks in quantum programming. The underlying assumption is that those tasks could be modeled as Markov decision processes (MDPs). Here, we investigate the feasibility of this assumption by exploring its consequences for single-qubit quantum state preparation and gate compilation. By forming discrete MDPs, we solve for the optimal policy exactly through policy iteration. We find optimal paths that correspond to the shortest possible sequence of gates to prepare a state or compile a gate, up to some target accuracy. Our method works in both the absence and presence of noise and compares favorably to other quantum compilation methods, such as the Ross–Selinger algorithm. This work provides theoretical insight into why reinforcement learning may be successfully used to find optimally short gate sequences in quantum programming.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

npj Quantum Information Computer Science-Computer Science (miscellaneous)

CiteScore

13.70

自引率

3.90%

发文量

130

审稿时长

29 weeks

期刊介绍： The scope of npj Quantum Information spans across all relevant disciplines, fields, approaches and levels and so considers outstanding work ranging from fundamental research to applications and technologies.