Pub Date : 2024-11-01DOI: 10.1038/s43588-024-00713-5
Weikang Li, Dong-Ling Deng
A method is introduced to compute provable bounds on noise-free quantum expectation values from noisy samples, promising potential applications in quantum optimization and machine learning.
Pub Date : 2024-11-01DOI: 10.1038/s43588-024-00709-1
Samantha V. Barron, Daniel J. Egger, Elijah Pelofske, Andreas Bärtschi, Stephan Eidenbenz, Matthis Lehmkuehler, Stefan Woerner
Quantum computing has emerged as a powerful computational paradigm capable of solving problems beyond the reach of classical computers. However, today’s quantum computers are noisy, posing challenges to obtaining accurate results. Here, we explore the impact of noise on quantum computing, focusing on the challenges in sampling bit strings from noisy quantum computers and the implications for optimization and machine learning. We formally quantify the sampling overhead to extract good samples from noisy quantum computers and relate it to the layer fidelity, a metric to determine the performance of noisy quantum processors. Further, we show how this allows us to use the conditional value at risk of noisy samples to determine provable bounds on noise-free expectation values. We discuss how to leverage these bounds for different algorithms and demonstrate our findings through experiments on real quantum computers involving up to 127 qubits. The results show strong alignment with theoretical predictions. In this study, the authors investigate the impact of noise on quantum computing with a focus on the challenges in sampling bit strings from noisy quantum computers, which has implications for optimization and machine learning.
{"title":"Provable bounds for noise-free expectation values computed from noisy samples","authors":"Samantha V. Barron, Daniel J. Egger, Elijah Pelofske, Andreas Bärtschi, Stephan Eidenbenz, Matthis Lehmkuehler, Stefan Woerner","doi":"10.1038/s43588-024-00709-1","DOIUrl":"10.1038/s43588-024-00709-1","url":null,"abstract":"Quantum computing has emerged as a powerful computational paradigm capable of solving problems beyond the reach of classical computers. However, today’s quantum computers are noisy, posing challenges to obtaining accurate results. Here, we explore the impact of noise on quantum computing, focusing on the challenges in sampling bit strings from noisy quantum computers and the implications for optimization and machine learning. We formally quantify the sampling overhead to extract good samples from noisy quantum computers and relate it to the layer fidelity, a metric to determine the performance of noisy quantum processors. Further, we show how this allows us to use the conditional value at risk of noisy samples to determine provable bounds on noise-free expectation values. We discuss how to leverage these bounds for different algorithms and demonstrate our findings through experiments on real quantum computers involving up to 127 qubits. The results show strong alignment with theoretical predictions. In this study, the authors investigate the impact of noise on quantum computing with a focus on the challenges in sampling bit strings from noisy quantum computers, which has implications for optimization and machine learning.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 11","pages":"865-875"},"PeriodicalIF":12.0,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-024-00709-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142564932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Generative artificial intelligence (GAI) requires substantial computational resources for model training and inference, but the electronic-waste (e-waste) implications of GAI and its management strategies remain underexplored. Here we introduce a computational power-driven material flow analysis framework to quantify and explore ways of managing the e-waste generated by GAI, with a particular focus on large language models. Our findings indicate that this e-waste stream could increase, potentially reaching a total accumulation of 1.2–5.0 million tons during 2020–2030, under different future GAI development settings. This may be intensified in the context of geopolitical restrictions on semiconductor imports and the rapid server turnover for operational cost savings. Meanwhile, we show that the implementation of circular economy strategies along the GAI value chain could reduce e-waste generation by 16–86%. This underscores the importance of proactive e-waste management in the face of advancing GAI technologies. Generative artificial intelligence (GAI) is driving a surge in e-waste due to intensive computational infrastructure needs. This study emphasizes the necessity for proactive implementation of circular economy practices throughout GAI value chains.
生成式人工智能(GAI)需要大量计算资源来进行模型训练和推理,但 GAI 及其管理策略对电子垃圾(e-waste)的影响仍未得到充分探索。在此,我们引入了一个计算力驱动的物质流分析框架,以量化和探索管理 GAI 产生的电子垃圾的方法,尤其侧重于大型语言模型。我们的研究结果表明,在未来不同的 GAI 发展环境下,电子废物流可能会增加,在 2020-2030 年期间可能达到 120-500 万吨的总积累量。在地缘政治对半导体进口的限制以及服务器为节约运营成本而快速更替的背景下,这种情况可能会加剧。同时,我们的研究表明,在 GAI 价值链上实施循环经济战略可将电子垃圾的产生量减少 16-86%。这凸显了面对不断进步的 GAI 技术,积极管理电子废物的重要性。
{"title":"E-waste challenges of generative artificial intelligence","authors":"Peng Wang, Ling-Yu Zhang, Asaf Tzachor, Wei-Qiang Chen","doi":"10.1038/s43588-024-00712-6","DOIUrl":"10.1038/s43588-024-00712-6","url":null,"abstract":"Generative artificial intelligence (GAI) requires substantial computational resources for model training and inference, but the electronic-waste (e-waste) implications of GAI and its management strategies remain underexplored. Here we introduce a computational power-driven material flow analysis framework to quantify and explore ways of managing the e-waste generated by GAI, with a particular focus on large language models. Our findings indicate that this e-waste stream could increase, potentially reaching a total accumulation of 1.2–5.0 million tons during 2020–2030, under different future GAI development settings. This may be intensified in the context of geopolitical restrictions on semiconductor imports and the rapid server turnover for operational cost savings. Meanwhile, we show that the implementation of circular economy strategies along the GAI value chain could reduce e-waste generation by 16–86%. This underscores the importance of proactive e-waste management in the face of advancing GAI technologies. Generative artificial intelligence (GAI) is driving a surge in e-waste due to intensive computational infrastructure needs. This study emphasizes the necessity for proactive implementation of circular economy practices throughout GAI value chains.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 11","pages":"818-823"},"PeriodicalIF":12.0,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142523796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-25DOI: 10.1038/s43588-024-00716-2
Yunxin Xu, Di Liu, Haipeng Gong
Accurate prediction of protein mutation effects is of great importance in protein engineering and design. Here we propose GeoStab-suite, a suite of three geometric learning-based models—GeoFitness, GeoDDG and GeoDTm—for the prediction of fitness score, ΔΔG and ΔTm of a protein upon mutations, respectively. GeoFitness engages a specialized loss function to allow supervised training of a unified model using the large amount of multi-labeled fitness data in the deep mutational scanning database. To further improve the downstream tasks of ΔΔG and ΔTm prediction, the encoder of GeoFitness is reutilized as a pre-trained module in GeoDDG and GeoDTm to overcome the challenge of lacking sufficient labeled data. This pre-training strategy, in combination with data expansion, markedly improves model performance and generalizability. In the benchmark test, GeoDDG and GeoDTm outperform the other state-of-the-art methods by at least 30% and 70%, respectively, in terms of the Spearman correlation coefficient. In this study, the authors propose a strategy to train a unified model to learn the general mutational effects based on multi-labeled deep mutational scanning (DMS) data, and then reutilize this pre-trained model to improve the downstream protein stability prediction tasks.
{"title":"Improving the prediction of protein stability changes upon mutations by geometric learning and a pre-training strategy","authors":"Yunxin Xu, Di Liu, Haipeng Gong","doi":"10.1038/s43588-024-00716-2","DOIUrl":"10.1038/s43588-024-00716-2","url":null,"abstract":"Accurate prediction of protein mutation effects is of great importance in protein engineering and design. Here we propose GeoStab-suite, a suite of three geometric learning-based models—GeoFitness, GeoDDG and GeoDTm—for the prediction of fitness score, ΔΔG and ΔTm of a protein upon mutations, respectively. GeoFitness engages a specialized loss function to allow supervised training of a unified model using the large amount of multi-labeled fitness data in the deep mutational scanning database. To further improve the downstream tasks of ΔΔG and ΔTm prediction, the encoder of GeoFitness is reutilized as a pre-trained module in GeoDDG and GeoDTm to overcome the challenge of lacking sufficient labeled data. This pre-training strategy, in combination with data expansion, markedly improves model performance and generalizability. In the benchmark test, GeoDDG and GeoDTm outperform the other state-of-the-art methods by at least 30% and 70%, respectively, in terms of the Spearman correlation coefficient. In this study, the authors propose a strategy to train a unified model to learn the general mutational effects based on multi-labeled deep mutational scanning (DMS) data, and then reutilize this pre-trained model to improve the downstream protein stability prediction tasks.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 11","pages":"840-850"},"PeriodicalIF":12.0,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23DOI: 10.1038/s43588-024-00723-3
He Li, Zun Wang, Nianlong Zou, Meng Ye, Runzhang Xu, Xiaoxun Gong, Wenhui Duan, Yong Xu
{"title":"Author Correction: Deep-learning density functional theory Hamiltonian for efficient ab initio electronic-structure calculation","authors":"He Li, Zun Wang, Nianlong Zou, Meng Ye, Runzhang Xu, Xiaoxun Gong, Wenhui Duan, Yong Xu","doi":"10.1038/s43588-024-00723-3","DOIUrl":"10.1038/s43588-024-00723-3","url":null,"abstract":"","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 11","pages":"876-876"},"PeriodicalIF":12.0,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-024-00723-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23DOI: 10.1038/s43588-024-00704-6
Zachary Fralish, Daniel Reker
Active machine learning is employed in academia and industry to support drug discovery. A recent study unravels the factors that influence a deep learning models’ ability to guide iterative discovery.
{"title":"Taking a deep dive with active learning for drug discovery","authors":"Zachary Fralish, Daniel Reker","doi":"10.1038/s43588-024-00704-6","DOIUrl":"10.1038/s43588-024-00704-6","url":null,"abstract":"Active machine learning is employed in academia and industry to support drug discovery. A recent study unravels the factors that influence a deep learning models’ ability to guide iterative discovery.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 10","pages":"727-728"},"PeriodicalIF":12.0,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-22DOI: 10.1038/s43588-024-00708-2
Christina L. Vizcarra, Ryan F. Trainor, Ashley Ringer McDonald, Chris T. Richardson, Davit Potoyan, Jessica A. Nash, Britt Lundgren, Tyler Luchko, Glen M. Hocky, Jonathan J. Foley IV, Timothy J. Atherton, Grace Y. Stokes
{"title":"An interdisciplinary effort to integrate coding into science courses","authors":"Christina L. Vizcarra, Ryan F. Trainor, Ashley Ringer McDonald, Chris T. Richardson, Davit Potoyan, Jessica A. Nash, Britt Lundgren, Tyler Luchko, Glen M. Hocky, Jonathan J. Foley IV, Timothy J. Atherton, Grace Y. Stokes","doi":"10.1038/s43588-024-00708-2","DOIUrl":"10.1038/s43588-024-00708-2","url":null,"abstract":"","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 11","pages":"803-804"},"PeriodicalIF":12.0,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-15DOI: 10.1038/s43588-024-00699-0
Guy Durant, Fergus Boyles, Kristian Birchall, Charlotte M. Deane
Many studies have prophesied that the integration of machine learning techniques into small-molecule therapeutics development will help to deliver a true leap forward in drug discovery. However, increasingly advanced algorithms and novel architectures have not always yielded substantial improvements in results. In this Perspective, we propose that a greater focus on the data for training and benchmarking these models is more likely to drive future improvement, and explore avenues for future research and strategies to address these data challenges. The application of machine learning techniques to small-molecule drug discovery has not yet yielded a true leap forward in the field. This Perspective discusses how a renewed focus on data and validation could help unlock machine learning’s potential.
{"title":"The future of machine learning for small-molecule drug discovery will be driven by data","authors":"Guy Durant, Fergus Boyles, Kristian Birchall, Charlotte M. Deane","doi":"10.1038/s43588-024-00699-0","DOIUrl":"10.1038/s43588-024-00699-0","url":null,"abstract":"Many studies have prophesied that the integration of machine learning techniques into small-molecule therapeutics development will help to deliver a true leap forward in drug discovery. However, increasingly advanced algorithms and novel architectures have not always yielded substantial improvements in results. In this Perspective, we propose that a greater focus on the data for training and benchmarking these models is more likely to drive future improvement, and explore avenues for future research and strategies to address these data challenges. The application of machine learning techniques to small-molecule drug discovery has not yet yielded a true leap forward in the field. This Perspective discusses how a renewed focus on data and validation could help unlock machine learning’s potential.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 10","pages":"735-743"},"PeriodicalIF":12.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}