Deep learning has revolutionized chemical research by accelerating the discovery and understanding of complex chemical systems. However, polymer chemistry lacks a unified deep learning framework owing to the complexity of polymer structures. Existing self-supervised learning methods simplify polymers into repeating units and neglect their inherent periodicity, thereby limiting the models’ ability to generalize across tasks. To address this, we propose a periodicity-aware deep learning framework for polymers, PerioGT. In pre-training, a chemical knowledge-driven periodicity prior is constructed and incorporated into the model through contrastive learning. Then, periodicity prompts are learned in fine-tuning based on the prior. Additionally, a graph augmentation strategy is employed, which integrates additional conditions via virtual nodes to model complex chemical interactions. PerioGT achieves state-of-the-art performance on 16 downstream tasks. Wet-lab experiments highlight PerioGT’s potential in the real world, identifying two polymers with potent antimicrobial properties. Our results demonstrate that introducing the periodicity prior effectively enhances model performance. PerioGT is a self-supervised learning framework for polymer property prediction, integrating periodicity priors and additional conditions to enhance generalization under data scarcity and enable broad applicability.
{"title":"Periodicity-aware deep learning for polymers","authors":"Yuhui Wu, Cong Wang, Xintian Shen, Tianyi Zhang, Peng Zhang, Jian Ji","doi":"10.1038/s43588-025-00903-9","DOIUrl":"10.1038/s43588-025-00903-9","url":null,"abstract":"Deep learning has revolutionized chemical research by accelerating the discovery and understanding of complex chemical systems. However, polymer chemistry lacks a unified deep learning framework owing to the complexity of polymer structures. Existing self-supervised learning methods simplify polymers into repeating units and neglect their inherent periodicity, thereby limiting the models’ ability to generalize across tasks. To address this, we propose a periodicity-aware deep learning framework for polymers, PerioGT. In pre-training, a chemical knowledge-driven periodicity prior is constructed and incorporated into the model through contrastive learning. Then, periodicity prompts are learned in fine-tuning based on the prior. Additionally, a graph augmentation strategy is employed, which integrates additional conditions via virtual nodes to model complex chemical interactions. PerioGT achieves state-of-the-art performance on 16 downstream tasks. Wet-lab experiments highlight PerioGT’s potential in the real world, identifying two polymers with potent antimicrobial properties. Our results demonstrate that introducing the periodicity prior effectively enhances model performance. PerioGT is a self-supervised learning framework for polymer property prediction, integrating periodicity priors and additional conditions to enhance generalization under data scarcity and enable broad applicability.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 12","pages":"1214-1226"},"PeriodicalIF":18.3,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-18DOI: 10.1038/s43588-025-00900-y
Arnab Bhattacharjee, Zaid Zada, Haocheng Wang, Bobbi Aubrey, Werner Doyle, Patricia Dugan, Daniel Friedman, Orrin Devinsky, Adeen Flinker, Peter J Ramadge, Uri Hasson, Ariel Goldstein, Samuel A Nastase
Recent research demonstrates that large language models can predict neural activity recorded via electrocorticography during natural language processing. To predict word-by-word neural activity, most prior work evaluates encoding models within individual electrodes and participants, limiting generalizability. Here we analyze electrocorticography data from eight participants listening to the same 30-min podcast. Using a shared response model, we estimate a common information space across participants. This shared space substantially enhances large language model-based encoding performance and enables denoising of individual brain responses by projecting back into participant-specific electrode spaces-yielding a 37% average improvement in encoding accuracy (from r = 0.188 to r = 0.257). The greatest gains occur in brain areas specialized for language comprehension, particularly the superior temporal gyrus and inferior frontal gyrus. Our findings highlight that estimating a shared space allows us to construct encoding models that better generalize across individuals.
{"title":"Aligning brains into a shared space improves their alignment with large language models.","authors":"Arnab Bhattacharjee, Zaid Zada, Haocheng Wang, Bobbi Aubrey, Werner Doyle, Patricia Dugan, Daniel Friedman, Orrin Devinsky, Adeen Flinker, Peter J Ramadge, Uri Hasson, Ariel Goldstein, Samuel A Nastase","doi":"10.1038/s43588-025-00900-y","DOIUrl":"https://doi.org/10.1038/s43588-025-00900-y","url":null,"abstract":"<p><p>Recent research demonstrates that large language models can predict neural activity recorded via electrocorticography during natural language processing. To predict word-by-word neural activity, most prior work evaluates encoding models within individual electrodes and participants, limiting generalizability. Here we analyze electrocorticography data from eight participants listening to the same 30-min podcast. Using a shared response model, we estimate a common information space across participants. This shared space substantially enhances large language model-based encoding performance and enables denoising of individual brain responses by projecting back into participant-specific electrode spaces-yielding a 37% average improvement in encoding accuracy (from r = 0.188 to r = 0.257). The greatest gains occur in brain areas specialized for language comprehension, particularly the superior temporal gyrus and inferior frontal gyrus. Our findings highlight that estimating a shared space allows us to construct encoding models that better generalize across individuals.</p>","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":" ","pages":""},"PeriodicalIF":18.3,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145552206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-17DOI: 10.1038/s43588-025-00901-x
Fiona R. Kolbinger, Jakob Nikolas Kather
Technical metrics used to evaluate medical artificial intelligence tools often fail to predict their clinical impact. We characterize this discordance and propose a framework of study designs to guide the translational process for clinical artificial intelligence tools, acknowledging their diversity and specific validation requirements.
{"title":"Adaptive validation strategies for real-world clinical artificial intelligence","authors":"Fiona R. Kolbinger, Jakob Nikolas Kather","doi":"10.1038/s43588-025-00901-x","DOIUrl":"10.1038/s43588-025-00901-x","url":null,"abstract":"Technical metrics used to evaluate medical artificial intelligence tools often fail to predict their clinical impact. We characterize this discordance and propose a framework of study designs to guide the translational process for clinical artificial intelligence tools, acknowledging their diversity and specific validation requirements.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 11","pages":"980-986"},"PeriodicalIF":18.3,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145544296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-14DOI: 10.1038/s43588-025-00898-3
Yi-Xin Sha, Ming-Yao Xia, Ling Lu, Yi Yang
Topological photonics and acoustics have attracted wide research interest for their ability to manipulate light and sound at surfaces. The supercell technique is the conventional standard approach used to calculate these boundary effects, but, as the supercell grows in size, this method requires increasingly large computational resources. Additionally, it falls short in differentiating the surface states at opposite boundaries and, due to finite-size effects, from bulk states. Here, to overcome these limitations, we provide two complementary efficient methods for obtaining the ideal topological surface states of semi-infinite systems of diverse surface configurations. The first is the cyclic reduction method, which is based on iteratively inverting the Hamiltonian for a single unit cell, and the other is the transfer matrix method, which relies on eigenanalysis of a transfer matrix for a pair of unit cells. Numerical benchmarks, including gyromagnetic photonic crystals, valley photonic crystals, spin-Hall acoustic crystals and quadrupole photonic crystals, jointly show that both methods can effectively sort out the boundary modes via the surface density of states, at reduced computational cost and increased speed. Our computational schemes enable direct comparisons with near-field scanning measurements, thereby expediting the exploration of topological artificial materials and the design of topological devices. This study reports two efficient methods—cyclic reduction and transfer matrix—to compute topological surface states in photonic and acoustic systems, cutting memory and time use by up to 100-fold and enabling the faster design of advanced topological devices.
{"title":"Efficient algorithms for the surface density of states in topological photonic and acoustic systems","authors":"Yi-Xin Sha, Ming-Yao Xia, Ling Lu, Yi Yang","doi":"10.1038/s43588-025-00898-3","DOIUrl":"10.1038/s43588-025-00898-3","url":null,"abstract":"Topological photonics and acoustics have attracted wide research interest for their ability to manipulate light and sound at surfaces. The supercell technique is the conventional standard approach used to calculate these boundary effects, but, as the supercell grows in size, this method requires increasingly large computational resources. Additionally, it falls short in differentiating the surface states at opposite boundaries and, due to finite-size effects, from bulk states. Here, to overcome these limitations, we provide two complementary efficient methods for obtaining the ideal topological surface states of semi-infinite systems of diverse surface configurations. The first is the cyclic reduction method, which is based on iteratively inverting the Hamiltonian for a single unit cell, and the other is the transfer matrix method, which relies on eigenanalysis of a transfer matrix for a pair of unit cells. Numerical benchmarks, including gyromagnetic photonic crystals, valley photonic crystals, spin-Hall acoustic crystals and quadrupole photonic crystals, jointly show that both methods can effectively sort out the boundary modes via the surface density of states, at reduced computational cost and increased speed. Our computational schemes enable direct comparisons with near-field scanning measurements, thereby expediting the exploration of topological artificial materials and the design of topological devices. This study reports two efficient methods—cyclic reduction and transfer matrix—to compute topological surface states in photonic and acoustic systems, cutting memory and time use by up to 100-fold and enabling the faster design of advanced topological devices.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 12","pages":"1192-1201"},"PeriodicalIF":18.3,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145524776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, artificial intelligence has advanced the design–make–test–analyze cycle, transforming molecular discovery. Despite these advances, the compartmentalized approach to computer-aided molecular design and synthesis remains a critical bottleneck, limiting further optimization of the design–make–test–analyze cycle. Here, to this end, we introduce SynGFN, which models molecular design as a cascade of simulated chemical reactions, enabling the assembly of molecules from synthesizable building blocks. SynGFN features two key ingredients: (1) a hierarchically pretrained policy network that accelerates learning across diverse distributions of desirable molecules in chemical spaces, and (2) a multifidelity acquisition framework to alleviate the cost of reward evaluations. These technical developments collectively endow SynGFN with the capability to explore a chemical space up to an order of magnitude larger (measured in terms of #Circles) than that of other synthesis-aware generative models, while identifying the most diverse, synthesizable and high-performance molecules. We demonstrate SynGFN’s potential impacts by designing inhibitors for GluN1/GluN3A, a therapeutic target for neuropsychiatric disorders. A persistent gap from theoretical molecules to experimentally viable compounds has hindered the practical adoption of generative algorithms. This study proposes SynGFN as a bridge linking molecular design and synthesis, accelerating exploration and producing diverse, synthesizable, high-performance molecules.
{"title":"SynGFN: learning across chemical space with generative flow-based molecular discovery","authors":"Yuchen Zhu, Shuwang Li, Jihong Chen, Donghai Zhao, Xiaorui Wang, Yitong Li, Yifei Liu, Yue Kong, Beichen Zhang, Chang Liu, Tingjun Hou, Chang-Yu Hsieh","doi":"10.1038/s43588-025-00902-w","DOIUrl":"10.1038/s43588-025-00902-w","url":null,"abstract":"In recent years, artificial intelligence has advanced the design–make–test–analyze cycle, transforming molecular discovery. Despite these advances, the compartmentalized approach to computer-aided molecular design and synthesis remains a critical bottleneck, limiting further optimization of the design–make–test–analyze cycle. Here, to this end, we introduce SynGFN, which models molecular design as a cascade of simulated chemical reactions, enabling the assembly of molecules from synthesizable building blocks. SynGFN features two key ingredients: (1) a hierarchically pretrained policy network that accelerates learning across diverse distributions of desirable molecules in chemical spaces, and (2) a multifidelity acquisition framework to alleviate the cost of reward evaluations. These technical developments collectively endow SynGFN with the capability to explore a chemical space up to an order of magnitude larger (measured in terms of #Circles) than that of other synthesis-aware generative models, while identifying the most diverse, synthesizable and high-performance molecules. We demonstrate SynGFN’s potential impacts by designing inhibitors for GluN1/GluN3A, a therapeutic target for neuropsychiatric disorders. A persistent gap from theoretical molecules to experimentally viable compounds has hindered the practical adoption of generative algorithms. This study proposes SynGFN as a bridge linking molecular design and synthesis, accelerating exploration and producing diverse, synthesizable, high-performance molecules.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"6 1","pages":"29-38"},"PeriodicalIF":18.3,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145514834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-12DOI: 10.1038/s43588-025-00905-7
Samuel A. Nastase
A systematic comparison of large language models suggests that larger models align better with both human behavior and brain activity during natural reading. Instruction tuning, however, does not yield a similar benefit.
{"title":"Larger language models better align with the reading brain","authors":"Samuel A. Nastase","doi":"10.1038/s43588-025-00905-7","DOIUrl":"10.1038/s43588-025-00905-7","url":null,"abstract":"A systematic comparison of large language models suggests that larger models align better with both human behavior and brain activity during natural reading. Instruction tuning, however, does not yield a similar benefit.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 11","pages":"994-995"},"PeriodicalIF":18.3,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145508432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-04DOI: 10.1038/s43588-025-00897-4
Yiqing Zhou, Chao Wan, Yichen Xu, Jin Peng Zhou, Kilian Q. Weinberger, Eun-Ah Kim
As quantum hardware advances toward enabling error-corrected quantum circuits in the near future, the absence of an efficient polynomial-time decoding algorithm for logical circuits presents a critical bottleneck. While quantum memory decoding has been well studied, inevitable correlated errors introduced by transversal entangling logical gates prevent the straightforward generalization of quantum memory decoders. Here we introduce a data-centric, modular decoder framework, the Multi-Core Circuit Decoder (MCCD), which consists of decoder modules corresponding to each logical operation supported by the quantum hardware. The MCCD handles both single-qubit and entangling gates within a unified framework. We train MCCD using mirror-symmetric random Clifford circuits, demonstrating its ability to effectively learn correlated decoding patterns. Through extensive testing on circuits substantially deeper than those used in training, we show that MCCD maintains high logical accuracy while exhibiting competitive polynomial decoding time across increasing circuit depths and code distances. When compared with conventional decoders such as minimum weight perfect matching (MWPM), most likely error (MLE) and belief propagation with ordered statistics post-processing (BP-OSD), MCCD achieves competitive accuracy with substantially better time efficiency, particularly for circuits with entangling gates. Our approach provides a noise-model-agnostic solution to the decoding challenge in deep logical quantum circuits. This study reports a machine learning decoder that efficiently corrects errors in quantum logical circuits with entangling gates. The Multi-Core Circuit Decoder achieves competitive accuracy while running much faster than conventional methods.
随着量子硬件在不久的将来向纠错量子电路的方向发展,缺乏有效的逻辑电路多项式时间解码算法是一个关键的瓶颈。虽然量子记忆译码已经得到了很好的研究,但横向纠缠逻辑门引入的不可避免的相关误差阻碍了量子记忆译码器的直接推广。在这里,我们介绍了一个以数据为中心的模块化解码器框架,即多核电路解码器(MCCD),它由与量子硬件支持的每个逻辑运算相对应的解码器模块组成。MCCD在一个统一的框架内处理单量子位和纠缠门。我们使用镜像对称随机Clifford电路训练MCCD,证明了其有效学习相关解码模式的能力。通过在比训练中使用的电路更深的电路上进行广泛的测试,我们表明MCCD在保持高逻辑准确性的同时,在增加电路深度和代码距离时表现出具有竞争力的多项式解码时间。与传统的解码器(如最小权重完美匹配(MWPM),最可能误差(MLE)和有序统计后处理(BP-OSD)的信念传播(belief propagation with ordered statistics postprocessing, BP-OSD)相比,MCCD实现了具有竞争力的精度和更好的时间效率,特别是对于有纠缠门的电路。我们的方法为深度逻辑量子电路中的解码挑战提供了一种与噪声模型无关的解决方案。
{"title":"Learning to decode logical circuits","authors":"Yiqing Zhou, Chao Wan, Yichen Xu, Jin Peng Zhou, Kilian Q. Weinberger, Eun-Ah Kim","doi":"10.1038/s43588-025-00897-4","DOIUrl":"10.1038/s43588-025-00897-4","url":null,"abstract":"As quantum hardware advances toward enabling error-corrected quantum circuits in the near future, the absence of an efficient polynomial-time decoding algorithm for logical circuits presents a critical bottleneck. While quantum memory decoding has been well studied, inevitable correlated errors introduced by transversal entangling logical gates prevent the straightforward generalization of quantum memory decoders. Here we introduce a data-centric, modular decoder framework, the Multi-Core Circuit Decoder (MCCD), which consists of decoder modules corresponding to each logical operation supported by the quantum hardware. The MCCD handles both single-qubit and entangling gates within a unified framework. We train MCCD using mirror-symmetric random Clifford circuits, demonstrating its ability to effectively learn correlated decoding patterns. Through extensive testing on circuits substantially deeper than those used in training, we show that MCCD maintains high logical accuracy while exhibiting competitive polynomial decoding time across increasing circuit depths and code distances. When compared with conventional decoders such as minimum weight perfect matching (MWPM), most likely error (MLE) and belief propagation with ordered statistics post-processing (BP-OSD), MCCD achieves competitive accuracy with substantially better time efficiency, particularly for circuits with entangling gates. Our approach provides a noise-model-agnostic solution to the decoding challenge in deep logical quantum circuits. This study reports a machine learning decoder that efficiently corrects errors in quantum logical circuits with entangling gates. The Multi-Core Circuit Decoder achieves competitive accuracy while running much faster than conventional methods.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 12","pages":"1158-1167"},"PeriodicalIF":18.3,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s43588-025-00897-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145446737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-31DOI: 10.1038/s43588-025-00899-2
Alexandre Mojon, Robert Mahari, Sandro Claudio Lera
Selecting capable counsel can shape the outcome of litigation, yet evaluating law firm performance remains challenging. Widely used rankings prioritize prestige, size and revenue over empirical litigation outcomes, offering little practical guidance. Here, to address this gap, we build on the Bradley–Terry model and introduce a new ranking framework that treats each lawsuit as a competitive game between plaintiff and defendant law firms. Leveraging a newly constructed dataset of 60,540 US civil lawsuits involving 54,541 law firms, our findings show that existing reputation-based rankings correlate poorly with actual litigation success, while our outcome-based ranking substantially improves predictive accuracy. These findings establish a foundation for more transparent, data-driven assessments of legal performance. This study introduces a data-driven method for ranking law firms based on litigation outcomes, revealing that traditional reputation-based rankings do not reflect legal performance accurately.
{"title":"Data-driven law firm rankings to reduce information asymmetry in legal disputes","authors":"Alexandre Mojon, Robert Mahari, Sandro Claudio Lera","doi":"10.1038/s43588-025-00899-2","DOIUrl":"10.1038/s43588-025-00899-2","url":null,"abstract":"Selecting capable counsel can shape the outcome of litigation, yet evaluating law firm performance remains challenging. Widely used rankings prioritize prestige, size and revenue over empirical litigation outcomes, offering little practical guidance. Here, to address this gap, we build on the Bradley–Terry model and introduce a new ranking framework that treats each lawsuit as a competitive game between plaintiff and defendant law firms. Leveraging a newly constructed dataset of 60,540 US civil lawsuits involving 54,541 law firms, our findings show that existing reputation-based rankings correlate poorly with actual litigation success, while our outcome-based ranking substantially improves predictive accuracy. These findings establish a foundation for more transparent, data-driven assessments of legal performance. This study introduces a data-driven method for ranking law firms based on litigation outcomes, revealing that traditional reputation-based rankings do not reflect legal performance accurately.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 11","pages":"1010-1016"},"PeriodicalIF":18.3,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s43588-025-00899-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145423744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}