Gene fusions are common cancer drivers and therapeutic targets, but clinical-grade open-source bioinformatic tools are lacking. Here, we introduce a fusion detection method named SplitFusion, which is fast by leveraging Burrows-Wheeler Aligner-maximal exact match (BWA-MEM) split alignments, can detect cryptic splice-site fusions (e.g., EML4::ALK v3b and ARv7), call fusions involving highly repetitive gene partners (e.g., CIC::DUX4), and infer frame-ness and exon-boundary alignments for functional prediction and minimizing false positives. Using 1,848 datasets of various sizes, SplitFusion demonstrated superior sensitivity and specificity compared to three other tools. In 1,076 formalin-fixed paraffin-embedded lung cancer samples, SplitFusion identified novel fusions and revealed that EML4::ALK variant 3 was associated with multiple fusion variants coexisting in the same tumor. Additionally, SplitFusion can call targeted splicing variants. Using data from 515 The Cancer Genome Atlas (TCGA) samples, SplitFusion showed the highest sensitivity and uncovered two cases of SLC34A2::ROS1 that were missed in previous studies. These capabilities make SplitFusion highly suitable for clinical applications and the study of fusion-defined tumor heterogeneity.
{"title":"SplitFusion enables ultrasensitive gene fusion detection and reveals fusion variant-associated tumor heterogeneity.","authors":"Weiwei Bian, Baifeng Zhang, Zhengbo Song, Binyamin A Knisbacher, Yee Man Chan, Chloe Bao, Chunwei Xu, Wenxian Wang, Athena Hoi Yee Chu, Chenyu Lu, Hongxian Wang, Siyu Bao, Zhenyu Gong, Hoi Yee Keung, Zi-Ying Maggie Chow, Yiping Zhang, Wah Cheuk, Gad Getz, Valentina Nardi, Mengsu Yang, William Chi Shing Cho, Jian Wang, Juxiang Chen, Zongli Zheng","doi":"10.1016/j.patter.2025.101174","DOIUrl":"10.1016/j.patter.2025.101174","url":null,"abstract":"<p><p>Gene fusions are common cancer drivers and therapeutic targets, but clinical-grade open-source bioinformatic tools are lacking. Here, we introduce a fusion detection method named SplitFusion, which is fast by leveraging Burrows-Wheeler Aligner-maximal exact match (BWA-MEM) split alignments, can detect cryptic splice-site fusions (e.g., <i>EML4::ALK</i> v3b and <i>ARv7</i>), call fusions involving highly repetitive gene partners (e.g., <i>CIC::DUX4</i>), and infer frame-ness and exon-boundary alignments for functional prediction and minimizing false positives. Using 1,848 datasets of various sizes, SplitFusion demonstrated superior sensitivity and specificity compared to three other tools. In 1,076 formalin-fixed paraffin-embedded lung cancer samples, SplitFusion identified novel fusions and revealed that <i>EML4::ALK</i> variant 3 was associated with multiple fusion variants coexisting in the same tumor. Additionally, SplitFusion can call targeted splicing variants. Using data from 515 The Cancer Genome Atlas (TCGA) samples, SplitFusion showed the highest sensitivity and uncovered two cases of <i>SLC34A2::ROS1</i> that were missed in previous studies. These capabilities make SplitFusion highly suitable for clinical applications and the study of fusion-defined tumor heterogeneity.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 2","pages":"101174"},"PeriodicalIF":6.7,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873004/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-14DOI: 10.1016/j.patter.2025.101183
Alejandra Alvarado
{"title":"Lessons from the EU AI Act.","authors":"Alejandra Alvarado","doi":"10.1016/j.patter.2025.101183","DOIUrl":"https://doi.org/10.1016/j.patter.2025.101183","url":null,"abstract":"","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 2","pages":"101183"},"PeriodicalIF":6.7,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873001/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06eCollection Date: 2025-02-14DOI: 10.1016/j.patter.2025.101176
Zifan Zheng, Yezhaohui Wang, Yuxin Huang, Shichao Song, Mingchuan Yang, Bo Tang, Feiyu Xiong, Zhiyu Li
Large language models (LLMs) have demonstrated performance approaching human levels in tasks such as long-text comprehension and mathematical reasoning, but they remain black-box systems. Understanding the reasoning bottlenecks of LLMs remains a critical challenge, as these limitations are deeply tied to their internal architecture. Attention heads play a pivotal role in reasoning and are thought to share similarities with human brain functions. In this review, we explore the roles and mechanisms of attention heads to help demystify the internal reasoning processes of LLMs. We first introduce a four-stage framework inspired by the human thought process. Using this framework, we review existing research to identify and categorize the functions of specific attention heads. Additionally, we analyze the experimental methodologies used to discover these special heads and further summarize relevant evaluation methods and benchmarks. Finally, we discuss the limitations of current research and propose several potential future directions.
{"title":"Attention heads of large language models.","authors":"Zifan Zheng, Yezhaohui Wang, Yuxin Huang, Shichao Song, Mingchuan Yang, Bo Tang, Feiyu Xiong, Zhiyu Li","doi":"10.1016/j.patter.2025.101176","DOIUrl":"10.1016/j.patter.2025.101176","url":null,"abstract":"<p><p>Large language models (LLMs) have demonstrated performance approaching human levels in tasks such as long-text comprehension and mathematical reasoning, but they remain black-box systems. Understanding the reasoning bottlenecks of LLMs remains a critical challenge, as these limitations are deeply tied to their internal architecture. Attention heads play a pivotal role in reasoning and are thought to share similarities with human brain functions. In this review, we explore the roles and mechanisms of attention heads to help demystify the internal reasoning processes of LLMs. We first introduce a four-stage framework inspired by the human thought process. Using this framework, we review existing research to identify and categorize the functions of specific attention heads. Additionally, we analyze the experimental methodologies used to discover these special heads and further summarize relevant evaluation methods and benchmarks. Finally, we discuss the limitations of current research and propose several potential future directions.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 2","pages":"101176"},"PeriodicalIF":6.7,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873009/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06eCollection Date: 2025-02-14DOI: 10.1016/j.patter.2025.101177
Frans van der Sluis, Egon L van den Broek
Balancing prediction accuracy, model interpretability, and domain generalization (also known as [a.k.a.] out-of-distribution testing/evaluation) is a central challenge in machine learning. To assess this challenge, we took 120 interpretable and 166 opaque models from 77,640 tuned configurations, complemented with ChatGPT, 3 probabilistic language models, and Vec2Read. The models first performed text classification to derive principles of textual complexity (task 1) and then generalized these to predict readers' appraisals of processing difficulty (task 2). The results confirmed the known accuracy-interpretability trade-off on task 1. However, task 2's domain generalization showed that interpretable models outperform complex, opaque models. Multiplicative interactions further improved interpretable models' domain generalization incrementally. We advocate for the value of big data for training, complemented by (1) external theories to enhance interpretability and guide machine learning and (2) small, well-crafted out-of-distribution data to validate models-together ensuring domain generalization and robustness against data shifts.
{"title":"Model interpretability enhances domain generalization in the case of textual complexity modeling.","authors":"Frans van der Sluis, Egon L van den Broek","doi":"10.1016/j.patter.2025.101177","DOIUrl":"10.1016/j.patter.2025.101177","url":null,"abstract":"<p><p>Balancing prediction accuracy, model interpretability, and domain generalization (also known as [a.k.a.] out-of-distribution testing/evaluation) is a central challenge in machine learning. To assess this challenge, we took 120 interpretable and 166 opaque models from 77,640 tuned configurations, complemented with ChatGPT, 3 probabilistic language models, and Vec2Read. The models first performed text classification to derive principles of textual complexity (task 1) and then generalized these to predict readers' appraisals of processing difficulty (task 2). The results confirmed the known accuracy-interpretability trade-off on task 1. However, task 2's domain generalization showed that interpretable models outperform complex, opaque models. Multiplicative interactions further improved interpretable models' domain generalization incrementally. We advocate for the value of big data for training, complemented by (1) external theories to enhance interpretability and guide machine learning and (2) small, well-crafted out-of-distribution data to validate models-together ensuring domain generalization and robustness against data shifts.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 2","pages":"101177"},"PeriodicalIF":6.7,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873011/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-04eCollection Date: 2025-02-14DOI: 10.1016/j.patter.2025.101172
Tristan Pelser, Jann Michael Weinand, Patrick Kuckertz, Detlef Stolten
Accurate renewable energy resource assessments are necessary for energy system planning to meet climate goals, yet inconsistencies in methods and data can produce significant differences in results. This paper introduces ETHOS.REFLOW, a Python-based workflow manager that ensures transparency and reproducibility in energy potential assessments. The tool enables reproducible analyses with minimal effort by automating the entire workflow, from data acquisition to reporting. We demonstrate its functionality by estimating the technical offshore wind potential of the North Sea, for fixed-foundation and mixed-technology (including floating turbines) scenarios. Two methods for turbine siting (explicit placement vs. uniform power density) and wind datasets are compared. Results show a maximum installable capacity of 768-861 GW and an annual yield of 2,961-3,047 TWh, with capacity factors between 41% and 46% and significant temporal variability. ETHOS.REFLOW offers a robust framework for reproducible energy potential studies, enabling energy system modelers to build on existing work and fostering trust in findings.
{"title":"ETHOS.REFLOW: An open-source workflow for reproducible renewable energy potential assessments.","authors":"Tristan Pelser, Jann Michael Weinand, Patrick Kuckertz, Detlef Stolten","doi":"10.1016/j.patter.2025.101172","DOIUrl":"10.1016/j.patter.2025.101172","url":null,"abstract":"<p><p>Accurate renewable energy resource assessments are necessary for energy system planning to meet climate goals, yet inconsistencies in methods and data can produce significant differences in results. This paper introduces ETHOS.REFLOW, a Python-based workflow manager that ensures transparency and reproducibility in energy potential assessments. The tool enables reproducible analyses with minimal effort by automating the entire workflow, from data acquisition to reporting. We demonstrate its functionality by estimating the technical offshore wind potential of the North Sea, for fixed-foundation and mixed-technology (including floating turbines) scenarios. Two methods for turbine siting (explicit placement vs. uniform power density) and wind datasets are compared. Results show a maximum installable capacity of 768-861 GW and an annual yield of 2,961-3,047 TWh, with capacity factors between 41% and 46% and significant temporal variability. ETHOS.REFLOW offers a robust framework for reproducible energy potential studies, enabling energy system modelers to build on existing work and fostering trust in findings.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 2","pages":"101172"},"PeriodicalIF":6.7,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873006/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-04eCollection Date: 2025-02-14DOI: 10.1016/j.patter.2025.101175
Chaoyu Lei, Kang Dang, Sifan Song, Zilong Wang, Sien Ping Chew, Ruitong Bian, Xichen Yang, Zhouyu Guan, Claudia Isabel Marques de Abreu Lopes, Mini Hang Wang, Richard Wai Chak Choy, Xiaoyan Hu, Kenneth Ka Hei Lai, Kelvin Kam Lung Chong, Chi Pui Pang, Xuefei Song, Jionglong Su, Xiaowei Ding, Huifang Zhou
Medical conditions and systemic diseases often manifest as distinct facial characteristics, making identification of these unique features crucial for disease screening. However, detecting diseases using facial photography remains challenging because of the wide variability in human facial features and disease conditions. The integration of artificial intelligence (AI) into facial analysis represents a promising frontier offering a user-friendly, non-invasive, and cost-effective screening approach. This review explores the potential of AI-assisted facial analysis for identifying subtle facial phenotypes indicative of health disorders. First, we outline the technological framework essential for effective implementation in healthcare settings. Subsequently, we focus on the role of AI-assisted facial analysis in disease screening. We further expand our examination to include applications in health monitoring, support of treatment decision-making, and disease follow-up, thereby contributing to comprehensive disease management. Despite its promise, the adoption of this technology faces several challenges, including privacy concerns, model accuracy, issues with model interpretability, biases in AI algorithms, and adherence to regulatory standards. Addressing these challenges is crucial to ensure fair and ethical use. By overcoming these hurdles, AI-assisted facial analysis can empower healthcare providers, improve patient care outcomes, and enhance global health.
{"title":"AI-assisted facial analysis in healthcare: From disease detection to comprehensive management.","authors":"Chaoyu Lei, Kang Dang, Sifan Song, Zilong Wang, Sien Ping Chew, Ruitong Bian, Xichen Yang, Zhouyu Guan, Claudia Isabel Marques de Abreu Lopes, Mini Hang Wang, Richard Wai Chak Choy, Xiaoyan Hu, Kenneth Ka Hei Lai, Kelvin Kam Lung Chong, Chi Pui Pang, Xuefei Song, Jionglong Su, Xiaowei Ding, Huifang Zhou","doi":"10.1016/j.patter.2025.101175","DOIUrl":"10.1016/j.patter.2025.101175","url":null,"abstract":"<p><p>Medical conditions and systemic diseases often manifest as distinct facial characteristics, making identification of these unique features crucial for disease screening. However, detecting diseases using facial photography remains challenging because of the wide variability in human facial features and disease conditions. The integration of artificial intelligence (AI) into facial analysis represents a promising frontier offering a user-friendly, non-invasive, and cost-effective screening approach. This review explores the potential of AI-assisted facial analysis for identifying subtle facial phenotypes indicative of health disorders. First, we outline the technological framework essential for effective implementation in healthcare settings. Subsequently, we focus on the role of AI-assisted facial analysis in disease screening. We further expand our examination to include applications in health monitoring, support of treatment decision-making, and disease follow-up, thereby contributing to comprehensive disease management. Despite its promise, the adoption of this technology faces several challenges, including privacy concerns, model accuracy, issues with model interpretability, biases in AI algorithms, and adherence to regulatory standards. Addressing these challenges is crucial to ensure fair and ethical use. By overcoming these hurdles, AI-assisted facial analysis can empower healthcare providers, improve patient care outcomes, and enhance global health.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 2","pages":"101175"},"PeriodicalIF":6.7,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873005/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-23eCollection Date: 2025-02-14DOI: 10.1016/j.patter.2025.101179
Qiong Wu, Nathan M Pajor, Yiwen Lu, Charles J Wolock, Jiayi Tong, Vitaly Lorman, Kevin B Johnson, Jason H Moore, Christopher B Forrest, David A Asch, Yong Chen
[This corrects the article DOI: 10.1016/j.patter.2024.101079.].
{"title":"Erratum: A latent transfer learning method for estimating hospital-specific post-acute healthcare demands following SARS-CoV-2 infection.","authors":"Qiong Wu, Nathan M Pajor, Yiwen Lu, Charles J Wolock, Jiayi Tong, Vitaly Lorman, Kevin B Johnson, Jason H Moore, Christopher B Forrest, David A Asch, Yong Chen","doi":"10.1016/j.patter.2025.101179","DOIUrl":"https://doi.org/10.1016/j.patter.2025.101179","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.1016/j.patter.2024.101079.].</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 2","pages":"101179"},"PeriodicalIF":6.7,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11872999/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22eCollection Date: 2025-02-14DOI: 10.1016/j.patter.2025.101173
Xinmei Yuan, Jiangbiao He, Yutong Li, Yu Liu, Yifan Ma, Bo Bao, Leqi Gu, Lili Li, Hui Zhang, Yucheng Jin, Long Sun
[This corrects the article DOI: 10.1016/j.patter.2024.100950.].
{"title":"Erratum: Data-driven evaluation of electric vehicle energy consumption for generalizing standard testing to real-world driving.","authors":"Xinmei Yuan, Jiangbiao He, Yutong Li, Yu Liu, Yifan Ma, Bo Bao, Leqi Gu, Lili Li, Hui Zhang, Yucheng Jin, Long Sun","doi":"10.1016/j.patter.2025.101173","DOIUrl":"https://doi.org/10.1016/j.patter.2025.101173","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.1016/j.patter.2024.100950.].</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 2","pages":"101173"},"PeriodicalIF":6.7,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-17eCollection Date: 2025-02-14DOI: 10.1016/j.patter.2024.101149
Paolo Muratore, Alireza Alemi, Davide Zoccolan
Despite their prominence as model systems of visual functions, it remains unclear whether rodents are capable of truly advanced processing of visual information. Here, we used a convolutional neural network (CNN) to measure the computational complexity required to account for rat object vision. We found that rat ability to discriminate objects despite scaling, translation, and rotation was well accounted for by the CNN mid-level layers. However, the tolerance displayed by rats to more severe image manipulations (occlusion and reduction of objects to outlines) was achieved by the network only in the final layers. Moreover, rats deployed perceptual strategies that were more invariant than those of the CNN, as they more consistently relied on the same set of diagnostic features across transformations. These results reveal an unexpected level of sophistication of rat object vision, while reinforcing the intuition that CNNs learn solutions that only marginally match those of biological visual systems.
{"title":"Unraveling the complexity of rat object vision requires a full convolutional network and beyond.","authors":"Paolo Muratore, Alireza Alemi, Davide Zoccolan","doi":"10.1016/j.patter.2024.101149","DOIUrl":"10.1016/j.patter.2024.101149","url":null,"abstract":"<p><p>Despite their prominence as model systems of visual functions, it remains unclear whether rodents are capable of truly advanced processing of visual information. Here, we used a convolutional neural network (CNN) to measure the computational complexity required to account for rat object vision. We found that rat ability to discriminate objects despite scaling, translation, and rotation was well accounted for by the CNN mid-level layers. However, the tolerance displayed by rats to more severe image manipulations (occlusion and reduction of objects to outlines) was achieved by the network only in the final layers. Moreover, rats deployed perceptual strategies that were more invariant than those of the CNN, as they more consistently relied on the same set of diagnostic features across transformations. These results reveal an unexpected level of sophistication of rat object vision, while reinforcing the intuition that CNNs learn solutions that only marginally match those of biological visual systems.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 2","pages":"101149"},"PeriodicalIF":6.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873012/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}