Patterns最新文献

英文中文

PYPE: A pipeline for phenome-wide association and Mendelian randomization in investigator-driven biobank scale analysis PYPE：研究者驱动的生物库规模分析中的全表型关联和孟德尔随机化管道

IF 6.5 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Patterns

Pub Date : 2024-05-01 DOI: 10.1016/j.patter.2024.100982

Taykhoom Dalal, Chirag J. Patel

Phenome-wide association studies (PheWASs) serve as a way of documenting the relationship between genotypes and multiple phenotypes, helping to uncover unexplored genotype-phenotype associations (known as pleiotropy). Secondly, Mendelian randomization (MR) can be harnessed to make causal statements about a pair of phenotypes by comparing their genetic architecture. Thus, approaches that automate both PheWASs and MR can enhance biobank-scale analyses, circumventing the need for multiple tools by providing a comprehensive, end-to-end tool to drive scientific discovery. To this end, we present PYPE, a Python pipeline for running, visualizing, and interpreting PheWASs. PYPE utilizes input genotype or phenotype files to automatically estimate associations between the chosen independent variables and phenotypes. PYPE can also produce a variety of visualizations and can be used to identify nearby genes and functional consequences of significant associations. Finally, PYPE can identify possible causal relationships between phenotypes using MR under a variety of causal effect modeling scenarios.

全表型关联研究（Phenome-wide association studies，PheWASs）是记录基因型与多种表型之间关系的一种方法，有助于发现尚未探索的基因型与表型之间的关联（称为多效性）。其次，孟德尔随机化（MR）可以通过比较一对表型的遗传结构，对其因果关系做出说明。因此，同时实现 PheWAS 和 MR 自动化的方法可以加强生物库规模的分析，通过提供全面的端到端工具来推动科学发现，从而避免对多种工具的需求。为此，我们推出了PYPE，一种用于运行、可视化和解释PheWAS的Python管道。PYPE利用输入的基因型或表型文件自动估计所选自变量与表型之间的关联。PYPE还能生成各种可视化结果，并可用于识别附近的基因和显著关联的功能性后果。最后，PYPE 还能在各种因果效应建模情况下使用 MR 识别表型之间可能存在的因果关系。

引用次数: 0

The landscape of biomedical research 生物医学研究的前景

IF 6.5 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Patterns

Pub Date : 2024-04-09 DOI: 10.1016/j.patter.2024.100968

Rita González-Márquez, Luca Schmidt, Benjamin M. Schmidt, Philipp Berens, Dmitry Kobak

The number of publications in biomedicine and life sciences has grown so much that it is difficult to keep track of new scientific works and to have an overview of the evolution of the field as a whole. Here, we present a two-dimensional (2D) map of the entire corpus of biomedical literature, based on the abstract texts of 21 million English articles from the PubMed database. To embed the abstracts into 2D, we used the large language model PubMedBERT, combined with t-SNE tailored to handle samples of this size. We used our map to study the emergence of the COVID-19 literature, the evolution of the neuroscience discipline, the uptake of machine learning, the distribution of gender imbalance in academic authorship, and the distribution of retracted paper mill articles. Furthermore, we present an interactive website that allows easy exploration and will enable further insights and facilitate future research.

生物医学和生命科学领域的出版物数量急剧增长，以至于很难跟踪新的科学著作，也很难对整个领域的发展有一个总体的了解。在此，我们根据 PubMed 数据库中 2100 万篇英文文章的摘要文本，绘制了整个生物医学文献库的二维（2D）地图。为了将摘要嵌入二维地图，我们使用了大型语言模型 PubMedBERT，并结合了专为处理这种规模的样本而定制的 t-SNE。我们利用我们的地图研究了 COVID-19 文献的出现、神经科学学科的演变、机器学习的应用、学术作者性别不平衡的分布以及被撤回论文的分布。此外，我们还推出了一个互动网站，方便人们进行探索，并将有助于进一步深入了解和促进未来的研究。

引用次数: 0

Improving antibody language models with native pairing 利用母语配对改进抗体语言模型

IF 6.5 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Patterns

Pub Date : 2024-04-04 DOI: 10.1016/j.patter.2024.100967

Sarah M. Burbach, Bryan Briney

Existing antibody language models are limited by their use of unpaired antibody sequence data. A recently published dataset of ∼1.6 × 10⁶ natively paired human antibody sequences offers a unique opportunity to evaluate how antibody language models are improved by training with native pairs. We trained three baseline antibody language models (BALM), using natively paired (BALM-paired), randomly-paired (BALM-shuffled), or unpaired (BALM-unpaired) sequences from this dataset. To address the paucity of paired sequences, we additionally fine-tuned ESM (evolutionary scale modeling)-2 with natively paired antibody sequences (ft-ESM). We provide evidence that training with native pairs allows the model to learn immunologically relevant features that span the light and heavy chains, which cannot be simulated by training with random pairs. We additionally show that training with native pairs improves model performance on a variety of metrics, including the ability of the model to classify antibodies by pathogen specificity.

现有的抗体语言模型由于使用未配对的抗体序列数据而受到限制。最近发表的 1.6 × 106 ∼原生配对人类抗体序列数据集提供了一个独特的机会来评估抗体语言模型如何通过使用原生配对进行训练而得到改进。我们使用该数据集中的原生配对（BALM-paired）、随机配对（BALM-shuffled）或未配对（BALM-unpaired）序列训练了三种基线抗体语言模型（BALM）。为了解决配对序列不足的问题，我们还利用原生配对抗体序列（ft-ESM）对ESM（进化尺度建模）-2进行了微调。我们提供的证据表明，使用原生配对序列进行训练可使模型学习到跨越轻链和重链的免疫学相关特征，而使用随机配对序列进行训练则无法模拟这些特征。此外，我们还证明了用原生配对进行训练能提高模型在各种指标上的性能，包括模型按病原体特异性对抗体进行分类的能力。

引用次数: 0

Enhancing molecular design efficiency: Uniting language models and generative networks with genetic algorithms 提高分子设计效率：将语言模型和生成网络与遗传算法相结合

IF 6.5 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Patterns

Pub Date : 2024-03-14 DOI: 10.1016/j.patter.2024.100947

Debsindhu Bhowmik, Pei Zhang, Zachary Fox, Stephan Irle, John Gounley

This study examines the effectiveness of generative models in drug discovery, material science, and polymer science, aiming to overcome constraints associated with traditional inverse design methods relying on heuristic rules. Generative models generate synthetic data resembling real data, enabling deep learning model training without extensive labeled datasets. They prove valuable in creating virtual libraries of molecules for material science and facilitating drug discovery by generating molecules with specific properties. While generative adversarial networks (GANs) are explored for these purposes, mode collapse restricts their efficacy, limiting novel structure variability. To address this, we introduce a masked language model (LM) inspired by natural language processing. Although LMs alone can have inherent limitations, we propose a hybrid architecture combining LMs and GANs to efficiently generate new molecules, demonstrating superior performance over standalone masked LMs, particularly for smaller population sizes. This hybrid LM-GAN architecture enhances efficiency in optimizing properties and generating novel samples.

本研究探讨了生成模型在药物发现、材料科学和高分子科学中的有效性，旨在克服与依赖启发式规则的传统逆向设计方法相关的制约因素。生成模型能生成与真实数据相似的合成数据，从而无需大量标注数据集即可进行深度学习模型训练。事实证明，生成模型在为材料科学创建虚拟分子库以及通过生成具有特定性质的分子促进药物发现方面具有重要价值。虽然生成式对抗网络（GANs）被用于这些目的，但模式崩溃限制了它们的功效，限制了新结构的可变性。为了解决这个问题，我们引入了受自然语言处理启发的遮蔽语言模型（LM）。虽然单独的语言模型可能存在固有的局限性，但我们提出了一种结合语言模型和 GAN 的混合架构，以高效生成新分子，其性能优于独立的屏蔽语言模型，尤其是在较小的种群规模下。这种 LM-GAN 混合架构提高了优化属性和生成新样本的效率。

{"title":"Enhancing molecular design efficiency: Uniting language models and generative networks with genetic algorithms","authors":"Debsindhu Bhowmik, Pei Zhang, Zachary Fox, Stephan Irle, John Gounley","doi":"10.1016/j.patter.2024.100947","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100947","url":null,"abstract":"This study examines the effectiveness of generative models in drug discovery, material science, and polymer science, aiming to overcome constraints associated with traditional inverse design methods relying on heuristic rules. Generative models generate synthetic data resembling real data, enabling deep learning model training without extensive labeled datasets. They prove valuable in creating virtual libraries of molecules for material science and facilitating drug discovery by generating molecules with specific properties. While generative adversarial networks (GANs) are explored for these purposes, mode collapse restricts their efficacy, limiting novel structure variability. To address this, we introduce a masked language model (LM) inspired by natural language processing. Although LMs alone can have inherent limitations, we propose a hybrid architecture combining LMs and GANs to efficiently generate new molecules, demonstrating superior performance over standalone masked LMs, particularly for smaller population sizes. This hybrid LM-GAN architecture enhances efficiency in optimizing properties and generating novel samples.","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"2 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140169430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimal shrinkage denoising breaks the noise floor in high-resolution diffusion MRI 最佳收缩去噪打破高分辨率弥散核磁共振成像的噪声底限

IF 6.5 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Patterns

Pub Date : 2024-03-14 DOI: 10.1016/j.patter.2024.100954

Khoi Huynh, Wei-Tang Chang, Ye Wu, Pew-Thian Yap

The spatial resolution attainable in diffusion magnetic resonance (MR) imaging is inherently limited by noise. The weaker signal associated with a smaller voxel size, especially at a high level of diffusion sensitization, is often buried under the noise floor owing to the non-Gaussian nature of the MR magnitude signal. Here, we show how the noise floor can be suppressed remarkably via optimal shrinkage of singular values associated with noise in complex-valued k-space data from multiple receiver channels. We explore and compare different low-rank signal matrix recovery strategies to utilize the inherently redundant information from multiple channels. In combination with background phase removal, the optimal strategy reduces the noise floor by 11 times. Our framework enables imaging with substantially improved resolution for precise characterization of tissue microstructure and white matter pathways without relying on expensive hardware upgrades and time-consuming acquisition repetitions, outperforming other related denoising methods.

扩散磁共振（MR）成像所能达到的空间分辨率本身就受到噪声的限制。由于磁共振幅值信号的非高斯性质，与较小体素尺寸相关的较弱信号，尤其是在高弥散敏化水平时，往往被掩盖在噪声底之下。在这里，我们展示了如何通过对来自多个接收通道的复值 k 空间数据中与噪声相关的奇异值进行优化收缩来显著抑制噪声底。我们探索并比较了不同的低秩信号矩阵恢复策略，以利用来自多个信道的固有冗余信息。结合背景相位去除，最佳策略可将本底噪声降低 11 倍。我们的框架大大提高了成像的分辨率，无需依赖昂贵的硬件升级和耗时的重复采集，就能精确表征组织微观结构和白质通路，其性能优于其他相关的去噪方法。

引用次数: 0

Anti-deficit is anti-racist and transformative 反赤字就是反种族主义和变革

IF 6.5 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Patterns

Pub Date : 2024-03-08 DOI: 10.1016/j.patter.2024.100934

L, e, t, i, c, i, a, , M, á, r, q, u, e, z, -, M, a, g, a, ñ, a

引用次数: 0

Physicians should build their own machine-learning models 医生应建立自己的机器学习模型

IF 6.5 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Patterns

Pub Date : 2024-03-08 DOI: 10.1016/j.patter.2024.100948

Y, o, s, r, a, , M, a, g, d, i, , M, e, k, k, i

引用次数: 0

Meet the authors: Georgios Rizos, Jenna L. Lawson, and Björn W. Schuller 与作者见面：Georgios Rizos、Jenna L. Lawson 和 Björn W. Schuller

IF 6.5 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Patterns

Pub Date : 2024-03-08 DOI: 10.1016/j.patter.2024.100952

Georgios Rizos, Jenna L. Lawson, Björn W. Schuller

引用次数: 0

From scraped to published 从废品到出版

IF 6.5 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Patterns

Pub Date : 2024-03-08 DOI: 10.1016/j.patter.2024.100953

Alejandra Alvarado, Andrew L. Hufton

引用次数: 0

Data-driven evaluation of electric vehicle energy consumption for generalizing standard testing to real-world driving 数据驱动的电动汽车能耗评估，将标准测试推广到实际驾驶中

IF 6.5 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Patterns

Pub Date : 2024-03-08 DOI: 10.1016/j.patter.2024.100950

Xinmei Yuan, Jiangbiao He, Yutong Li, Yu Liu, Yifan Ma, Bo Bao, Leqi Gu, Lili Li, Hui Zhang, Yucheng Jin, Long Sun

Standard energy-consumption testing, providing the only publicly available quantifiable measure of battery electric vehicle (BEV) energy consumption, is crucial for promoting transparency and accountability in the electrified automotive industry; however, significant discrepancies between standard testing and real-world driving have hindered energy and environmental assessments of BEVs and their broader adoption. In this study, we propose a data-driven evaluation method for standard testing to characterize BEV energy consumption. By decoupling the impact of the driving profile, our evaluation approach is generalizable to various driving conditions. In experiments with our approach for estimating energy consumption, we achieve a 3.84% estimation error for 13 different multiregional standardized test cycles and a 7.12% estimation error for 106 diverse real-world trips. Our results highlight the great potential of the proposed approach for promoting public awareness of BEV energy consumption through standard testing while also providing a reliable fundamental model of BEVs.

标准能耗测试是电池电动汽车（BEV）能耗的唯一公开可量化测量方法，对于促进电气化汽车行业的透明度和问责制至关重要；然而，标准测试与实际驾驶之间的巨大差异阻碍了对电池电动汽车的能源和环境评估及其更广泛的应用。在本研究中，我们提出了一种数据驱动的标准测试评估方法，用于描述 BEV 的能耗。通过分离驾驶环境的影响，我们的评估方法可适用于各种驾驶条件。在使用我们的方法估算能耗的实验中，我们在 13 个不同的多区域标准化测试周期中实现了 3.84% 的估算误差，在 106 个不同的实际行程中实现了 7.12% 的估算误差。我们的结果凸显了所提方法的巨大潜力，即通过标准测试提高公众对电动汽车能耗的认识，同时提供可靠的电动汽车基本模型。

{"title":"Data-driven evaluation of electric vehicle energy consumption for generalizing standard testing to real-world driving","authors":"Xinmei Yuan, Jiangbiao He, Yutong Li, Yu Liu, Yifan Ma, Bo Bao, Leqi Gu, Lili Li, Hui Zhang, Yucheng Jin, Long Sun","doi":"10.1016/j.patter.2024.100950","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100950","url":null,"abstract":"Standard energy-consumption testing, providing the only publicly available quantifiable measure of battery electric vehicle (BEV) energy consumption, is crucial for promoting transparency and accountability in the electrified automotive industry; however, significant discrepancies between standard testing and real-world driving have hindered energy and environmental assessments of BEVs and their broader adoption. In this study, we propose a data-driven evaluation method for standard testing to characterize BEV energy consumption. By decoupling the impact of the driving profile, our evaluation approach is generalizable to various driving conditions. In experiments with our approach for estimating energy consumption, we achieve a 3.84% estimation error for 13 different multiregional standardized test cycles and a 7.12% estimation error for 106 diverse real-world trips. Our results highlight the great potential of the proposed approach for promoting public awareness of BEV energy consumption through standard testing while also providing a reliable fundamental model of BEVs.","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"64 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140586567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Patterns

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀