Pub Date : 2024-06-01DOI: 10.1016/j.patter.2024.101009
Sam Freesun Friedman, Shaan Khurshid
{"title":"Consider this a WARNing","authors":"Sam Freesun Friedman, Shaan Khurshid","doi":"10.1016/j.patter.2024.101009","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101009","url":null,"abstract":"","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141414393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-31DOI: 10.1016/j.patter.2024.100994
Eve Richardson, Raphael Trevizani, Jason A. Greenbaum, Hannah Carter, Morten Nielsen, Bjoern Peters
Many problems in biology require looking for a “needle in a haystack,” corresponding to a binary classification where there are a few positives within a much larger set of negatives, which is referred to as a class imbalance. The receiver operating characteristic (ROC) curve and the associated area under the curve (AUC) have been reported as ill-suited to evaluate prediction performance on imbalanced problems where there is more interest in performance on the positive minority class, while the precision-recall (PR) curve is preferable. We show via simulation and a real case study that this is a misinterpretation of the difference between the ROC and PR spaces, showing that the ROC curve is robust to class imbalance, while the PR curve is highly sensitive to class imbalance. Furthermore, we show that class imbalance cannot be easily disentangled from classifier performance measured via PR-AUC.
{"title":"The receiver operating characteristic curve accurately assesses imbalanced datasets","authors":"Eve Richardson, Raphael Trevizani, Jason A. Greenbaum, Hannah Carter, Morten Nielsen, Bjoern Peters","doi":"10.1016/j.patter.2024.100994","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100994","url":null,"abstract":"<p>Many problems in biology require looking for a “needle in a haystack,” corresponding to a binary classification where there are a few positives within a much larger set of negatives, which is referred to as a class imbalance. The receiver operating characteristic (ROC) curve and the associated area under the curve (AUC) have been reported as ill-suited to evaluate prediction performance on imbalanced problems where there is more interest in performance on the positive minority class, while the precision-recall (PR) curve is preferable. We show via simulation and a real case study that this is a misinterpretation of the difference between the ROC and PR spaces, showing that the ROC curve is robust to class imbalance, while the PR curve is highly sensitive to class imbalance. Furthermore, we show that class imbalance cannot be easily disentangled from classifier performance measured via PR-AUC.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141191060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-13DOI: 10.1016/j.patter.2024.100989
Eivind Heggernes Ask, Astrid Tschan-Plessl, Hanna Julie Hoel, Arne Kolstad, Harald Holte, Karl-Johan Malmberg
Flow cytometry is a powerful technology for high-throughput protein quantification at the single-cell level. Technical advances have substantially increased data complexity, but novel bioinformatical tools often show limitations in statistical testing, data sharing, cross-experiment comparability, or clinical data integration. We developed MetaGate as a platform for interactive statistical analysis and visualization of manually gated high-dimensional cytometry data with integration of metadata. MetaGate provides a data reduction algorithm based on a combinatorial gating system that produces a small, portable, and standardized data file. This is subsequently used to produce figures and statistical analyses through a fast web-based user interface. We demonstrate the utility of MetaGate through a comprehensive mass cytometry analysis of peripheral blood immune cells from 28 patients with diffuse large B cell lymphoma along with 17 healthy controls. Through MetaGate analysis, our study identifies key immune cell population changes associated with disease progression.
{"title":"MetaGate: Interactive analysis of high-dimensional cytometry data with metadata integration","authors":"Eivind Heggernes Ask, Astrid Tschan-Plessl, Hanna Julie Hoel, Arne Kolstad, Harald Holte, Karl-Johan Malmberg","doi":"10.1016/j.patter.2024.100989","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100989","url":null,"abstract":"<p>Flow cytometry is a powerful technology for high-throughput protein quantification at the single-cell level. Technical advances have substantially increased data complexity, but novel bioinformatical tools often show limitations in statistical testing, data sharing, cross-experiment comparability, or clinical data integration. We developed MetaGate as a platform for interactive statistical analysis and visualization of manually gated high-dimensional cytometry data with integration of metadata. MetaGate provides a data reduction algorithm based on a combinatorial gating system that produces a small, portable, and standardized data file. This is subsequently used to produce figures and statistical analyses through a fast web-based user interface. We demonstrate the utility of MetaGate through a comprehensive mass cytometry analysis of peripheral blood immune cells from 28 patients with diffuse large B cell lymphoma along with 17 healthy controls. Through MetaGate analysis, our study identifies key immune cell population changes associated with disease progression.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140933481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-13DOI: 10.1016/j.patter.2024.100990
Samar Samir Khalil, Noha S. Tawfik, Marco Spruit
The incidences of mental health illnesses, such as suicidal ideation and depression, are increasing, which highlights the urgent need for early detection methods. There is a growing interest in using natural language processing (NLP) models to analyze textual data from patients, but accessing patients’ data for research purposes can be challenging due to privacy concerns. Federated learning (FL) is a promising approach that can balance the need for centralized learning with data ownership sensitivity. In this study, we examine the effectiveness of FL models in detecting depression by using a simulated multilingual dataset. We analyzed social media posts in five different languages with varying sample sizes. Our findings indicate that FL achieves strong performance in most cases while maintaining clients’ privacy for both independent and non-independent client partitioning.
{"title":"Federated learning for privacy-preserving depression detection with multilingual language models in social media posts","authors":"Samar Samir Khalil, Noha S. Tawfik, Marco Spruit","doi":"10.1016/j.patter.2024.100990","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100990","url":null,"abstract":"<p>The incidences of mental health illnesses, such as suicidal ideation and depression, are increasing, which highlights the urgent need for early detection methods. There is a growing interest in using natural language processing (NLP) models to analyze textual data from patients, but accessing patients’ data for research purposes can be challenging due to privacy concerns. Federated learning (FL) is a promising approach that can balance the need for centralized learning with data ownership sensitivity. In this study, we examine the effectiveness of FL models in detecting depression by using a simulated multilingual dataset. We analyzed social media posts in five different languages with varying sample sizes. Our findings indicate that FL achieves strong performance in most cases while maintaining clients’ privacy for both independent and non-independent client partitioning.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140934296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-10DOI: 10.1016/j.patter.2024.100972
Fabio Crameri, Sari Hason
Color is crucial in scientific visualization, yet it is often misused. Addressing this, we think accessible and accurate techniques, such as color-blind friendly palettes and perceptually even gradients, are vital. Accountability and basic knowledge in data visualization are key in fostering a culture of color integrity, ensuring accurate and inclusive data representation.
{"title":"Navigating color integrity in data visualization","authors":"Fabio Crameri, Sari Hason","doi":"10.1016/j.patter.2024.100972","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100972","url":null,"abstract":"<p>Color is crucial in scientific visualization, yet it is often misused. Addressing this, we think accessible and accurate techniques, such as color-blind friendly palettes and perceptually even gradients, are vital. Accountability and basic knowledge in data visualization are key in fostering a culture of color integrity, ensuring accurate and inclusive data representation.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140933308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-10DOI: 10.1016/j.patter.2024.100988
Peter S. Park, Simon Goldstein, Aidan O’Gara, Michael Chen, Dan Hendrycks
This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta’s CICERO) and general-purpose AI systems (including large language models). Next, we detail several risks from AI deception, such as fraud, election tampering, and losing control of AI. Finally, we outline several potential solutions: first, regulatory frameworks should subject AI systems that are capable of deception to robust risk-assessment requirements; second, policymakers should implement bot-or-not laws; and finally, policymakers should prioritize the funding of relevant research, including tools to detect AI deception and to make AI systems less deceptive. Policymakers, researchers, and the broader public should work proactively to prevent AI deception from destabilizing the shared foundations of our society.
本文认为,当前一系列人工智能系统已经学会了如何欺骗人类。我们将欺骗定义为系统性地诱导错误信念,以追求某种非真相的结果。我们首先调查了人工智能欺骗的实证案例,讨论了特殊用途人工智能系统(包括 Meta 的 CICERO)和通用人工智能系统(包括大型语言模型)。接下来,我们详细介绍了人工智能欺骗的几种风险,如欺诈、篡改选举和失去对人工智能的控制。最后,我们概述了几种潜在的解决方案:首先,监管框架应该对能够进行欺骗的人工智能系统提出严格的风险评估要求;其次,政策制定者应该实施 "要么机器人,要么不机器人 "的法律;最后,政策制定者应该优先资助相关研究,包括检测人工智能欺骗行为和减少人工智能系统欺骗性的工具。政策制定者、研究人员和广大公众应积极努力,防止人工智能欺骗行为破坏我们社会的共同基础。
{"title":"AI deception: A survey of examples, risks, and potential solutions","authors":"Peter S. Park, Simon Goldstein, Aidan O’Gara, Michael Chen, Dan Hendrycks","doi":"10.1016/j.patter.2024.100988","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100988","url":null,"abstract":"<p>This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta’s CICERO) and general-purpose AI systems (including large language models). Next, we detail several risks from AI deception, such as fraud, election tampering, and losing control of AI. Finally, we outline several potential solutions: first, regulatory frameworks should subject AI systems that are capable of deception to robust risk-assessment requirements; second, policymakers should implement bot-or-not laws; and finally, policymakers should prioritize the funding of relevant research, including tools to detect AI deception and to make AI systems less deceptive. Policymakers, researchers, and the broader public should work proactively to prevent AI deception from destabilizing the shared foundations of our society.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140933222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-08DOI: 10.1016/j.patter.2024.100991
Zhehuan Fan, Jie Yu, Xiang Zhang, Yijie Chen, Shihui Sun, Yuanyuan Zhang, Mingan Chen, Fu Xiao, Wenyong Wu, Xutong Li, Mingyue Zheng, Xiaomin Luo, Dingyan Wang
Deep-learning-based classification models are increasingly used for predicting molecular properties in drug development. However, traditional classification models using the Softmax function often give overconfident mispredictions for out-of-distribution samples, highlighting a critical lack of accurate uncertainty estimation. Such limitations can result in substantial costs and should be avoided during drug development. Inspired by advances in evidential deep learning and Posterior Network, we replaced the Softmax function with a normalizing flow to enhance the uncertainty estimation ability of the model in molecular property classification. The proposed strategy was evaluated across diverse scenarios, including simulated experiments based on a synthetic dataset, ADMET predictions, and ligand-based virtual screening. The results demonstrate that compared with the vanilla model, the proposed strategy effectively alleviates the problem of giving overconfident but incorrect predictions. Our findings support the promising application of evidential deep learning in drug development and offer a valuable framework for further research.
{"title":"Reducing overconfident errors in molecular property classification using Posterior Network","authors":"Zhehuan Fan, Jie Yu, Xiang Zhang, Yijie Chen, Shihui Sun, Yuanyuan Zhang, Mingan Chen, Fu Xiao, Wenyong Wu, Xutong Li, Mingyue Zheng, Xiaomin Luo, Dingyan Wang","doi":"10.1016/j.patter.2024.100991","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100991","url":null,"abstract":"<p>Deep-learning-based classification models are increasingly used for predicting molecular properties in drug development. However, traditional classification models using the Softmax function often give overconfident mispredictions for out-of-distribution samples, highlighting a critical lack of accurate uncertainty estimation. Such limitations can result in substantial costs and should be avoided during drug development. Inspired by advances in evidential deep learning and Posterior Network, we replaced the Softmax function with a normalizing flow to enhance the uncertainty estimation ability of the model in molecular property classification. The proposed strategy was evaluated across diverse scenarios, including simulated experiments based on a synthetic dataset, ADMET predictions, and ligand-based virtual screening. The results demonstrate that compared with the vanilla model, the proposed strategy effectively alleviates the problem of giving overconfident but incorrect predictions. Our findings support the promising application of evidential deep learning in drug development and offer a valuable framework for further research.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140933408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-03DOI: 10.1016/j.patter.2024.100983
Abdelrahman Sharafeldin, Nabil Imam, Hannah Choi
We present an end-to-end architecture for embodied exploration inspired by two biological computations: predictive coding and uncertainty minimization. The architecture can be applied to any exploration setting in a task-independent and intrinsically driven manner. We first demonstrate our approach in a maze navigation task and show that it can discover the underlying transition distributions and spatial features of the environment. Second, we apply our model to a more complex active vision task, whereby an agent actively samples its visual environment to gather information. We show that our model builds unsupervised representations through exploration that allow it to efficiently categorize visual scenes. We further show that using these representations for downstream classification leads to superior data efficiency and learning speed compared to other baselines while maintaining lower parameter complexity. Finally, the modular structure of our model facilitates interpretability, allowing us to probe its internal mechanisms and representations during exploration.
{"title":"Active sensing with predictive coding and uncertainty minimization","authors":"Abdelrahman Sharafeldin, Nabil Imam, Hannah Choi","doi":"10.1016/j.patter.2024.100983","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100983","url":null,"abstract":"<p>We present an end-to-end architecture for embodied exploration inspired by two biological computations: predictive coding and uncertainty minimization. The architecture can be applied to any exploration setting in a task-independent and intrinsically driven manner. We first demonstrate our approach in a maze navigation task and show that it can discover the underlying transition distributions and spatial features of the environment. Second, we apply our model to a more complex active vision task, whereby an agent actively samples its visual environment to gather information. We show that our model builds unsupervised representations through exploration that allow it to efficiently categorize visual scenes. We further show that using these representations for downstream classification leads to superior data efficiency and learning speed compared to other baselines while maintaining lower parameter complexity. Finally, the modular structure of our model facilitates interpretability, allowing us to probe its internal mechanisms and representations during exploration.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140832974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-02DOI: 10.1016/j.patter.2024.100987
Kelly Rootes-Murdy, Sandeep Panta, Ross Kelly, Javier Romero, Yann Quidé, Murray J. Cairns, Carmel Loughland, Vaughan J. Carr, Stanley V. Catts, Assen Jablensky, Melissa J. Green, Frans Henskens, Dylan Kiltschewskij, Patricia T. Michie, Bryan Mowry, Christos Pantelis, Paul E. Rasser, William R. Reay, Ulrich Schall, Rodney J. Scott, Vince D. Calhoun
Structural neuroimaging studies have identified a combination of shared and disorder-specific patterns of gray matter (GM) deficits across psychiatric disorders. Pooling large data allows for examination of a possible common neuroanatomical basis that may identify a certain vulnerability for mental illness. Large-scale collaborative research is already facilitated by data repositories, institutionally supported databases, and data archives. However, these data-sharing methodologies can suffer from significant barriers. Federated approaches augment these approaches by enabling access or more sophisticated, shareable and scaled-up analyses of large-scale data. We examined GM alterations using Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymous Computation, an open-source, decentralized analysis application. Through federated analysis of eight sites, we identified significant overlap in the GM patterns (n = 4,102) of individuals with schizophrenia, major depressive disorder, and autism spectrum disorder. These results show cortical and subcortical regions that may indicate a shared vulnerability to psychiatric disorders.
{"title":"Cortical similarities in psychiatric and mood disorders identified in federated VBM analysis via COINSTAC","authors":"Kelly Rootes-Murdy, Sandeep Panta, Ross Kelly, Javier Romero, Yann Quidé, Murray J. Cairns, Carmel Loughland, Vaughan J. Carr, Stanley V. Catts, Assen Jablensky, Melissa J. Green, Frans Henskens, Dylan Kiltschewskij, Patricia T. Michie, Bryan Mowry, Christos Pantelis, Paul E. Rasser, William R. Reay, Ulrich Schall, Rodney J. Scott, Vince D. Calhoun","doi":"10.1016/j.patter.2024.100987","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100987","url":null,"abstract":"<p>Structural neuroimaging studies have identified a combination of shared and disorder-specific patterns of gray matter (GM) deficits across psychiatric disorders. Pooling large data allows for examination of a possible common neuroanatomical basis that may identify a certain vulnerability for mental illness. Large-scale collaborative research is already facilitated by data repositories, institutionally supported databases, and data archives. However, these data-sharing methodologies can suffer from significant barriers. Federated approaches augment these approaches by enabling access or more sophisticated, shareable and scaled-up analyses of large-scale data. We examined GM alterations using Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymous Computation, an open-source, decentralized analysis application. Through federated analysis of eight sites, we identified significant overlap in the GM patterns (<em>n</em> = 4,102) of individuals with schizophrenia, major depressive disorder, and autism spectrum disorder. These results show cortical and subcortical regions that may indicate a shared vulnerability to psychiatric disorders.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140832823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatially resolved transcriptomics has revolutionized genome-scale transcriptomic profiling by providing high-resolution characterization of transcriptional patterns. Here, we present our spatial transcriptomics analysis framework, MUSTANG (MUlti-sample Spatial Transcriptomics data ANalysis with cross-sample transcriptional similarity Guidance), which is capable of performing multi-sample spatial transcriptomics spot cellular deconvolution by allowing both cross-sample expression-based similarity information sharing as well as spatial correlation in gene expression patterns within samples. Experiments on a semi-synthetic spatial transcriptomics dataset and three real-world spatial transcriptomics datasets demonstrate the effectiveness of MUSTANG in revealing biological insights inherent in the cellular characterization of tissue samples under study.
空间解析转录组学通过提供高分辨率的转录模式表征,彻底改变了基因组规模的转录组学分析。在这里,我们介绍了我们的空间转录组学分析框架 MUSTANG(MUlti-sample Spatial Transcriptomics data ANalysis with cross-sample transcriptional similarity Guidance),它能够通过基于表达的跨样本相似性信息共享以及样本内基因表达模式的空间相关性来执行多样本空间转录组学定点细胞解卷积。在一个半合成空间转录组学数据集和三个真实世界空间转录组学数据集上的实验证明了 MUSTANG 在揭示所研究组织样本细胞特征内在的生物学见解方面的有效性。
{"title":"MUSTANG: Multi-sample spatial transcriptomics data analysis with cross-sample transcriptional similarity guidance","authors":"Seyednami Niyakan, Jianting Sheng, Yuliang Cao, Xiang Zhang, Zhan Xu, Ling Wu, Stephen T.C. Wong, Xiaoning Qian","doi":"10.1016/j.patter.2024.100986","DOIUrl":"https://doi.org/10.1016/j.patter.2024.100986","url":null,"abstract":"<p>Spatially resolved transcriptomics has revolutionized genome-scale transcriptomic profiling by providing high-resolution characterization of transcriptional patterns. Here, we present our spatial transcriptomics analysis framework, MUSTANG (MUlti-sample Spatial Transcriptomics data ANalysis with cross-sample transcriptional similarity Guidance), which is capable of performing multi-sample spatial transcriptomics spot cellular deconvolution by allowing both cross-sample expression-based similarity information sharing as well as spatial correlation in gene expression patterns within samples. Experiments on a semi-synthetic spatial transcriptomics dataset and three real-world spatial transcriptomics datasets demonstrate the effectiveness of MUSTANG in revealing biological insights inherent in the cellular characterization of tissue samples under study.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140832909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}