Manual review is an integral part of any study. As the cost of data generation continues to decrease, the rapid rise in large-scale multi-omic studies calls for a modular, flexible framework to perform what is currently a tedious, error-prone process. We developed AnnoMate, a Python-based package built with Plotly Dash that creates interactive, highly customizable dashboards for reviewing and annotating data. Its object-oriented framework enables easy development and modification of custom dashboards for specific manual review tasks. We utilized this framework to implement “reviewer” dashboards for various tasks often performed in cancer genome sequencing studies.
{"title":"AnnoMate: Exploring and annotating integrated molecular data through custom interactive visualizations","authors":"Claudia Chu, Conor Messer, Samantha Van Seters, Mendy Miller, Kristy Schlueter-Kuck, Gad Getz","doi":"10.1016/j.patter.2024.101060","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101060","url":null,"abstract":"<p>Manual review is an integral part of any study. As the cost of data generation continues to decrease, the rapid rise in large-scale multi-omic studies calls for a modular, flexible framework to perform what is currently a tedious, error-prone process. We developed <em>AnnoMate</em>, a Python-based package built with Plotly Dash that creates interactive, highly customizable dashboards for reviewing and annotating data. Its object-oriented framework enables easy development and modification of custom dashboards for specific manual review tasks. We utilized this framework to implement “reviewer” dashboards for various tasks often performed in cancer genome sequencing studies.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142250275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-13DOI: 10.1016/j.patter.2024.101049
Erik-Jan van Kesteren
For over 30 years, synthetic data have been heralded as a solution to make sensitive datasets accessible. However, despite much research effort, its adoption as a tool for research with sensitive data is lacking. This article argues that to make progress in this regard, the data science community should focus on improving the accessibility of existing privacy-friendly synthesis techniques.
{"title":"To democratize research with sensitive data, we should make synthetic data more accessible","authors":"Erik-Jan van Kesteren","doi":"10.1016/j.patter.2024.101049","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101049","url":null,"abstract":"<p>For over 30 years, synthetic data have been heralded as a solution to make sensitive datasets accessible. However, despite much research effort, its adoption as a tool for research with sensitive data is lacking. This article argues that to make progress in this regard, the data science community should focus on improving the accessibility of existing privacy-friendly synthesis techniques.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-13DOI: 10.1016/j.patter.2024.101061
Andrew L. Hufton
No Abstract
无摘要
{"title":"Balancing innovation and integrity in peer review","authors":"Andrew L. Hufton","doi":"10.1016/j.patter.2024.101061","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101061","url":null,"abstract":"No Abstract","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-13DOI: 10.1016/j.patter.2024.101040
Mol Mir, Stephanie H. Nowotarski
The “stacking cell puzzle” is a data visualization project consisting of a three-dimensional puzzle made with electron microscopy data of planarian cells.
堆叠细胞拼图 "是一个数据可视化项目,包括一个利用扁平动物细胞电子显微镜数据制作的三维拼图。
{"title":"The stacking cell puzzle","authors":"Mol Mir, Stephanie H. Nowotarski","doi":"10.1016/j.patter.2024.101040","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101040","url":null,"abstract":"<p>The “stacking cell puzzle” is a data visualization project consisting of a three-dimensional puzzle made with electron microscopy data of planarian cells.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-12DOI: 10.1016/j.patter.2024.101059
Mingxuan Liu, Yilin Ning, Yuhe Ke, Yuqing Shang, Bibhas Chakraborty, Marcus Eng Hock Ong, Roger Vaughan, Nan Liu
The escalating integration of machine learning in high-stakes fields such as healthcare raises substantial concerns about model fairness. We propose an interpretable framework, fairness-aware interpretable modeling (FAIM), to improve model fairness without compromising performance, featuring an interactive interface to identify a “fairer” model from a set of high-performing models and promoting the integration of data-driven evidence and clinical expertise to enhance contextualized fairness. We demonstrate FAIM’s value in reducing intersectional biases arising from race and sex by predicting hospital admission with two real-world databases, the Medical Information Mart for Intensive Care IV Emergency Department (MIMIC-IV-ED) and the database collected from Singapore General Hospital Emergency Department (SGH-ED). For both datasets, FAIM models not only exhibit satisfactory discriminatory performance but also significantly mitigate biases as measured by well-established fairness metrics, outperforming commonly used bias mitigation methods. Our approach demonstrates the feasibility of improving fairness without sacrificing performance and provides a modeling mode that invites domain experts to engage, fostering a multidisciplinary effort toward tailored AI fairness.
机器学习与医疗保健等高风险领域的整合不断升级,引起了人们对模型公平性的极大关注。我们提出了一个可解释的框架--公平感知可解释建模(FAIM),以在不影响性能的情况下提高模型的公平性,其特点是从一组高性能模型中识别出 "更公平 "模型的交互式界面,并促进数据驱动的证据和临床专业知识的整合,以提高情境公平性。我们利用两个真实世界的数据库--重症监护医学信息市场 IV 急诊部(MIMIC-IV-ED)和新加坡中央医院急诊部(SGH-ED)收集的数据库--预测入院情况,证明了 FAIM 在减少种族和性别交叉偏见方面的价值。对于这两个数据集,FAIM 模型不仅表现出令人满意的判别性能,而且还能显著减轻偏差,这是用公认的公平性指标来衡量的,优于常用的减轻偏差方法。我们的方法证明了在不牺牲性能的情况下提高公平性的可行性,并提供了一种可邀请领域专家参与的建模模式,促进了多学科合作,以实现量身定制的人工智能公平性。
{"title":"FAIM: Fairness-aware interpretable modeling for trustworthy machine learning in healthcare","authors":"Mingxuan Liu, Yilin Ning, Yuhe Ke, Yuqing Shang, Bibhas Chakraborty, Marcus Eng Hock Ong, Roger Vaughan, Nan Liu","doi":"10.1016/j.patter.2024.101059","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101059","url":null,"abstract":"<p>The escalating integration of machine learning in high-stakes fields such as healthcare raises substantial concerns about model fairness. We propose an interpretable framework, fairness-aware interpretable modeling (FAIM), to improve model fairness without compromising performance, featuring an interactive interface to identify a “fairer” model from a set of high-performing models and promoting the integration of data-driven evidence and clinical expertise to enhance contextualized fairness. We demonstrate FAIM’s value in reducing intersectional biases arising from race and sex by predicting hospital admission with two real-world databases, the Medical Information Mart for Intensive Care IV Emergency Department (MIMIC-IV-ED) and the database collected from Singapore General Hospital Emergency Department (SGH-ED). For both datasets, FAIM models not only exhibit satisfactory discriminatory performance but also significantly mitigate biases as measured by well-established fairness metrics, outperforming commonly used bias mitigation methods. Our approach demonstrates the feasibility of improving fairness without sacrificing performance and provides a modeling mode that invites domain experts to engage, fostering a multidisciplinary effort toward tailored AI fairness.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10DOI: 10.1016/j.patter.2024.101057
Cheng Tang, Yang Zhou, Shuaizhu Zhao, Mingshu Xie, Ruizhe Zhang, Xiaoyan Long, Lingqiang Zhu, Youming Lu, Guangzhi Ma, Hao Li
Accurate analysis of social behaviors in animals is hindered by methodological challenges. Here, we develop a segmentation tracking and clustering system (STCS) to address two major challenges in computational neuroethology: reliable multi-animal tracking and pose estimation under complex interaction conditions and providing interpretable insights into social differences guided by genotype information. We established a comprehensive, long-term, multi-animal-tracking dataset across various experimental settings. Benchmarking STCS against state-of-the-art tracking algorithms, we demonstrated its superior efficacy in analyzing behavioral experiments and establishing a robust tracking baseline. By analyzing the behavior of mice with autism spectrum disorder (ASD) using a novel weakly supervised clustering method under both solitary and social conditions, STCS reveals potential links between social stress and motor impairments. Benefiting from its modular and web-based design, STCS allows researchers to easily integrate the latest computer vision methods, enabling comprehensive behavior analysis services over the Internet, even from a single laptop.
{"title":"Segmentation tracking and clustering system enables accurate multi-animal tracking of social behaviors","authors":"Cheng Tang, Yang Zhou, Shuaizhu Zhao, Mingshu Xie, Ruizhe Zhang, Xiaoyan Long, Lingqiang Zhu, Youming Lu, Guangzhi Ma, Hao Li","doi":"10.1016/j.patter.2024.101057","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101057","url":null,"abstract":"<p>Accurate analysis of social behaviors in animals is hindered by methodological challenges. Here, we develop a segmentation tracking and clustering system (STCS) to address two major challenges in computational neuroethology: reliable multi-animal tracking and pose estimation under complex interaction conditions and providing interpretable insights into social differences guided by genotype information. We established a comprehensive, long-term, multi-animal-tracking dataset across various experimental settings. Benchmarking STCS against state-of-the-art tracking algorithms, we demonstrated its superior efficacy in analyzing behavioral experiments and establishing a robust tracking baseline. By analyzing the behavior of mice with autism spectrum disorder (ASD) using a novel weakly supervised clustering method under both solitary and social conditions, STCS reveals potential links between social stress and motor impairments. Benefiting from its modular and web-based design, STCS allows researchers to easily integrate the latest computer vision methods, enabling comprehensive behavior analysis services over the Internet, even from a single laptop.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-29DOI: 10.1016/j.patter.2024.101045
Gale M. Lucas, Burcin Becerik-Gerber, Shawn C. Roll
With the exponential rise in the prevalence of automation, trust in such technology has become more critical than ever before. Trust is confidence in a particular entity, especially in regard to the consequences they can have for the trustor, and calibrated trust is the extent to which the judgments of trust are accurate. The focus of this paper is to reevaluate the general understanding of calibrating trust in automation, update this understanding, and apply it to worker’s trust in automation in the workplace. Seminal models of trust in automation were designed for automation that was already common in workforces, where the machine’s “intelligence” (i.e., capacity for decision making, cognition, and/or understanding) was limited. Now, burgeoning automation with more human-like intelligence is intended to be more interactive with workers, serving in roles such as decision aid, assistant, or collaborative coworker. Thus, we revise “calibrated trust in automation” to include more intelligent automated systems.
{"title":"Calibrating workers’ trust in intelligent automated systems","authors":"Gale M. Lucas, Burcin Becerik-Gerber, Shawn C. Roll","doi":"10.1016/j.patter.2024.101045","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101045","url":null,"abstract":"<p>With the exponential rise in the prevalence of automation, trust in such technology has become more critical than ever before. Trust is confidence in a particular entity, especially in regard to the consequences they can have for the trustor, and calibrated trust is the extent to which the judgments of trust are accurate. The focus of this paper is to reevaluate the general understanding of calibrating trust in automation, update this understanding, and apply it to worker’s trust in automation in the workplace. Seminal models of trust in automation were designed for automation that was already common in workforces, where the machine’s “intelligence” (i.e., capacity for decision making, cognition, and/or understanding) was limited. Now, burgeoning automation with more human-like intelligence is intended to be more interactive with workers, serving in roles such as decision aid, assistant, or collaborative coworker. Thus, we revise “calibrated trust in automation” to include more intelligent automated systems.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-29DOI: 10.1016/j.patter.2024.101047
Julius Vetter, Jakob H. Macke, Richard Gao
Denoising diffusion probabilistic models (DDPMs) have recently been shown to accurately generate complicated data such as images, audio, or time series. Experimental and clinical neuroscience also stand to benefit from this progress, as the accurate generation of neurophysiological time series can enable or improve many neuroscientific applications. Here, we present a flexible DDPM-based method for modeling multichannel, densely sampled neurophysiological recordings. DDPMs can generate realistic synthetic data for a variety of datasets from different species and recording techniques. The generated data capture important statistics, such as frequency spectra and phase-amplitude coupling, as well as fine-grained features such as sharp wave ripples. Furthermore, data can be generated based on additional information such as experimental conditions. We demonstrate the flexibility of DDPMs in several applications, including brain-state classification and missing-data imputation. In summary, DDPMs can serve as accurate generative models of neurophysiological recordings and have broad utility in the probabilistic generation of synthetic recordings for neuroscientific applications.
{"title":"Generating realistic neurophysiological time series with denoising diffusion probabilistic models","authors":"Julius Vetter, Jakob H. Macke, Richard Gao","doi":"10.1016/j.patter.2024.101047","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101047","url":null,"abstract":"<p>Denoising diffusion probabilistic models (DDPMs) have recently been shown to accurately generate complicated data such as images, audio, or time series. Experimental and clinical neuroscience also stand to benefit from this progress, as the accurate generation of neurophysiological time series can enable or improve many neuroscientific applications. Here, we present a flexible DDPM-based method for modeling multichannel, densely sampled neurophysiological recordings. DDPMs can generate realistic synthetic data for a variety of datasets from different species and recording techniques. The generated data capture important statistics, such as frequency spectra and phase-amplitude coupling, as well as fine-grained features such as sharp wave ripples. Furthermore, data can be generated based on additional information such as experimental conditions. We demonstrate the flexibility of DDPMs in several applications, including brain-state classification and missing-data imputation. In summary, DDPMs can serve as accurate generative models of neurophysiological recordings and have broad utility in the probabilistic generation of synthetic recordings for neuroscientific applications.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-29DOI: 10.1016/j.patter.2024.101048
Roberto De Filippo, Dietmar Schmitz
Serotonin (5-HT) is crucial for regulating brain functions such as mood, sleep, and cognition. This study presents a comprehensive transcriptomic analysis of 5-HT receptors (Htrs) across ≈4 million cells in the adult mouse brain using single-cell RNA sequencing (scRNA-seq) data from the Allen Institute. We observed differential transcription patterns of all 14 Htr subtypes, revealing diverse prevalence and distribution across cell classes. Remarkably, we found that 65.84% of cells transcribe RNA of at least one Htr, with frequent co-transcription of multiple Htrs, underscoring the complexity of the 5-HT system even at the single-cell dimension. Leveraging a multiplexed error-robust fluorescence in situ hybridization (MERFISH) dataset provided by Harvard University of ≈10 million cells, we analyzed the spatial distribution of each Htr, confirming previous findings and uncovering novel transcription patterns. To aid in exploring Htr transcription, we provide an online interactive visualizer.
{"title":"Transcriptomic mapping of the 5-HT receptor landscape","authors":"Roberto De Filippo, Dietmar Schmitz","doi":"10.1016/j.patter.2024.101048","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101048","url":null,"abstract":"<p>Serotonin (5-HT) is crucial for regulating brain functions such as mood, sleep, and cognition. This study presents a comprehensive transcriptomic analysis of 5-HT receptors (Htrs) across ≈4 million cells in the adult mouse brain using single-cell RNA sequencing (scRNA-seq) data from the Allen Institute. We observed differential transcription patterns of all 14 Htr subtypes, revealing diverse prevalence and distribution across cell classes. Remarkably, we found that 65.84% of cells transcribe RNA of at least one Htr, with frequent co-transcription of multiple Htrs, underscoring the complexity of the 5-HT system even at the single-cell dimension. Leveraging a multiplexed error-robust fluorescence <em>in situ</em> hybridization (MERFISH) dataset provided by Harvard University of ≈10 million cells, we analyzed the spatial distribution of each Htr, confirming previous findings and uncovering novel transcription patterns. To aid in exploring Htr transcription, we provide an online interactive visualizer.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-28DOI: 10.1016/j.patter.2024.101046
Michael A. Lones
Mistakes in machine learning practice are commonplace and can result in loss of confidence in the findings and products of machine learning. This tutorial outlines common mistakes that occur when using machine learning and what can be done to avoid them. While it should be accessible to anyone with a basic understanding of machine learning techniques, it focuses on issues that are of particular concern within academic research, such as the need to make rigorous comparisons and reach valid conclusions. It covers five stages of the machine learning process: what to do before model building, how to reliably build models, how to robustly evaluate models, how to compare models fairly, and how to report results.
{"title":"Avoiding common machine learning pitfalls","authors":"Michael A. Lones","doi":"10.1016/j.patter.2024.101046","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101046","url":null,"abstract":"<p>Mistakes in machine learning practice are commonplace and can result in loss of confidence in the findings and products of machine learning. This tutorial outlines common mistakes that occur when using machine learning and what can be done to avoid them. While it should be accessible to anyone with a basic understanding of machine learning techniques, it focuses on issues that are of particular concern within academic research, such as the need to make rigorous comparisons and reach valid conclusions. It covers five stages of the machine learning process: what to do before model building, how to reliably build models, how to robustly evaluate models, how to compare models fairly, and how to report results.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142196809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}