Jackson P Lautier, Stella Grosser, Jessica Kim, Hyewon Kim, Junghi Kim
{"title":"血浆浓度-时间曲线聚类:无监督学习在药物基因组学中的应用。","authors":"Jackson P Lautier, Stella Grosser, Jessica Kim, Hyewon Kim, Junghi Kim","doi":"10.1080/10543406.2024.2365389","DOIUrl":null,"url":null,"abstract":"<p><p>Pharmaceutical researchers are continually searching for techniques to improve both drug development processes and patient outcomes. An area of recent interest is the potential for machine learning (ML) applications within pharmacology. One such application not yet given close study is the unsupervised clustering of plasma concentration-time curves, hereafter, pharmacokinetic (PK) curves. In this paper, we present our findings on how to cluster PK curves by their similarity. Specifically, we find clustering to be effective at identifying similar-shaped PK curves and informative for understanding patterns within each cluster of PK curves. Because PK curves are time series data objects, our approach utilizes the extensive body of research related to the clustering of time series data as a starting point. As such, we examine many dissimilarity measures between time series data objects to find those most suitable for PK curves. We identify Euclidean distance as generally most appropriate for clustering PK curves, and we further show that dynamic time warping, Fréchet, and structure-based measures of dissimilarity like correlation may produce unexpected results. As an illustration, we apply these methods in a case study with 250 PK curves used in a previous pharmacogenomic study. Our case study finds that an unsupervised ML clustering with Euclidean distance, without any subject genetic information, is able to independently validate the same conclusions as the reference pharmacogenomic results. To our knowledge, this is the first such demonstration. Further, the case study demonstrates how the clustering of PK curves may generate insights that could be difficult to perceive solely with population level summary statistics of PK metrics.</p>","PeriodicalId":54870,"journal":{"name":"Journal of Biopharmaceutical Statistics","volume":" ","pages":"1-19"},"PeriodicalIF":1.2000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Clustering plasma concentration-time curves: applications of unsupervised learning in pharmacogenomics.\",\"authors\":\"Jackson P Lautier, Stella Grosser, Jessica Kim, Hyewon Kim, Junghi Kim\",\"doi\":\"10.1080/10543406.2024.2365389\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Pharmaceutical researchers are continually searching for techniques to improve both drug development processes and patient outcomes. An area of recent interest is the potential for machine learning (ML) applications within pharmacology. One such application not yet given close study is the unsupervised clustering of plasma concentration-time curves, hereafter, pharmacokinetic (PK) curves. In this paper, we present our findings on how to cluster PK curves by their similarity. Specifically, we find clustering to be effective at identifying similar-shaped PK curves and informative for understanding patterns within each cluster of PK curves. Because PK curves are time series data objects, our approach utilizes the extensive body of research related to the clustering of time series data as a starting point. As such, we examine many dissimilarity measures between time series data objects to find those most suitable for PK curves. We identify Euclidean distance as generally most appropriate for clustering PK curves, and we further show that dynamic time warping, Fréchet, and structure-based measures of dissimilarity like correlation may produce unexpected results. As an illustration, we apply these methods in a case study with 250 PK curves used in a previous pharmacogenomic study. Our case study finds that an unsupervised ML clustering with Euclidean distance, without any subject genetic information, is able to independently validate the same conclusions as the reference pharmacogenomic results. To our knowledge, this is the first such demonstration. Further, the case study demonstrates how the clustering of PK curves may generate insights that could be difficult to perceive solely with population level summary statistics of PK metrics.</p>\",\"PeriodicalId\":54870,\"journal\":{\"name\":\"Journal of Biopharmaceutical Statistics\",\"volume\":\" \",\"pages\":\"1-19\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2024-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Biopharmaceutical Statistics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1080/10543406.2024.2365389\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"PHARMACOLOGY & PHARMACY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biopharmaceutical Statistics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/10543406.2024.2365389","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
Clustering plasma concentration-time curves: applications of unsupervised learning in pharmacogenomics.
Pharmaceutical researchers are continually searching for techniques to improve both drug development processes and patient outcomes. An area of recent interest is the potential for machine learning (ML) applications within pharmacology. One such application not yet given close study is the unsupervised clustering of plasma concentration-time curves, hereafter, pharmacokinetic (PK) curves. In this paper, we present our findings on how to cluster PK curves by their similarity. Specifically, we find clustering to be effective at identifying similar-shaped PK curves and informative for understanding patterns within each cluster of PK curves. Because PK curves are time series data objects, our approach utilizes the extensive body of research related to the clustering of time series data as a starting point. As such, we examine many dissimilarity measures between time series data objects to find those most suitable for PK curves. We identify Euclidean distance as generally most appropriate for clustering PK curves, and we further show that dynamic time warping, Fréchet, and structure-based measures of dissimilarity like correlation may produce unexpected results. As an illustration, we apply these methods in a case study with 250 PK curves used in a previous pharmacogenomic study. Our case study finds that an unsupervised ML clustering with Euclidean distance, without any subject genetic information, is able to independently validate the same conclusions as the reference pharmacogenomic results. To our knowledge, this is the first such demonstration. Further, the case study demonstrates how the clustering of PK curves may generate insights that could be difficult to perceive solely with population level summary statistics of PK metrics.
期刊介绍:
The Journal of Biopharmaceutical Statistics, a rapid publication journal, discusses quality applications of statistics in biopharmaceutical research and development. Now publishing six times per year, it includes expositions of statistical methodology with immediate applicability to biopharmaceutical research in the form of full-length and short manuscripts, review articles, selected/invited conference papers, short articles, and letters to the editor. Addressing timely and provocative topics important to the biostatistical profession, the journal covers:
Drug, device, and biological research and development;
Drug screening and drug design;
Assessment of pharmacological activity;
Pharmaceutical formulation and scale-up;
Preclinical safety assessment;
Bioavailability, bioequivalence, and pharmacokinetics;
Phase, I, II, and III clinical development including complex innovative designs;
Premarket approval assessment of clinical safety;
Postmarketing surveillance;
Big data and artificial intelligence and applications.