First, Do No Harm: Addressing AI's Challenges With Out-of-Distribution Data in Medicine

IF 2.8 3区 医学 Q2 MEDICINE, RESEARCH & EXPERIMENTAL Cts-Clinical and Translational Science Pub Date : 2025-01-16 DOI:10.1111/cts.70132
Chu Weng, Wesley Lin, Sherry Dong, Qi Liu, Hanrui Zhang
{"title":"First, Do No Harm: Addressing AI's Challenges With Out-of-Distribution Data in Medicine","authors":"Chu Weng,&nbsp;Wesley Lin,&nbsp;Sherry Dong,&nbsp;Qi Liu,&nbsp;Hanrui Zhang","doi":"10.1111/cts.70132","DOIUrl":null,"url":null,"abstract":"<p>The advent of AI has brought transformative changes across many fields, particularly in biomedical field, where AI is now being used to facilitate drug discovery and development, enhance diagnostic and prognostic accuracy, and support clinical decision-making. For example, since 2021, there has been a notable increase in AI-related submissions to the US Food and Drug Administration (FDA) Center for Drug Evaluation and Research (CDER), reflecting the rapid expansion of AI applications in drug development [<span>1</span>]. In addition, the rapid growth in AI health applications is reflected by the exponential increase in the number of such studies found on PubMed [<span>2</span>]. However, the translation of AI models from development to real-world deployment remains challenging. This is due to various factors, including data drift, where the characteristics of data in the deployment phase differ from those used in model training. Consequently, ensuring the performance of medical AI models in the deployment phase has become a critical area of focus, as AI models that excel in controlled environments may still struggle with real-world variability, leading to poor predictions for patients whose characteristics differ significantly from the training set. Such cases, often referred to as OOD samples, present a major challenge for AI-driven decision-making, such as making diagnosis or selecting treatments for a patient. The failure to recognize these OOD samples can result in suboptimal or even harmful decisions.</p><p>To address this, we propose a prescreening procedure for medical AI model deployment (especially when the AI model risk is high), aimed at avoiding or flagging the predictions by AI models on OOD samples (Figure 1a). This procedure, we believe, can be beneficial for ensuring the trustworthiness of AI in medicine.</p><p>OOD scenarios are a common challenge in medical AI applications. For instance, a model trained predominantly on data from a specific demographic group may underperform when applied to patients from different demographic groups, resulting in inaccurate predictions. OOD cases can also arise when AI models encounter data that differ from the training data due to factors like variations in medical practices and treatment landscapes of the clinical trials. These issues can potentially lead to harm to patients (e.g., misdiagnosis, inappropriate treatment recommendations), and a loss of trust in AI systems.</p><p>The importance of detecting OOD samples to define the scope of use for AI models has been highlighted in multiple research and clinical studies. A well-known example is the Medical Out-of-Distribution-Analysis (MOOD) Challenge [<span>3</span>], which benchmarked OOD detection algorithms across several supervised and unsupervised models, including autoencoder neural networks, U-Net, vector-quantized variational autoencoders, principle component analysis (PCA), and linear Gaussian process regression. These algorithms were used to identify brain magnetic resonance imaging (MRI) and abdominal computed tomography (CT) scan images that deviated from the training data, thereby reducing the risk of overconfident predictions from machine learning models. Similarly, methods such as the Gram matrix algorithm and linear/outlier synthesis have been employed to detect OOD samples in skin lesion images [<span>4</span>].</p><p>Beyond medical imaging, OOD detection has also been recommended for other healthcare data types, such as electronic health records (EHRs), to enhance model reliability [<span>5</span>]. In addition to diagnostic applications, OOD detection can enrich clinical trial cohorts by identifying patients with canonical symptoms. For example, Hopkins et al. used anomaly scores to determine whether patients with bipolar depression should be included in clinical trials for non-racemic amisulpride (SEP-4199). The patients identified as anomalies exhibited distinct responses to the treatment compared to canonical patients [<span>6</span>].</p><p>To demonstrate how OOD detection techniques can be integrated into existing medical AI pipelines, we extend a previously published antimalarial prediction model by incorporating a machine-learning-based OOD detector (Figure 1b). After adding OOD detection, the system exhibits more robust performance when evaluated on transcriptomes from a previously unseen geographic region.</p><p>We had originally trained a tree-based gradient boosting algorithm, LightGBM, using transcriptomes from <i>Plasmodium falciparum</i> isolates obtained from patients, to predict resistance to artemisinin, an antimalarial drug [<span>7</span>]. Briefly, the training data consisted of transcriptomes from isolates in Southeast Asia, alongside the clearance rates of these isolates [<span>8</span>]. Isolates with slow clearance rates were labeled as resistant, while others were classified as non-resistant [<span>7</span>].</p><p>To enhance the model, we incorporated an OOD detection approach that discriminates between in-distribution (ID) and OOD samples. This was done based on the distance between each sample in the latent space of a deep neural network, trained using contrastive learning, and its <i>k</i>-th nearest neighbor in the training set [<span>9</span>]. If the distance exceeded a defined threshold, set such that 5% of the training observations are classified as OOD, the sample was classified as OOD. This approach has demonstrated strong performance in other domains, such as image detection and time series modeling [<span>9, 10</span>].</p><p>To simulate applying a pretrained model in a new setting, we tested artemisinin resistance in a geographically distinct region. We trained our model on 786 transcriptomes from Southeast Asian countries in the Mok et al. dataset, excluding Myanmar. We then validated the pretrained model on samples from Myanmar, using the deep nearest neighbor approach [<span>9</span>] to identify which samples were OOD relative to the training data. We first evaluated the pretrained model on the entire validation set, then removed the OOD samples and reassessed performance to examine the impact.</p><p>During OOD detection, 5 of the 82 samples from Myanmar were identified as OOD. We evaluated the model's predictive performance using AUROC, achieving 0.5973 [0.5508, 0.6605] AUROC on the full Myanmar dataset. After removing the OOD samples, performance improved to 0.6934 [0.6356, 0.7593] AUROC. In contrast, for the five OOD samples, the model's performance was significantly lower, at 0.3310 [0.2153, 0.4421] AUROC—well below random chance (AUROC = 0.5). These results demonstrate that OOD detection effectively identifies samples where the pretrained model's predictions are unreliable and improved the performances by removing unreliable samples in the validation settings.</p><p>AI models need a clearly defined scope of use in terms of input data. This is especially true when moving the AI model from the development stage into a real-world setting. For example, many AI models are developed using data from clinical trials. In general, clinical trial data are generated from highly controlled environments, with carefully selected, relatively homogeneous patient populations. These trials have predetermined durations and predefined endpoints. In contrast, real-world data (RWD) are collected under routine clinical conditions, often in an uncontrolled setting, from patients with diverse demographics, varying disease presentations, and different treatment approaches. The heterogeneity in RWD can present challenges, as AI models may struggle to generalize to broader, more diverse populations. Unlike clinical trials, RWD might not have predefined endpoints for data collection, and it may introduce more variability and potential biases. These discrepancies between clinical trial data and real-world data can lead to suboptimal performance when AI models are applied beyond the scope of their original training.</p><p>Given the high dimensionality of AI model input data, defining the scope of use can become complex. OOD detection can play a valuable role in addressing this challenge. By incorporating OOD detection as a pre-screening step, clinicians can better evaluate whether AI models are suitable for a specific patient, enhancing the safety and effectiveness of AI in medicine. This approach is crucial for reducing the risk of incorrect diagnoses or treatments and ensuring AI is used responsibly in healthcare settings.</p><p>However, OOD detection can be challenging, and current methods still have limitations. In most datasets, there are no explicit labels indicating whether data is OOD. As a result, OOD models are often trained to detect when input data follows a distribution different from the training data, introducing an element of randomness into the detection process. Combining multiple OOD detection models with different random seeds or using models that rely on various mechanisms—such as K-nearest neighbors, isolation forest, Bayesian neural networks, variational autoencoders (VAEs), or contrastive learning—can improve the specificity of OOD detection. More research is needed to further improve the OOD detection methods.</p><p>It is important to note that the OOD approach could incentivize greater inclusivity and diversity in clinical trials and training data. Generally speaking, the more inclusive and diverse the clinical trials and training data are, the broader the scope of use will be for the AI model, as fewer patients will be classified as OOD during the model deployment phase. In addition, the OOD approach can help identify gaps in data inclusivity and offer valuable insights for improving future data collection and enhancing the diversity in clinical trial datasets.</p><p>Looking ahead, the integration of OOD detection into medical AI systems can be an important step toward responsible AI deployment. By explicitly addressing the limitations of our training data and our model capabilities, we can build more trustworthy AI systems that align with the rigorous standards of medical practice and the fundamental principle of “first, do no harm.”</p><p>The authors declare no conflicts of interest.</p><p>This article reflects the views of the author and should not be construed to represent FDA’s views or policies. As an Associate Editor for Clinical and Translational Science, Qi Liu was not involved in the review or decision process for this paper.</p>","PeriodicalId":50610,"journal":{"name":"Cts-Clinical and Translational Science","volume":"18 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11739455/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cts-Clinical and Translational Science","FirstCategoryId":"3","ListUrlMain":"https://ascpt.onlinelibrary.wiley.com/doi/10.1111/cts.70132","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

The advent of AI has brought transformative changes across many fields, particularly in biomedical field, where AI is now being used to facilitate drug discovery and development, enhance diagnostic and prognostic accuracy, and support clinical decision-making. For example, since 2021, there has been a notable increase in AI-related submissions to the US Food and Drug Administration (FDA) Center for Drug Evaluation and Research (CDER), reflecting the rapid expansion of AI applications in drug development [1]. In addition, the rapid growth in AI health applications is reflected by the exponential increase in the number of such studies found on PubMed [2]. However, the translation of AI models from development to real-world deployment remains challenging. This is due to various factors, including data drift, where the characteristics of data in the deployment phase differ from those used in model training. Consequently, ensuring the performance of medical AI models in the deployment phase has become a critical area of focus, as AI models that excel in controlled environments may still struggle with real-world variability, leading to poor predictions for patients whose characteristics differ significantly from the training set. Such cases, often referred to as OOD samples, present a major challenge for AI-driven decision-making, such as making diagnosis or selecting treatments for a patient. The failure to recognize these OOD samples can result in suboptimal or even harmful decisions.

To address this, we propose a prescreening procedure for medical AI model deployment (especially when the AI model risk is high), aimed at avoiding or flagging the predictions by AI models on OOD samples (Figure 1a). This procedure, we believe, can be beneficial for ensuring the trustworthiness of AI in medicine.

OOD scenarios are a common challenge in medical AI applications. For instance, a model trained predominantly on data from a specific demographic group may underperform when applied to patients from different demographic groups, resulting in inaccurate predictions. OOD cases can also arise when AI models encounter data that differ from the training data due to factors like variations in medical practices and treatment landscapes of the clinical trials. These issues can potentially lead to harm to patients (e.g., misdiagnosis, inappropriate treatment recommendations), and a loss of trust in AI systems.

The importance of detecting OOD samples to define the scope of use for AI models has been highlighted in multiple research and clinical studies. A well-known example is the Medical Out-of-Distribution-Analysis (MOOD) Challenge [3], which benchmarked OOD detection algorithms across several supervised and unsupervised models, including autoencoder neural networks, U-Net, vector-quantized variational autoencoders, principle component analysis (PCA), and linear Gaussian process regression. These algorithms were used to identify brain magnetic resonance imaging (MRI) and abdominal computed tomography (CT) scan images that deviated from the training data, thereby reducing the risk of overconfident predictions from machine learning models. Similarly, methods such as the Gram matrix algorithm and linear/outlier synthesis have been employed to detect OOD samples in skin lesion images [4].

Beyond medical imaging, OOD detection has also been recommended for other healthcare data types, such as electronic health records (EHRs), to enhance model reliability [5]. In addition to diagnostic applications, OOD detection can enrich clinical trial cohorts by identifying patients with canonical symptoms. For example, Hopkins et al. used anomaly scores to determine whether patients with bipolar depression should be included in clinical trials for non-racemic amisulpride (SEP-4199). The patients identified as anomalies exhibited distinct responses to the treatment compared to canonical patients [6].

To demonstrate how OOD detection techniques can be integrated into existing medical AI pipelines, we extend a previously published antimalarial prediction model by incorporating a machine-learning-based OOD detector (Figure 1b). After adding OOD detection, the system exhibits more robust performance when evaluated on transcriptomes from a previously unseen geographic region.

We had originally trained a tree-based gradient boosting algorithm, LightGBM, using transcriptomes from Plasmodium falciparum isolates obtained from patients, to predict resistance to artemisinin, an antimalarial drug [7]. Briefly, the training data consisted of transcriptomes from isolates in Southeast Asia, alongside the clearance rates of these isolates [8]. Isolates with slow clearance rates were labeled as resistant, while others were classified as non-resistant [7].

To enhance the model, we incorporated an OOD detection approach that discriminates between in-distribution (ID) and OOD samples. This was done based on the distance between each sample in the latent space of a deep neural network, trained using contrastive learning, and its k-th nearest neighbor in the training set [9]. If the distance exceeded a defined threshold, set such that 5% of the training observations are classified as OOD, the sample was classified as OOD. This approach has demonstrated strong performance in other domains, such as image detection and time series modeling [9, 10].

To simulate applying a pretrained model in a new setting, we tested artemisinin resistance in a geographically distinct region. We trained our model on 786 transcriptomes from Southeast Asian countries in the Mok et al. dataset, excluding Myanmar. We then validated the pretrained model on samples from Myanmar, using the deep nearest neighbor approach [9] to identify which samples were OOD relative to the training data. We first evaluated the pretrained model on the entire validation set, then removed the OOD samples and reassessed performance to examine the impact.

During OOD detection, 5 of the 82 samples from Myanmar were identified as OOD. We evaluated the model's predictive performance using AUROC, achieving 0.5973 [0.5508, 0.6605] AUROC on the full Myanmar dataset. After removing the OOD samples, performance improved to 0.6934 [0.6356, 0.7593] AUROC. In contrast, for the five OOD samples, the model's performance was significantly lower, at 0.3310 [0.2153, 0.4421] AUROC—well below random chance (AUROC = 0.5). These results demonstrate that OOD detection effectively identifies samples where the pretrained model's predictions are unreliable and improved the performances by removing unreliable samples in the validation settings.

AI models need a clearly defined scope of use in terms of input data. This is especially true when moving the AI model from the development stage into a real-world setting. For example, many AI models are developed using data from clinical trials. In general, clinical trial data are generated from highly controlled environments, with carefully selected, relatively homogeneous patient populations. These trials have predetermined durations and predefined endpoints. In contrast, real-world data (RWD) are collected under routine clinical conditions, often in an uncontrolled setting, from patients with diverse demographics, varying disease presentations, and different treatment approaches. The heterogeneity in RWD can present challenges, as AI models may struggle to generalize to broader, more diverse populations. Unlike clinical trials, RWD might not have predefined endpoints for data collection, and it may introduce more variability and potential biases. These discrepancies between clinical trial data and real-world data can lead to suboptimal performance when AI models are applied beyond the scope of their original training.

Given the high dimensionality of AI model input data, defining the scope of use can become complex. OOD detection can play a valuable role in addressing this challenge. By incorporating OOD detection as a pre-screening step, clinicians can better evaluate whether AI models are suitable for a specific patient, enhancing the safety and effectiveness of AI in medicine. This approach is crucial for reducing the risk of incorrect diagnoses or treatments and ensuring AI is used responsibly in healthcare settings.

However, OOD detection can be challenging, and current methods still have limitations. In most datasets, there are no explicit labels indicating whether data is OOD. As a result, OOD models are often trained to detect when input data follows a distribution different from the training data, introducing an element of randomness into the detection process. Combining multiple OOD detection models with different random seeds or using models that rely on various mechanisms—such as K-nearest neighbors, isolation forest, Bayesian neural networks, variational autoencoders (VAEs), or contrastive learning—can improve the specificity of OOD detection. More research is needed to further improve the OOD detection methods.

It is important to note that the OOD approach could incentivize greater inclusivity and diversity in clinical trials and training data. Generally speaking, the more inclusive and diverse the clinical trials and training data are, the broader the scope of use will be for the AI model, as fewer patients will be classified as OOD during the model deployment phase. In addition, the OOD approach can help identify gaps in data inclusivity and offer valuable insights for improving future data collection and enhancing the diversity in clinical trial datasets.

Looking ahead, the integration of OOD detection into medical AI systems can be an important step toward responsible AI deployment. By explicitly addressing the limitations of our training data and our model capabilities, we can build more trustworthy AI systems that align with the rigorous standards of medical practice and the fundamental principle of “first, do no harm.”

The authors declare no conflicts of interest.

This article reflects the views of the author and should not be construed to represent FDA’s views or policies. As an Associate Editor for Clinical and Translational Science, Qi Liu was not involved in the review or decision process for this paper.

Abstract Image

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
第一,不伤害:解决人工智能在医学中不分布数据的挑战。
人工智能的出现给许多领域带来了革命性的变化,特别是在生物医学领域,人工智能现在被用于促进药物发现和开发,提高诊断和预后准确性,并支持临床决策。例如,自2021年以来,向美国食品和药物管理局(FDA)药物评估和研究中心(CDER)提交的人工智能相关申请显著增加,反映了人工智能在药物开发中的应用迅速扩大。此外,在PubMed[2]上发现的此类研究数量呈指数级增长,反映了人工智能健康应用的快速增长。然而,人工智能模型从开发到实际部署的转换仍然具有挑战性。这是由于各种因素造成的,包括数据漂移,其中部署阶段的数据特征与模型训练中使用的数据特征不同。因此,确保医疗人工智能模型在部署阶段的性能已成为重点关注的一个关键领域,因为在受控环境中表现出色的人工智能模型可能仍然难以应对现实世界的可变性,从而导致对特征与训练集差异很大的患者的预测不佳。这些病例通常被称为OOD样本,对人工智能驱动的决策提出了重大挑战,例如为患者做出诊断或选择治疗方法。未能识别这些OOD样本可能会导致次优甚至有害的决策。为了解决这个问题,我们提出了一种医疗人工智能模型部署的预筛选程序(特别是当人工智能模型风险较高时),旨在避免或标记人工智能模型对OOD样本的预测(图1a)。我们相信,这一过程有助于确保人工智能在医学领域的可信度。OOD场景是医疗人工智能应用中的常见挑战。例如,一个主要基于特定人口统计群体的数据训练的模型,在应用于不同人口统计群体的患者时,可能表现不佳,导致预测不准确。当人工智能模型遇到与训练数据不同的数据时,由于医疗实践和临床试验的治疗环境等因素,也可能出现OOD病例。这些问题可能会导致对患者的伤害(例如,误诊、不适当的治疗建议),以及对人工智能系统的信任丧失。多项研究和临床研究都强调了检测OOD样本以确定人工智能模型使用范围的重要性。一个著名的例子是医学分布外分析(MOOD)挑战[3],它对几种有监督和无监督模型的OOD检测算法进行了基准测试,包括自编码器神经网络、U-Net、矢量量化变分自编码器、主成分分析(PCA)和线性高斯过程回归。这些算法用于识别偏离训练数据的脑磁共振成像(MRI)和腹部计算机断层扫描(CT)扫描图像,从而降低机器学习模型过度自信预测的风险。类似地,Gram矩阵算法和线性/离群值合成等方法已被用于检测皮肤病变图像[4]中的OOD样本。除了医学成像之外,还建议对其他医疗保健数据类型(如电子健康记录(EHRs))进行OOD检测,以提高模型可靠性。除了诊断应用外,OOD检测还可以通过识别具有典型症状的患者来丰富临床试验队列。例如,Hopkins等人使用异常评分来确定双相抑郁症患者是否应该纳入非外消旋氨硫pride的临床试验(SEP-4199)。与正常患者相比,被确定为异常的患者对治疗表现出明显的反应。为了演示OOD检测技术如何集成到现有的医疗人工智能管道中,我们通过结合基于机器学习的OOD检测器扩展了先前发表的抗疟疾预测模型(图1b)。在加入OOD检测后,当对来自以前未见过的地理区域的转录组进行评估时,系统表现出更强大的性能。我们最初训练了一个基于树的梯度增强算法LightGBM,使用从患者身上获得的恶性疟原虫分离物的转录组来预测对抗疟疾药物青蒿素的耐药性。简单地说,训练数据包括来自东南亚分离株的转录组,以及这些分离株的清除率bb0。清除率慢的分离株被标记为耐药,而其他的被分类为非耐药[7]。为了增强模型,我们采用了一种OOD检测方法来区分分布中(ID)和OOD样本。 这是基于使用对比学习训练的深度神经网络潜在空间中的每个样本与训练集[9]中第k个最近邻居之间的距离来完成的。如果距离超过定义的阈值,设置为5%的训练观察值被分类为OOD,则该样本被分类为OOD。该方法在图像检测和时间序列建模等其他领域也表现出了很强的性能[9,10]。为了模拟在新环境中应用预训练模型,我们在一个地理上不同的地区测试了青蒿素耐药性。我们对Mok等人数据集中东南亚国家(缅甸除外)的786个转录组进行了模型训练。然后,我们在缅甸样本上验证了预训练模型,使用深度最近邻方法[9]来识别哪些样本相对于训练数据是OOD。我们首先在整个验证集上评估预训练模型,然后删除OOD样本并重新评估性能以检查影响。在对缅甸82份样品进行OOD检测时,鉴定出5份为OOD。我们使用AUROC评估了模型的预测性能,在整个缅甸数据集上获得了0.5973 [0.5508,0.6605]AUROC。去除OOD样品后,性能提高到0.6934 [0.6356,0.7593]AUROC。相比之下,对于五个OOD样本,模型的性能明显较低,为0.3310 [0.2153,0.4421]AUROC -远低于随机概率(AUROC = 0.5)。这些结果表明,OOD检测有效地识别了预训练模型预测不可靠的样本,并通过去除验证设置中的不可靠样本来提高性能。人工智能模型需要在输入数据方面有明确定义的使用范围。当将AI模型从开发阶段转移到现实环境中时尤其如此。例如,许多人工智能模型是使用临床试验数据开发的。一般来说,临床试验数据是在高度控制的环境中产生的,有精心选择的、相对均匀的患者群体。这些试验具有预定的持续时间和预定的终点。相比之下,真实世界数据(RWD)是在常规临床条件下收集的,通常是在不受控制的环境中,来自不同人口统计学、不同疾病表现和不同治疗方法的患者。RWD的异质性可能带来挑战,因为人工智能模型可能难以推广到更广泛、更多样化的人群。与临床试验不同,RWD可能没有预定义的数据收集终点,并且可能引入更多的可变性和潜在的偏差。当人工智能模型的应用范围超出其原始训练范围时,临床试验数据与实际数据之间的差异可能导致性能不佳。考虑到人工智能模型输入数据的高维性,定义使用范围可能会变得复杂。OOD检测可以在应对这一挑战方面发挥重要作用。通过将OOD检测作为预筛选步骤,临床医生可以更好地评估人工智能模型是否适合特定患者,从而提高人工智能在医学中的安全性和有效性。这种方法对于降低错误诊断或治疗的风险以及确保在医疗保健环境中负责任地使用人工智能至关重要。然而,OOD检测具有挑战性,目前的方法仍然存在局限性。在大多数数据集中,没有明确的标签表明数据是否为OOD。因此,OOD模型通常被训练来检测输入数据何时遵循与训练数据不同的分布,从而在检测过程中引入随机性元素。结合多个具有不同随机种子的OOD检测模型或使用依赖于各种机制的模型,如k近邻、隔离森林、贝叶斯神经网络、变分自编码器(VAEs)或对比学习,可以提高OOD检测的特异性。OOD的检测方法有待进一步完善。值得注意的是,OOD方法可以在临床试验和培训数据中激励更大的包容性和多样性。一般来说,临床试验和培训数据的包容性和多样性越强,人工智能模型的使用范围就越广,因为在模型部署阶段,被归类为OOD的患者就越少。此外,OOD方法可以帮助识别数据包容性方面的差距,并为改进未来的数据收集和增强临床试验数据集的多样性提供有价值的见解。展望未来,将OOD检测集成到医疗人工智能系统中可能是朝着负责任的人工智能部署迈出的重要一步。 通过明确解决我们的训练数据和模型能力的局限性,我们可以建立更值得信赖的人工智能系统,这些系统符合严格的医疗实践标准和“首先,不伤害”的基本原则。作者声明无利益冲突。本文反映了作者的观点,不应被解释为代表FDA的观点或政策。作为《临床与转化科学》杂志的副主编,刘琪没有参与本文的评审或决策过程。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Cts-Clinical and Translational Science
Cts-Clinical and Translational Science 医学-医学:研究与实验
CiteScore
6.70
自引率
2.60%
发文量
234
审稿时长
6-12 weeks
期刊介绍: Clinical and Translational Science (CTS), an official journal of the American Society for Clinical Pharmacology and Therapeutics, highlights original translational medicine research that helps bridge laboratory discoveries with the diagnosis and treatment of human disease. Translational medicine is a multi-faceted discipline with a focus on translational therapeutics. In a broad sense, translational medicine bridges across the discovery, development, regulation, and utilization spectrum. Research may appear as Full Articles, Brief Reports, Commentaries, Phase Forwards (clinical trials), Reviews, or Tutorials. CTS also includes invited didactic content that covers the connections between clinical pharmacology and translational medicine. Best-in-class methodologies and best practices are also welcomed as Tutorials. These additional features provide context for research articles and facilitate understanding for a wide array of individuals interested in clinical and translational science. CTS welcomes high quality, scientifically sound, original manuscripts focused on clinical pharmacology and translational science, including animal, in vitro, in silico, and clinical studies supporting the breadth of drug discovery, development, regulation and clinical use of both traditional drugs and innovative modalities.
期刊最新文献
Best Practices and Gaps in Current Regulatory and Health Technology Assessment Real-World Evidence Policies for Medicines and Medical Devices: Current State of Play and Next Steps. Trial of Preemptive Pharmacogenetic Testing in Underserved Patients (TOPP UP): Study Framework and Baseline Characteristics of Randomized Controlled Trial Participants. Hypoxia Promotes an Adrenergic to Mesenchymal Transcriptional Program Transition in Neuroblastoma. QX008N, an Anti-TSLP Monoclonal Antibody: Pharmacokinetics, Tolerability, and Immunogenicity in Healthy Chinese Subjects. Evaluation of Drug-Drug Interaction Potential Between the Oral Antibiotic Zoliflodacin and the CYP3A4 Inhibitor Itraconazole: A Phase 1 Study in Healthy Participants.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1