首页 > 最新文献

The journal of machine learning for biomedical imaging最新文献

英文 中文
Probabilistic dipole inversion for adaptive quantitative susceptibility mapping 自适应定量敏感性映射的概率偶极子反演
Pub Date : 2020-09-07 DOI: 10.59275/j.melba.2021-bbf2
Jinwei Zhang, Hang Zhang, M. Sabuncu, P. Spincemaille, Thanh D. Nguyen, Yi Wang
A learning-based posterior distribution estimation method, Probabilistic Dipole Inversion (PDI), is proposed to solve the quantitative susceptibility mapping (QSM) inverse problem in MRI with uncertainty estimation. In PDI, a deep convolutional neural network (CNN) is used to represent the multivariate Gaussian distribution as the approximate posterior distribution of susceptibility given the input measured field. Such CNN is first trained on healthy subjects via posterior density estimation, where the training dataset contains samples from the true posterior distribution. Domain adaptations are then deployed on patient datasets with new pathologies not included in pre-training, where PDI updates the pre-trained CNN’s weights in an unsupervised fashion by minimizing the Kullback-Leibler divergence between the approximate posterior distribution represented by CNN and the true posterior distribution from the likelihood distribution of a known physical model and pre-defined prior distribution. Based on our experiments, PDI provides additional uncertainty estimation compared to the conventional MAP approach, while addressing the potential issue of the pre-trained CNN when test data deviates from training. Our code is available at https://github.com/Jinwei1209/Bayesian_QSM.
提出了一种基于学习的后验分布估计方法——概率偶极子反演(PDI),以解决MRI定量敏感性映射(QSM)的不确定性反演问题。在PDI中,使用深度卷积神经网络(CNN)将敏感性的多元高斯分布表示为给定输入测量场的近似后验分布。这种CNN首先通过后验密度估计在健康受试者上训练,其中训练数据集包含来自真实后验分布的样本。然后将域适应部署在预训练中未包含的新病理的患者数据集上,其中PDI通过最小化由CNN表示的近似后验分布与来自已知物理模型和预定义先验分布的似然分布的真实后验分布之间的Kullback-Leibler散度,以无监督的方式更新预训练的CNN的权值。根据我们的实验,与传统的MAP方法相比,PDI提供了额外的不确定性估计,同时解决了当测试数据偏离训练数据时预训练CNN的潜在问题。我们的代码可在https://github.com/Jinwei1209/Bayesian_QSM上获得。
{"title":"Probabilistic dipole inversion for adaptive quantitative susceptibility mapping","authors":"Jinwei Zhang, Hang Zhang, M. Sabuncu, P. Spincemaille, Thanh D. Nguyen, Yi Wang","doi":"10.59275/j.melba.2021-bbf2","DOIUrl":"https://doi.org/10.59275/j.melba.2021-bbf2","url":null,"abstract":"A learning-based posterior distribution estimation method, Probabilistic Dipole Inversion (PDI), is proposed to solve the quantitative susceptibility mapping (QSM) inverse problem in MRI with uncertainty estimation. In PDI, a deep convolutional neural network (CNN) is used to represent the multivariate Gaussian distribution as the approximate posterior distribution of susceptibility given the input measured field. Such CNN is first trained on healthy subjects via posterior density estimation, where the training dataset contains samples from the true posterior distribution. Domain adaptations are then deployed on patient datasets with new pathologies not included in pre-training, where PDI updates the pre-trained CNN’s weights in an unsupervised fashion by minimizing the Kullback-Leibler divergence between the approximate posterior distribution represented by CNN and the true posterior distribution from the likelihood distribution of a known physical model and pre-defined prior distribution. Based on our experiments, PDI provides additional uncertainty estimation compared to the conventional MAP approach, while addressing the potential issue of the pre-trained CNN when test data deviates from training. Our code is available at https://github.com/Jinwei1209/Bayesian_QSM.","PeriodicalId":75083,"journal":{"name":"The journal of machine learning for biomedical imaging","volume":"745 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76841908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Learning Interpretable Microscopic Features of Tumor by Multi-task Adversarial CNNs To Improve Generalization 利用多任务对抗cnn学习肿瘤可解释的微观特征以提高泛化
Pub Date : 2020-08-04 DOI: 10.59275/j.melba.2023-3462
Mara Graziani, Sebastian Otálora, S. Marchand-Maillet, H. Muller, V. Andrearczyk
Adopting Convolutional Neural Networks (CNNs) in the daily routine of primary diagnosis requires not only near-perfect precision, but also a sufficient degree of generalization to data acquisition shifts and transparency. Existing CNN models act as black boxes, not ensuring to the physicians that important diagnostic features are used by the model. Building on top of successfully existing techniques such as multi-task learning, domain adversarial training and concept-based interpretability, this paper addresses the challenge of introducing diagnostic factors in the training objectives. Here we show that our architecture, by learning end-to-end an uncertainty-based weighting combination of multi-task and adversarial losses, is encouraged to focus on pathology features such as density and pleomorphism of nuclei, e.g. variations in size and appearance, while discarding misleading features such as staining differences. Our results on breast lymph node tissue show significantly improved generalization in the detection of tumorous tissue, with best average AUC 0.89 (0.01) against the baseline AUC 0.86 (0.005). By applying the interpretability technique of linearly probing intermediate representations, we also demonstrate that interpretable pathology features such as nuclei density are learned by the proposed CNN architecture, confirming the increased transparency of this model. This result is a starting point towards building interpretable multi-task architectures that are robust to data heterogeneity. Our code is available at https://github.com/maragraziani/multitask_adversarial
在日常的初级诊断中采用卷积神经网络(cnn)不仅需要接近完美的精度,而且需要对数据采集偏移和透明度有足够的泛化程度。现有的CNN模型就像黑盒子一样,不能向医生保证模型使用了重要的诊断特征。本文以现有的多任务学习、领域对抗训练和基于概念的可解释性等成功技术为基础,解决了在训练目标中引入诊断因素的挑战。在这里,我们展示了我们的架构,通过端到端学习基于不确定性的多任务加权组合和对抗性损失,被鼓励关注病理特征,如细胞核的密度和多形性,例如大小和外观的变化,同时丢弃误导性特征,如染色差异。我们在乳腺淋巴结组织的结果显示,肿瘤组织检测的通用性显著提高,最佳平均AUC为0.89(0.01),而基线AUC为0.86(0.005)。通过应用线性探测中间表征的可解释性技术,我们还证明了可解释的病理特征,如核密度被提出的CNN架构所学习,证实了该模型的透明度增加。该结果是构建可解释的多任务体系结构的起点,该体系结构对数据异构具有鲁棒性。我们的代码可在https://github.com/maragraziani/multitask_adversarial上获得
{"title":"Learning Interpretable Microscopic Features of Tumor by Multi-task Adversarial CNNs To Improve Generalization","authors":"Mara Graziani, Sebastian Otálora, S. Marchand-Maillet, H. Muller, V. Andrearczyk","doi":"10.59275/j.melba.2023-3462","DOIUrl":"https://doi.org/10.59275/j.melba.2023-3462","url":null,"abstract":"Adopting Convolutional Neural Networks (CNNs) in the daily routine of primary diagnosis requires not only near-perfect precision, but also a sufficient degree of generalization to data acquisition shifts and transparency. Existing CNN models act as black boxes, not ensuring to the physicians that important diagnostic features are used by the model. Building on top of successfully existing techniques such as multi-task learning, domain adversarial training and concept-based interpretability, this paper addresses the challenge of introducing diagnostic factors in the training objectives. Here we show that our architecture, by learning end-to-end an uncertainty-based weighting combination of multi-task and adversarial losses, is encouraged to focus on pathology features such as density and pleomorphism of nuclei, e.g. variations in size and appearance, while discarding misleading features such as staining differences. Our results on breast lymph node tissue show significantly improved generalization in the detection of tumorous tissue, with best average AUC 0.89 (0.01) against the baseline AUC 0.86 (0.005). By applying the interpretability technique of linearly probing intermediate representations, we also demonstrate that interpretable pathology features such as nuclei density are learned by the proposed CNN architecture, confirming the increased transparency of this model. This result is a starting point towards building interpretable multi-task architectures that are robust to data heterogeneity. Our code is available at https://github.com/maragraziani/multitask_adversarial","PeriodicalId":75083,"journal":{"name":"The journal of machine learning for biomedical imaging","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88793048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint Frequency and Image Space Learning for MRI Reconstruction and Analysis 联合频率和图像空间学习用于MRI重建和分析
Pub Date : 2020-07-02 DOI: 10.59275/j.melba.2022-16cc
Nalini Singh, J. E. Iglesias, E. Adalsteinsson, Adrian V. Dalca, P. Golland
We propose neural network layers that explicitly combine frequency and image feature representations and show that they can be used as a versatile building block for reconstruction from frequency space data. Our work is motivated by the challenges arising in MRI acquisition where the signal is a corrupted Fourier transform of the desired image. The proposed joint learning schemes enable both correction of artifacts native to the frequency space and manipulation of image space representations to reconstruct coherent image structures at every layer of the network. This is in contrast to most current deep learning approaches for image reconstruction that treat frequency and image space features separately and often operate exclusively in one of the two spaces. We demonstrate the advantages of joint convolutional learning for a variety of tasks, including motion correction, denoising, reconstruction from undersampled acquisitions, and combined undersampling and motion correction on simulated and real world multicoil MRI data. The joint models produce consistently high quality output images across all tasks and datasets. When integrated into a state of the art unrolled optimization network with physics-inspired data consistency constraints for undersampled reconstruction, the proposed architectures significantly improve the optimization landscape, which yields an order of magnitude reduction of training time. This result suggests that joint representations are particularly well suited for MRI signals in deep learning networks. Our code and pretrained models are publicly available at https://github.com/nalinimsingh/interlacer.
我们提出了明确结合频率和图像特征表示的神经网络层,并表明它们可以用作从频率空间数据重建的通用构建块。我们的工作是由MRI采集中出现的挑战所激发的,其中信号是所需图像的傅立叶变换的损坏。所提出的联合学习方案既可以校正频率空间固有的伪影,又可以操纵图像空间表示,从而在网络的每一层重建连贯的图像结构。这与目前大多数用于图像重建的深度学习方法形成对比,这些方法分别处理频率和图像空间特征,并且通常只在两个空间中的一个空间中操作。我们展示了联合卷积学习在各种任务中的优势,包括运动校正、去噪、从欠采样采集中重建,以及在模拟和真实世界多线圈MRI数据上结合欠采样和运动校正。联合模型在所有任务和数据集上产生一致的高质量输出图像。当集成到具有物理启发的欠采样重建数据一致性约束的最先进的展开优化网络中时,所提出的架构显着改善了优化环境,从而使训练时间减少了一个数量级。这一结果表明,联合表示特别适合于深度学习网络中的MRI信号。我们的代码和预训练模型可以在https://github.com/nalinimsingh/interlacer上公开获得。
{"title":"Joint Frequency and Image Space Learning for MRI Reconstruction and Analysis","authors":"Nalini Singh, J. E. Iglesias, E. Adalsteinsson, Adrian V. Dalca, P. Golland","doi":"10.59275/j.melba.2022-16cc","DOIUrl":"https://doi.org/10.59275/j.melba.2022-16cc","url":null,"abstract":"We propose neural network layers that explicitly combine frequency and image feature representations and show that they can be used as a versatile building block for reconstruction from frequency space data. Our work is motivated by the challenges arising in MRI acquisition where the signal is a corrupted Fourier transform of the desired image. The proposed joint learning schemes enable both correction of artifacts native to the frequency space and manipulation of image space representations to reconstruct coherent image structures at every layer of the network. This is in contrast to most current deep learning approaches for image reconstruction that treat frequency and image space features separately and often operate exclusively in one of the two spaces. We demonstrate the advantages of joint convolutional learning for a variety of tasks, including motion correction, denoising, reconstruction from undersampled acquisitions, and combined undersampling and motion correction on simulated and real world multicoil MRI data. The joint models produce consistently high quality output images across all tasks and datasets. When integrated into a state of the art unrolled optimization network with physics-inspired data consistency constraints for undersampled reconstruction, the proposed architectures significantly improve the optimization landscape, which yields an order of magnitude reduction of training time. This result suggests that joint representations are particularly well suited for MRI signals in deep learning networks. Our code and pretrained models are publicly available at https://github.com/nalinimsingh/interlacer.","PeriodicalId":75083,"journal":{"name":"The journal of machine learning for biomedical imaging","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86189684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
The Alzheimer's Disease Prediction Of Longitudinal Evolution (TADPOLE) Challenge: Results after 1 Year Follow-up 纵向进化(蝌蚪)挑战的阿尔茨海默病预测:1年后的随访结果
Pub Date : 2020-02-09 DOI: 10.59275/j.melba.2021-2dcc
Razvan V. Marinescu, N. Oxtoby, A. Young, E. Bron, A. Toga, M. Weiner, F. Barkhof, Nick C Fox, A. Eshaghi, Tina Toni, Marcin Salaterski, V. Lunina, M. Ansart, S. Durrleman, Pascal Lu, S. Iddi, Dan Li, W. Thompson, M. Donohue, A. Nahon, Yarden Levy, Dan Halbersberg, M. Cohen, Huiling Liao, Tengfei Li, Kaixian Yu, Hongtu Zhu, Jose Gerardo Tamez-Peña, A. Ismail, Timothy Wood, H. C. Bravo, Minh Nguyen, Nanbo Sun, Jiashi Feng, B. Yeo, Gan Chen, Kexin Qi, Shi-Yu Chen, D. Qiu, I. Buciuman, A. Kelner, R. Pop, Denisa Rimocea, M. Ghazi, M. Nielsen, S. Ourselin, Lauge Sørensen, Vikram Venkatraghavan, Keli Liu, C. Rabe, P. Manser, S. Hill, J. Howlett, Zhiyue Huang, S. Kiddle, S. Mukherjee, Anaïs Rouanet, B. Taschler, B. Tom, S. White, N. Faux, S. Sedai, Javier de Velasco Oriol, Edgar E. V. Clemente, K. Estrada, Leon M. Aksman, A. Altmann, C. Stonnington, Yalin Wang, Jianfeng Wu, Vivek Devadas, Clémentine Fourrier, L. L. Rakêt, Aristeidis Sotiras, G. Erus, J. Doshi, C. Davatzikos, J. Vogel, Andrew Doyle, Angela Tam, A
Accurate prediction of progression in subjects at risk of Alzheimer's disease is crucial for enrolling the right subjects in clinical trials. However, a prospective comparison of state-of-the-art algorithms for predicting disease onset and progression is currently lacking. We present the findings of "The Alzheimer's Disease Prediction Of Longitudinal Evolution" (TADPOLE) Challenge, which compared the performance of 92 algorithms from 33 international teams at predicting the future trajectory of 219 individuals at risk of Alzheimer's disease. Challenge participants were required to make a prediction, for each month of a 5-year future time period, of three key outcomes: clinical diagnosis, Alzheimer's Disease Assessment Scale Cognitive Subdomain (ADAS-Cog13), and total volume of the ventricles. The methods used by challenge participants included multivariate linear regression, machine learning methods such as support vector machines and deep neural networks, as well as disease progression models. No single submission was best at predicting all three outcomes. For clinical diagnosis and ventricle volume prediction, the best algorithms strongly outperform simple baselines in predictive ability. However, for ADAS-Cog13 no single submitted prediction method was significantly better than random guesswork. Two ensemble methods based on taking the mean and median over all predictions, obtained top scores on almost all tasks. Better than average performance at diagnosis prediction was generally associated with the additional inclusion of features from cerebrospinal fluid (CSF) samples and diffusion tensor imaging (DTI). On the other hand, better performance at ventricle volume prediction was associated with inclusion of summary statistics, such as the slope or maxima/minima of patient-specific biomarkers. On a limited, cross-sectional subset of the data emulating clinical trials, performance of the best algorithms at predicting clinical diagnosis decreased only slightly (2 percentage points) compared to the full longitudinal dataset. The submission system remains open via the website https://tadpole.grand-challenge.org, while TADPOLE SHARE (https://tadpole-share.github.io/) collates code for submissions. TADPOLE's unique results suggest that current prediction algorithms provide sufficient accuracy to exploit biomarkers related to clinical diagnosis and ventricle volume, for cohort refinement in clinical trials for Alzheimer's disease. However, results call into question the usage of cognitive test scores for patient selection and as a primary endpoint in clinical trials.
准确预测阿尔茨海默病风险受试者的进展对于招募合适的受试者进行临床试验至关重要。然而,目前缺乏预测疾病发生和进展的最先进算法的前瞻性比较。我们展示了“阿尔茨海默病纵向进化预测”(TADPOLE)挑战的研究结果,该挑战比较了来自33个国际团队的92种算法在预测219名阿尔茨海默病风险个体的未来轨迹方面的表现。挑战参与者被要求在未来5年的每个月预测三个关键结果:临床诊断、阿尔茨海默病评估量表认知子域(ADAS-Cog13)和心室的总容积。挑战参与者使用的方法包括多元线性回归、机器学习方法,如支持向量机和深度神经网络,以及疾病进展模型。没有哪一份报告能最好地预测所有三种结果。对于临床诊断和心室容量预测,最佳算法在预测能力上明显优于简单基线。然而,对于ADAS-Cog13,没有一种提交的预测方法明显优于随机猜测。两种基于对所有预测取平均值和中位数的集成方法在几乎所有任务中都获得了最高分。优于平均水平的诊断预测通常与脑脊液(CSF)样本和弥散张量成像(DTI)的附加特征相关。另一方面,心室容量预测的更好表现与汇总统计相关,例如患者特异性生物标志物的斜率或最大值/最小值。在有限的模拟临床试验数据的横截面子集上,与完整的纵向数据集相比,最佳算法在预测临床诊断方面的性能仅略有下降(2个百分点)。提交系统通过网站https://tadpole.grand-challenge.org保持开放,而蝌蚪共享(https://tadpole-share.github.io/)整理提交代码。TADPOLE的独特结果表明,目前的预测算法提供了足够的准确性,可以利用与临床诊断和心室体积相关的生物标志物,在阿尔茨海默病的临床试验中进行队列优化。然而,研究结果对使用认知测试分数作为患者选择和临床试验的主要终点提出了质疑。
{"title":"The Alzheimer's Disease Prediction Of Longitudinal Evolution (TADPOLE) Challenge: Results after 1 Year Follow-up","authors":"Razvan V. Marinescu, N. Oxtoby, A. Young, E. Bron, A. Toga, M. Weiner, F. Barkhof, Nick C Fox, A. Eshaghi, Tina Toni, Marcin Salaterski, V. Lunina, M. Ansart, S. Durrleman, Pascal Lu, S. Iddi, Dan Li, W. Thompson, M. Donohue, A. Nahon, Yarden Levy, Dan Halbersberg, M. Cohen, Huiling Liao, Tengfei Li, Kaixian Yu, Hongtu Zhu, Jose Gerardo Tamez-Peña, A. Ismail, Timothy Wood, H. C. Bravo, Minh Nguyen, Nanbo Sun, Jiashi Feng, B. Yeo, Gan Chen, Kexin Qi, Shi-Yu Chen, D. Qiu, I. Buciuman, A. Kelner, R. Pop, Denisa Rimocea, M. Ghazi, M. Nielsen, S. Ourselin, Lauge Sørensen, Vikram Venkatraghavan, Keli Liu, C. Rabe, P. Manser, S. Hill, J. Howlett, Zhiyue Huang, S. Kiddle, S. Mukherjee, Anaïs Rouanet, B. Taschler, B. Tom, S. White, N. Faux, S. Sedai, Javier de Velasco Oriol, Edgar E. V. Clemente, K. Estrada, Leon M. Aksman, A. Altmann, C. Stonnington, Yalin Wang, Jianfeng Wu, Vivek Devadas, Clémentine Fourrier, L. L. Rakêt, Aristeidis Sotiras, G. Erus, J. Doshi, C. Davatzikos, J. Vogel, Andrew Doyle, Angela Tam, A","doi":"10.59275/j.melba.2021-2dcc","DOIUrl":"https://doi.org/10.59275/j.melba.2021-2dcc","url":null,"abstract":"Accurate prediction of progression in subjects at risk of Alzheimer's disease is crucial for enrolling the right subjects in clinical trials. However, a prospective comparison of state-of-the-art algorithms for predicting disease onset and progression is currently lacking. We present the findings of \"The Alzheimer's Disease Prediction Of Longitudinal Evolution\" (TADPOLE) Challenge, which compared the performance of 92 algorithms from 33 international teams at predicting the future trajectory of 219 individuals at risk of Alzheimer's disease. Challenge participants were required to make a prediction, for each month of a 5-year future time period, of three key outcomes: clinical diagnosis, Alzheimer's Disease Assessment Scale Cognitive Subdomain (ADAS-Cog13), and total volume of the ventricles. The methods used by challenge participants included multivariate linear regression, machine learning methods such as support vector machines and deep neural networks, as well as disease progression models. No single submission was best at predicting all three outcomes. For clinical diagnosis and ventricle volume prediction, the best algorithms strongly outperform simple baselines in predictive ability. However, for ADAS-Cog13 no single submitted prediction method was significantly better than random guesswork. Two ensemble methods based on taking the mean and median over all predictions, obtained top scores on almost all tasks. Better than average performance at diagnosis prediction was generally associated with the additional inclusion of features from cerebrospinal fluid (CSF) samples and diffusion tensor imaging (DTI). On the other hand, better performance at ventricle volume prediction was associated with inclusion of summary statistics, such as the slope or maxima/minima of patient-specific biomarkers. On a limited, cross-sectional subset of the data emulating clinical trials, performance of the best algorithms at predicting clinical diagnosis decreased only slightly (2 percentage points) compared to the full longitudinal dataset. The submission system remains open via the website https://tadpole.grand-challenge.org, while TADPOLE SHARE (https://tadpole-share.github.io/) collates code for submissions. TADPOLE's unique results suggest that current prediction algorithms provide sufficient accuracy to exploit biomarkers related to clinical diagnosis and ventricle volume, for cohort refinement in clinical trials for Alzheimer's disease. However, results call into question the usage of cognitive test scores for patient selection and as a primary endpoint in clinical trials.","PeriodicalId":75083,"journal":{"name":"The journal of machine learning for biomedical imaging","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89767080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Distributionally Robust Deep Learning using Hardness Weighted Sampling 基于硬度加权抽样的分布鲁棒深度学习
Pub Date : 2020-01-08 DOI: 10.59275/j.melba.2022-8b6a
Lucas Fidon, S. Ourselin, T. Vercauteren
Limiting failures of machine learning systems is of paramount importance for safety-critical applications. In order to improve the robustness of machine learning systems, Distributionally Robust Optimization (DRO) has been proposed as a generalization of Empirical Risk Minimization (ERM). However, its use in deep learning has been severely restricted due to the relative inefficiency of the optimizers available for DRO in comparison to the wide-spread variants of Stochastic Gradient Descent (SGD) optimizers for ERM.We propose SGD with hardness weighted sampling, a principled and efficient optimization method for DRO in machine learning that is particularly suited in the context of deep learning. Similar to a hard example mining strategy in practice, the proposed algorithm is straightforward to implement and computationally as efficient as SGD-based optimizers used for deep learning, requiring minimal overhead computation. In contrast to typical ad hoc hard mining approaches, we prove the convergence of our DRO algorithm for over-parameterized deep learning networks with ReLU activation and finite number of layers and parameters.Our experiments on fetal brain 3D MRI segmentation and brain tumor segmentation in MRI demonstrate the feasibility and the usefulness of our approach. Using our hardness weighted sampling for training a state-of-the-art deep learning pipeline leads to improved robustness to anatomical variabilities in automatic fetal brain 3D MRI segmentation using deep learning and to improved robustness to the image protocol variations in brain tumor segmentation.a decrease of 2% of the interquartile range of the Dice scores for the enhanced tumor and the tumor core regions.Our code is available at https://github.com/LucasFidon/HardnessWeightedSampler
限制机器学习系统的故障对于安全关键应用至关重要。为了提高机器学习系统的鲁棒性,分布式鲁棒优化(DRO)被提出作为经验风险最小化(ERM)的推广。然而,与广泛使用的随机梯度下降(SGD)优化器相比,用于DRO的优化器相对效率低下,因此它在深度学习中的使用受到了严重限制。我们提出了具有硬度加权抽样的SGD,这是机器学习中DRO的一种原则性和高效的优化方法,特别适合深度学习的背景。与实践中的硬示例挖掘策略类似,所提出的算法易于实现,并且计算效率与用于深度学习的基于sgd的优化器一样高,需要最小的开销计算。与典型的临时硬挖掘方法相比,我们证明了我们的DRO算法对于具有ReLU激活和有限层数和参数的过参数化深度学习网络的收敛性。通过对胎儿脑三维MRI分割和脑肿瘤MRI分割的实验,验证了该方法的可行性和实用性。使用我们的硬度加权采样来训练最先进的深度学习管道,可以提高使用深度学习的自动胎儿脑3D MRI分割中解剖变异的鲁棒性,并提高对脑肿瘤分割中图像协议变化的鲁棒性。增强肿瘤和肿瘤核心区域的Dice分数的四分位数范围减少了2%。我们的代码可在https://github.com/LucasFidon/HardnessWeightedSampler上获得
{"title":"Distributionally Robust Deep Learning using Hardness Weighted Sampling","authors":"Lucas Fidon, S. Ourselin, T. Vercauteren","doi":"10.59275/j.melba.2022-8b6a","DOIUrl":"https://doi.org/10.59275/j.melba.2022-8b6a","url":null,"abstract":"Limiting failures of machine learning systems is of paramount importance for safety-critical applications. In order to improve the robustness of machine learning systems, Distributionally Robust Optimization (DRO) has been proposed as a generalization of Empirical Risk Minimization (ERM). However, its use in deep learning has been severely restricted due to the relative inefficiency of the optimizers available for DRO in comparison to the wide-spread variants of Stochastic Gradient Descent (SGD) optimizers for ERM.We propose SGD with hardness weighted sampling, a principled and efficient optimization method for DRO in machine learning that is particularly suited in the context of deep learning. Similar to a hard example mining strategy in practice, the proposed algorithm is straightforward to implement and computationally as efficient as SGD-based optimizers used for deep learning, requiring minimal overhead computation. In contrast to typical ad hoc hard mining approaches, we prove the convergence of our DRO algorithm for over-parameterized deep learning networks with ReLU activation and finite number of layers and parameters.Our experiments on fetal brain 3D MRI segmentation and brain tumor segmentation in MRI demonstrate the feasibility and the usefulness of our approach. Using our hardness weighted sampling for training a state-of-the-art deep learning pipeline leads to improved robustness to anatomical variabilities in automatic fetal brain 3D MRI segmentation using deep learning and to improved robustness to the image protocol variations in brain tumor segmentation.a decrease of 2% of the interquartile range of the Dice scores for the enhanced tumor and the tumor core regions.Our code is available at https://github.com/LucasFidon/HardnessWeightedSampler","PeriodicalId":75083,"journal":{"name":"The journal of machine learning for biomedical imaging","volume":"148 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77869875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
期刊
The journal of machine learning for biomedical imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1