Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention最新文献
Pub Date : 2023-10-01DOI: 10.1007/978-3-031-43996-4_36
Yubo Fan, Jianing Wang, Yiyuan Zhao, Rui Li, Han Liu, Robert F Labadie, Jack H Noble, Benoit M Dawant
Cochlear implants (CIs) are neuroprosthetics that can provide a sense of sound to people with severe-to-profound hearing loss. A CI contains an electrode array (EA) that is threaded into the cochlea during surgery. Recent studies have shown that hearing outcomes are correlated with EA placement. An image-guided cochlear implant programming technique is based on this correlation and utilizes the EA location with respect to the intracochlear anatomy to help audiologists adjust the CI settings to improve hearing. Automated methods to localize EA in postoperative CT images are of great interest for large-scale studies and for translation into the clinical workflow. In this work, we propose a unified deep-learning-based framework for automated EA localization. It consists of a multi-task network and a series of postprocessing algorithms to localize various types of EAs. The evaluation on a dataset with 27 cadaveric samples shows that its localization error is slightly smaller than the state-of-the-art method. Another evaluation on a large-scale clinical dataset containing 561 cases across two institutions demonstrates a significant improvement in robustness compared to the state-of-the-art method. This suggests that this technique could be integrated into the clinical workflow and provide audiologists with information that facilitates the programming of the implant leading to improved patient care.
人工耳蜗(CI)是一种神经义肢,可以为重度到永久性听力损失患者提供声音感知。CI 包含一个电极阵列 (EA),在手术中被穿入耳蜗。最近的研究表明,听力效果与电极阵列的位置有关。图像引导人工耳蜗植入编程技术就是基于这种相关性,并利用 EA 位置与耳蜗内解剖结构的关系,帮助听力学家调整 CI 设置以改善听力。在术后 CT 图像中定位 EA 的自动化方法对于大规模研究和转化为临床工作流程具有重大意义。在这项工作中,我们提出了一种基于深度学习的统一框架,用于自动 EA 定位。它由一个多任务网络和一系列后处理算法组成,用于定位各种类型的 EA。在一个包含 27 个尸体样本的数据集上进行的评估表明,其定位误差略小于最先进的方法。另一项评估是在一个大规模临床数据集上进行的,该数据集包含两个机构的 561 个病例,结果表明与最先进的方法相比,该方法的鲁棒性有了显著提高。这表明这项技术可以整合到临床工作流程中,为听力学家提供有助于植入程序设计的信息,从而改善患者护理。
{"title":"A Unified Deep-Learning-Based Framework for Cochlear Implant Electrode Array Localization.","authors":"Yubo Fan, Jianing Wang, Yiyuan Zhao, Rui Li, Han Liu, Robert F Labadie, Jack H Noble, Benoit M Dawant","doi":"10.1007/978-3-031-43996-4_36","DOIUrl":"10.1007/978-3-031-43996-4_36","url":null,"abstract":"<p><p>Cochlear implants (CIs) are neuroprosthetics that can provide a sense of sound to people with severe-to-profound hearing loss. A CI contains an electrode array (EA) that is threaded into the cochlea during surgery. Recent studies have shown that hearing outcomes are correlated with EA placement. An image-guided cochlear implant programming technique is based on this correlation and utilizes the EA location with respect to the intracochlear anatomy to help audiologists adjust the CI settings to improve hearing. Automated methods to localize EA in postoperative CT images are of great interest for large-scale studies and for translation into the clinical workflow. In this work, we propose a unified deep-learning-based framework for automated EA localization. It consists of a multi-task network and a series of postprocessing algorithms to localize various types of EAs. The evaluation on a dataset with 27 cadaveric samples shows that its localization error is slightly smaller than the state-of-the-art method. Another evaluation on a large-scale clinical dataset containing 561 cases across two institutions demonstrates a significant improvement in robustness compared to the state-of-the-art method. This suggests that this technique could be integrated into the clinical workflow and provide audiologists with information that facilitates the programming of the implant leading to improved patient care.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14228 ","pages":"376-385"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10976972/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140338426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1007/978-3-031-43996-4_24
Mohammad M R Khan, Yubo Fan, Benoit M Dawant, Jack H Noble
In cochlear implant (CI) procedures, an electrode array is surgically inserted into the cochlea. The electrodes are used to stimulate the auditory nerve and restore hearing sensation for the recipient. If the array folds inside the cochlea during the insertion procedure, it can lead to trauma, damage to the residual hearing, and poor hearing restoration. Intraoperative detection of such a case can allow a surgeon to perform reimplantation. However, this intraoperative detection requires experience and electrophysiological tests sometimes fail to detect an array folding. Due to the low incidence of array folding, we generated a dataset of CT images with folded synthetic electrode arrays with realistic metal artifact. The dataset was used to train a multitask custom 3D-UNet model for array fold detection. We tested the trained model on real post-operative CTs (7 with folded arrays and 200 without). Our model could correctly classify all the fold-over cases while misclassifying only 3 non fold-over cases. Therefore, the model is a promising option for array fold detection.
{"title":"Cochlear Implant Fold Detection in Intra-operative CT Using Weakly Supervised Multi-task Deep Learning.","authors":"Mohammad M R Khan, Yubo Fan, Benoit M Dawant, Jack H Noble","doi":"10.1007/978-3-031-43996-4_24","DOIUrl":"10.1007/978-3-031-43996-4_24","url":null,"abstract":"<p><p>In cochlear implant (CI) procedures, an electrode array is surgically inserted into the cochlea. The electrodes are used to stimulate the auditory nerve and restore hearing sensation for the recipient. If the array folds inside the cochlea during the insertion procedure, it can lead to trauma, damage to the residual hearing, and poor hearing restoration. Intraoperative detection of such a case can allow a surgeon to perform reimplantation. However, this intraoperative detection requires experience and electrophysiological tests sometimes fail to detect an array folding. Due to the low incidence of array folding, we generated a dataset of CT images with folded synthetic electrode arrays with realistic metal artifact. The dataset was used to train a multitask custom 3D-UNet model for array fold detection. We tested the trained model on real post-operative CTs (7 with folded arrays and 200 without). Our model could correctly classify all the fold-over cases while misclassifying only 3 non fold-over cases. Therefore, the model is a promising option for array fold detection.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14228 ","pages":"249-259"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10953791/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140186822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1007/978-3-031-43999-5_32
Matthew Ragoza, Kayhan Batmanghelich
Magnetic resonance elastography (MRE) is a medical imaging modality that non-invasively quantifies tissue stiffness (elasticity) and is commonly used for diagnosing liver fibrosis. Constructing an elasticity map of tissue requires solving an inverse problem involving a partial differential equation (PDE). Current numerical techniques to solve the inverse problem are noise-sensitive and require explicit specification of physical relationships. In this work, we apply physics-informed neural networks to solve the inverse problem of tissue elasticity reconstruction. Our method does not rely on numerical differentiation and can be extended to learn relevant correlations from anatomical images while respecting physical constraints. We evaluate our approach on simulated data and in vivo data from a cohort of patients with non-alcoholic fatty liver disease (NAFLD). Compared to numerical baselines, our method is more robust to noise and more accurate on realistic data, and its performance is further enhanced by incorporating anatomical information.
{"title":"Physics-Informed Neural Networks for Tissue Elasticity Reconstruction in Magnetic Resonance Elastography.","authors":"Matthew Ragoza, Kayhan Batmanghelich","doi":"10.1007/978-3-031-43999-5_32","DOIUrl":"10.1007/978-3-031-43999-5_32","url":null,"abstract":"<p><p>Magnetic resonance elastography (MRE) is a medical imaging modality that non-invasively quantifies tissue stiffness (elasticity) and is commonly used for diagnosing liver fibrosis. Constructing an elasticity map of tissue requires solving an inverse problem involving a partial differential equation (PDE). Current numerical techniques to solve the inverse problem are noise-sensitive and require explicit specification of physical relationships. In this work, we apply physics-informed neural networks to solve the inverse problem of tissue elasticity reconstruction. Our method does not rely on numerical differentiation and can be extended to learn relevant correlations from anatomical images while respecting physical constraints. We evaluate our approach on simulated data and <i>in vivo</i> data from a cohort of patients with non-alcoholic fatty liver disease (NAFLD). Compared to numerical baselines, our method is more robust to noise and more accurate on realistic data, and its performance is further enhanced by incorporating anatomical information.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14229 ","pages":"333-343"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141115/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1007/978-3-031-43895-0_49
Myeongkyun Kang, Philip Chikontwe, Soopil Kim, Kyong Hwan Jin, Ehsan Adeli, Kilian M Pohl, Sang Hyun Park
One-shot federated learning (FL) has emerged as a promising solution in scenarios where multiple communication rounds are not practical. Notably, as feature distributions in medical data are less discriminative than those of natural images, robust global model training with FL is non-trivial and can lead to overfitting. To address this issue, we propose a novel one-shot FL framework leveraging Image Synthesis and Client model Adaptation (FedISCA) with knowledge distillation (KD). To prevent overfitting, we generate diverse synthetic images ranging from random noise to realistic images. This approach (i) alleviates data privacy concerns and (ii) facilitates robust global model training using KD with decentralized client models. To mitigate domain disparity in the early stages of synthesis, we design noise-adapted client models where batch normalization statistics on random noise (synthetic images) are updated to enhance KD. Lastly, the global model is trained with both the original and noise-adapted client models via KD and synthetic images. This process is repeated till global model convergence. Extensive evaluation of this design on five small- and three large-scale medical image classification datasets reveals superior accuracy over prior methods. Code is available at https://github.com/myeongkyunkang/FedISCA.
{"title":"One-shot Federated Learning on Medical Data using Knowledge Distillation with Image Synthesis and Client Model Adaptation.","authors":"Myeongkyun Kang, Philip Chikontwe, Soopil Kim, Kyong Hwan Jin, Ehsan Adeli, Kilian M Pohl, Sang Hyun Park","doi":"10.1007/978-3-031-43895-0_49","DOIUrl":"10.1007/978-3-031-43895-0_49","url":null,"abstract":"<p><p>One-shot federated learning (FL) has emerged as a promising solution in scenarios where multiple communication rounds are not practical. Notably, as feature distributions in medical data are less discriminative than those of natural images, robust global model training with FL is non-trivial and can lead to overfitting. To address this issue, we propose a novel one-shot FL framework leveraging Image Synthesis and Client model Adaptation (FedISCA) with knowledge distillation (KD). To prevent overfitting, we generate diverse synthetic images ranging from random noise to realistic images. This approach (i) alleviates data privacy concerns and (ii) facilitates robust global model training using KD with decentralized client models. To mitigate domain disparity in the early stages of synthesis, we design noise-adapted client models where batch normalization statistics on random noise (synthetic images) are updated to enhance KD. Lastly, the global model is trained with both the original and noise-adapted client models via KD and synthetic images. This process is repeated till global model convergence. Extensive evaluation of this design on five small- and three large-scale medical image classification datasets reveals superior accuracy over prior methods. Code is available at https://github.com/myeongkyunkang/FedISCA.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14221 ","pages":"521-531"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10781197/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139418907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1007/978-3-031-43999-5_67
Tianyi Zeng, Jiazhen Zhang, Eléonore V Lieffrig, Zhuotong Cai, Fuyao Chen, Chenyu You, Mika Naganawa, Yihuan Lu, John A Onofrey
Head motion correction is an essential component of brain PET imaging, in which even motion of small magnitude can greatly degrade image quality and introduce artifacts. Building upon previous work, we propose a new head motion correction framework taking fast reconstructions as input. The main characteristics of the proposed method are: (i) the adoption of a high-resolution short-frame fast reconstruction workflow; (ii) the development of a novel encoder for PET data representation extraction; and (iii) the implementation of data augmentation techniques. Ablation studies are conducted to assess the individual contributions of each of these design choices. Furthermore, multi-subject studies are conducted on an 18F-FPEB dataset, and the method performance is qualitatively and quantitatively evaluated by MOLAR reconstruction study and corresponding brain Region of Interest (ROI) Standard Uptake Values (SUV) evaluation. Additionally, we also compared our method with a conventional intensity-based registration method. Our results demonstrate that the proposed method outperforms other methods on all subjects, and can accurately estimate motion for subjects out of the training set. All code is publicly available on GitHub: https://github.com/OnofreyLab/dl-hmc_fast_recon_miccai2023.
头部运动校正是脑 PET 成像的重要组成部分,在这种成像中,即使是幅度很小的运动也会大大降低图像质量并引入伪影。在以往工作的基础上,我们提出了一种新的头部运动校正框架,将快速重建作为输入。该方法的主要特点是(i) 采用高分辨率短帧快速重建工作流程;(ii) 开发用于 PET 数据表示提取的新型编码器;(iii) 实施数据增强技术。进行消融研究以评估这些设计选择各自的贡献。此外,我们还对 18F-FPEB 数据集进行了多受试者研究,并通过 MOLAR 重建研究和相应的大脑感兴趣区(ROI)标准摄取值(SUV)评估,对该方法的性能进行了定性和定量评估。此外,我们还将该方法与传统的基于强度的配准方法进行了比较。结果表明,在所有受试者身上,我们提出的方法都优于其他方法,并能准确估计训练集以外受试者的运动。所有代码均可在 GitHub 上公开获取:https://github.com/OnofreyLab/dl-hmc_fast_recon_miccai2023。
{"title":"Fast Reconstruction for Deep Learning PET Head Motion Correction.","authors":"Tianyi Zeng, Jiazhen Zhang, Eléonore V Lieffrig, Zhuotong Cai, Fuyao Chen, Chenyu You, Mika Naganawa, Yihuan Lu, John A Onofrey","doi":"10.1007/978-3-031-43999-5_67","DOIUrl":"10.1007/978-3-031-43999-5_67","url":null,"abstract":"<p><p>Head motion correction is an essential component of brain PET imaging, in which even motion of small magnitude can greatly degrade image quality and introduce artifacts. Building upon previous work, we propose a new head motion correction framework taking fast reconstructions as input. The main characteristics of the proposed method are: (i) the adoption of a high-resolution short-frame fast reconstruction workflow; (ii) the development of a novel encoder for PET data representation extraction; and (iii) the implementation of data augmentation techniques. Ablation studies are conducted to assess the individual contributions of each of these design choices. Furthermore, multi-subject studies are conducted on an <sup>18</sup>F-FPEB dataset, and the method performance is qualitatively and quantitatively evaluated by MOLAR reconstruction study and corresponding brain Region of Interest (ROI) Standard Uptake Values (SUV) evaluation. Additionally, we also compared our method with a conventional intensity-based registration method. Our results demonstrate that the proposed method outperforms other methods on all subjects, and can accurately estimate motion for subjects out of the training set. All code is publicly available on GitHub: https://github.com/OnofreyLab/dl-hmc_fast_recon_miccai2023.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14229 ","pages":"710-719"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10758999/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139089835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vision Transformer (ViT) models have demonstrated a breakthrough in a wide range of computer vision tasks. However, compared to the Convolutional Neural Network (CNN) models, it has been observed that the ViT models struggle to capture high-frequency components of images, which can limit their ability to detect local textures and edge information. As abnormalities in human tissue, such as tumors and lesions, may greatly vary in structure, texture, and shape, high-frequency information such as texture is crucial for effective semantic segmentation tasks. To address this limitation in ViT models, we propose a new technique, Laplacian-Former, that enhances the self-attention map by adaptively re-calibrating the frequency information in a Laplacian pyramid. More specifically, our proposed method utilizes a dual attention mechanism via efficient attention and frequency attention while the efficient attention mechanism reduces the complexity of self-attention to linear while producing the same output, selectively intensifying the contribution of shape and texture features. Furthermore, we introduce a novel efficient enhancement multi-scale bridge that effectively transfers spatial information from the encoder to the decoder while preserving the fundamental features. We demonstrate the efficacy of Laplacian-former on multi-organ and skin lesion segmentation tasks with +1.87% and +0.76% dice scores compared to SOTA approaches, respectively. Our implementation is publically available at GitHub.
视觉变换器(ViT)模型在广泛的计算机视觉任务中取得了突破性进展。然而,与卷积神经网络(CNN)模型相比,人们发现 ViT 模型很难捕捉到图像的高频成分,从而限制了其检测局部纹理和边缘信息的能力。由于肿瘤和病变等人体组织异常在结构、纹理和形状上可能存在很大差异,因此纹理等高频信息对于有效的语义分割任务至关重要。为了解决 ViT 模型中的这一局限性,我们提出了一种新技术--拉普拉斯矩阵(Laplacian-Former),该技术通过自适应地重新校准拉普拉斯金字塔中的频率信息来增强自我关注图。更具体地说,我们提出的方法通过高效注意力和频率注意力利用了双重注意力机制,而高效注意力机制在产生相同输出的同时将自我注意力的复杂性降低为线性,选择性地强化了形状和纹理特征的贡献。此外,我们还引入了一种新颖的高效增强多尺度桥,可有效地将空间信息从编码器传输到解码器,同时保留基本特征。我们证明了拉普拉斯公式在多器官和皮肤病变分割任务中的功效,与 SOTA 方法相比,骰子得分分别提高了 +1.87% 和 +0.76%。我们的实现可在 GitHub 上公开获取。
{"title":"Laplacian-Former: Overcoming the Limitations of Vision Transformers in Local Texture Detection.","authors":"Reza Azad, Amirhossein Kazerouni, Babak Azad, Ehsan Khodapanah Aghdam, Yury Velichko, Ulas Bagci, Dorit Merhof","doi":"10.1007/978-3-031-43898-1_70","DOIUrl":"10.1007/978-3-031-43898-1_70","url":null,"abstract":"<p><p>Vision Transformer (ViT) models have demonstrated a breakthrough in a wide range of computer vision tasks. However, compared to the Convolutional Neural Network (CNN) models, it has been observed that the ViT models struggle to capture high-frequency components of images, which can limit their ability to detect local textures and edge information. As abnormalities in human tissue, such as tumors and lesions, may greatly vary in structure, texture, and shape, high-frequency information such as texture is crucial for effective semantic segmentation tasks. To address this limitation in ViT models, we propose a new technique, Laplacian-Former, that enhances the self-attention map by adaptively re-calibrating the frequency information in a Laplacian pyramid. More specifically, our proposed method utilizes a dual attention mechanism via efficient attention and frequency attention while the efficient attention mechanism reduces the complexity of self-attention to linear while producing the same output, selectively intensifying the contribution of shape and texture features. Furthermore, we introduce a novel efficient enhancement multi-scale bridge that effectively transfers spatial information from the encoder to the decoder while preserving the fundamental features. We demonstrate the efficacy of Laplacian-former on multi-organ and skin lesion segmentation tasks with +1.87% and +0.76% dice scores compared to SOTA approaches, respectively. Our implementation is publically available at GitHub.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14222 ","pages":"736-746"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10830169/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139652500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1007/978-3-031-43993-3_18
Deeksha M Shama, Jiasen Jing, Archana Venkataraman
We propose a robust deep learning framework to simultaneously detect and localize seizure activity from multichannel scalp EEG. Our model, called DeepSOZ, consists of a transformer encoder to generate global and channel-wise encodings. The global branch is combined with an LSTM for temporal seizure detection. In parallel, we employ attention-weighted multi-instance pooling of channel-wise encodings to predict the seizure onset zone. DeepSOZ is trained in a supervised fashion and generates high-resolution predictions on the order of each second (temporal) and EEG channel (spatial). We validate DeepSOZ via bootstrapped nested cross-validation on a large dataset of 120 patients curated from the Temple University Hospital corpus. As compared to baseline approaches, DeepSOZ provides robust overall performance in our multi-task learning setup. We also evaluate the intra-seizure and intra-patient consistency of DeepSOZ as a first step to establishing its trustworthiness for integration into the clinical workflow for epilepsy.
{"title":"DeepSOZ: A Robust Deep Model for Joint Temporal and Spatial Seizure Onset Localization from Multichannel EEG Data.","authors":"Deeksha M Shama, Jiasen Jing, Archana Venkataraman","doi":"10.1007/978-3-031-43993-3_18","DOIUrl":"https://doi.org/10.1007/978-3-031-43993-3_18","url":null,"abstract":"<p><p>We propose a robust deep learning framework to simultaneously detect and localize seizure activity from multichannel scalp EEG. Our model, called DeepSOZ, consists of a transformer encoder to generate global and channel-wise encodings. The global branch is combined with an LSTM for temporal seizure detection. In parallel, we employ attention-weighted multi-instance pooling of channel-wise encodings to predict the seizure onset zone. DeepSOZ is trained in a supervised fashion and generates high-resolution predictions on the order of each second (temporal) and EEG channel (spatial). We validate DeepSOZ via bootstrapped nested cross-validation on a large dataset of 120 patients curated from the Temple University Hospital corpus. As compared to baseline approaches, DeepSOZ provides robust overall performance in our multi-task learning setup. We also evaluate the intra-seizure and intra-patient consistency of DeepSOZ as a first step to establishing its trustworthiness for integration into the clinical workflow for epilepsy.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"2023 ","pages":"184-194"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11545985/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142635479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brain tissue microarchitecture is characterized by heterogeneous degrees of diffusivity and rates of transverse relaxation. Unlike standard diffusion MRI with a single echo time (TE), which provides information primarily on diffusivity, relaxation-diffusion MRI involves multiple TEs and multiple diffusion-weighting strengths for probing tissue-specific coupling between relaxation and diffusivity. Here, we introduce a relaxation-diffusion model that characterizes tissue apparent relaxation coefficients for a spectrum of diffusion length scales and at the same time factors out the effects of intra-voxel orientation heterogeneity. We examined the model with an in vivo dataset, acquired using a clinical scanner, involving different health conditions. Experimental results indicate that our model caters to heterogeneous tissue microstructure and can distinguish fiber bundles with similar diffusivities but different relaxation rates. Code with sample data is available at https://github.com/dryewu/RDSI.
{"title":"Relaxation-Diffusion Spectrum Imaging for Probing Tissue Microarchitecture.","authors":"Ye Wu, Xiaoming Liu, Xinyuan Zhang, Khoi Minh Huynh, Sahar Ahmad, Pew-Thian Yap","doi":"10.1007/978-3-031-43993-3_15","DOIUrl":"10.1007/978-3-031-43993-3_15","url":null,"abstract":"<p><p>Brain tissue microarchitecture is characterized by heterogeneous degrees of diffusivity and rates of transverse relaxation. Unlike standard diffusion MRI with a single echo time (TE), which provides information primarily on diffusivity, relaxation-diffusion MRI involves multiple TEs and multiple diffusion-weighting strengths for probing tissue-specific coupling between relaxation and diffusivity. Here, we introduce a relaxation-diffusion model that characterizes tissue apparent relaxation coefficients for a spectrum of diffusion length scales and at the same time factors out the effects of intra-voxel orientation heterogeneity. We examined the model with an in vivo dataset, acquired using a clinical scanner, involving different health conditions. Experimental results indicate that our model caters to heterogeneous tissue microstructure and can distinguish fiber bundles with similar diffusivities but different relaxation rates. Code with sample data is available at https://github.com/dryewu/RDSI.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14227 ","pages":"152-162"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11340880/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142057762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1007/978-3-031-43895-0_59
Shantanu Ghosh, Ke Yu, Kayhan Batmanghelich
Building generalizable AI models is one of the primary challenges in the healthcare domain. While radiologists rely on generalizable descriptive rules of abnormality, Neural Network (NN) models suffer even with a slight shift in input distribution (e.g., scanner type). Fine-tuning a model to transfer knowledge from one domain to another requires a significant amount of labeled data in the target domain. In this paper, we develop an interpretable model that can be efficiently fine-tuned to an unseen target domain with minimal computational cost. We assume the interpretable component of NN to be approximately domain-invariant. However, interpretable models typically underperform compared to their Blackbox (BB) variants. We start with a BB in the source domain and distill it into a mixture of shallow interpretable models using human-understandable concepts. As each interpretable model covers a subset of data, a mixture of interpretable models achieves comparable performance as BB. Further, we use the pseudo-labeling technique from semi-supervised learning (SSL) to learn the concept classifier in the target domain, followed by fine-tuning the interpretable models in the target domain. We evaluate our model using a real-life large-scale chest-X-ray (CXR) classification dataset. The code is available at: https://github.com/batmanlab/MICCAI-2023-Route-interpret-repeat-CXRs.
建立可通用的人工智能模型是医疗保健领域的主要挑战之一。放射科医生依赖于可通用的异常描述规则,而神经网络(NN)模型即使在输入分布(如扫描仪类型)稍有变化的情况下也会受到影响。要对模型进行微调,将知识从一个领域转移到另一个领域,就需要在目标领域获得大量标注数据。在本文中,我们开发了一种可解释模型,它能以最小的计算成本高效地微调到未见过的目标领域。我们假设 NN 的可解释部分近似于域不变。然而,与黑盒(BB)变体相比,可解释模型通常表现不佳。我们从源领域的 BB 开始,利用人类可理解的概念将其提炼为浅层可解释模型的混合物。由于每个可解释模型都涵盖了数据的一个子集,因此可解释模型的混合物可以达到与黑箱模型相当的性能。此外,我们使用半监督学习(SSL)中的伪标记技术来学习目标领域中的概念分类器,然后对目标领域中的可解释模型进行微调。我们使用现实生活中的大规模胸透(CXR)分类数据集对我们的模型进行了评估。代码见:https://github.com/batmanlab/MICCAI-2023-Route-interpret-repeat-CXRs。
{"title":"Distilling BlackBox to Interpretable Models for Efficient Transfer Learning.","authors":"Shantanu Ghosh, Ke Yu, Kayhan Batmanghelich","doi":"10.1007/978-3-031-43895-0_59","DOIUrl":"10.1007/978-3-031-43895-0_59","url":null,"abstract":"<p><p>Building generalizable AI models is one of the primary challenges in the healthcare domain. While radiologists rely on generalizable descriptive rules of abnormality, Neural Network (NN) models suffer even with a slight shift in input distribution (<i>e.g</i>., scanner type). Fine-tuning a model to transfer knowledge from one domain to another requires a significant amount of labeled data in the target domain. In this paper, we develop an interpretable model that can be efficiently fine-tuned to an unseen target domain with minimal computational cost. We assume the interpretable component of NN to be approximately domain-invariant. However, interpretable models typically underperform compared to their Blackbox (BB) variants. We start with a BB in the source domain and distill it into a <i>mixture</i> of shallow interpretable models using human-understandable concepts. As each interpretable model covers a subset of data, a mixture of interpretable models achieves comparable performance as BB. Further, we use the pseudo-labeling technique from semi-supervised learning (SSL) to learn the concept classifier in the target domain, followed by fine-tuning the interpretable models in the target domain. We evaluate our model using a real-life large-scale chest-X-ray (CXR) classification dataset. The code is available at: https://github.com/batmanlab/MICCAI-2023-Route-interpret-repeat-CXRs.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14221 ","pages":"628-638"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141113/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1007/978-3-031-43987-2_12
Yimu Pan, Tongan Cai, Manas Mehta, Alison D Gernand, Jeffery A Goldstein, Leena Mithal, Delia Mwinyelle, Kelly Gallagher, James Z Wang
The placenta is a valuable organ that can aid in understanding adverse events during pregnancy and predicting issues post-birth. Manual pathological examination and report generation, however, are laborious and resource-intensive. Limitations in diagnostic accuracy and model efficiency have impeded previous attempts to automate placenta analysis. This study presents a novel framework for the automatic analysis of placenta images that aims to improve accuracy and efficiency. Building on previous vision-language contrastive learning (VLC) methods, we propose two enhancements, namely Pathology Report Feature Recomposition and Distributional Feature Recomposition, which increase representation robustness and mitigate feature suppression. In addition, we employ efficient neural networks as image encoders to achieve model compression and inference acceleration. Experiments validate that the proposed approach outperforms prior work in both performance and efficiency by significant margins. The benefits of our method, including enhanced efficacy and deployability, may have significant implications for reproductive healthcare, particularly in rural areas or low- and middle-income countries.
{"title":"Enhancing Automatic Placenta Analysis through Distributional Feature Recomposition in Vision-Language Contrastive Learning.","authors":"Yimu Pan, Tongan Cai, Manas Mehta, Alison D Gernand, Jeffery A Goldstein, Leena Mithal, Delia Mwinyelle, Kelly Gallagher, James Z Wang","doi":"10.1007/978-3-031-43987-2_12","DOIUrl":"10.1007/978-3-031-43987-2_12","url":null,"abstract":"<p><p>The placenta is a valuable organ that can aid in understanding adverse events during pregnancy and predicting issues post-birth. Manual pathological examination and report generation, however, are laborious and resource-intensive. Limitations in diagnostic accuracy and model efficiency have impeded previous attempts to automate placenta analysis. This study presents a novel framework for the automatic analysis of placenta images that aims to improve accuracy and efficiency. Building on previous vision-language contrastive learning (VLC) methods, we propose two enhancements, namely Pathology Report Feature Recomposition and Distributional Feature Recomposition, which increase representation robustness and mitigate feature suppression. In addition, we employ efficient neural networks as image encoders to achieve model compression and inference acceleration. Experiments validate that the proposed approach outperforms prior work in both performance and efficiency by significant margins. The benefits of our method, including enhanced efficacy and deployability, may have significant implications for reproductive healthcare, particularly in rural areas or low- and middle-income countries.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14225 ","pages":"116-126"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11192145/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141444014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention