Pub Date : 2023-06-01Epub Date: 2023-06-08DOI: 10.1007/978-3-031-34048-2_24
Mohammad Farazi, Zhangsihao Yang, Wenhui Zhu, Peijie Qiu, Yalin Wang
Convolutional neural networks (CNN) have been broadly studied on images, videos, graphs, and triangular meshes. However, it has seldom been studied on tetrahedral meshes. Given the merits of using volumetric meshes in applications like brain image analysis, we introduce a novel interpretable graph CNN framework for the tetrahedral mesh structure. Inspired by ChebyNet, our model exploits the volumetric Laplace-Beltrami Operator (LBO) to define filters over commonly used graph Laplacian which lacks the Riemannian metric information of 3D manifolds. For pooling adaptation, we introduce new objective functions for localized minimum cuts in the Graclus algorithm based on the LBO. We employ a piece-wise constant approximation scheme that uses the clustering assignment matrix to estimate the LBO on sampled meshes after each pooling. Finally, adapting the Gradient-weighted Class Activation Mapping algorithm for tetrahedral meshes, we use the obtained heatmaps to visualize discovered regions-of-interest as biomarkers. We demonstrate the effectiveness of our model on cortical tetrahedral meshes from patients with Alzheimer's disease, as there is scientific evidence showing the correlation of cortical thickness to neurodegenerative disease progression. Our results show the superiority of our LBO-based convolution layer and adapted pooling over the conventionally used unitary cortical thickness, graph Laplacian, and point cloud representation.
{"title":"TetCNN: Convolutional Neural Networks on Tetrahedral Meshes.","authors":"Mohammad Farazi, Zhangsihao Yang, Wenhui Zhu, Peijie Qiu, Yalin Wang","doi":"10.1007/978-3-031-34048-2_24","DOIUrl":"10.1007/978-3-031-34048-2_24","url":null,"abstract":"<p><p>Convolutional neural networks (CNN) have been broadly studied on images, videos, graphs, and triangular meshes. However, it has seldom been studied on tetrahedral meshes. Given the merits of using volumetric meshes in applications like brain image analysis, we introduce a novel interpretable graph CNN framework for the tetrahedral mesh structure. Inspired by ChebyNet, our model exploits the volumetric Laplace-Beltrami Operator (LBO) to define filters over commonly used graph Laplacian which lacks the Riemannian metric information of 3D manifolds. For pooling adaptation, we introduce new objective functions for localized minimum cuts in the Graclus algorithm based on the LBO. We employ a piece-wise constant approximation scheme that uses the clustering assignment matrix to estimate the LBO on sampled meshes after each pooling. Finally, adapting the Gradient-weighted Class Activation Mapping algorithm for tetrahedral meshes, we use the obtained heatmaps to visualize discovered regions-of-interest as biomarkers. We demonstrate the effectiveness of our model on cortical tetrahedral meshes from patients with Alzheimer's disease, as there is scientific evidence showing the correlation of cortical thickness to neurodegenerative disease progression. Our results show the superiority of our LBO-based convolution layer and adapted pooling over the conventionally used unitary cortical thickness, graph Laplacian, and point cloud representation.</p>","PeriodicalId":73379,"journal":{"name":"Information processing in medical imaging : proceedings of the ... conference","volume":"13939 ","pages":"303-315"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10765307/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139099273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Atlases are crucial to imaging statistics as they enable the standardization of inter-subject and inter-population analyses. While existing atlas estimation methods based on fluid/elastic/diffusion registration yield high-quality results for the human brain, these deformation models do not extend to a variety of other challenging areas of neuroscience such as the anatomy of C. elegans worms and fruit flies. To this end, this work presents a general probabilistic deep network-based framework for atlas estimation and registration which can flexibly incorporate various deformation models and levels of keypoint supervision that can be applied to a wide class of model organisms. Of particular relevance, it also develops a deformable piecewise rigid atlas model which is regularized to preserve inter-observation distances between neighbors. These modeling considerations are shown to improve atlas construction and key-point alignment across a diversity of datasets with small sample sizes including neuron positions in C. elegans hermaphrodites, fluorescence microscopy of male C. elegans, and images of fruit fly wings. Code is accessible at https://github.com/amin-nejat/Deformable-Atlas.
{"title":"Learning Probabilistic Piecewise Rigid Atlases of Model Organisms via Generative Deep Networks.","authors":"Amin Nejatbakhsh, Neel Dey, Vivek Venkatachalam, Eviatar Yemini, Liam Paninski, Erdem Varol","doi":"10.1007/978-3-031-34048-2_26","DOIUrl":"10.1007/978-3-031-34048-2_26","url":null,"abstract":"<p><p>Atlases are crucial to imaging statistics as they enable the standardization of inter-subject and inter-population analyses. While existing atlas estimation methods based on fluid/elastic/diffusion registration yield high-quality results for the human brain, these deformation models do not extend to a variety of other challenging areas of neuroscience such as the anatomy of <i>C. elegans</i> worms and fruit flies. To this end, this work presents a general probabilistic deep network-based framework for atlas estimation and registration which can flexibly incorporate various deformation models and levels of keypoint supervision that can be applied to a wide class of model organisms. Of particular relevance, it also develops a deformable piecewise rigid atlas model which is regularized to preserve inter-observation distances between neighbors. These modeling considerations are shown to improve atlas construction and key-point alignment across a diversity of datasets with small sample sizes including neuron positions in <i>C. elegans</i> hermaphrodites, fluorescence microscopy of male <i>C. elegans</i>, and images of fruit fly wings. Code is accessible at https://github.com/amin-nejat/Deformable-Atlas.</p>","PeriodicalId":73379,"journal":{"name":"Information processing in medical imaging : proceedings of the ... conference","volume":"13939 ","pages":"332-343"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10358289/pdf/nihms-1910173.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10239564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Surface reconstruction of cortical and subcortical structures is crucial for brain morphological studies. Existing deep learning surface reconstruction methods, such as DeepCSR and Vox2Surf, learn an implicit field function for computing the isosurface, but do not consider mesh topology. In this paper, we propose a novel and efficient deep learning mesh deformation network, called MeshDeform, to reconstruct topologically correct surfaces of subcortical structures using brain MR images. MeshDeform combines features extracted from a U-Net encoder with mesh deformation blocks to predict surfaces of subcortical structures by deforming spherical mesh templates. MeshDeform is able to reconstruct in less than 10 seconds the surfaces of a left-right pair of subcortical structures with subvoxel accuracy. Reconstruction of all 17 subcortical structures takes less than one and a half minutes. By contrast, Vox2Surf takes about 20-30 minutes for all subcortical structures. Visual and quantitative evaluation on the Human Connectome Project (HCP) dataset demonstrate that MeshDeform generates accurate subcortical surfaces in limited time while preserving mesh topology.
{"title":"MeshDeform: Surface Reconstruction of Subcortical Structures in Human Brain MRI.","authors":"Junjie Zhao, Siyuan Liu, Sahar Ahmad, Yap Pew-Thian","doi":"10.1007/978-3-031-34048-2_41","DOIUrl":"10.1007/978-3-031-34048-2_41","url":null,"abstract":"<p><p>Surface reconstruction of cortical and subcortical structures is crucial for brain morphological studies. Existing deep learning surface reconstruction methods, such as DeepCSR and Vox2Surf, learn an implicit field function for computing the isosurface, but do not consider mesh topology. In this paper, we propose a novel and efficient deep learning mesh deformation network, called MeshDeform, to reconstruct topologically correct surfaces of subcortical structures using brain MR images. MeshDeform combines features extracted from a U-Net encoder with mesh deformation blocks to predict surfaces of subcortical structures by deforming spherical mesh templates. MeshDeform is able to reconstruct in less than 10 seconds the surfaces of a left-right pair of subcortical structures with subvoxel accuracy. Reconstruction of all 17 subcortical structures takes less than one and a half minutes. By contrast, Vox2Surf takes about 20-30 minutes for all subcortical structures. Visual and quantitative evaluation on the Human Connectome Project (HCP) dataset demonstrate that MeshDeform generates accurate subcortical surfaces in limited time while preserving mesh topology.</p>","PeriodicalId":73379,"journal":{"name":"Information processing in medical imaging : proceedings of the ... conference","volume":"13939 ","pages":"536-547"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10617631/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71429784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01Epub Date: 2023-06-08DOI: 10.1007/978-3-031-34048-2_49
Chenyu You, Weicheng Dai, Yifei Min, Lawrence Staib, James S Duncan
Contrastive learning has shown great promise over annotation scarcity problems in the context of medical image segmentation. Existing approaches typically assume a balanced class distribution for both labeled and unlabeled medical images. However, medical image data in reality is commonly imbalanced (i.e., multi-class label imbalance), which naturally yields blurry contours and usually incorrectly labels rare objects. Moreover, it remains unclear whether all negative samples are equally negative. In this work, we present ACTION, an Anatomical-aware ConTrastive dIstillatiON framework, for semi-supervised medical image segmentation. Specifically, we first develop an iterative contrastive distillation algorithm by softly labeling the negatives rather than binary supervision between positive and negative pairs. We also capture more semantically similar features from the randomly chosen negative set compared to the positives to enforce the diversity of the sampled data. Second, we raise a more important question: Can we really handle imbalanced samples to yield better performance? Hence, the keyinnovation in ACTION is to learn global semantic relationship across the entire dataset and local anatomical features among the neighbouring pixels with minimal additional memory footprint. During the training, we introduce anatomical contrast by actively sampling a sparse set of hard negative pixels, which can generate smoother segmentation boundaries and more accurate predictions. Extensive experiments across two benchmark datasets and different unlabeled settings show that ACTION significantly outperforms the current state-of-the-art semi-supervised methods.
{"title":"Bootstrapping Semi-supervised Medical Image Segmentation with Anatomical-Aware Contrastive Distillation.","authors":"Chenyu You, Weicheng Dai, Yifei Min, Lawrence Staib, James S Duncan","doi":"10.1007/978-3-031-34048-2_49","DOIUrl":"10.1007/978-3-031-34048-2_49","url":null,"abstract":"<p><p>Contrastive learning has shown great promise over annotation scarcity problems in the context of medical image segmentation. Existing approaches typically assume a balanced class distribution for both labeled and unlabeled medical images. However, medical image data in reality is commonly imbalanced (<i>i.e</i>., multi-class label imbalance), which naturally yields blurry contours and usually incorrectly labels rare objects. Moreover, it remains unclear whether all negative samples are equally negative. In this work, we present <b>ACTION</b>, an <b>A</b>natomical-aware <b>C</b>on<b>T</b>rastive d<b>I</b>stillati<b>ON</b> framework, for semi-supervised medical image segmentation. Specifically, we first develop an iterative contrastive distillation algorithm by softly labeling the negatives rather than binary supervision between positive and negative pairs. We also capture more semantically similar features from the randomly chosen negative set compared to the positives to enforce the diversity of the sampled data. Second, we raise a more important question: Can we really handle imbalanced samples to yield better performance? Hence, the <i>key</i> <i>innovation</i> in ACTION is to learn global semantic relationship across the entire dataset and local anatomical features among the neighbouring pixels with minimal additional memory footprint. During the training, we introduce anatomical contrast by actively sampling a sparse set of hard negative pixels, which can generate smoother segmentation boundaries and more accurate predictions. Extensive experiments across two benchmark datasets and different unlabeled settings show that ACTION significantly outperforms the current state-of-the-art semi-supervised methods.</p>","PeriodicalId":73379,"journal":{"name":"Information processing in medical imaging : proceedings of the ... conference","volume":"13939 ","pages":"641-653"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10322187/pdf/nihms-1913000.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9794875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1007/978-3-031-34048-2_32
Wenhui Zhu, Peijie Qiu, Oana M Dumitrascu, Jacob M Sobczak, Mohammad Farazi, Zhangsihao Yang, Keshav Nandakumar, Yalin Wang
Non-mydriatic retinal color fundus photography (CFP) is widely available due to the advantage of not requiring pupillary dilation, however, is prone to poor quality due to operators, systemic imperfections, or patient-related causes. Optimal retinal image quality is mandated for accurate medical diagnoses and automated analyses. Herein, we leveraged the Optimal Transport (OT) theory to propose an unpaired image-to-image translation scheme for mapping low-quality retinal CFPs to high-quality counterparts. Furthermore, to improve the flexibility, robustness, and applicability of our image enhancement pipeline in the clinical practice, we generalized a state-of-the-art model-based image reconstruction method, regularization by denoising, by plugging in priors learned by our OT-guided image-to-image translation network. We named it as regularization by enhancing (RE). We validated the integrated framework, OTRE, on three publicly available retinal image datasets by assessing the quality after enhancement and their performance on various downstream tasks, including diabetic retinopathy grading, vessel segmentation, and diabetic lesion segmentation. The experimental results demonstrated the superiority of our proposed framework over some state-of-the-art unsupervised competitors and a state-of-the-art supervised method.
{"title":"OTRE: Where Optimal Transport Guided Unpaired Image-to-Image Translation Meets Regularization by Enhancing.","authors":"Wenhui Zhu, Peijie Qiu, Oana M Dumitrascu, Jacob M Sobczak, Mohammad Farazi, Zhangsihao Yang, Keshav Nandakumar, Yalin Wang","doi":"10.1007/978-3-031-34048-2_32","DOIUrl":"https://doi.org/10.1007/978-3-031-34048-2_32","url":null,"abstract":"<p><p>Non-mydriatic retinal color fundus photography (CFP) is widely available due to the advantage of not requiring pupillary dilation, however, is prone to poor quality due to operators, systemic imperfections, or patient-related causes. Optimal retinal image quality is mandated for accurate medical diagnoses and automated analyses. Herein, we leveraged the <i>Optimal Transport (OT)</i> theory to propose an unpaired image-to-image translation scheme for mapping low-quality retinal CFPs to high-quality counterparts. Furthermore, to improve the flexibility, robustness, and applicability of our image enhancement pipeline in the clinical practice, we generalized a state-of-the-art model-based image reconstruction method, regularization by denoising, by plugging in priors learned by our OT-guided image-to-image translation network. We named it as <i>regularization by enhancing (RE)</i>. We validated the integrated framework, OTRE, on three publicly available retinal image datasets by assessing the quality after enhancement and their performance on various downstream tasks, including diabetic retinopathy grading, vessel segmentation, and diabetic lesion segmentation. The experimental results demonstrated the superiority of our proposed framework over some state-of-the-art unsupervised competitors and a state-of-the-art supervised method.</p>","PeriodicalId":73379,"journal":{"name":"Information processing in medical imaging : proceedings of the ... conference","volume":"13939 ","pages":"415-427"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10329768/pdf/nihms-1880857.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9811952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01Epub Date: 2023-06-08DOI: 10.1007/978-3-031-34048-2_57
Lei Zhou, Huidong Liu, Joseph Bae, Junjun He, Dimitris Samaras, Prateek Prasanna
Can we use sparse tokens for dense prediction, e.g., segmentation? Although token sparsification has been applied to Vision Transformers (ViT) to accelerate classification, it is still unknown how to perform segmentation from sparse tokens. To this end, we reformulate segmentation as a sparse encoding → tokencompletion → dense decoding (SCD) pipeline. We first empirically show that naïvely applying existing approaches from classification token pruning and masked image modeling (MIM) leads to failure and inefficient training caused by inappropriate sampling algorithms and the low quality of the restored dense features. In this paper, we propose Soft-topK Token Pruning (STP) and Multi-layer Token Assembly (MTA) to address these problems. In sparse encoding, STP predicts token importance scores with a lightweight sub-network and samples the topK tokens. The intractable topK gradients are approximated through a continuous perturbed score distribution. In token completion, MTA restores a full token sequence by assembling both sparse output tokens and pruned multi-layer intermediate ones. The last dense decoding stage is compatible with existing segmentation decoders, e.g., UNETR. Experiments show SCD pipelines equipped with STP and MTA are much faster than baselines without token pruning in both training (up to 120% higher throughput) and inference (up to 60.6% higher throughput) while maintaining segmentation quality. Code is available here: https://github.com/cvlab-stonybrook/TokenSparse-for-MedSeg.
{"title":"Token Sparsification for Faster Medical Image Segmentation.","authors":"Lei Zhou, Huidong Liu, Joseph Bae, Junjun He, Dimitris Samaras, Prateek Prasanna","doi":"10.1007/978-3-031-34048-2_57","DOIUrl":"https://doi.org/10.1007/978-3-031-34048-2_57","url":null,"abstract":"<p><p><i>Can we use sparse tokens for dense prediction, e.g., segmentation?</i> Although token sparsification has been applied to Vision Transformers (ViT) to accelerate classification, it is still unknown how to perform segmentation from sparse tokens. To this end, we reformulate segmentation as a <i>s</i><i>parse encoding</i> → <i>token</i> <i>c</i><i>ompletion</i> → <i>d</i><i>ense decoding</i> (SCD) pipeline. We first empirically show that naïvely applying existing approaches from classification token pruning and masked image modeling (MIM) leads to failure and inefficient training caused by inappropriate sampling algorithms and the low quality of the restored dense features. In this paper, we propose <i>Soft-topK Token Pruning (STP)</i> and <i>Multi-layer Token Assembly (MTA)</i> to address these problems. In <i>sparse encoding</i>, <i>STP</i> predicts token importance scores with a lightweight sub-network and samples the topK tokens. The intractable topK gradients are approximated through a continuous perturbed score distribution. In <i>token completion</i>, <i>MTA</i> restores a full token sequence by assembling both sparse output tokens and pruned multi-layer intermediate ones. The last <i>dense decoding</i> stage is compatible with existing segmentation decoders, e.g., UNETR. Experiments show SCD pipelines equipped with <i>STP</i> and <i>MTA</i> are much faster than baselines without token pruning in both training (up to 120% higher throughput) and inference (up to 60.6% higher throughput) while maintaining segmentation quality. Code is available here: https://github.com/cvlab-stonybrook/TokenSparse-for-MedSeg.</p>","PeriodicalId":73379,"journal":{"name":"Information processing in medical imaging : proceedings of the ... conference","volume":"13939 ","pages":"743-754"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11056020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140856074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01Epub Date: 2023-06-08DOI: 10.1007/978-3-031-34048-2_22
Jinghan Huang, Moo K Chung, Anqi Qiu
This study proposes a novel heterogeneous graph convolutional neural network (HGCNN) to handle complex brain fMRI data at regional and across-region levels. We introduce a generic formulation of spectral filters on heterogeneous graphs by introducing the k-th Hodge-Laplacian (HL) operator. In particular, we propose Laguerre polynomial approximations of HL spectral filters and prove that their spatial localization on graphs is related to the polynomial order. Furthermore, based on the bijection property of boundary operators on simplex graphs, we introduce a generic topological graph pooling (TGPool) method that can be used at any dimensional simplices. This study designs HL-node, HL-edge, and HL-HGCNN neural networks to learn signal representation at a graph node, edge levels, and both, respectively. Our experiments employ fMRI from the Adolescent Brain Cognitive Development (ABCD; n=7693) to predict general intelligence. Our results demonstrate the advantage of the HL-edge network over the HL-node network when functional brain connectivity is considered as features. The HL-HGCNN outperforms the state-of-the-art graph neural networks (GNNs) approaches, such as GAT, BrainGNN, dGCN, BrainNetCNN, and Hypergraph NN. The functional connectivity features learned from the HL-HGCNN are meaningful in interpreting neural circuits related to general intelligence.
{"title":"Heterogeneous Graph Convolutional Neural Network via Hodge-Laplacian for Brain Functional Data.","authors":"Jinghan Huang, Moo K Chung, Anqi Qiu","doi":"10.1007/978-3-031-34048-2_22","DOIUrl":"10.1007/978-3-031-34048-2_22","url":null,"abstract":"<p><p>This study proposes a novel heterogeneous graph convolutional neural network (HGCNN) to handle complex brain fMRI data at regional and across-region levels. We introduce a generic formulation of spectral filters on heterogeneous graphs by introducing the <i>k</i>-<i>th</i> Hodge-Laplacian (HL) operator. In particular, we propose Laguerre polynomial approximations of HL spectral filters and prove that their spatial localization on graphs is related to the polynomial order. Furthermore, based on the bijection property of boundary operators on simplex graphs, we introduce a generic topological graph pooling (TGPool) method that can be used at any dimensional simplices. This study designs HL-node, HL-edge, and HL-HGCNN neural networks to learn signal representation at a graph node, edge levels, and both, respectively. Our experiments employ fMRI from the Adolescent Brain Cognitive Development (ABCD; n=7693) to predict general intelligence. Our results demonstrate the advantage of the HL-edge network over the HL-node network when functional brain connectivity is considered as features. The HL-HGCNN outperforms the state-of-the-art graph neural networks (GNNs) approaches, such as GAT, BrainGNN, dGCN, BrainNetCNN, and Hypergraph NN. The functional connectivity features learned from the HL-HGCNN are meaningful in interpreting neural circuits related to general intelligence.</p>","PeriodicalId":73379,"journal":{"name":"Information processing in medical imaging : proceedings of the ... conference","volume":"13939 ","pages":"278-290"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11108189/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141077300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-23DOI: 10.48550/arXiv.2305.13756
B. Kim, J. Dolz, Pierre-Marc Jodoin, Christian Desrosiers
Privacy protection in medical data is a legitimate obstacle for centralized machine learning applications. Here, we propose a client-server image segmentation system which allows for the analysis of multi-centric medical images while preserving patient privacy. In this approach, the client protects the to-be-segmented patient image by mixing it to a reference image. As shown in our work, it is challenging to separate the image mixture to exact original content, thus making the data unworkable and unrecognizable for an unauthorized person. This proxy image is sent to a server for processing. The server then returns the mixture of segmentation maps, which the client can revert to a correct target segmentation. Our system has two components: 1) a segmentation network on the server side which processes the image mixture, and 2) a segmentation unmixing network which recovers the correct segmentation map from the segmentation mixture. Furthermore, the whole system is trained end-to-end. The proposed method is validated on the task of MRI brain segmentation using images from two different datasets. Results show that the segmentation accuracy of our method is comparable to a system trained on raw images, and outperforms other privacy-preserving methods with little computational overhead.
{"title":"Mixup-Privacy: A simple yet effective approach for privacy-preserving segmentation","authors":"B. Kim, J. Dolz, Pierre-Marc Jodoin, Christian Desrosiers","doi":"10.48550/arXiv.2305.13756","DOIUrl":"https://doi.org/10.48550/arXiv.2305.13756","url":null,"abstract":"Privacy protection in medical data is a legitimate obstacle for centralized machine learning applications. Here, we propose a client-server image segmentation system which allows for the analysis of multi-centric medical images while preserving patient privacy. In this approach, the client protects the to-be-segmented patient image by mixing it to a reference image. As shown in our work, it is challenging to separate the image mixture to exact original content, thus making the data unworkable and unrecognizable for an unauthorized person. This proxy image is sent to a server for processing. The server then returns the mixture of segmentation maps, which the client can revert to a correct target segmentation. Our system has two components: 1) a segmentation network on the server side which processes the image mixture, and 2) a segmentation unmixing network which recovers the correct segmentation map from the segmentation mixture. Furthermore, the whole system is trained end-to-end. The proposed method is validated on the task of MRI brain segmentation using images from two different datasets. Results show that the segmentation accuracy of our method is comparable to a system trained on raw images, and outperforms other privacy-preserving methods with little computational overhead.","PeriodicalId":73379,"journal":{"name":"Information processing in medical imaging : proceedings of the ... conference","volume":"40 1","pages":"717-729"},"PeriodicalIF":0.0,"publicationDate":"2023-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83483216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical image segmentation is a fundamental task in the community of medical image analysis. In this paper, a novel network architecture, referred to as Convolution, Transformer, and Operator (CTO), is proposed. CTO employs a combination of Convolutional Neural Networks (CNNs), Vision Transformer (ViT), and an explicit boundary detection operator to achieve high recognition accuracy while maintaining an optimal balance between accuracy and efficiency. The proposed CTO follows the standard encoder-decoder segmentation paradigm, where the encoder network incorporates a popular CNN backbone for capturing local semantic information, and a lightweight ViT assistant for integrating long-range dependencies. To enhance the learning capacity on boundary, a boundary-guided decoder network is proposed that uses a boundary mask obtained from a dedicated boundary detection operator as explicit supervision to guide the decoding learning process. The performance of the proposed method is evaluated on six challenging medical image segmentation datasets, demonstrating that CTO achieves state-of-the-art accuracy with a competitive model complexity.
{"title":"Rethinking Boundary Detection in Deep Learning Models for Medical Image Segmentation","authors":"Yi-Mou Lin, Dong-Ming Zhang, Xiaori Fang, Yufan Chen, Kwang-Ting Cheng, Hao Chen","doi":"10.48550/arXiv.2305.00678","DOIUrl":"https://doi.org/10.48550/arXiv.2305.00678","url":null,"abstract":"Medical image segmentation is a fundamental task in the community of medical image analysis. In this paper, a novel network architecture, referred to as Convolution, Transformer, and Operator (CTO), is proposed. CTO employs a combination of Convolutional Neural Networks (CNNs), Vision Transformer (ViT), and an explicit boundary detection operator to achieve high recognition accuracy while maintaining an optimal balance between accuracy and efficiency. The proposed CTO follows the standard encoder-decoder segmentation paradigm, where the encoder network incorporates a popular CNN backbone for capturing local semantic information, and a lightweight ViT assistant for integrating long-range dependencies. To enhance the learning capacity on boundary, a boundary-guided decoder network is proposed that uses a boundary mask obtained from a dedicated boundary detection operator as explicit supervision to guide the decoding learning process. The performance of the proposed method is evaluated on six challenging medical image segmentation datasets, demonstrating that CTO achieves state-of-the-art accuracy with a competitive model complexity.","PeriodicalId":73379,"journal":{"name":"Information processing in medical imaging : proceedings of the ... conference","volume":"77 1","pages":"730-742"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87056818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-31DOI: 10.48550/arXiv.2303.18019
Gary Sarwin, A. Carretta, V. Staartjes, M. Zoli, D. Mazzatenta, L. Regli, C. Serra, E. Konukoglu
Advanced minimally invasive neurosurgery navigation relies mainly on Magnetic Resonance Imaging (MRI) guidance. MRI guidance, however, only provides pre-operative information in the majority of the cases. Once the surgery begins, the value of this guidance diminishes to some extent because of the anatomical changes due to surgery. Guidance with live image feedback coming directly from the surgical device, e.g., endoscope, can complement MRI-based navigation or be an alternative if MRI guidance is not feasible. With this motivation, we present a method for live image-only guidance leveraging a large data set of annotated neurosurgical videos.First, we report the performance of a deep learning-based object detection method, YOLO, on detecting anatomical structures in neurosurgical images. Second, we present a method for generating neurosurgical roadmaps using unsupervised embedding without assuming exact anatomical matches between patients, presence of an extensive anatomical atlas, or the need for simultaneous localization and mapping. A generated roadmap encodes the common anatomical paths taken in surgeries in the training set. At inference, the roadmap can be used to map a surgeon's current location using live image feedback on the path to provide guidance by being able to predict which structures should appear going forward or backward, much like a mapping application. Even though the embedding is not supervised by position information, we show that it is correlated to the location inside the brain and on the surgical path. We trained and evaluated the proposed method with a data set of 166 transsphenoidal adenomectomy procedures.
{"title":"Live image-based neurosurgical guidance and roadmap generation using unsupervised embedding","authors":"Gary Sarwin, A. Carretta, V. Staartjes, M. Zoli, D. Mazzatenta, L. Regli, C. Serra, E. Konukoglu","doi":"10.48550/arXiv.2303.18019","DOIUrl":"https://doi.org/10.48550/arXiv.2303.18019","url":null,"abstract":"Advanced minimally invasive neurosurgery navigation relies mainly on Magnetic Resonance Imaging (MRI) guidance. MRI guidance, however, only provides pre-operative information in the majority of the cases. Once the surgery begins, the value of this guidance diminishes to some extent because of the anatomical changes due to surgery. Guidance with live image feedback coming directly from the surgical device, e.g., endoscope, can complement MRI-based navigation or be an alternative if MRI guidance is not feasible. With this motivation, we present a method for live image-only guidance leveraging a large data set of annotated neurosurgical videos.First, we report the performance of a deep learning-based object detection method, YOLO, on detecting anatomical structures in neurosurgical images. Second, we present a method for generating neurosurgical roadmaps using unsupervised embedding without assuming exact anatomical matches between patients, presence of an extensive anatomical atlas, or the need for simultaneous localization and mapping. A generated roadmap encodes the common anatomical paths taken in surgeries in the training set. At inference, the roadmap can be used to map a surgeon's current location using live image feedback on the path to provide guidance by being able to predict which structures should appear going forward or backward, much like a mapping application. Even though the embedding is not supervised by position information, we show that it is correlated to the location inside the brain and on the surgical path. We trained and evaluated the proposed method with a data set of 166 transsphenoidal adenomectomy procedures.","PeriodicalId":73379,"journal":{"name":"Information processing in medical imaging : proceedings of the ... conference","volume":"12 1","pages":"107-118"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82443376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}