首页 > 最新文献

IEEE Journal of Biomedical and Health Informatics最新文献

英文 中文
Inhibitory Components in Muscle Synergies Factorized by The Rectified Latent Variable Model from Electromyographic Data. 通过整流潜变量模型从肌电图数据推断肌肉协同作用中的抑制成分
IF 6.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-09 DOI: 10.1109/JBHI.2024.3453603
Xiaoyu Guo, Subing Huang, Borong He, Chuanlin Lan, Jodie J Xie, Kelvin Y S Lau, Tomohiko Takei, Arthur D P Mak, Roy T H Cheung, Kazuhiko Seki, Vincent C K Cheung, Rosa H M Chan

Non-negative matrix factorization (NMF), widely used in motor neuroscience for identifying muscle synergies from electromyographical signals (EMGs), extracts non-negative synergies and is yet unable to identify potential negative components (NegCps) in synergies underpinned by inhibitory spinal interneurons. To overcome this constraint, we propose to utilize rectified latent variable model (RLVM) to extract muscle synergies. RLVM uses an autoencoder neural network, and the weight matrix of its neural network could be negative, while latent variables must remain non-negative. If inputs to the model are EMGs, the weight matrix and latent variables represent muscle synergies and their temporal activation coefficients, respectively. We compared performances of NMF and RLVM in identifying muscle synergies in simulated and experimental datasets. Our simulated results showed that RLVM performed better in identifying muscle-synergy subspace and NMF had a good correlation with ground truth. Finally, we applied RLVM to a previously published experimental dataset comprising EMGs from upper-limb muscles and spike recordings of spinal premotor interneurons (PreM-INs) collected from two macaque monkeys during grasping tasks. RLVM and NMF synergies were highly similar, but a few small negative muscle components were observed in RLVM synergies. The muscles with NegCps identified by RLVM exhibited near-zero values in their corresponding synergies identified by NMF. Importantly, NegCps of RLVM synergies showed correspondence with the muscle connectivity of PreM-INs with inhibitory muscle fields, as identified by spike-triggered averaging of EMGs. Our results demonstrate the feasibility of RLVM in extracting potential inhibitory muscle-synergy components from EMGs.

非负矩阵因式分解(NMF)在运动神经科学中被广泛用于从肌电信号(EMG)中识别肌肉协同作用,但它提取的是非负协同作用,无法识别由抑制性脊髓中间神经元支撑的协同作用中的潜在负成分(NegCps)。为了克服这一限制,我们建议利用整流潜变量模型(RLVM)来提取肌肉协同作用。RLVM 使用自编码器神经网络,其神经网络的权重矩阵可以为负,而潜变量必须保持非负。如果模型的输入是肌电图,则权重矩阵和潜变量分别代表肌肉协同作用及其时间激活系数。我们比较了 NMF 和 RLVM 在模拟和实验数据集中识别肌肉协同作用的性能。模拟结果表明,RLVM 在识别肌肉协同子空间方面表现更好,而 NMF 与地面实况具有良好的相关性。最后,我们将 RLVM 应用于之前发表的实验数据集,该数据集包括两只猕猴在抓握任务中采集的上肢肌肉肌电图和脊髓前运动中间神经元(PreM-INs)的尖峰记录。RLVM 和 NMF 协同作用高度相似,但在 RLVM 协同作用中观察到了一些小的负肌肉成分。RLVM 识别出的具有负肌肉成分的肌肉在 NMF 识别出的相应协同作用中表现出接近零的值。重要的是,RLVM 协同作用的 NegCps 与 EMG 的尖峰触发平均化所识别出的具有抑制性肌场的 PreM-IN 的肌肉连通性相对应。我们的研究结果证明了 RLVM 从肌电图中提取潜在抑制性肌肉协同成分的可行性。
{"title":"Inhibitory Components in Muscle Synergies Factorized by The Rectified Latent Variable Model from Electromyographic Data.","authors":"Xiaoyu Guo, Subing Huang, Borong He, Chuanlin Lan, Jodie J Xie, Kelvin Y S Lau, Tomohiko Takei, Arthur D P Mak, Roy T H Cheung, Kazuhiko Seki, Vincent C K Cheung, Rosa H M Chan","doi":"10.1109/JBHI.2024.3453603","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3453603","url":null,"abstract":"<p><p>Non-negative matrix factorization (NMF), widely used in motor neuroscience for identifying muscle synergies from electromyographical signals (EMGs), extracts non-negative synergies and is yet unable to identify potential negative components (NegCps) in synergies underpinned by inhibitory spinal interneurons. To overcome this constraint, we propose to utilize rectified latent variable model (RLVM) to extract muscle synergies. RLVM uses an autoencoder neural network, and the weight matrix of its neural network could be negative, while latent variables must remain non-negative. If inputs to the model are EMGs, the weight matrix and latent variables represent muscle synergies and their temporal activation coefficients, respectively. We compared performances of NMF and RLVM in identifying muscle synergies in simulated and experimental datasets. Our simulated results showed that RLVM performed better in identifying muscle-synergy subspace and NMF had a good correlation with ground truth. Finally, we applied RLVM to a previously published experimental dataset comprising EMGs from upper-limb muscles and spike recordings of spinal premotor interneurons (PreM-INs) collected from two macaque monkeys during grasping tasks. RLVM and NMF synergies were highly similar, but a few small negative muscle components were observed in RLVM synergies. The muscles with NegCps identified by RLVM exhibited near-zero values in their corresponding synergies identified by NMF. Importantly, NegCps of RLVM synergies showed correspondence with the muscle connectivity of PreM-INs with inhibitory muscle fields, as identified by spike-triggered averaging of EMGs. Our results demonstrate the feasibility of RLVM in extracting potential inhibitory muscle-synergy components from EMGs.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142390171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Quantification of HER2 Amplification Levels Using Deep Learning. 利用深度学习自动量化 HER2 扩增水平。
IF 6.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-09 DOI: 10.1109/JBHI.2024.3476554
Ching-Wei Wang, Kai-Lin Chu, Ting-Sheng Su, Keng-Wei Liu, Yi-Jia Lin, Tai-Kuang Chao

HER2 assessment is necessary for patient selection in anti-HER2 targeted treatment. However, manual assessment of HER2 amplification is time-costly, labor-intensive, highly subjective and error-prone. Challenges in HER2 analysis in fluorescence in situ hybridization (FISH) and dual in situ hybridization (DISH) images include unclear and blurry cell boundaries, large variations in cell shapes and signals, overlapping and clustered cells and sparse label issues with manual annotations only on cells with high confidences, producing subjective assessment scores according to the individual choices on cell selection. To address the above-mentioned issues, we have developed a soft-sampling cascade deep learning model and a signal detection model in quantifying CEN17 and HER2 of cells to assist assessment of HER2 amplification status for patient selection of HER2 targeting therapy to breast cancer. In evaluation with two different kinds of clinical datasets, including a FISH data set and a DISH data set, the proposed method achieves high accuracy, recall and F1-score for both datasets in instance segmentation of HER2 related cells that must contain both CEN17 and HER2 signals. Moreover, the proposed method is demonstrated to significantly outperform seven state of the art recently published deep learning methods, including contour proposal network (CPN), soft label-based FCN (SL-FCN), modified fully convolutional network (M-FCN), bilayer convolutional network (BCNet), SOLOv2, Cascade R-CNN and DeepLabv3+ with three different backbones (p ≤ 0.01). Clinically, anti-HER2 therapy can also be applied to gastric cancer patients. We applied the developed model to assist in HER2 DISH amplification assessment for gastric cancer patients, and it also showed promising predictive results (accuracy 97.67 ±1.46%, precision 96.15 ±5.82%, respectively).

在抗 HER2 靶向治疗中,HER2 评估是选择患者的必要条件。然而,人工评估 HER2 扩增耗时、耗力、主观性强且容易出错。在荧光原位杂交(FISH)和双原位杂交(DISH)图像中进行 HER2 分析所面临的挑战包括细胞边界不清晰和模糊、细胞形状和信号差异大、细胞重叠和聚集以及标签稀疏等问题,而人工标注仅针对置信度高的细胞,因此会根据个人对细胞选择的不同而产生主观评估分数。针对上述问题,我们开发了一种软采样级联深度学习模型和信号检测模型,用于量化细胞的CEN17和HER2,以辅助评估HER2扩增状态,帮助患者选择HER2靶向治疗乳腺癌。在对两种不同类型的临床数据集(包括 FISH 数据集和 DISH 数据集)进行评估时,在对必须同时包含 CEN17 和 HER2 信号的 HER2 相关细胞进行实例分割时,所提出的方法在这两种数据集上都取得了较高的准确率、召回率和 F1 分数。此外,该方法还明显优于七种最新发表的深度学习方法,包括轮廓提议网络(CPN)、基于软标签的FCN(SL-FCN)、修正的全卷积网络(M-FCN)、双层卷积网络(BCNet)、SOLOv2、级联R-CNN和具有三种不同骨架的DeepLabv3+(P≤0.01)。在临床上,抗 HER2 治疗也可用于胃癌患者。我们将所开发的模型用于辅助胃癌患者的 HER2 DISH 扩增评估,也显示出了良好的预测结果(准确率分别为 97.67 ±1.46%,精确率分别为 96.15 ±5.82%)。
{"title":"Automated Quantification of HER2 Amplification Levels Using Deep Learning.","authors":"Ching-Wei Wang, Kai-Lin Chu, Ting-Sheng Su, Keng-Wei Liu, Yi-Jia Lin, Tai-Kuang Chao","doi":"10.1109/JBHI.2024.3476554","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3476554","url":null,"abstract":"<p><p>HER2 assessment is necessary for patient selection in anti-HER2 targeted treatment. However, manual assessment of HER2 amplification is time-costly, labor-intensive, highly subjective and error-prone. Challenges in HER2 analysis in fluorescence in situ hybridization (FISH) and dual in situ hybridization (DISH) images include unclear and blurry cell boundaries, large variations in cell shapes and signals, overlapping and clustered cells and sparse label issues with manual annotations only on cells with high confidences, producing subjective assessment scores according to the individual choices on cell selection. To address the above-mentioned issues, we have developed a soft-sampling cascade deep learning model and a signal detection model in quantifying CEN17 and HER2 of cells to assist assessment of HER2 amplification status for patient selection of HER2 targeting therapy to breast cancer. In evaluation with two different kinds of clinical datasets, including a FISH data set and a DISH data set, the proposed method achieves high accuracy, recall and F1-score for both datasets in instance segmentation of HER2 related cells that must contain both CEN17 and HER2 signals. Moreover, the proposed method is demonstrated to significantly outperform seven state of the art recently published deep learning methods, including contour proposal network (CPN), soft label-based FCN (SL-FCN), modified fully convolutional network (M-FCN), bilayer convolutional network (BCNet), SOLOv2, Cascade R-CNN and DeepLabv3+ with three different backbones (p ≤ 0.01). Clinically, anti-HER2 therapy can also be applied to gastric cancer patients. We applied the developed model to assist in HER2 DISH amplification assessment for gastric cancer patients, and it also showed promising predictive results (accuracy 97.67 ±1.46%, precision 96.15 ±5.82%, respectively).</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142390168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Trustworthy Curriculum Learning Guided Multi-Target Domain Adaptation Network for Autism Spectrum Disorder Classification. 用于自闭症谱系障碍分类的可信课程学习引导的多目标领域适应网络。
IF 6.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-08 DOI: 10.1109/JBHI.2024.3476076
Jiale Dun, Jun Wang, Juncheng Li, Qianhui Yang, Wenlong Hang, Xiaofeng Lu, Shihui Ying, Jun Shi

Domain adaptation has demonstrated success in classification of multi-center autism spectrum disorder (ASD). However, current domain adaptation methods primarily focus on classifying data in a single target domain with the assistance of one or multiple source domains, lacking the capability to address the clinical scenario of identifying ASD in multiple target domains. In response to this limitation, we propose a Trustworthy Curriculum Learning Guided Multi-Target Domain Adaptation (TCL-MTDA) network for identifying ASD in multiple target domains. To effectively handle varying degrees of data shift in multiple target domains, we propose a trustworthy curriculum learning procedure based on the Dempster-Shafer (D-S) Theory of Evidence. Additionally, a domain-contrastive adaptation method is integrated into the TCL-MTDA process to align data distributions between source and target domains, facilitating the learning of domain-invariant features. The proposed TCL-MTDA method is evaluated on 437 subjects (including 220 ASD patients and 217 NCs) from the Autism Brain Imaging Data Exchange (ABIDE). Experimental results validate the effectiveness of our proposed method in multi-target ASD classification, achieving an average accuracy of 71.46% (95% CI: 68.85% - 74.06%) across four target domains, significantly outperforming most baseline methods (p<0.05).

在多中心自闭症谱系障碍(ASD)分类方面,领域适应已取得了成功。然而,目前的领域适应方法主要侧重于在一个或多个源领域的辅助下对单个目标领域的数据进行分类,缺乏在多个目标领域识别自闭症谱系障碍的临床场景的能力。针对这一局限,我们提出了一种可信课程学习引导的多目标域自适应(TCL-MTDA)网络,用于识别多个目标域中的 ASD。为了有效处理多个目标领域中不同程度的数据偏移,我们提出了基于 Dempster-Shafer (D-S) 证据理论的可信课程学习程序。此外,我们还在 TCL-MTDA 过程中集成了领域对比适应方法,以调整源领域和目标领域之间的数据分布,从而促进领域不变特征的学习。我们在自闭症脑成像数据交换中心(ABIDE)的 437 名受试者(包括 220 名 ASD 患者和 217 名 NCs)上对所提出的 TCL-MTDA 方法进行了评估。实验结果验证了我们提出的方法在多目标 ASD 分类中的有效性,在四个目标领域中取得了 71.46% (95% CI: 68.85% - 74.06%) 的平均准确率,明显优于大多数基线方法(p
{"title":"A Trustworthy Curriculum Learning Guided Multi-Target Domain Adaptation Network for Autism Spectrum Disorder Classification.","authors":"Jiale Dun, Jun Wang, Juncheng Li, Qianhui Yang, Wenlong Hang, Xiaofeng Lu, Shihui Ying, Jun Shi","doi":"10.1109/JBHI.2024.3476076","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3476076","url":null,"abstract":"<p><p>Domain adaptation has demonstrated success in classification of multi-center autism spectrum disorder (ASD). However, current domain adaptation methods primarily focus on classifying data in a single target domain with the assistance of one or multiple source domains, lacking the capability to address the clinical scenario of identifying ASD in multiple target domains. In response to this limitation, we propose a Trustworthy Curriculum Learning Guided Multi-Target Domain Adaptation (TCL-MTDA) network for identifying ASD in multiple target domains. To effectively handle varying degrees of data shift in multiple target domains, we propose a trustworthy curriculum learning procedure based on the Dempster-Shafer (D-S) Theory of Evidence. Additionally, a domain-contrastive adaptation method is integrated into the TCL-MTDA process to align data distributions between source and target domains, facilitating the learning of domain-invariant features. The proposed TCL-MTDA method is evaluated on 437 subjects (including 220 ASD patients and 217 NCs) from the Autism Brain Imaging Data Exchange (ABIDE). Experimental results validate the effectiveness of our proposed method in multi-target ASD classification, achieving an average accuracy of 71.46% (95% CI: 68.85% - 74.06%) across four target domains, significantly outperforming most baseline methods (p<0.05).</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142390167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cascaded Inner-Outer Clip Retformer for Ultrasound Video Object Segmentation. 用于超声波视频对象分割的级联内-外夹式重构器
IF 6.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-07 DOI: 10.1109/JBHI.2024.3464732
Jialu Li, Lei Zhu, Zhaohu Xing, Baoliang Zhao, Ying Hu, Faqin Lv, Qiong Wang

Computer-aided ultrasound (US) imaging is an important prerequisite for early clinical diagnosis and treatment. Due to the harsh ultrasound (US) image quality and the blurry tumor area, recent memory-based video object segmentation models (VOS) achieve frame-level segmentation by performing intensive similarity matching among the past frames which could inevitably result in computational redundancy. Furthermore, the current attention mechanism utilized in recent models only allocates the same attention level among whole spatial-temporal memory features without making distinctions, which may result in accuracy degradation. In this paper, we first build a larger annotated benchmark dataset for breast lesion segmentation in ultrasound videos, then we propose a lightweight clip-level VOS framework for achieving higher segmentation accuracy while maintaining the speed. The Inner-Outer Clip Retformer is proposed to extract spatialtemporal tumor features in parallel. Specifically, the proposed Outer Clip Retformer extracts the tumor movement feature from past video clips to locate the current clip tumor position, while the Inner Clip Retformer detailedly extracts current tumor features that can produce more accurate segmentation results. Then a Clip Contrastive loss function is further proposed to align the extracted tumor features along both the spatial-temporal dimensions to improve the segmentation accuracy. In addition, the Global Retentive Memory is proposed to maintain the complementary tumor features with lower computing resources which can generate coherent temporal movement features. In this way, our model can significantly improve the spatial-temporal perception ability without increasing a large number of parameters, achieving more accurate segmentation results while maintaining a faster segmentation speed. Finally, we conduct extensive experiments to evaluate our proposed model on several video object segmentation datasets, the results show that our framework outperforms state-of-theart segmentation methods.

计算机辅助超声(US)成像是早期临床诊断和治疗的重要前提。由于超声波(US)图像质量苛刻且肿瘤区域模糊,近期基于内存的视频对象分割模型(VOS)通过在过去的帧之间进行密集的相似性匹配来实现帧级分割,这不可避免地会造成计算冗余。此外,目前最新模型所采用的注意力机制只是在整个时空记忆特征之间分配相同的注意力级别,而不进行区分,这可能会导致精度下降。在本文中,我们首先为超声视频中的乳腺病变分割建立了一个更大的标注基准数据集,然后提出了一个轻量级片段级 VOS 框架,以在保持速度的同时获得更高的分割精度。我们提出了内-外片段重构器(Inner-Outer Clip Retformer)来并行提取肿瘤的时空特征。具体来说,外片段重构器从过去的视频片段中提取肿瘤运动特征,从而定位当前片段的肿瘤位置,而内片段重构器则详细提取当前肿瘤特征,从而得出更准确的分割结果。然后,进一步提出 Clip Contrastive 损失函数,使提取的肿瘤特征在空间和时间维度上保持一致,从而提高分割精度。此外,我们还提出了全局保留记忆(Global Retentive Memory)技术,以较低的计算资源来保留互补的肿瘤特征,从而生成连贯的时间运动特征。这样,我们的模型就能在不增加大量参数的情况下显著提高时空感知能力,在保持较快的分割速度的同时获得更精确的分割结果。最后,我们在多个视频对象分割数据集上进行了大量实验,以评估我们提出的模型,结果表明我们的框架优于现有的分割方法。
{"title":"Cascaded Inner-Outer Clip Retformer for Ultrasound Video Object Segmentation.","authors":"Jialu Li, Lei Zhu, Zhaohu Xing, Baoliang Zhao, Ying Hu, Faqin Lv, Qiong Wang","doi":"10.1109/JBHI.2024.3464732","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3464732","url":null,"abstract":"<p><p>Computer-aided ultrasound (US) imaging is an important prerequisite for early clinical diagnosis and treatment. Due to the harsh ultrasound (US) image quality and the blurry tumor area, recent memory-based video object segmentation models (VOS) achieve frame-level segmentation by performing intensive similarity matching among the past frames which could inevitably result in computational redundancy. Furthermore, the current attention mechanism utilized in recent models only allocates the same attention level among whole spatial-temporal memory features without making distinctions, which may result in accuracy degradation. In this paper, we first build a larger annotated benchmark dataset for breast lesion segmentation in ultrasound videos, then we propose a lightweight clip-level VOS framework for achieving higher segmentation accuracy while maintaining the speed. The Inner-Outer Clip Retformer is proposed to extract spatialtemporal tumor features in parallel. Specifically, the proposed Outer Clip Retformer extracts the tumor movement feature from past video clips to locate the current clip tumor position, while the Inner Clip Retformer detailedly extracts current tumor features that can produce more accurate segmentation results. Then a Clip Contrastive loss function is further proposed to align the extracted tumor features along both the spatial-temporal dimensions to improve the segmentation accuracy. In addition, the Global Retentive Memory is proposed to maintain the complementary tumor features with lower computing resources which can generate coherent temporal movement features. In this way, our model can significantly improve the spatial-temporal perception ability without increasing a large number of parameters, achieving more accurate segmentation results while maintaining a faster segmentation speed. Finally, we conduct extensive experiments to evaluate our proposed model on several video object segmentation datasets, the results show that our framework outperforms state-of-theart segmentation methods.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142390169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HMDA: A Hybrid Model with Multi-scale Deformable Attention for Medical Image Segmentation. HMDA:用于医学图像分割的多尺度可变形注意力混合模型
IF 6.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-07 DOI: 10.1109/JBHI.2024.3469230
Mengmeng Wu, Tiantian Liu, Xin Dai, Chuyang Ye, Jinglong Wu, Shintaro Funahashi, Tianyi Yan

Transformers have been applied to medical image segmentation tasks owing to their excellent longrange modeling capability, compensating for the failure of Convolutional Neural Networks (CNNs) to extract global features. However, the standardized self-attention modules in Transformers, characterized by a uniform and inflexible pattern of attention distribution, frequently lead to unnecessary computational redundancy with high-dimensional data, consequently impeding the model's capacity for precise concentration on salient image regions. Additionally, achieving effective explicit interaction between the spatially detailed features captured by CNNs and the long-range contextual features provided by Transformers remains challenging. In this architecture, we propose a Hybrid Transformer and CNN architecture with Multi-scale Deformable Attention(HMDA), designed to address the aforementioned issues effectively. Specifically, we introduce a Multi-scale Spatially Adaptive Deformable Attention (MSADA) mechanism, which attends to a small set of key sampling points around a reference within the multi-scale features, to achieve better performance. In addition, we propose the Cross Attention Bridge (CAB) module, which integrates multi-scale transformer and local features through channelwise cross attention enriching feature synthesis. HMDA is validated on multiple datasets, and the results demonstrate the effectiveness of our approach, which achieves competitive results compared to the previous methods.

变形器具有出色的长距离建模能力,弥补了卷积神经网络(CNN)无法提取全局特征的缺陷,因此已被应用于医学图像分割任务。然而,Transformers 中标准化的自我注意力模块具有注意力分布均匀且不灵活的特点,经常导致高维数据出现不必要的计算冗余,从而阻碍了模型精确集中于突出图像区域的能力。此外,在 CNN 捕捉到的空间细节特征与 Transformers 提供的远距离上下文特征之间实现有效的明确互动仍具有挑战性。在本架构中,我们提出了一种具有多尺度可变形注意力(HMDA)的混合变形器和 CNN 架构,旨在有效解决上述问题。具体来说,我们引入了多尺度空间自适应可变形关注(MSADA)机制,该机制关注多尺度特征中参考点周围的一小部分关键采样点,以实现更好的性能。此外,我们还提出了交叉注意桥(CAB)模块,它通过通道交叉注意丰富特征合成,整合了多尺度变换器和局部特征。我们在多个数据集上对 HMDA 进行了验证,结果表明我们的方法非常有效,与之前的方法相比取得了具有竞争力的结果。
{"title":"HMDA: A Hybrid Model with Multi-scale Deformable Attention for Medical Image Segmentation.","authors":"Mengmeng Wu, Tiantian Liu, Xin Dai, Chuyang Ye, Jinglong Wu, Shintaro Funahashi, Tianyi Yan","doi":"10.1109/JBHI.2024.3469230","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3469230","url":null,"abstract":"<p><p>Transformers have been applied to medical image segmentation tasks owing to their excellent longrange modeling capability, compensating for the failure of Convolutional Neural Networks (CNNs) to extract global features. However, the standardized self-attention modules in Transformers, characterized by a uniform and inflexible pattern of attention distribution, frequently lead to unnecessary computational redundancy with high-dimensional data, consequently impeding the model's capacity for precise concentration on salient image regions. Additionally, achieving effective explicit interaction between the spatially detailed features captured by CNNs and the long-range contextual features provided by Transformers remains challenging. In this architecture, we propose a Hybrid Transformer and CNN architecture with Multi-scale Deformable Attention(HMDA), designed to address the aforementioned issues effectively. Specifically, we introduce a Multi-scale Spatially Adaptive Deformable Attention (MSADA) mechanism, which attends to a small set of key sampling points around a reference within the multi-scale features, to achieve better performance. In addition, we propose the Cross Attention Bridge (CAB) module, which integrates multi-scale transformer and local features through channelwise cross attention enriching feature synthesis. HMDA is validated on multiple datasets, and the results demonstrate the effectiveness of our approach, which achieves competitive results compared to the previous methods.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142390170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TBE-Net: A Deep Network Based on Tree-like Branch Encoder for Medical Image Segmentation. TBE-Net:基于树状分支编码器的深度网络,用于医学图像分割
IF 6.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-07 DOI: 10.1109/JBHI.2024.3468904
Shukai Yang, Xiaoqian Zhang, Youdong He, Yufeng Chen, Ying Zhou

In recent years, encoder-decoder-based network structures have been widely used in designing medical image segmentation models. However, these methods still face some limitations: 1) The network's feature extraction capability is limited, primarily due to insufficient attention to the encoder, resulting in a failure to extract rich and effective features. 2) Unidirectional stepwise decoding of smaller-sized feature maps restricts segmentation performance. To address the above limitations, we propose an innovative Tree-like Branch Encoder Network (TBE-Net), which adopts a tree-like branch encoder to better perform feature extraction and preserve feature information. Additionally, we introduce the Depth and Width Expansion (D-WE) module to expand the network depth and width at low parameter cost, thereby enhancing network performance. Furthermore, we design a Deep Aggregation Module (DAM) to better aggregate and process encoder features. Subsequently, we directly decode the aggregated features to generate the segmentation map. The experimental results show that, compared to other advanced algorithms, our method, with the lowest parameter cost, achieved improvements in the IoU metric on the TNBC, PH2, CHASE-DB1, STARE, and COVID-19-CT-Seg datasets by 1.6%, 0.46%, 0.81%, 1.96%, and 0.86%, respectively.

近年来,基于编码器-解码器的网络结构被广泛用于设计医学图像分割模型。然而,这些方法仍面临一些局限性:1) 网络的特征提取能力有限,主要原因是对编码器的关注不够,导致无法提取丰富有效的特征。2) 对较小尺寸的特征图进行单向逐步解码限制了分割性能。针对上述局限,我们提出了一种创新的树状分支编码器网络(TBE-Net),它采用树状分支编码器,能更好地进行特征提取并保留特征信息。此外,我们还引入了深度和宽度扩展(D-WE)模块,以较低的参数成本扩展网络深度和宽度,从而提高网络性能。此外,我们还设计了深度聚合模块(DAM),以更好地聚合和处理编码器特征。随后,我们直接对聚合特征进行解码,生成分割图。实验结果表明,与其他先进算法相比,我们的方法参数成本最低,在 TNBC、PH2、CHASE-DB1、STARE 和 COVID-19-CT-Seg 数据集上的 IoU 指标分别提高了 1.6%、0.46%、0.81%、1.96% 和 0.86%。
{"title":"TBE-Net: A Deep Network Based on Tree-like Branch Encoder for Medical Image Segmentation.","authors":"Shukai Yang, Xiaoqian Zhang, Youdong He, Yufeng Chen, Ying Zhou","doi":"10.1109/JBHI.2024.3468904","DOIUrl":"10.1109/JBHI.2024.3468904","url":null,"abstract":"<p><p>In recent years, encoder-decoder-based network structures have been widely used in designing medical image segmentation models. However, these methods still face some limitations: 1) The network's feature extraction capability is limited, primarily due to insufficient attention to the encoder, resulting in a failure to extract rich and effective features. 2) Unidirectional stepwise decoding of smaller-sized feature maps restricts segmentation performance. To address the above limitations, we propose an innovative Tree-like Branch Encoder Network (TBE-Net), which adopts a tree-like branch encoder to better perform feature extraction and preserve feature information. Additionally, we introduce the Depth and Width Expansion (D-WE) module to expand the network depth and width at low parameter cost, thereby enhancing network performance. Furthermore, we design a Deep Aggregation Module (DAM) to better aggregate and process encoder features. Subsequently, we directly decode the aggregated features to generate the segmentation map. The experimental results show that, compared to other advanced algorithms, our method, with the lowest parameter cost, achieved improvements in the IoU metric on the TNBC, PH2, CHASE-DB1, STARE, and COVID-19-CT-Seg datasets by 1.6%, 0.46%, 0.81%, 1.96%, and 0.86%, respectively.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142390173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel recognition and classification approach for motor imagery based on spatio-temporal features. 基于时空特征的新型运动图像识别和分类方法。
IF 6.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-07 DOI: 10.1109/JBHI.2024.3464550
Renjie Lv, Wenwen Chang, Guanghui Yan, Wenchao Nie, Lei Zheng, Bin Guo, Muhammad Tariq Sadiq

Motor imagery, as a paradigm of brainmachine interfaces, holds vast potential in the field of medical rehabilitation. Addressing the challenges posed by the non-stationarity and low signal-to-noise ratio of EEG signals, the effective extraction of features from motor imagery signals for accurate recognition stands as a key focus in motor imagery brain-machine interface technology. This paper proposes a motor imagery EEG signal classification model that combines functional brain networks with graph convolutional networks. First, functional brain networks are constructed using different brain functional connectivity metrics, and graph theory features are calculated to deeply analyze the characteristics of brain networks under different motor tasks. Then, the constructed functional brain networks are combined with graph convolutional networks for the classification and recognition of motor imagery tasks. The analysis based on brain functional connectivity reveals that the functional connectivity strength during the both fists task is significantly higher than that of other motor imagery tasks, and the functional connectivity strength during actual movement is generally superior to that of motor imagery tasks. In experiments conducted on the Physionet public dataset, the proposed model achieved a classification accuracy of 88.39% under multi-subject conditions, significantly outperforming traditional methods. Under single-subject conditions, the model effectively addressed the issue of individual variability, achieving an average classification accuracy of 99.31%. These results indicate that the proposed model not only exhibits excellent performance in the classification of motor imagery tasks but also provides new insights into the functional connectivity characteristics of different motor tasks and their corresponding brain regions.

运动图像作为脑机接口的一种范例,在医疗康复领域具有巨大潜力。针对脑电信号的非稳态性和低信噪比带来的挑战,如何从运动图像信号中有效提取特征并进行准确识别,是运动图像脑机接口技术的重点。本文提出了一种结合功能脑网络和图卷积网络的运动图像脑电信号分类模型。首先,利用不同的脑功能连接指标构建功能脑网络,并计算图论特征,深入分析不同运动任务下脑网络的特征。然后,将构建的脑功能网络与图卷积网络相结合,用于运动图像任务的分类和识别。基于脑功能连接的分析表明,双拳任务时的功能连接强度明显高于其他运动想象任务,而实际运动时的功能连接强度普遍优于运动想象任务。在 Physionet 公共数据集上进行的实验中,所提出的模型在多主体条件下的分类准确率达到了 88.39%,明显优于传统方法。在单被试条件下,该模型有效地解决了个体差异问题,平均分类准确率达到 99.31%。这些结果表明,所提出的模型不仅在运动想象任务的分类中表现出色,而且为不同运动任务及其相应脑区的功能连接特性提供了新的见解。
{"title":"A novel recognition and classification approach for motor imagery based on spatio-temporal features.","authors":"Renjie Lv, Wenwen Chang, Guanghui Yan, Wenchao Nie, Lei Zheng, Bin Guo, Muhammad Tariq Sadiq","doi":"10.1109/JBHI.2024.3464550","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3464550","url":null,"abstract":"<p><p>Motor imagery, as a paradigm of brainmachine interfaces, holds vast potential in the field of medical rehabilitation. Addressing the challenges posed by the non-stationarity and low signal-to-noise ratio of EEG signals, the effective extraction of features from motor imagery signals for accurate recognition stands as a key focus in motor imagery brain-machine interface technology. This paper proposes a motor imagery EEG signal classification model that combines functional brain networks with graph convolutional networks. First, functional brain networks are constructed using different brain functional connectivity metrics, and graph theory features are calculated to deeply analyze the characteristics of brain networks under different motor tasks. Then, the constructed functional brain networks are combined with graph convolutional networks for the classification and recognition of motor imagery tasks. The analysis based on brain functional connectivity reveals that the functional connectivity strength during the both fists task is significantly higher than that of other motor imagery tasks, and the functional connectivity strength during actual movement is generally superior to that of motor imagery tasks. In experiments conducted on the Physionet public dataset, the proposed model achieved a classification accuracy of 88.39% under multi-subject conditions, significantly outperforming traditional methods. Under single-subject conditions, the model effectively addressed the issue of individual variability, achieving an average classification accuracy of 99.31%. These results indicate that the proposed model not only exhibits excellent performance in the classification of motor imagery tasks but also provides new insights into the functional connectivity characteristics of different motor tasks and their corresponding brain regions.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142390166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Acoustic COVID-19 Detection Using Multiple Instance Learning. 利用多实例学习进行声学 COVID-19 检测。
IF 6.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-04 DOI: 10.1109/JBHI.2024.3474975
Michael Reiter, Pernkopf Franz

In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA challenge was created. It is based on the Coswara dataset offering the recording categories cough, speech, breath and vowel phonation. Recording durations vary greatly, ranging from one second to over a minute. A base model is pre-trained on random, short time intervals. Subsequently, a Multiple Instance Learning (MIL) model based on self-attention is incorporated to make collective predictions for multiple time segments within each audio recording, taking advantage of longer durations. In order to compete in the fusion category of the DiCOVA challenge, we utilize a linear regression approach among other fusion methods to combine predictions from the most successful models associated with each sound modality. The application of the MIL approach significantly improves generalizability, leading to an AUC ROC score of 86.6% in the fusion category. By incorporating previously unused data, including the sound modality 'sustained vowel phonation' and patient metadata, we were able to significantly improve our previous results reaching a score of 92.2%.

在 COVID-19 大流行中,严格的检测计划至关重要。然而,检测既耗时又昂贵。基于机器学习的录音诊断工具能以低成本实现广泛的测试。为了实现此类算法之间的可比性,DiCOVA 挑战赛应运而生。它以 Coswara 数据集为基础,提供咳嗽、语音、呼吸和元音发音等录音类别。记录持续时间差异很大,从一秒到超过一分钟不等。基础模型在随机的短时间间隔上进行预训练。随后,基于自我关注的多实例学习(MIL)模型被纳入其中,利用较长的持续时间,对每段录音中的多个时间片段进行集体预测。为了参加 DiCOVA 挑战赛的融合类比赛,我们在其他融合方法中采用了线性回归方法,将与每种声音模式相关的最成功模型的预测结果结合起来。MIL 方法的应用大大提高了通用性,使融合类的 AUC ROC 得分为 86.6%。通过纳入以前未使用过的数据,包括声音模式 "持续元音发音 "和患者元数据,我们能够显著改善以前的结果,得分率达到 92.2%。
{"title":"Acoustic COVID-19 Detection Using Multiple Instance Learning.","authors":"Michael Reiter, Pernkopf Franz","doi":"10.1109/JBHI.2024.3474975","DOIUrl":"10.1109/JBHI.2024.3474975","url":null,"abstract":"<p><p>In the COVID-19 pandemic, a rigorous testing scheme was crucial. However, tests can be time-consuming and expensive. A machine learning-based diagnostic tool for audio recordings could enable widespread testing at low costs. In order to achieve comparability between such algorithms, the DiCOVA challenge was created. It is based on the Coswara dataset offering the recording categories cough, speech, breath and vowel phonation. Recording durations vary greatly, ranging from one second to over a minute. A base model is pre-trained on random, short time intervals. Subsequently, a Multiple Instance Learning (MIL) model based on self-attention is incorporated to make collective predictions for multiple time segments within each audio recording, taking advantage of longer durations. In order to compete in the fusion category of the DiCOVA challenge, we utilize a linear regression approach among other fusion methods to combine predictions from the most successful models associated with each sound modality. The application of the MIL approach significantly improves generalizability, leading to an AUC ROC score of 86.6% in the fusion category. By incorporating previously unused data, including the sound modality 'sustained vowel phonation' and patient metadata, we were able to significantly improve our previous results reaching a score of 92.2%.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142375328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BioSAM: Generating SAM Prompts From Superpixel Graph for Biological Instance Segmentation. BioSAM:从超像素图生成用于生物实例分割的 SAM 提示。
IF 6.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-04 DOI: 10.1109/JBHI.2024.3474706
Miaomiao Cai, Xiaoyu Liu, Zhiwei Xiong, Xuejin Chen

Proposal-free instance segmentation methods have significantly advanced the field of biological image analysis. Recently, the Segment Anything Model (SAM) has shown an extraordinary ability to handle challenging instance boundaries. However, directly applying SAM to biological images that contain instances with complex morphologies and dense distributions fails to yield satisfactory results. In this work, we propose BioSAM, a new biological instance segmentation framework generating SAM prompts from a superpixel graph. Specifically, to avoid over-merging, we first generate sufficient superpixels as graph nodes and construct an initialized graph. We then generate initial prompts from each superpixel and aggregate them through a graph neural network (GNN) by predicting the relationship of superpixels to avoid over-segmentation. We employ the SAM encoder embeddings and the SAM-assisted superpixel similarity as new features for the graph to enhance its discrimination capability. With the graph-based prompt aggregation, we utilize the aggregated prompts in SAM to refine the segmentation and generate more accurate instance boundaries. Comprehensive experiments on four representative biological datasets demonstrate that our proposed method outperforms state-of-the-art methods.

无提案实例分割方法极大地推动了生物图像分析领域的发展。最近,"任意分割模型"(Segment Anything Model,SAM)在处理具有挑战性的实例边界方面表现出了非凡的能力。然而,直接将 SAM 应用于包含形态复杂、分布密集的实例的生物图像,并不能获得令人满意的结果。在这项工作中,我们提出了一种新的生物实例分割框架--BioSAM,它能从超像素图中生成 SAM 提示。具体来说,为了避免过度合并,我们首先生成足够的超像素作为图节点,并构建初始化图。然后,我们从每个超像素生成初始提示,并通过图神经网络(GNN)预测超像素之间的关系将它们聚合起来,以避免过度分割。我们将 SAM 编码器嵌入和 SAM 辅助超像素相似性作为图的新特征,以增强其分辨能力。通过基于图的提示聚合,我们利用 SAM 中的聚合提示来完善分割并生成更准确的实例边界。在四个具有代表性的生物数据集上进行的综合实验表明,我们提出的方法优于最先进的方法。
{"title":"BioSAM: Generating SAM Prompts From Superpixel Graph for Biological Instance Segmentation.","authors":"Miaomiao Cai, Xiaoyu Liu, Zhiwei Xiong, Xuejin Chen","doi":"10.1109/JBHI.2024.3474706","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3474706","url":null,"abstract":"<p><p>Proposal-free instance segmentation methods have significantly advanced the field of biological image analysis. Recently, the Segment Anything Model (SAM) has shown an extraordinary ability to handle challenging instance boundaries. However, directly applying SAM to biological images that contain instances with complex morphologies and dense distributions fails to yield satisfactory results. In this work, we propose BioSAM, a new biological instance segmentation framework generating SAM prompts from a superpixel graph. Specifically, to avoid over-merging, we first generate sufficient superpixels as graph nodes and construct an initialized graph. We then generate initial prompts from each superpixel and aggregate them through a graph neural network (GNN) by predicting the relationship of superpixels to avoid over-segmentation. We employ the SAM encoder embeddings and the SAM-assisted superpixel similarity as new features for the graph to enhance its discrimination capability. With the graph-based prompt aggregation, we utilize the aggregated prompts in SAM to refine the segmentation and generate more accurate instance boundaries. Comprehensive experiments on four representative biological datasets demonstrate that our proposed method outperforms state-of-the-art methods.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142375329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transformer3: A Pure Transformer Framework for fMRI-Based Representations of Human Brain Function. Transformer3:基于 fMRI 的人脑功能表征的纯转换器框架。
IF 6.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-04 DOI: 10.1109/JBHI.2024.3471186
Xiaoxi Tian, Hao Ma, Yun Guan, Le Xu, Jiangcong Liu, Lixia Tian

Effective representation learning is essential for neuroimage-based individualized predictions. Numerous studies have been performed on fMRI-based individualized predictions, leveraging sample-wise, spatial, and temporal interdependencies hidden in fMRI data. However, these studies failed to fully utilize the effective information hidden in fMRI data, as only one or two types of the interdependencies were analyzed. To effectively extract representations of human brain function through fully leveraging the three types of the interdependencies, we establish a pure transformer-based framework, Transformer3, leveraging transformer's strong ability to capture interdependencies within the input data. Transformer3 consists mainly of three transformer modules, with the Batch Transformer module used for addressing sample-wise similarities and differences, the Region Transformer module used for handling complex spatial interdependencies among brain regions, and the Time Transformer module used for capturing temporal interdependencies across time points. Experiments on age, IQ, and sex predictions based on two public datasets demonstrate the effectiveness of the proposed Transformer3. As the only hypothesis is that sample-wise, spatial, and temporal interdependencies extensively exist within the input data, the proposed Transformer3 can be widely used for representation learning based on multivariate time-series. Furthermore, the pure transformer framework makes it quite convenient for understanding the driving factors underlying the predictive models based on Transformer3.

有效的表征学习对于基于神经图像的个性化预测至关重要。利用隐藏在 fMRI 数据中的样本、空间和时间相互依存关系,已经进行了大量基于 fMRI 的个性化预测研究。然而,这些研究未能充分利用隐藏在 fMRI 数据中的有效信息,因为只分析了一种或两种类型的相互依存关系。为了充分利用三类相互依存关系有效提取人脑功能的表征,我们利用变换器捕捉输入数据中相互依存关系的强大能力,建立了一个基于变换器的纯粹框架--Transformer3。Transformer3 主要由三个转换器模块组成,其中批量转换器模块用于处理样本的相似性和差异性,区域转换器模块用于处理大脑区域之间复杂的空间相互依赖关系,时间转换器模块用于捕捉跨时间点的时间相互依赖关系。基于两个公开数据集的年龄、智商和性别预测实验证明了所提出的 Transformer3 的有效性。由于唯一的假设是输入数据中广泛存在样本、空间和时间上的相互依存关系,因此所提出的 Transformer3 可广泛用于基于多元时间序列的表征学习。此外,纯转换器框架也为理解基于 Transformer3 的预测模型背后的驱动因素提供了极大的便利。
{"title":"Transformer<sup>3</sup>: A Pure Transformer Framework for fMRI-Based Representations of Human Brain Function.","authors":"Xiaoxi Tian, Hao Ma, Yun Guan, Le Xu, Jiangcong Liu, Lixia Tian","doi":"10.1109/JBHI.2024.3471186","DOIUrl":"10.1109/JBHI.2024.3471186","url":null,"abstract":"<p><p>Effective representation learning is essential for neuroimage-based individualized predictions. Numerous studies have been performed on fMRI-based individualized predictions, leveraging sample-wise, spatial, and temporal interdependencies hidden in fMRI data. However, these studies failed to fully utilize the effective information hidden in fMRI data, as only one or two types of the interdependencies were analyzed. To effectively extract representations of human brain function through fully leveraging the three types of the interdependencies, we establish a pure transformer-based framework, Transformer3, leveraging transformer's strong ability to capture interdependencies within the input data. Transformer<sup>3</sup> consists mainly of three transformer modules, with the Batch Transformer module used for addressing sample-wise similarities and differences, the Region Transformer module used for handling complex spatial interdependencies among brain regions, and the Time Transformer module used for capturing temporal interdependencies across time points. Experiments on age, IQ, and sex predictions based on two public datasets demonstrate the effectiveness of the proposed Transformer3. As the only hypothesis is that sample-wise, spatial, and temporal interdependencies extensively exist within the input data, the proposed Transformer<sup>3</sup> can be widely used for representation learning based on multivariate time-series. Furthermore, the pure transformer framework makes it quite convenient for understanding the driving factors underlying the predictive models based on Transformer<sup>3</sup>.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142375330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Journal of Biomedical and Health Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1