首页 > 最新文献

Biomedical Physics & Engineering Express最新文献

英文 中文
Temporal and comorbidity-aware representation of longitudinal patient trajectories from electronic health records. 来自电子健康记录的纵向患者轨迹的时间和共病意识表征。
IF 1.6 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-30 DOI: 10.1088/2057-1976/ae38de
M Sreenivasan, S Madhavendranath, Anu Mary Chacko

Electronic health records (EHRs) capture longitudinal multi-visit patient journeys but are difficult to analyze due to temporal irregularity, multimorbidity, and heterogeneous coding. This study introduces a temporal and comorbidity-aware trajectory representation that restructures admissions into ordered symbolic visit states while preserving diagnostic progression, secondary comorbidities, procedure categories, demographics, outcomes, and inter-visit intervals. These symbolic states are subsequently encoded as fixed-length numerical vectors suitable for computational analysis. Validation was conducted in two stages: Stage I assessed construction fidelity using coverage metrics, comorbidity preservation, diagnostic transition structures, and exact inter-visit gap encoding and Stage II assessed analytical utility through clustering experiments using different clustering approacheslike sequence similarity, Gaussian Mixture Models (GMM), and a temporal LSTM autoencoder (TS-LSTM). Proof of concept was done by encoding subset of patient cohorts from the MIMIC-IV database consisting of 2,280 patients with 8,849 admissions having complete primary diagnosis coverage and near-complete secondary coverage. Stage 1 assessment consisting of cohort-level coverage metrics confirmed that the transformation preserved essential clinical information and key properties of longitudinal EHRs. In Stage 2, clustering experiments validated the analytical utility of the representation across sequence-based, Gaussian mixture, and temporal LSTM autoencoder approaches. Ablation studies further demonstrated that both multimorbidity depth and inter-visit gap encoding are critical to maintaining cluster separability and temporal fidelity. The findings show that explicit encoding of comorbidity and timing improves interpretability and subgroup coherence. Although evaluated on a single dataset, the use of standardised ICD-10 EHR structure supports the assumption that the framework can generalise across healthcare settings; future work will incorporate multimodal data and external validation.

电子健康记录(EHRs)捕获纵向多次就诊的患者旅程,但由于时间不规则、多病症和异构编码而难以分析。本研究引入了一种时间和共病感知轨迹表示,将入院重新构建为有序的象征性就诊状态,同时保留诊断进展、继发共病、手术类别、人口统计学、结果和两次就诊间隔。这些符号状态随后被编码为适合于计算分析的定长数值向量。验证分两个阶段进行:第一阶段使用覆盖度量、共病保存、诊断过渡结构和精确访问间隙编码来评估构建保真度;第二阶段使用不同的聚类方法,如序列相似性、高斯混合模型(GMM)和时序LSTM自动编码器(TS-LSTM),通过聚类实验来评估分析实用性。概念验证是通过对来自MIMIC-IV数据库的患者队列子集进行编码来完成的,该数据库由2,280名患者组成,其中8,849名入院患者具有完整的初级诊断覆盖率和近乎完整的次要诊断覆盖率。由队列覆盖指标组成的第一阶段评估证实,这种转变保留了纵向电子病历的基本临床信息和关键属性。在第二阶段,聚类实验验证了跨序列表示、高斯混合和时间LSTM自编码器方法的分析效用。消融研究进一步表明,多病深度和访问间隙编码对于保持聚类可分离性和时间保真度至关重要。研究结果表明,共病和时间的显式编码提高了可解释性和亚组一致性。尽管在单一数据集上进行了评估,但使用标准化ICD-10 EHR结构支持了该框架可以在整个医疗保健环境中推广的假设;未来的工作将包括多模态数据和外部验证。
{"title":"Temporal and comorbidity-aware representation of longitudinal patient trajectories from electronic health records.","authors":"M Sreenivasan, S Madhavendranath, Anu Mary Chacko","doi":"10.1088/2057-1976/ae38de","DOIUrl":"10.1088/2057-1976/ae38de","url":null,"abstract":"<p><p>Electronic health records (EHRs) capture longitudinal multi-visit patient journeys but are difficult to analyze due to temporal irregularity, multimorbidity, and heterogeneous coding. This study introduces a temporal and comorbidity-aware trajectory representation that restructures admissions into ordered symbolic visit states while preserving diagnostic progression, secondary comorbidities, procedure categories, demographics, outcomes, and inter-visit intervals. These symbolic states are subsequently encoded as fixed-length numerical vectors suitable for computational analysis. Validation was conducted in two stages: Stage I assessed construction fidelity using coverage metrics, comorbidity preservation, diagnostic transition structures, and exact inter-visit gap encoding and Stage II assessed analytical utility through clustering experiments using different clustering approacheslike sequence similarity, Gaussian Mixture Models (GMM), and a temporal LSTM autoencoder (TS-LSTM). Proof of concept was done by encoding subset of patient cohorts from the MIMIC-IV database consisting of 2,280 patients with 8,849 admissions having complete primary diagnosis coverage and near-complete secondary coverage. Stage 1 assessment consisting of cohort-level coverage metrics confirmed that the transformation preserved essential clinical information and key properties of longitudinal EHRs. In Stage 2, clustering experiments validated the analytical utility of the representation across sequence-based, Gaussian mixture, and temporal LSTM autoencoder approaches. Ablation studies further demonstrated that both multimorbidity depth and inter-visit gap encoding are critical to maintaining cluster separability and temporal fidelity. The findings show that explicit encoding of comorbidity and timing improves interpretability and subgroup coherence. Although evaluated on a single dataset, the use of standardised ICD-10 EHR structure supports the assumption that the framework can generalise across healthcare settings; future work will incorporate multimodal data and external validation.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145984350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impact of fine-grained learning rate configuration on the performance of medical image segmentation models. 细粒度学习率配置对医学图像分割模型性能的影响。
IF 1.6 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-30 DOI: 10.1088/2057-1976/ae3830
Fang Wang, Ji Li, Rui Zhang, Jing Hu, Gaimei Gao

Research on deep learning for medical image segmentation has shifted from single-modality networks to multimodal data fusion. Updating the parameters of such deep learning models is crucial for accurate segmentation predictions. Although existing optimizers can perform global parameter updates, the fine-grained initialization of learning rates across different network hierarchies and its influence on segmentation performance has not been sufficiently explored. To address this, we conducted a series of experiments showing that the initialization of a differentiated learning rate across network layers directly affected the performance of medical image segmentation models. To determine the optimal initial learning rate for each network level, we summarized a general statistical relationship between early-stage training results and the model's final optimal performance. In this paper, we proposed a fine-grained learning rate configuration algorithm. To verify the effectiveness of the proposed algorithm, we evaluated 10 segmentation models on three benchmark datasets: the colon polyp segmentation dataset CVC-ClinicDB, the gastrointestinal polyp dataset Kvasir-SEG, and the breast tumor segmentation dataset BUSI. The models that achieved the most significant improvement in mIoU on these three datasets were H-vmunet, MSRUNet, and H-vmunet, with increases of 3.87%, 4.67%, and 6.22%, respectively. Additionally, we validated the generalization and transferability of the proposed algorithm using a thyroid nodule segmentation dataset and a skin lesion segmentation dataset. Finally, a series of analyses, including segmentation result analysis, feature map visualization, training process analysis, computational overhead analysis, and clinical relevance analysis, confirmed the effectiveness of the proposed method. The core code is publicly available athttps://github.com/Lambda-Wave/PaperCoreCode.

医学图像分割的深度学习研究已经从单模态网络转向多模态数据融合。更新这些深度学习模型的参数对于准确的分割预测至关重要。虽然现有的优化器可以执行全局参数更新,但对不同网络层次上学习率的细粒度初始化及其对分割性能的影响尚未得到充分的探讨。为了解决这个问题,我们进行了一系列实验,表明跨网络层的差异化学习率初始化直接影响医学图像分割模型的性能。为了确定每个网络级别的最优初始学习率,我们总结了早期训练结果与模型最终最优性能之间的一般统计关系。本文提出了一种细粒度学习率配置算法。为了验证所提出算法的有效性,我们在三个基准数据集上评估了10个分割模型:结肠息肉分割数据集CVC-ClinicDB,胃肠道息肉数据集Kvasir-SEG和乳腺肿瘤分割数据集BUSI。在这三个数据集上,mIoU改善最显著的模型是H-vmunet、MSRUNet和H-vmunet,分别提高了3.87%、4.67%和6.22%。此外,我们使用甲状腺结节分割数据集和皮肤病变分割数据集验证了所提出算法的泛化和可移植性。最后,通过分割结果分析、特征图可视化、训练过程分析、计算开销分析、临床相关性分析等一系列分析,验证了该方法的有效性。核心代码可在https://github.com/Lambda-Wave/PaperCoreCode上公开获得。
{"title":"Impact of fine-grained learning rate configuration on the performance of medical image segmentation models.","authors":"Fang Wang, Ji Li, Rui Zhang, Jing Hu, Gaimei Gao","doi":"10.1088/2057-1976/ae3830","DOIUrl":"10.1088/2057-1976/ae3830","url":null,"abstract":"<p><p>Research on deep learning for medical image segmentation has shifted from single-modality networks to multimodal data fusion. Updating the parameters of such deep learning models is crucial for accurate segmentation predictions. Although existing optimizers can perform global parameter updates, the fine-grained initialization of learning rates across different network hierarchies and its influence on segmentation performance has not been sufficiently explored. To address this, we conducted a series of experiments showing that the initialization of a differentiated learning rate across network layers directly affected the performance of medical image segmentation models. To determine the optimal initial learning rate for each network level, we summarized a general statistical relationship between early-stage training results and the model's final optimal performance. In this paper, we proposed a fine-grained learning rate configuration algorithm. To verify the effectiveness of the proposed algorithm, we evaluated 10 segmentation models on three benchmark datasets: the colon polyp segmentation dataset CVC-ClinicDB, the gastrointestinal polyp dataset Kvasir-SEG, and the breast tumor segmentation dataset BUSI. The models that achieved the most significant improvement in mIoU on these three datasets were H-vmunet, MSRUNet, and H-vmunet, with increases of 3.87%, 4.67%, and 6.22%, respectively. Additionally, we validated the generalization and transferability of the proposed algorithm using a thyroid nodule segmentation dataset and a skin lesion segmentation dataset. Finally, a series of analyses, including segmentation result analysis, feature map visualization, training process analysis, computational overhead analysis, and clinical relevance analysis, confirmed the effectiveness of the proposed method. The core code is publicly available athttps://github.com/Lambda-Wave/PaperCoreCode.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145984345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model uncertainty estimates for deep learning mammographic density prediction using ordinal and classification approaches. 使用顺序和分类方法进行深度学习乳房x线摄影密度预测的模型不确定性估计。
IF 1.6 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-30 DOI: 10.1088/2057-1976/ae39e2
Steven Squires, Grey Kuling, D Gareth Evans, Anne L Martel, Susan M Astley

Purpose. Mammographic density is associated with the risk of developing breast cancer and can be predicted using deep learning methods. Model uncertainty estimates are not produced by standard regression approaches but would be valuable for clinical and research purposes. Our objective is to produce deep learning models with in-built uncertainty estimates without degrading predictive performance.Approach. We analysed data from over 150,000 mammogram images with associated continuous density scores from expert readers in the Predicting Risk Of Cancer At Screening (PROCAS) study. We re-designated the continuous density scores to 100 density classes then trained classification and ordinal deep learning models. Distributions and distribution-free methods were applied to extract predictions and uncertainties. A deep learning regression model was trained on the continuous density scores to act as a direct comparison.Results. The root mean squared error (RMSE) between expert assigned density labels and predictions of the standard regression model were 8.42 (8.34-8.51) while the RMSE for the classification and ordinal classification were 8.37 (8.28-8.46) and 8.44 (8.35-8.53) respectively. The average uncertainties produced by the models were higher when the density scores from pairs of expert readers density scores differ more, when different mammogram views of the same views are more variable, and when two separately trained models show higher variation.Conclusions. Using either a classification or ordinal approach we can produce model uncertainty estimates without loss of predictive performance.

目的:乳房x线摄影密度与患乳腺癌的风险相关,可以使用深度学习方法进行预测。模型不确定性估计不是由标准回归方法产生的,但对临床和研究目的有价值。我们的目标是在不降低预测性能的情况下,建立具有内置不确定性估计的深度学习模型。方法:我们分析了来自专家读者在预测癌症筛查风险(PROCAS)研究中的超过15万张乳房x光片图像的数据,以及相关的连续密度评分。我们将连续密度分数重新指定为100个密度类,然后训练分类和有序深度学习模型。应用分布和无分布方法提取预测和不确定性。结果:专家分配的密度标签与标准回归模型预测值的均方根误差(RMSE)分别为8.42(8.34-8.51),分类和有序分类的RMSE分别为8.37(8.28-8.46)和8.44(8.35-8.53)。当对专家读者的密度评分差异较大,同一视图的不同乳房x光片视图变化较大,以及两个单独训练的模型变化较大时,模型产生的平均不确定性较高。结论:使用分类方法或顺序方法都可以在不损失预测性能的情况下产生模型不确定性估计。
{"title":"Model uncertainty estimates for deep learning mammographic density prediction using ordinal and classification approaches.","authors":"Steven Squires, Grey Kuling, D Gareth Evans, Anne L Martel, Susan M Astley","doi":"10.1088/2057-1976/ae39e2","DOIUrl":"10.1088/2057-1976/ae39e2","url":null,"abstract":"<p><p><i>Purpose</i>. Mammographic density is associated with the risk of developing breast cancer and can be predicted using deep learning methods. Model uncertainty estimates are not produced by standard regression approaches but would be valuable for clinical and research purposes. Our objective is to produce deep learning models with in-built uncertainty estimates without degrading predictive performance.<i>Approach</i>. We analysed data from over 150,000 mammogram images with associated continuous density scores from expert readers in the Predicting Risk Of Cancer At Screening (PROCAS) study. We re-designated the continuous density scores to 100 density classes then trained classification and ordinal deep learning models. Distributions and distribution-free methods were applied to extract predictions and uncertainties. A deep learning regression model was trained on the continuous density scores to act as a direct comparison.<i>Results</i>. The root mean squared error (RMSE) between expert assigned density labels and predictions of the standard regression model were 8.42 (8.34-8.51) while the RMSE for the classification and ordinal classification were 8.37 (8.28-8.46) and 8.44 (8.35-8.53) respectively. The average uncertainties produced by the models were higher when the density scores from pairs of expert readers density scores differ more, when different mammogram views of the same views are more variable, and when two separately trained models show higher variation.<i>Conclusions</i>. Using either a classification or ordinal approach we can produce model uncertainty estimates without loss of predictive performance.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146003008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Microdosimetric analysis of proton boron capture therapy using microdosimetric kinetic model. 质子硼捕获疗法的微剂量动力学模型分析。
IF 1.6 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-30 DOI: 10.1088/2057-1976/ae3965
Abdur Rahim, Tatsuhiko Sato, Hiroshi Fukuda, Mehrdad Shahmohammadi Beni, Hiroshi Watabe

Objective. Proton boron capture therapy (PBCT) is a novel approach that utilizes alpha particles generated through the proton induced capture reaction with11B. Early studies reported substantial dose enhancements of 50%-96% near the Bragg peak, suggesting a promising therapeutic advantage. However, subsequent investigations have raised critical concerns regarding the practical feasibility of PBCT, primarily due to the relatively low reaction cross section in the Bragg peak region and the need for clinically unrealistic boron concentrations. The aim of this study is to evaluate Relative Biological Effectiveness (RBE) enhancement in PBCT using microdosimetric analysis across a wide range of boron concentrations.Approach. In the present work, we have employed Monte Carlo model using Particle and Heavy Ion Transport code System (PHITS) package combined with the Microdosimetric Kinetic Model to quantify both physical and biological dose enhancements at varying concentrations of11B.Main Results. Microdosimetric analysis revealed that the total dose is dominated by protons, although alpha particles dominate in regions of higher linear energy deposition. The resulting RBE enhancement factors were 1.0011, 1.0080, and 1.1275 (for Human Salivary Gland (HSG) cell type) for 100, 1000, and 10,000 ppm boron concentrations, respectively. While the enhancements at lower concentrations are negligible, a modest increase is observed at very high boron levels.Significance. Based on the resulting RBE enhancement factors, it can be concluded that although alpha particles generated via thep+11B → 3αreaction contribute high-Linear Energy Transfer (LET) energy at the cellular level, the overall biological dose enhancement remains rather minimal. These results indicate that under clinically achievable boron concentrations, the therapeutic benefit of PBCT may be limited.

目的:质子硼融合治疗是一种利用质子诱导与11b融合反应产生的α ;粒子的新方法。早期的研究报道了布拉格峰附近5096%的剂量增强,这表明了一种有希望的治疗优势。然而,随后的研究对PBFT的实际可行性提出了关键的担忧,主要是由于Bragg峰区域的反应截面相对较低,并且需要临床上不现实的硼浓度。本研究的目的是通过广泛的硼浓度范围内的微剂量分析来评估RBE在PBFT中的增强作用。在目前的工作中,我们采用蒙特卡罗模型,使用PHITS包结合微剂量动力学模型来量化不同浓度下11b的物理和生物剂量增强。微剂量分析显示,总剂量是由质子决定的,尽管α粒子在较高的线性能量沉积区域占主导地位。结果表明,当硼浓度为100、1000和10,000 ppm时,RBE增强因子分别为1.0011、1.0080和1.1275 (HSG细胞类型)。虽然在较低浓度下的增强可以忽略不计,但在非常高的硼水平下观察到适度的增加。根据得到的RBE增强因子,可以得出结论,尽管通过p + 11b→3α反应产生的α粒子在细胞水平上贡献了高let能量,但总体生物剂量en- ;增强仍然很小。这些结果表明,在临床可达到的硼浓度下,PBFT的治疗效果可能有限。
{"title":"Microdosimetric analysis of proton boron capture therapy using microdosimetric kinetic model.","authors":"Abdur Rahim, Tatsuhiko Sato, Hiroshi Fukuda, Mehrdad Shahmohammadi Beni, Hiroshi Watabe","doi":"10.1088/2057-1976/ae3965","DOIUrl":"10.1088/2057-1976/ae3965","url":null,"abstract":"<p><p><i>Objective</i>. Proton boron capture therapy (PBCT) is a novel approach that utilizes alpha particles generated through the proton induced capture reaction with<sup>11</sup>B. Early studies reported substantial dose enhancements of 50%-96% near the Bragg peak, suggesting a promising therapeutic advantage. However, subsequent investigations have raised critical concerns regarding the practical feasibility of PBCT, primarily due to the relatively low reaction cross section in the Bragg peak region and the need for clinically unrealistic boron concentrations. The aim of this study is to evaluate Relative Biological Effectiveness (RBE) enhancement in PBCT using microdosimetric analysis across a wide range of boron concentrations.<i>Approach</i>. In the present work, we have employed Monte Carlo model using Particle and Heavy Ion Transport code System (PHITS) package combined with the Microdosimetric Kinetic Model to quantify both physical and biological dose enhancements at varying concentrations of<sup>11</sup>B.<i>Main Results</i>. Microdosimetric analysis revealed that the total dose is dominated by protons, although alpha particles dominate in regions of higher linear energy deposition. The resulting RBE enhancement factors were 1.0011, 1.0080, and 1.1275 (for Human Salivary Gland (HSG) cell type) for 100, 1000, and 10,000 ppm boron concentrations, respectively. While the enhancements at lower concentrations are negligible, a modest increase is observed at very high boron levels.<i>Significance</i>. Based on the resulting RBE enhancement factors, it can be concluded that although alpha particles generated via the<i>p</i>+<sup>11</sup>B → 3<i>α</i>reaction contribute high-Linear Energy Transfer (LET) energy at the cellular level, the overall biological dose enhancement remains rather minimal. These results indicate that under clinically achievable boron concentrations, the therapeutic benefit of PBCT may be limited.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145987851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual pseudo-labeling based adversarial domain adaptation for EEG-based emotion recognition. 基于双伪标记的对抗域自适应脑电图情感识别。
IF 1.6 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-29 DOI: 10.1088/2057-1976/ae395e
Ling Huang, Mingxuan Li, Guangpeng Gao, Mengjie Qian

In recent years, unsupervised domain adaptation (UDA) has emerged as a promising approach for constructing cross-subject emotion recognition models. However, most existing UDA methods do not fully exploit class information in the target domain, resulting in a relatively coarse-grained process of domain alignment and a higher risk of incorrect class matching. To address this issue, this paper proposes a novel adversarial training-based domain adaptation framework. The proposed method leverages emotion class prototypes to enhance intra-class correlation between the source and target feature distributions. Meanwhile, soft pseudo-labels generated by prototype clustering are utilized to further improve the inter-class discriminability within each domain. In order to enhance the robustness and quality of hard pseudo-labels in the target domain, a dual pseudo-labeling strategy is introduced. Finally, adversarial training is conducted to achieve a more fine-grained alignment of data distributions across domains. We conduct cross-subject and cross-session evaluations on the SEED and SEED-IV datasets, respectively. Experimental results demonstrate the effectiveness of our method and its advantages over several state-of-the-art UDA approaches. By introducing dual pseudo-labels, our study incorporates additional supervision, enabling a more refined domain adaptation process and significantly improving the generalization capability of EEG-based emotion recognition models.

近年来,无监督域自适应(UDA)已成为构建跨主体情感识别模型的一种很有前途的方法。然而,大多数现有的UDA方法并没有充分利用目标域中的类信息,导致相对粗粒度的域对齐过程和更高的错误类匹配风险。为了解决这一问题,本文提出了一种新的基于对抗性训练的领域自适应框架。该方法利用情感类原型来增强源和目标特征分布之间的类内相关性。同时,利用原型聚类生成的软伪标签,进一步提高每个域内的类间可分辨性。为了提高目标域硬伪标签的鲁棒性和质量,提出了一种双伪标签策略。最后,进行对抗性训练,以实现跨领域数据分布的更细粒度对齐。我们分别对SEED和SEED- iv数据集进行了跨学科和跨学科的评估。实验结果证明了我们的方法的有效性及其优于几种最先进的UDA方法的优势。通过引入双伪标签,我们的研究结合了额外的监督,实现了更精细的领域适应过程,显著提高了基于脑电图的情感识别模型的泛化能力。
{"title":"Dual pseudo-labeling based adversarial domain adaptation for EEG-based emotion recognition.","authors":"Ling Huang, Mingxuan Li, Guangpeng Gao, Mengjie Qian","doi":"10.1088/2057-1976/ae395e","DOIUrl":"10.1088/2057-1976/ae395e","url":null,"abstract":"<p><p>In recent years, unsupervised domain adaptation (UDA) has emerged as a promising approach for constructing cross-subject emotion recognition models. However, most existing UDA methods do not fully exploit class information in the target domain, resulting in a relatively coarse-grained process of domain alignment and a higher risk of incorrect class matching. To address this issue, this paper proposes a novel adversarial training-based domain adaptation framework. The proposed method leverages emotion class prototypes to enhance intra-class correlation between the source and target feature distributions. Meanwhile, soft pseudo-labels generated by prototype clustering are utilized to further improve the inter-class discriminability within each domain. In order to enhance the robustness and quality of hard pseudo-labels in the target domain, a dual pseudo-labeling strategy is introduced. Finally, adversarial training is conducted to achieve a more fine-grained alignment of data distributions across domains. We conduct cross-subject and cross-session evaluations on the SEED and SEED-IV datasets, respectively. Experimental results demonstrate the effectiveness of our method and its advantages over several state-of-the-art UDA approaches. By introducing dual pseudo-labels, our study incorporates additional supervision, enabling a more refined domain adaptation process and significantly improving the generalization capability of EEG-based emotion recognition models.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145987848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A magnetic induction-based differential method for intracerebral hemorrhage lateralization. 基于磁感应的脑出血偏侧诊断方法。
IF 1.6 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-29 DOI: 10.1088/2057-1976/ae38e6
Hui Quan Wang, Guo Chong Chen, Rui Juan Chen, Xin Ma, Jin Hai Wang, Xiang Yang Xu

Magnetic induction technology (MIT), as a non-contact and non-invasive sensing approach, has shown great potential for detecting brain lesions since it is unaffected by skull shielding. However, most MIT-based studies on intracerebral hemorrhage (ICH) have mainly focused on identifying the presence or estimating the volume of bleeding, while research on spatial localization has remained limited. In this study, a magnetic induction differential localization (MIDL) method was proposed to detect and localize ICH. A pair of symmetrically arranged detection coils was designed to sense the differential magnetic field perturbations caused by variations in the electrical conductivity and permittivity of brain tissues. The feasibility and response characteristics of the system were verified through numerical simulations and physical phantom experiments, followed byin vivovalidation on eight New Zealand white rabbits with unilateral induced hemorrhages. The real and imaginary components of the differential signals were analyzed to investigate their correlation with the side and volume of hemorrhage. Both simulations and phantom experiments demonstrated opposite variation trends of the real and imaginary components for left- and right-side hemorrhages. Animal experiments further confirmed that, after the injection of 1 ml of blood, the signal variation amplitudes significantly exceeded the baseline deviation (P < 0.05), exhibiting opposite directions of change between the two hemispheres. These results indicate that the proposed MIDL method can effectively distinguish the hemorrhage side and provide a theoretical and experimental foundation for non-invasive localization of intracerebral hemorrhage using MIT.

磁感应技术(MIT)作为一种非接触式、非侵入式的传感技术,由于不受颅骨屏蔽的影响,在检测脑损伤方面显示出巨大的潜力。然而,大多数基于mit的脑出血研究主要集中在识别出血的存在或估计出血的体积,而对空间定位的研究仍然有限。本研究提出了一种磁感应差分定位(MIDL)方法来检测和定位ICH。设计了一对对称排列的检测线圈,用于检测脑组织电导率和介电常数变化引起的差分磁场扰动。通过数值模拟和物理模拟实验验证了该系统的可行性和响应特性,并对8只单侧诱发出血的新西兰大白兔进行了体内验证。分析微分信号的实、虚分量,探讨其与出血部位和出血量的相关性。模拟实验和模拟实验均表明,左右侧壁出血的实部成分和虚部成分的变化趋势相反。动物实验进一步证实,注射1ml血液后,信号变化幅度明显超过基线偏差(P < 0.05),两脑半球呈现相反方向的变化。上述结果表明,所提出的MIDL方法能够有效区分出血侧,为MIT无创性定位脑出血提供了理论和实验基础。
{"title":"A magnetic induction-based differential method for intracerebral hemorrhage lateralization.","authors":"Hui Quan Wang, Guo Chong Chen, Rui Juan Chen, Xin Ma, Jin Hai Wang, Xiang Yang Xu","doi":"10.1088/2057-1976/ae38e6","DOIUrl":"10.1088/2057-1976/ae38e6","url":null,"abstract":"<p><p>Magnetic induction technology (MIT), as a non-contact and non-invasive sensing approach, has shown great potential for detecting brain lesions since it is unaffected by skull shielding. However, most MIT-based studies on intracerebral hemorrhage (ICH) have mainly focused on identifying the presence or estimating the volume of bleeding, while research on spatial localization has remained limited. In this study, a magnetic induction differential localization (MIDL) method was proposed to detect and localize ICH. A pair of symmetrically arranged detection coils was designed to sense the differential magnetic field perturbations caused by variations in the electrical conductivity and permittivity of brain tissues. The feasibility and response characteristics of the system were verified through numerical simulations and physical phantom experiments, followed by<i>in vivo</i>validation on eight New Zealand white rabbits with unilateral induced hemorrhages. The real and imaginary components of the differential signals were analyzed to investigate their correlation with the side and volume of hemorrhage. Both simulations and phantom experiments demonstrated opposite variation trends of the real and imaginary components for left- and right-side hemorrhages. Animal experiments further confirmed that, after the injection of 1 ml of blood, the signal variation amplitudes significantly exceeded the baseline deviation (P < 0.05), exhibiting opposite directions of change between the two hemispheres. These results indicate that the proposed MIDL method can effectively distinguish the hemorrhage side and provide a theoretical and experimental foundation for non-invasive localization of intracerebral hemorrhage using MIT.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145984402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical modeling of blood and tissue signatures using ultrasonic color flow imaging. 使用超声彩色血流成像的血液和组织特征的统计建模。
IF 1.6 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-29 DOI: 10.1088/2057-1976/ae3f36
Atefeh Abdolmanafi, Jonathan Rubin, Stephen Z Pinter, J Brian Fowlkes, Oliver Kripfgans

Conventional color flow processing is primarily optimized for qualitative visualization of flow dynamics, limiting its diagnostic use in regions where vascular structures are small relative to the ultrasound beamwidth. Leveraging the statistical properties of color flow data may provide a pathway toward quantitative discrimination between blood and tissue signals. This could enhance detection of vascular abnormalities, improve diagnostic accuracy, and support monitoring in diseases with small hemodynamic changes. Experimental data was obtained using a clinical GE LOGIQ 9 ultrasound system with a 10L linear array probe (3.75 MHz) positioned on an in-house made half-space flow phantom with the focus located at 3 cm depth. The simulation data obtained from Field II used a setup analogous to the experimental settings. Theoretical probability density function of ultrasound color flow power was derived using a gamma distribution. Shape parameters for blood and tissue were estimated using maximum likelihood estimation (MLE) in both simulation and experimental data. Color flow power was found to follow the gamma distribution in both simulation and experimental data. The estimated shape parameters aligned with theoretical predictions and distinguished between blood and tissue. Estimated shape parameters are less than or equal to 1 for tissue samples and greater than 1 for blood samples. This study presents a statistical modeling approach to enhance blood-tissue differentiation in color flow ultrasound, enabling blood characterization and perfusion quantification for improved detection and monitoring of vascular abnormalities.

传统的彩色流处理主要针对流动动力学的定性可视化进行了优化,限制了其在血管结构相对于超声波束宽度较小的区域的诊断应用。利用颜色流数据的统计特性可以为血液和组织信号之间的定量区分提供途径。这可以增强对血管异常的检测,提高诊断准确性,并支持对血液动力学变化较小的疾病的监测。实验数据使用GE LOGIQ 9临床超声系统获得,该系统将10L线性阵列探头(3.75 MHz)放置在内部制造的半空间流模上,焦点位于3cm深度。从第二场获得的模拟数据使用了与实验设置类似的设置。利用伽玛分布导出了超声彩色流功率的理论概率密度函数。在模拟和实验数据中使用最大似然估计(MLE)估计血液和组织的形状参数。在模拟和实验数据中均发现彩色流功率服从伽玛分布。估计的形状参数与理论预测一致,并区分了血液和组织。估计的形状参数对于组织样本小于等于1,对于血液样本大于1。本研究提出了一种统计建模方法来增强彩色血流超声中的血液组织分化,使血液表征和灌注量化能够改进血管异常的检测和监测。
{"title":"Statistical modeling of blood and tissue signatures using ultrasonic color flow imaging.","authors":"Atefeh Abdolmanafi, Jonathan Rubin, Stephen Z Pinter, J Brian Fowlkes, Oliver Kripfgans","doi":"10.1088/2057-1976/ae3f36","DOIUrl":"https://doi.org/10.1088/2057-1976/ae3f36","url":null,"abstract":"<p><p>Conventional color flow processing is primarily optimized for qualitative visualization of flow dynamics, limiting its diagnostic use in regions where vascular structures are small relative to the ultrasound beamwidth. Leveraging the statistical properties of color flow data may provide a pathway toward quantitative discrimination between blood and tissue signals. This could enhance detection of vascular abnormalities, improve diagnostic accuracy, and support monitoring in diseases with small hemodynamic changes. Experimental data was obtained using a clinical GE LOGIQ 9 ultrasound system with a 10L linear array probe (3.75 MHz) positioned on an in-house made half-space flow phantom with the focus located at 3 cm depth. The simulation data obtained from Field II used a setup analogous to the experimental settings. Theoretical probability density function of ultrasound color flow power was derived using a gamma distribution. Shape parameters for blood and tissue were estimated using maximum likelihood estimation (MLE) in both simulation and experimental data. Color flow power was found to follow the gamma distribution in both simulation and experimental data. The estimated shape parameters aligned with theoretical predictions and distinguished between blood and tissue. Estimated shape parameters are less than or equal to 1 for tissue samples and greater than 1 for blood samples. This study presents a statistical modeling approach to enhance blood-tissue differentiation in color flow ultrasound, enabling blood characterization and perfusion quantification for improved detection and monitoring of vascular abnormalities.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146083890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LMSA-net: a lightweight multi-scale attention network for eeg-based emotion recognition. LMSA-Net:一个轻量级的多尺度注意网络,用于基于脑电图的情绪识别。
IF 1.6 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-23 DOI: 10.1088/2057-1976/ae3763
Hao Yue, Hengrui Ruan, Yawu Zhao

Electroencephalogram (EEG)-based emotion recognition holds great potential in affective computing, mental health assessment, and human-computer interaction. However, EEG signals are non-stationary, noisy, and composed of multiple frequency bands, making direct feature learning from raw data particularly challenging. While end-to-end models alleviate the need for manual feature engineering, advancing the performance frontier of lightweight architectures remains a crucial and complex challenge for practical deployment. To address these issues, we propose LMSA-Net (Lightweight Multi-Scale Attention Network), a lightweight, interpretable, and end-to-end model that directly learns spatio-temporal features from raw EEG signals. The architecture integrates learnable channel weighting for adaptive spatial encoding, multi-scale temporal separable convolution for rhythm-specific feature extraction, and Sim Attention Module for parameter-free saliency enhancement. Our proposed LMSA-Net is evaluated on three benchmark datasets, SEED, SEED-IV, and DEAP, under subject-dependent protocols. It achieves top performance on SEED (65.53% accuracy), competitive results on SEED-IV (48.52% accuracy), and strong performance in arousal classification on DEAP, demonstrating good generalization. Ablation studies confirm the critical role of each proposed module. Frequency analysis reveals that our multi-scale temporal kernels inherently specialize in distinct EEG rhythms, validating their neurophysiological alignment. Lightweight design is evidenced by minimal parameters (7.64K) and low latency, ideal for edge deployment. Interpretability analysis further shows the model's focus on emotion-related brain regions. LMSA-Net thus delivers an efficient, interpretable, and high-performing solution. The code is available athttps://github.com/rhr0411/LMSA-Net.git.

基于脑电图(EEG)的情绪识别在情感计算、心理健康评估和人机交互方面具有巨大的潜力。然而,脑电图信号是非平稳的、有噪声的,并且由多个频段组成,这使得从原始数据中直接学习特征特别具有挑战性。虽然端到端模型减轻了手动特征工程的需要,但推进轻量级体系结构的性能前沿仍然是实际部署的关键和复杂挑战。为了解决这些问题,我们提出了LMSA-Net(轻量级多尺度注意力网络),这是一种轻量级、可解释的端到端模型,可以直接从原始脑电图信号中学习时空特征。该架构集成了用于自适应空间编码的可学习信道加权,用于节奏特定特征提取的多尺度时间可分离卷积,以及用于无参数显著性增强的Sim注意力模块。我们提出的LMSA-Net在三个基准数据集(SEED, SEED- iv和DEAP)上根据主题相关协议进行了评估。它在SEED上取得了最高的成绩(准确率为65.53%),在SEED- iv上取得了竞争结果(准确率为48.52%),在DEAP上的唤醒分类上取得了优异的成绩,表现出了良好的泛化。消融研究证实了每个提议模块的关键作用。频率分析显示,我们的多尺度时间核固有地专注于不同的脑电图节律,验证了它们的神经生理一致性。轻量级设计以最小的参数(7.64K)和低延迟为证明,是边缘部署的理想选择。可解释性分析进一步表明,该模型关注的是与情绪相关的大脑区域。因此,LMSA-Net提供了一个高效、可解释和高性能的解决方案。代码可在https://github.com/rhr0411/LMSA-Net.git上获得。
{"title":"LMSA-net: a lightweight multi-scale attention network for eeg-based emotion recognition.","authors":"Hao Yue, Hengrui Ruan, Yawu Zhao","doi":"10.1088/2057-1976/ae3763","DOIUrl":"10.1088/2057-1976/ae3763","url":null,"abstract":"<p><p>Electroencephalogram (EEG)-based emotion recognition holds great potential in affective computing, mental health assessment, and human-computer interaction. However, EEG signals are non-stationary, noisy, and composed of multiple frequency bands, making direct feature learning from raw data particularly challenging. While end-to-end models alleviate the need for manual feature engineering, advancing the performance frontier of lightweight architectures remains a crucial and complex challenge for practical deployment. To address these issues, we propose LMSA-Net (Lightweight Multi-Scale Attention Network), a lightweight, interpretable, and end-to-end model that directly learns spatio-temporal features from raw EEG signals. The architecture integrates learnable channel weighting for adaptive spatial encoding, multi-scale temporal separable convolution for rhythm-specific feature extraction, and Sim Attention Module for parameter-free saliency enhancement. Our proposed LMSA-Net is evaluated on three benchmark datasets, SEED, SEED-IV, and DEAP, under subject-dependent protocols. It achieves top performance on SEED (65.53% accuracy), competitive results on SEED-IV (48.52% accuracy), and strong performance in arousal classification on DEAP, demonstrating good generalization. Ablation studies confirm the critical role of each proposed module. Frequency analysis reveals that our multi-scale temporal kernels inherently specialize in distinct EEG rhythms, validating their neurophysiological alignment. Lightweight design is evidenced by minimal parameters (7.64K) and low latency, ideal for edge deployment. Interpretability analysis further shows the model's focus on emotion-related brain regions. LMSA-Net thus delivers an efficient, interpretable, and high-performing solution. The code is available athttps://github.com/rhr0411/LMSA-Net.git.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145964659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feasibility of breath-hold gating with visual-tactile guidance on an MR-Linac. 在MR-Linac上采用视觉触觉引导的屏气门控的可行性。
IF 1.6 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-22 DOI: 10.1088/2057-1976/ae3570
Kai Yuan, Matthew Manhin Cheung, Wai Kin Lai, Wing Ki Wong, Ashley Chi Kin Cheng, Louis Lee

This study aimed to assess a visual-tactile breath-hold (BH) workflow integrated with Elekta Unity's comprehensive motion management (CMM) system for gated MR-guided radiotherapy in situations where verbal coaching is impractical. A visual guidance program and a 3D-printed couch-mounted tactile pointer were implemented to instruct patients and stabilize voluntary BH. Two patients, one with pancreatic cancer and one with lung cancer, were treated using this workflow. Treatment beam gating was driven by CMM BH criteria, and audit log files from CMM-guided treatments were analyzed. Expected gating efficiencies were 40% for the pancreas case and 51.4% for the lung case, while measured efficiencies were 42.59 ± 2.56% and 54.95 ± 0.54%, respectively. The corresponding beam-on times were 14.75 ± 0.96 and 16.25 ± 0.50 min. The workflow reduced reliance on motion prediction for gating and mitigated frequent beam holds typically observed with free-breathing strategies, thereby decreasing dosimetric uncertainty. These findings indicate that a visual-tactile BH workflow on a 1.5 T MR-Linac is feasible and practical, supporting efficient gated delivery and reproducible breath-holds when verbal coaching is limited.

本研究旨在评估视觉触觉屏气(BH)工作流程与Elekta Unity的综合运动管理(CMM)系统的集成,用于门控MR引导放疗,在口头指导不切实际的情况下。采用视觉引导程序和3D打印沙发安装触觉指针来指导患者并稳定自愿BH。两名患者,一名患有胰腺癌,一名患有肺癌,使用这种工作流程进行治疗。治疗光束门控由CMM BH标准驱动,并分析了CMM引导治疗的审计日志文件。胰腺和肺部的预期门控效率分别为40%和51.4%,而实际效率分别为42.59±2.56%和54.95±0.54%。相应的波束时间分别为14.75±0.96和16.25±0.50分钟。该工作流程减少了对门控运动预测的依赖,并减轻了通常使用自由呼吸策略观察到的频繁光束保持,从而降低了剂量学的不确定性。这些发现表明,在1.5 T MR - Linac上的视觉-触觉BH工作流程是可行和实用的,在口头指导有限的情况下,支持有效的门控输送和可重复的屏气。
{"title":"Feasibility of breath-hold gating with visual-tactile guidance on an MR-Linac.","authors":"Kai Yuan, Matthew Manhin Cheung, Wai Kin Lai, Wing Ki Wong, Ashley Chi Kin Cheng, Louis Lee","doi":"10.1088/2057-1976/ae3570","DOIUrl":"10.1088/2057-1976/ae3570","url":null,"abstract":"<p><p>This study aimed to assess a visual-tactile breath-hold (BH) workflow integrated with Elekta Unity's comprehensive motion management (CMM) system for gated MR-guided radiotherapy in situations where verbal coaching is impractical. A visual guidance program and a 3D-printed couch-mounted tactile pointer were implemented to instruct patients and stabilize voluntary BH. Two patients, one with pancreatic cancer and one with lung cancer, were treated using this workflow. Treatment beam gating was driven by CMM BH criteria, and audit log files from CMM-guided treatments were analyzed. Expected gating efficiencies were 40% for the pancreas case and 51.4% for the lung case, while measured efficiencies were 42.59 ± 2.56% and 54.95 ± 0.54%, respectively. The corresponding beam-on times were 14.75 ± 0.96 and 16.25 ± 0.50 min. The workflow reduced reliance on motion prediction for gating and mitigated frequent beam holds typically observed with free-breathing strategies, thereby decreasing dosimetric uncertainty. These findings indicate that a visual-tactile BH workflow on a 1.5 T MR-Linac is feasible and practical, supporting efficient gated delivery and reproducible breath-holds when verbal coaching is limited.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145931977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hires-diagnoser: a dual stream medical image diagnosis framework based on multi-level resolution adaptive sensing. hirres - diagnoser:一种基于多级分辨率自适应传感的双流医学图像诊断框架。
IF 1.6 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Pub Date : 2026-01-22 DOI: 10.1088/2057-1976/ae2b74
Si-Chao Zhao, Jun-Jun Chen, Shi-Long Shi, Ge Deng, Xue-Jun Qiu

Improving medical image diagnosis performance relies on effectively representing features across various scales and accurately capturing local lesion characteristics and spatial context. While traditional convolutional neural networks are limited by fixed local receptive fields, hindering their ability to model global semantic relationships, transformers with self-attention mechanisms excel at capturing long-range contextual information but struggle with identifying small lesions. To overcome these challenges, this study introduces Hires-Diagnoser, a dual-stream framework for medical image diagnosis that supports multiple resolution levels. This framework combines ConvNeXt and Swin-Transformer branches in a parallel architecture. The ConvNeXt branch focuses on extracting local texture features through convolutions, while the Swin-Transformer branch captures global contextual dependencies using window-based self-attention. Additionally, a cross-modal correlation module (LCA) facilitates dynamic interaction and adaptive fusion of features across different resolutions. Experimental assessments on four datasets (RaabinWBC, Brain Tumor MRI, LC25000, and OCT-C8) demonstrated accuracy rates of 98.59%, 95.45%, 99.43%, and 95.23%, respectively, surpassing existing methods. By incorporating a cross-modal feature interaction mechanism, this framework achieves high performance and precise pathological interpretations, offering an effective solution for medical image diagnosis with certain practical implications.The source code of this proposal can be found at https://github.com/si-yuan20/hire-diagnoser.

医学图像诊断性能的提高依赖于跨多个尺度特征的协同表示以及准确捕获局部病变特征和空间背景的能力。现有研究表明,传统的卷积神经网络受限于其固定的局部感受野大小,这限制了其有效地模拟跨不同区域的全局语义关系的能力。尽管利用自我注意机制的变压器可以捕获远程上下文信息,但它们在识别小病变方面面临挑战。为了解决这些问题,本文提出了hire - diagnoser,这是一个用于医学图像诊断的双流框架,可容纳多个分辨率级别。这个框架的特点是一个集成了ConvNeXt和swing - transformer分支的并行架构。ConvNeXt分支专注于通过卷积操作提取局部纹理特征,而swing - transformer分支负责通过基于窗口的自关注捕获全局上下文依赖。此外,引入了一个跨模态相关模块(LCA),以促进不同分辨率下特征的动态交互和自适应融合。在RaabinWBC、脑肿瘤MRI、LC25000和OCT-C8四个不同的数据集上进行实验评估,准确率分别为99.45%、98.01%、100%和97.58%,优于现有方法。该框架利用跨模态特征交互机制,实现了高性能、细致化的病理解释,为医学图像诊断领域提供了一种高效、适应性强的解决方案,具有重要的应用潜力。
{"title":"Hires-diagnoser: a dual stream medical image diagnosis framework based on multi-level resolution adaptive sensing.","authors":"Si-Chao Zhao, Jun-Jun Chen, Shi-Long Shi, Ge Deng, Xue-Jun Qiu","doi":"10.1088/2057-1976/ae2b74","DOIUrl":"10.1088/2057-1976/ae2b74","url":null,"abstract":"<p><p>Improving medical image diagnosis performance relies on effectively representing features across various scales and accurately capturing local lesion characteristics and spatial context. While traditional convolutional neural networks are limited by fixed local receptive fields, hindering their ability to model global semantic relationships, transformers with self-attention mechanisms excel at capturing long-range contextual information but struggle with identifying small lesions. To overcome these challenges, this study introduces Hires-Diagnoser, a dual-stream framework for medical image diagnosis that supports multiple resolution levels. This framework combines ConvNeXt and Swin-Transformer branches in a parallel architecture. The ConvNeXt branch focuses on extracting local texture features through convolutions, while the Swin-Transformer branch captures global contextual dependencies using window-based self-attention. Additionally, a cross-modal correlation module (LCA) facilitates dynamic interaction and adaptive fusion of features across different resolutions. Experimental assessments on four datasets (RaabinWBC, Brain Tumor MRI, LC25000, and OCT-C8) demonstrated accuracy rates of 98.59%, 95.45%, 99.43%, and 95.23%, respectively, surpassing existing methods. By incorporating a cross-modal feature interaction mechanism, this framework achieves high performance and precise pathological interpretations, offering an effective solution for medical image diagnosis with certain practical implications.The source code of this proposal can be found at https://github.com/si-yuan20/hire-diagnoser.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145740828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biomedical Physics & Engineering Express
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1