首页 > 最新文献

International Journal of Imaging Systems and Technology最新文献

英文 中文
Indexers Should Actively Support the Fight Against Paper Mills 索引机构应积极支持与造纸厂的斗争
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-13 DOI: 10.1002/ima.70230
Mohamed L. Seghier
<p>Fake publications, proliferated by paper mills, are a symptom of the modern number-centric academia [<span>1, 2</span>]. The recent Richardson et al. study found that, while fraudulent papers make up a small portion of all publications, the scale of fraud has increased at a shocking rate, not suspected by many [<span>3</span>]. Specifically, the conclusion of Richardson et al.'s study is alarming: that is, “the ability to evade interventions is enabling the number of fraudulent publications to grow at a rate far outpacing that of legitimate science” [<span>3</span>]. All this is reinforced by a degree of impunity that perpetrators enjoy, leaving researchers disillusioned about the authenticity of existing scientific evidence. The question on the lips of all stakeholders is this: if it is still ‘easy’ to publish fake papers, why has academia allowed the problem to persist and undermine scholarly communication? Here, we discuss this problem from a pragmatic perspective, while stressing that this real problem should not be weaponized against science [<span>4</span>].</p><p>Fraud is neither new nor specific to academia. For example, academia has been wrestling with different types of fraudulent activities for decades, including cheating in exams, bogus colleges, forged degrees, and doctored CVs. Fraud in scholarly communication should be examined comprehensively within this broader context. The core underlying issue is that not all researchers have the necessary skills or resources to maintain high levels of research productivity. Consequently, some individuals may resort to unethical practices to achieve high h-index scores and publication counts like those of well-supported researchers at leading institutions, yet without investing equivalent effort. If scholarly communication is vulnerable to corruption [<span>5</span>], and the consequences for being caught are not harsh, why not game the system?</p><p>I believe here is where the problem lies: the whole purpose of fraudulent activities is to inflate research metrics through manipulation and fraud [<span>6</span>]. Yet, manipulations and fraud are rarely fed back to the system to adjust these metrics. This is why a solution must actively involve the entities that calculate and promote such research metrics: the indexers, like journal indexing and university ranking agencies. Specifically, indexers can impact individuals (and institutions) who engage in fraudulent practices by hurting their research metrics.</p><p>Scholarly communication was profoundly shaped by the introduction of research metrics, originally packaged as objective quantitative measures of research quality and impact [<span>7</span>]. These metrics are widely adopted in academia despite their known limitations and inherent biases [<span>8, 9</span>]. But academia also knew that these metrics can be manipulated, as they soon become bad metrics (Goodhart's law), making the system vulnerable to corruption (Campbell's law). Yet, aca
造纸厂泛滥的虚假出版物是现代以数字为中心的学术界的一个症状[1,2]。最近Richardson等人的研究发现,虽然欺诈性论文只占所有出版物的一小部分,但欺诈的规模却以惊人的速度增长,很多人都没有想到这一点。具体来说,理查森等人的研究得出的结论令人担忧:即“逃避干预的能力使欺诈性出版物的数量以远远超过合法科学的速度增长”。犯罪者享有一定程度的有罪不罚,这一切都加强了这一点,使研究人员对现有科学证据的真实性感到失望。所有利益相关者口中的问题是:如果发表假论文仍然很“容易”,为什么学术界允许这个问题持续存在并破坏学术交流?在这里,我们从务实的角度来讨论这个问题,同时强调这个真正的问题不应该被用作对抗科学的武器。欺诈既不新鲜,也不局限于学术界。例如,几十年来,学术界一直在与各种类型的欺诈活动作斗争,包括考试作弊、假大学、伪造学位和伪造简历。学术传播中的欺诈行为应该在这一更广泛的背景下进行全面审查。核心的潜在问题是,并不是所有的研究人员都有必要的技能或资源来维持高水平的研究生产力。因此,一些人可能会采取不道德的做法来获得高h指数得分和发表数量,就像那些在领先机构得到良好支持的研究人员一样,但却没有投入同等的努力。如果学术传播容易受到腐败的影响,而且被发现的后果并不严重,为什么不利用这个制度呢?我认为这就是问题所在:欺诈活动的全部目的是通过操纵和欺诈来夸大研究指标。然而,操纵和欺诈很少反馈到系统来调整这些指标。这就是为什么一个解决方案必须积极地涉及计算和促进这些研究指标的实体:索引机构,如期刊索引和大学排名机构。具体来说,索引器可以通过损害研究指标来影响从事欺诈行为的个人(和机构)。研究指标的引入深刻地影响了学术交流,研究指标最初被包装为研究质量和影响的客观定量衡量标准。尽管这些指标存在已知的局限性和固有的偏差,但它们在学术界被广泛采用[8,9]。但学术界也知道这些指标是可以被操纵的,因为它们很快就会变成糟糕的指标(古德哈特定律),使体系容易受到腐败的影响(坎贝尔定律)。然而,学术界仍然自满于让糟糕的衡量标准如此普遍,忽视了几次让体系免于腐败的呼吁。由于许多糟糕的指标都是由索引器创建的,也许它们可以在某种程度上帮助修复系统。例如,如果一个人在Scopus上打开一个研究人员的个人资料,许多关键指标就会显示在第一页,很容易访问。撤回的论文就不是这样了,因为它们经常隐藏在其他出版物中。在用户资料中简单显示撤稿总数,有助于暴露虚假或不可靠研究的规模。的确,撤稿有时是由于真正的错误造成的;尽管如此,它们经常表明研究质量差(要么是假的,要么是不可靠的研究),因此应该在研究人员的简介中突出显示(图1)。其目的不是损害研究人员的声誉,而是鼓励人们反思他们的产出速度和合作质量,尤其是在撤稿变得重要的时候。为了提高这种纠正措施的有效性,系统必须在识别和撤回不良论文方面快速可靠。目前,撤稿过程缓慢而复杂,因此出版商应该简化程序以加快撤稿过程。事实上,当社区标记出不可靠或虚假的出版物时,一些出版商行动迟缓(甚至不活跃)。这就是为什么索引器和出版商与现有社区主导的倡议(如United2Act、撤稿观察和PubPeer)密切合作非常重要,以更好地了解撤稿模式,并确保实施严格和全面的解决方案b[12]。特别是,索引商和出版商应该促进和支持侦探,以打击虚假出版物。另一个相关的方面是,当有人怀疑期刊的编辑行为不佳或受到损害时,索引机构有权取消期刊的索引。期刊在编辑过程中需要进行许多检查,以检测假论文。 例如,在这个人工智能时代,仍然令人惊讶的是,一些期刊没有严格的筛选系统,包括抄袭、图片修改、数据操纵、有缺陷的统计数据、虚假或不相关的引用、身份不明或虚假的作者,或推荐的审稿人有虚假的账户或明显的利益冲突。这些期刊应该被要求整合这些检查,否则它们就有被取消索引的风险。此外,索引编纂者应该标记出作者变更请求异常高的期刊,因为这可能表明可疑的作者销售[15]。同样,高撤稿率的期刊应该更频繁地进行评估,为它们提供明确的行动计划,以解决现有的问题。如果做不到这一点,这些期刊可能会面临失去影响因子的风险,或者他们的出版物被排除在引文统计之外,以帮助防止引文出售行为。此外,那些发表了太多特刊或超出其范围和目标的文章的期刊应该被取消索引。例如,传播或教育领域的期刊发表关于农业人工智能的特刊是没有意义的!最后但并非最不重要的是,一些索引器在其度量计算方法中留下了漏洞,这应该得到解决。例如,最近的一项调查显示,b谷歌Scholar继续收录伪造论文的引用,即使这些论文已被撤回b[6],这为欺诈机构提供了激励,让他们在预印本服务器上充斥着荒谬的工作,人为地夸大引用。同样,索引员应该与出版商密切合作,以识别涉嫌参与欺诈编辑活动的编辑。事实上,一些编辑很不幸地屈服于允许发表虚假研究报告的诱惑。他们可能会寻求两种类型的潜在利益:(1)从某些欺诈活动中获得直接利益,例如经济补偿,与获得有利编辑决定的作者合作,或访问可以促进在其他期刊上发表的作者或编辑网络;或者(2)通过增加期刊指标的间接优势,因为一些由相互关联的研究人员网络支持的假论文可以带来更高的提交量、出版数量和引用。因此,索引员应该与出版商密切合作,调查这种做法,要么帮助被造纸厂劫持的期刊,要么对积极参与推广此类虚假出版物的编辑采取果断行动。这方面涉及到研究领域的根源,作弊者(其中一些是脆弱的研究人员)在其中运作,即他们的大学所促进的工作环境。许多欺诈行为来自那些痴迷于排名的新兴大学,这并不令人惊讶(也令人难过)。这些大学想要发挥自己的能力,所以他们雇佣研究人员作为“学术雇佣兵”。通常,那些被聘用的研究人员被要求每年发表X篇论文,而不考虑其他因素,比如他们沉重的教学负担和缺乏资源。此外,研究人员受到经济奖励的激励,在领先的学术期刊上发表论文,即使他们的行政和教学责任继续增加。未能发表X篇论文可能会影响他们的工作或晋升机会。在这种情况下,不监督其成员获得出版物的方法的大学应该面临调整其在大学排名表中的位置的后果。例如,除了每位教员的出版物数量外,排名机构还应密切关注这些出版物产生的研究条件,即该大学的研究生态系统是否健康,是否能够维持这种研究生产力?一些维度可以作为质疑非典型研究生产力的危险信号,包括教师的工作量模型、每篇发表论文的博士或博士后人员数量、配备适当设备的实验室数量、某一特定领域每篇发表论文的研究经费数量,以及每篇论文的国际合作者的规模和类型。新兴大学中高产研究人员的增加凸显了用于维持研究生产力的有问题方法的程度[12,20]。同样,论文撤稿率不正常的大学也应该在排名中受到相应的惩罚。用非典型的夸大指标来降级大学,可能会向这些大学发出强烈的信号,促使它们重新考虑自己的研究策略。 总之,研究人员无偿向出版商提供他们的智力产出,如文章,并自愿参与同行评审。作为回报,出版商应保持符合研究完整性和可靠性标准的出版流程。通过确保索引者和排名机构使用的当前指标考虑到欺诈性产出,该系统可以有效地阻止作者参与造纸厂和掠夺性期刊所提倡的不道德行为。如果越来越少的研究人员求助于这些欺诈性的服务,这将使它们在经济上无法生存,甚至可能完全消失。根据United2Act倡议,提高人们对虚假出版物危险的认识至关重要,因为它们
{"title":"Indexers Should Actively Support the Fight Against Paper Mills","authors":"Mohamed L. Seghier","doi":"10.1002/ima.70230","DOIUrl":"https://doi.org/10.1002/ima.70230","url":null,"abstract":"&lt;p&gt;Fake publications, proliferated by paper mills, are a symptom of the modern number-centric academia [&lt;span&gt;1, 2&lt;/span&gt;]. The recent Richardson et al. study found that, while fraudulent papers make up a small portion of all publications, the scale of fraud has increased at a shocking rate, not suspected by many [&lt;span&gt;3&lt;/span&gt;]. Specifically, the conclusion of Richardson et al.'s study is alarming: that is, “the ability to evade interventions is enabling the number of fraudulent publications to grow at a rate far outpacing that of legitimate science” [&lt;span&gt;3&lt;/span&gt;]. All this is reinforced by a degree of impunity that perpetrators enjoy, leaving researchers disillusioned about the authenticity of existing scientific evidence. The question on the lips of all stakeholders is this: if it is still ‘easy’ to publish fake papers, why has academia allowed the problem to persist and undermine scholarly communication? Here, we discuss this problem from a pragmatic perspective, while stressing that this real problem should not be weaponized against science [&lt;span&gt;4&lt;/span&gt;].&lt;/p&gt;&lt;p&gt;Fraud is neither new nor specific to academia. For example, academia has been wrestling with different types of fraudulent activities for decades, including cheating in exams, bogus colleges, forged degrees, and doctored CVs. Fraud in scholarly communication should be examined comprehensively within this broader context. The core underlying issue is that not all researchers have the necessary skills or resources to maintain high levels of research productivity. Consequently, some individuals may resort to unethical practices to achieve high h-index scores and publication counts like those of well-supported researchers at leading institutions, yet without investing equivalent effort. If scholarly communication is vulnerable to corruption [&lt;span&gt;5&lt;/span&gt;], and the consequences for being caught are not harsh, why not game the system?&lt;/p&gt;&lt;p&gt;I believe here is where the problem lies: the whole purpose of fraudulent activities is to inflate research metrics through manipulation and fraud [&lt;span&gt;6&lt;/span&gt;]. Yet, manipulations and fraud are rarely fed back to the system to adjust these metrics. This is why a solution must actively involve the entities that calculate and promote such research metrics: the indexers, like journal indexing and university ranking agencies. Specifically, indexers can impact individuals (and institutions) who engage in fraudulent practices by hurting their research metrics.&lt;/p&gt;&lt;p&gt;Scholarly communication was profoundly shaped by the introduction of research metrics, originally packaged as objective quantitative measures of research quality and impact [&lt;span&gt;7&lt;/span&gt;]. These metrics are widely adopted in academia despite their known limitations and inherent biases [&lt;span&gt;8, 9&lt;/span&gt;]. But academia also knew that these metrics can be manipulated, as they soon become bad metrics (Goodhart's law), making the system vulnerable to corruption (Campbell's law). Yet, aca","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70230","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145316822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Dual-Model Federated Learning for Generalizable Brain Tumor Segmentation 基于自适应双模型联邦学习的泛化脑肿瘤分割
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-12 DOI: 10.1002/ima.70223
Abdul Raheem, Zhen Yang, Malik Abdul Manan, Shahzad Ahmed, Fahad Sabah

Accurate segmentation of brain tumors in magnetic resonance imaging (MRI) is critical for diagnosis, treatment planning, and longitudinal monitoring. The development of robust and generalizable deep learning models for tumor segmentation is hindered by challenges such as data privacy, limited annotations, and domain variability across clinical institutions. To address these issues, we propose a dual-model federated learning model for brain tumor segmentation that enables collaborative model training without sharing patient data. The model employs two specialized architectures: a Multi-Scale Encoder U-Net (MSE-UNet) for fine-grained, multi-resolution feature extraction and a Residual Attention Transpose U-Net (ART-UNet) that leverages residual learning and dual attention mechanisms to enhance contextual sensitivity and robustness under non-IID conditions. To ensure effective learning across distributed, heterogeneous clients, we introduce a Dual-Model Architecture-Aware Aggregation (DAAA) strategy, which performs independent, performance-weighted aggregation of each architecture's updates. The proposed method is evaluated on two benchmark datasets, BraTS 2018 and TCGA-LGG, demonstrating superior performance compared to several state-of-the-art baselines. The model achieves Dice scores of 91.30% and 90.10% on BraTS and TCGA-LGG, respectively, with improved IoU, sensitivity, and boundary precision. Ablation studies confirm that each component, including auxiliary supervision, architectural duality, and adaptive aggregation, contributes significantly to overall performance. Clinically, this framework offers a scalable and privacy-preserving solution that can be integrated into real-world healthcare systems without compromising patient data security. By enabling cross-institutional collaboration and ensuring robust performance across diverse imaging protocols, the proposed model facilitates early and accurate tumor delineation, supporting radiologists in critical decision-making processes such as surgical planning, radiotherapy guidance, and follow-up assessment. This study establishes a foundation for practical deployment in federated clinical environments, particularly within resource-constrained or privacy-sensitive institutions.

磁共振成像(MRI)对脑肿瘤的准确分割对诊断、治疗计划和纵向监测至关重要。数据隐私、有限的注释和跨临床机构的领域可变性等挑战阻碍了用于肿瘤分割的鲁棒性和可泛化深度学习模型的发展。为了解决这些问题,我们提出了一种用于脑肿瘤分割的双模型联邦学习模型,该模型可以在不共享患者数据的情况下进行协作模型训练。该模型采用两种专门的架构:用于细粒度、多分辨率特征提取的多尺度编码器U-Net (MSE-UNet)和利用残差学习和双注意机制来增强非iid条件下上下文敏感性和鲁棒性的剩余注意转置U-Net (ART-UNet)。为了确保跨分布式、异构客户端的有效学习,我们引入了双模型架构感知聚合(DAAA)策略,该策略对每个架构的更新执行独立的、性能加权的聚合。该方法在两个基准数据集(BraTS 2018和TCGA-LGG)上进行了评估,与几个最先进的基线相比,显示出优越的性能。该模型在brat和TCGA-LGG上的Dice得分分别为91.30%和90.10%,IoU、灵敏度和边界精度均有提高。消融研究证实,每个组成部分,包括辅助监督、建筑二元性和自适应聚合,都对整体性能有重要贡献。在临床上,该框架提供了一个可扩展且保护隐私的解决方案,可以集成到现实世界的医疗保健系统中,而不会影响患者数据的安全性。通过促进跨机构合作,确保不同成像方案的强大性能,所提出的模型促进了早期和准确的肿瘤描绘,支持放射科医生在关键决策过程中,如手术计划、放疗指导和随访评估。本研究为联邦临床环境中的实际部署奠定了基础,特别是在资源受限或隐私敏感的机构中。
{"title":"Adaptive Dual-Model Federated Learning for Generalizable Brain Tumor Segmentation","authors":"Abdul Raheem,&nbsp;Zhen Yang,&nbsp;Malik Abdul Manan,&nbsp;Shahzad Ahmed,&nbsp;Fahad Sabah","doi":"10.1002/ima.70223","DOIUrl":"https://doi.org/10.1002/ima.70223","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate segmentation of brain tumors in magnetic resonance imaging (MRI) is critical for diagnosis, treatment planning, and longitudinal monitoring. The development of robust and generalizable deep learning models for tumor segmentation is hindered by challenges such as data privacy, limited annotations, and domain variability across clinical institutions. To address these issues, we propose a dual-model federated learning model for brain tumor segmentation that enables collaborative model training without sharing patient data. The model employs two specialized architectures: a Multi-Scale Encoder U-Net (MSE-UNet) for fine-grained, multi-resolution feature extraction and a Residual Attention Transpose U-Net (ART-UNet) that leverages residual learning and dual attention mechanisms to enhance contextual sensitivity and robustness under non-IID conditions. To ensure effective learning across distributed, heterogeneous clients, we introduce a Dual-Model Architecture-Aware Aggregation (DAAA) strategy, which performs independent, performance-weighted aggregation of each architecture's updates. The proposed method is evaluated on two benchmark datasets, BraTS 2018 and TCGA-LGG, demonstrating superior performance compared to several state-of-the-art baselines. The model achieves Dice scores of 91.30% and 90.10% on BraTS and TCGA-LGG, respectively, with improved IoU, sensitivity, and boundary precision. Ablation studies confirm that each component, including auxiliary supervision, architectural duality, and adaptive aggregation, contributes significantly to overall performance. Clinically, this framework offers a scalable and privacy-preserving solution that can be integrated into real-world healthcare systems without compromising patient data security. By enabling cross-institutional collaboration and ensuring robust performance across diverse imaging protocols, the proposed model facilitates early and accurate tumor delineation, supporting radiologists in critical decision-making processes such as surgical planning, radiotherapy guidance, and follow-up assessment. This study establishes a foundation for practical deployment in federated clinical environments, particularly within resource-constrained or privacy-sensitive institutions.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145316731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Skin Cancer Classification in Dermoscopic Images Using Multi-Scale Feature Map Fusion Based on Deep Learning 基于深度学习的多尺度特征映射融合皮肤镜图像皮肤癌分类
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-10 DOI: 10.1002/ima.70219
Arvind Singh Rajpoot, Rahul Dixit, Anupam Shukla

Skin cancer is a prevalent and potentially life-threatening condition that requires accurate and timely diagnosis. Dermoscopic imaging aids in diagnosis but is often limited by manual interpretation, which can be time–intensive and error–prone. Automated classification systems using convolutional neural networks (CNNs) have shown significant promise in enhancing accuracy and efficiency. To further improve classification performance, we propose a novel deep learning-based Multi–Scale Feature Map Fusion (MFMF) model that extracts and fuses features from multiple convolutional layers. The MFMF module effectively combines these multi–scale features, enabling robust feature capture even with limited datasets. Additionally, a hair removal algorithm is proposed to enhance prediction accuracy through improved image preprocessing. On the HAM10000 dataset, our proposed model achieves an overall accuracy of 96.12%, an AUC of 98.7%, and an F1 Score of 93.29%, along with notable improvements in precision, recall, and sensitivity.

皮肤癌是一种普遍且可能危及生命的疾病,需要准确和及时的诊断。皮肤镜成像有助于诊断,但往往受到人工解释的限制,这可能是耗时且容易出错的。使用卷积神经网络(cnn)的自动分类系统在提高准确性和效率方面显示出显著的前景。为了进一步提高分类性能,我们提出了一种新的基于深度学习的多尺度特征映射融合(MFMF)模型,该模型从多个卷积层中提取和融合特征。MFMF模块有效地结合了这些多尺度特征,即使在有限的数据集上也能实现鲁棒的特征捕获。此外,提出了一种脱毛算法,通过改进图像预处理来提高预测精度。在HAM10000数据集上,我们提出的模型的总体准确率为96.12%,AUC为98.7%,F1分数为93.29%,同时在精度、召回率和灵敏度方面都有显着提高。
{"title":"Skin Cancer Classification in Dermoscopic Images Using Multi-Scale Feature Map Fusion Based on Deep Learning","authors":"Arvind Singh Rajpoot,&nbsp;Rahul Dixit,&nbsp;Anupam Shukla","doi":"10.1002/ima.70219","DOIUrl":"https://doi.org/10.1002/ima.70219","url":null,"abstract":"<div>\u0000 \u0000 <p>Skin cancer is a prevalent and potentially life-threatening condition that requires accurate and timely diagnosis. Dermoscopic imaging aids in diagnosis but is often limited by manual interpretation, which can be time–intensive and error–prone. Automated classification systems using convolutional neural networks (CNNs) have shown significant promise in enhancing accuracy and efficiency. To further improve classification performance, we propose a novel deep learning-based Multi–Scale Feature Map Fusion (MFMF) model that extracts and fuses features from multiple convolutional layers. The MFMF module effectively combines these multi–scale features, enabling robust feature capture even with limited datasets. Additionally, a hair removal algorithm is proposed to enhance prediction accuracy through improved image preprocessing. On the HAM10000 dataset, our proposed model achieves an overall accuracy of 96.12%, an AUC of 98.7%, and an F1 Score of 93.29%, along with notable improvements in precision, recall, and sensitivity.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145272410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Review of Deep Learning for Respiratory Motion Tracking in 2D and 3D Ultrasound Sequences for Image-Guided Therapy 深度学习用于呼吸运动跟踪的二维和三维超声序列图像引导治疗综述
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-07 DOI: 10.1002/ima.70217
Maryam Zebarjadi, Hubert H. Lim

Ultrasound (US) imaging is widely used in image-guided therapies due to its real-time capability, high temporal resolution, cost-effectiveness, safety, and portability. While many US tracking methods exist, most reviews focus on conventional techniques and overlook recent advances in deep learning (DL). Existing DL surveys often emphasize modalities like MRI (Magnetic Resonance Imaging) or CT (Computed Tomography), but US tracking presents unique challenges, such as speckle noise, acoustic shadowing, and occlusions, that require tailored solutions. With increasing interest in automating US-guided procedures, particularly for respiratory-induced motion (RIM) tracking, this review fills a gap by focusing on recent DL-based methods specifically developed for US motion tracking. We conducted a targeted review of state-of-the-art DL techniques, categorizing them by network architecture. Special attention is given to studies using benchmark datasets like CLUST (Challenge on Liver Ultrasound Tracking), which enable standardized performance comparison. DL-based methods demonstrate improved accuracy and robustness in difficult scenarios, such as out-of-plane motion and occlusions, where traditional approaches often underperform. DL has significantly improved the feasibility of markerless US tracking in clinical settings, supporting more automated and reliable image-guided interventions. While conventional methods still offer value, hybrid strategies that integrate DL with traditional tracking or sensor fusion approaches show strong potential. This review complements prior surveys on conventional US tracking, serving as a useful resource for researchers developing robust, automated US-based applications.

超声(US)成像因其实时性、高时间分辨率、成本效益、安全性和便携性而广泛应用于图像引导治疗。虽然存在许多美国跟踪方法,但大多数评论都集中在传统技术上,而忽略了深度学习(DL)的最新进展。现有的深度测量通常强调MRI(磁共振成像)或CT(计算机断层扫描)等模式,但美国跟踪存在独特的挑战,如斑点噪声、声阴影和闭塞,需要量身定制的解决方案。随着对自动化US引导程序的兴趣日益增加,特别是呼吸诱导运动(RIM)跟踪,本文通过关注最近专门为US运动跟踪开发的基于dl的方法来填补空白。我们对最先进的深度学习技术进行了有针对性的回顾,并根据网络架构对它们进行了分类。特别关注使用CLUST(肝脏超声追踪挑战)等基准数据集的研究,这些数据集可以实现标准化的性能比较。基于dl的方法在困难的场景中表现出更高的准确性和鲁棒性,例如平面外运动和遮挡,而传统方法通常表现不佳。DL显著提高了临床环境中无标记US跟踪的可行性,支持更自动化和可靠的图像引导干预。虽然传统方法仍然具有价值,但将深度学习与传统跟踪或传感器融合方法相结合的混合策略显示出强大的潜力。这篇综述补充了先前对传统美国跟踪的调查,为研究人员开发健壮的、自动化的美国应用程序提供了有用的资源。
{"title":"A Review of Deep Learning for Respiratory Motion Tracking in 2D and 3D Ultrasound Sequences for Image-Guided Therapy","authors":"Maryam Zebarjadi,&nbsp;Hubert H. Lim","doi":"10.1002/ima.70217","DOIUrl":"https://doi.org/10.1002/ima.70217","url":null,"abstract":"<p>Ultrasound (US) imaging is widely used in image-guided therapies due to its real-time capability, high temporal resolution, cost-effectiveness, safety, and portability. While many US tracking methods exist, most reviews focus on conventional techniques and overlook recent advances in deep learning (DL). Existing DL surveys often emphasize modalities like MRI (Magnetic Resonance Imaging) or CT (Computed Tomography), but US tracking presents unique challenges, such as speckle noise, acoustic shadowing, and occlusions, that require tailored solutions. With increasing interest in automating US-guided procedures, particularly for respiratory-induced motion (RIM) tracking, this review fills a gap by focusing on recent DL-based methods specifically developed for US motion tracking. We conducted a targeted review of state-of-the-art DL techniques, categorizing them by network architecture. Special attention is given to studies using benchmark datasets like CLUST (Challenge on Liver Ultrasound Tracking), which enable standardized performance comparison. DL-based methods demonstrate improved accuracy and robustness in difficult scenarios, such as out-of-plane motion and occlusions, where traditional approaches often underperform. DL has significantly improved the feasibility of markerless US tracking in clinical settings, supporting more automated and reliable image-guided interventions. While conventional methods still offer value, hybrid strategies that integrate DL with traditional tracking or sensor fusion approaches show strong potential. This review complements prior surveys on conventional US tracking, serving as a useful resource for researchers developing robust, automated US-based applications.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70217","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145271947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Domain Adaptive Medical Segmentation Network Based on Contrastive Learning 基于对比学习的无监督域自适应医学分割网络
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-06 DOI: 10.1002/ima.70210
Siqi Wang, Hao Wu, Xiaosheng Yu, Chengdong Wu

Accurate organ segmentation from magnetic resonance imaging (MRI) or computed tomography (CT) images is essential for surgical planning and decision-making. Traditional fully supervised deep learning methods often exhibit a significant decline in performance when applied to datasets that differ from the training data, thus limiting their clinical applicability. This study proposes a novel segmentation method based on unsupervised domain adaptation, aiming to improve cross-domain segmentation performance without the need for ground truth labels in the target domain. Specifically, our method trains the network with labeled source images and unlabeled target images, introducing a bidirectional feature-prototype contrastive loss to align features across domains, minimizing within-class variations and maximizing between-class variations. To further improve model performance, we propose a prototype-guided pseudo-label fusion module that generates high-quality pseudo-labels for the unlabeled target images between domain prototypes. Experimental results show that our method outperforms other unsupervised domain adaptation segmentation approaches, achieving state-of-the-art performance. Code is available at: https://github.com/WANGSIQII/UDA.git.

从磁共振成像(MRI)或计算机断层扫描(CT)图像中准确分割器官对手术计划和决策至关重要。传统的完全监督深度学习方法在应用于与训练数据不同的数据集时,往往表现出显著的性能下降,从而限制了它们的临床适用性。本文提出了一种新的基于无监督域自适应的分割方法,旨在提高跨域分割的性能,而不需要在目标域中使用地面真值标签。具体来说,我们的方法使用标记的源图像和未标记的目标图像训练网络,引入双向特征-原型对比损失来对齐跨域的特征,最小化类内变化和最大化类间变化。为了进一步提高模型性能,我们提出了一个原型引导的伪标签融合模块,该模块可以为域原型之间未标记的目标图像生成高质量的伪标签。实验结果表明,该方法优于其他无监督域自适应分割方法,达到了最先进的性能。代码可从https://github.com/WANGSIQII/UDA.git获得。
{"title":"Unsupervised Domain Adaptive Medical Segmentation Network Based on Contrastive Learning","authors":"Siqi Wang,&nbsp;Hao Wu,&nbsp;Xiaosheng Yu,&nbsp;Chengdong Wu","doi":"10.1002/ima.70210","DOIUrl":"https://doi.org/10.1002/ima.70210","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate organ segmentation from magnetic resonance imaging (MRI) or computed tomography (CT) images is essential for surgical planning and decision-making. Traditional fully supervised deep learning methods often exhibit a significant decline in performance when applied to datasets that differ from the training data, thus limiting their clinical applicability. This study proposes a novel segmentation method based on unsupervised domain adaptation, aiming to improve cross-domain segmentation performance without the need for ground truth labels in the target domain. Specifically, our method trains the network with labeled source images and unlabeled target images, introducing a bidirectional feature-prototype contrastive loss to align features across domains, minimizing within-class variations and maximizing between-class variations. To further improve model performance, we propose a prototype-guided pseudo-label fusion module that generates high-quality pseudo-labels for the unlabeled target images between domain prototypes. Experimental results show that our method outperforms other unsupervised domain adaptation segmentation approaches, achieving state-of-the-art performance. Code is available at: https://github.com/WANGSIQII/UDA.git.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145271752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fully Automated Glioblastoma Segmentation and Classification in Multispectral Magnetic Resonance Images Based on Level Set and Deep Neural Network 基于水平集和深度神经网络的多谱磁共振图像胶质母细胞瘤全自动分割与分类
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-06 DOI: 10.1002/ima.70198
Asieh Khosravanian, Mehrzad Lotfi, Saeed Mozaffari, Saeed Ayat, Ali Reza Safarpour

Automatic segmentation and classification methods of glioblastomas in magnetic resonance imaging (MRI) scans are essential to overcome the limitations of error-prone manual methods, especially given the intrinsic challenges such as intensity nonuniformity, diverse anatomical brain alterations, and significant variations in tumor shape, size, and location. These complexities pose major hurdles for radiologists in diagnosis and surgical planning, underscoring the critical significance of robust automated solutions. This study presents a novel fully automated approach for segmentation and classification of glioblastoma brain tumors using multi-spectral MRI data. Our proposed framework innovatively integrates two key steps. In the first step, a new level set method is presented for segmentation, which is uniquely enhanced by super-pixel fuzzy entropy-based clustering—a technique designed to effectively handle image inhomogeneities and noise—density peak clustering, and a lattice Boltzmann solver for efficient contour evolution. In the second step, a VGG-16 deep neural network is employed for precise classification. To assess the capability of the proposed method in both segmentation and classification tasks, real T2-weighted and fluid-attenuated inversion recovery magnetic resonance images of glioblastomas from the BraTS 2020 dataset are used simultaneously in a multi-spectral manner. Our segmentation results, evaluated by measuring the Dice coefficient, Jaccard index, sensitivity, specificity, and running time. The mean values (Mean ± Standard deviation) of these metrics are 0.8915 ± 0.0293, 0.8055 ± 0.0478, 0.9535 ± 0.0644, 0.9910 ± 0.0364, 2.2909 ± 0.2597, respectively. Additionally, the average values of accuracy, precision, recall, and F1-score across the fivefold cross-validation of the classification method are 0.9149, 0.9532, 0.9160, and 0.9349, respectively. According to the experiments, our proposed fully automated framework not only achieves superior performance in simultaneous segmentation and classification compared to other state-of-the-art segmentation methods but also offers a robust and efficient solution for clinical applications. While this study demonstrates strong potential, future work will focus on extending the framework for multi-label segmentation of different tumor sub-regions and validating its efficacy on even larger and more diverse clinical datasets.

磁共振成像(MRI)扫描中胶质母细胞瘤的自动分割和分类方法对于克服容易出错的人工方法的局限性至关重要,特别是考虑到其固有的挑战,如强度不均匀、大脑解剖改变的多样性以及肿瘤形状、大小和位置的显著变化。这些复杂性对放射科医生在诊断和手术计划方面构成了主要障碍,强调了强大的自动化解决方案的关键意义。本研究提出了一种新的全自动方法,利用多谱MRI数据对胶质母细胞瘤脑肿瘤进行分割和分类。我们提出的框架创新地整合了两个关键步骤。在第一步中,提出了一种新的水平集分割方法,该方法通过基于超像素模糊熵的聚类(一种有效处理图像不均匀性和噪声密度峰值聚类的技术)和用于有效轮廓演化的晶格玻尔兹曼求解器来增强。第二步,采用VGG-16深度神经网络进行精确分类。为了评估所提出的方法在分割和分类任务中的能力,以多光谱方式同时使用来自BraTS 2020数据集的胶质母细胞瘤的真实t2加权和流体衰减的反演恢复磁共振图像。我们的分割结果,通过测量骰子系数,Jaccard指数,灵敏度,特异性和运行时间来评估。这些指标的平均值(mean±Standard deviation)分别为0.8915±0.0293、0.8055±0.0478、0.9535±0.0644、0.9910±0.0364、2.2909±0.2597。分类方法的五重交叉验证正确率、精密度、召回率和f1得分的平均值分别为0.9149、0.9532、0.9160和0.9349。实验结果表明,我们提出的全自动化框架不仅在同时分割和分类方面取得了比其他最先进的分割方法更好的性能,而且为临床应用提供了一个鲁棒性和高效率的解决方案。虽然这项研究显示出强大的潜力,但未来的工作将集中在扩展不同肿瘤亚区域的多标签分割框架,并在更大、更多样化的临床数据集上验证其有效性。
{"title":"Fully Automated Glioblastoma Segmentation and Classification in Multispectral Magnetic Resonance Images Based on Level Set and Deep Neural Network","authors":"Asieh Khosravanian,&nbsp;Mehrzad Lotfi,&nbsp;Saeed Mozaffari,&nbsp;Saeed Ayat,&nbsp;Ali Reza Safarpour","doi":"10.1002/ima.70198","DOIUrl":"https://doi.org/10.1002/ima.70198","url":null,"abstract":"<div>\u0000 \u0000 <p>Automatic segmentation and classification methods of glioblastomas in magnetic resonance imaging (MRI) scans are essential to overcome the limitations of error-prone manual methods, especially given the intrinsic challenges such as intensity nonuniformity, diverse anatomical brain alterations, and significant variations in tumor shape, size, and location. These complexities pose major hurdles for radiologists in diagnosis and surgical planning, underscoring the critical significance of robust automated solutions. This study presents a novel fully automated approach for segmentation and classification of glioblastoma brain tumors using multi-spectral MRI data. Our proposed framework innovatively integrates two key steps. In the first step, a new level set method is presented for segmentation, which is uniquely enhanced by super-pixel fuzzy entropy-based clustering—a technique designed to effectively handle image inhomogeneities and noise—density peak clustering, and a lattice Boltzmann solver for efficient contour evolution. In the second step, a VGG-16 deep neural network is employed for precise classification. To assess the capability of the proposed method in both segmentation and classification tasks, real T2-weighted and fluid-attenuated inversion recovery magnetic resonance images of glioblastomas from the BraTS 2020 dataset are used simultaneously in a multi-spectral manner. Our segmentation results, evaluated by measuring the Dice coefficient, Jaccard index, sensitivity, specificity, and running time. The mean values (Mean ± Standard deviation) of these metrics are 0.8915 ± 0.0293, 0.8055 ± 0.0478, 0.9535 ± 0.0644, 0.9910 ± 0.0364, 2.2909 ± 0.2597, respectively. Additionally, the average values of accuracy, precision, recall, and F1-score across the fivefold cross-validation of the classification method are 0.9149, 0.9532, 0.9160, and 0.9349, respectively. According to the experiments, our proposed fully automated framework not only achieves superior performance in simultaneous segmentation and classification compared to other state-of-the-art segmentation methods but also offers a robust and efficient solution for clinical applications. While this study demonstrates strong potential, future work will focus on extending the framework for multi-label segmentation of different tumor sub-regions and validating its efficacy on even larger and more diverse clinical datasets.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145271753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable Attention-Enhanced Approach for Multimodal Breast Cancer Diagnosis Across Diverse Imaging Modalities 多模式乳腺癌诊断的可解释的注意力增强方法
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-03 DOI: 10.1002/ima.70209
Uzma Nawaz, Zubair Saeed, Hafiz Muhammad UbaidUllah, Farheen Mirza, Mirza Muzzamil

Early and accurate detection of breast cancer is critical for improving survival rates. This study presents a robust deep learning framework that integrates convolutional and attention-based modules to enhance feature extraction across various imaging modalities. The proposed model is evaluated on four benchmark breast cancer datasets: BreakHis (400×), INbreast, BUSI, and CBIS-DDSM, which capture variations in histopathological, mammographic, and ultrasound images. A stratified fivefold cross-validation strategy was adopted to ensure model generalizability. The proposed approach achieves outstanding classification performance, with accuracies of 98.75% on BreakHis, 99.12% on INbreast, 98.40% on BUSI, and 99.05% on CBIS-DDSM. These results consistently surpass those of traditional CNNs and recent baseline models, such as ResNet50, DenseNet121, EfficientNet-B0, and Vision Transformers, across all datasets. A detailed ablation study confirms the effectiveness of each component in the architecture. A computational cost analysis demonstrates that the proposed model achieves superior accuracy with reduced training epochs and competitive inference times.

乳腺癌的早期和准确检测对于提高生存率至关重要。本研究提出了一个强大的深度学习框架,该框架集成了卷积和基于注意力的模块,以增强各种成像模式的特征提取。该模型在四个基准乳腺癌数据集上进行了评估:BreakHis (400x)、INbreast、BUSI和CBIS-DDSM,这些数据集捕获了组织病理学、乳房x线摄影和超声图像的变化。采用分层五重交叉验证策略,确保模型的通用性。该方法取得了优异的分类性能,在BreakHis上的准确率为98.75%,在INbreast上的准确率为99.12%,在BUSI上的准确率为98.40%,在CBIS-DDSM上的准确率为99.05%。在所有数据集上,这些结果始终优于传统cnn和最近的基线模型,如ResNet50, DenseNet121, EfficientNet-B0和Vision Transformers。详细的消融研究证实了体系结构中每个组件的有效性。计算成本分析表明,该模型在减少训练次数和竞争推理次数的情况下获得了较高的准确率。
{"title":"Explainable Attention-Enhanced Approach for Multimodal Breast Cancer Diagnosis Across Diverse Imaging Modalities","authors":"Uzma Nawaz,&nbsp;Zubair Saeed,&nbsp;Hafiz Muhammad UbaidUllah,&nbsp;Farheen Mirza,&nbsp;Mirza Muzzamil","doi":"10.1002/ima.70209","DOIUrl":"https://doi.org/10.1002/ima.70209","url":null,"abstract":"<p>Early and accurate detection of breast cancer is critical for improving survival rates. This study presents a robust deep learning framework that integrates convolutional and attention-based modules to enhance feature extraction across various imaging modalities. The proposed model is evaluated on four benchmark breast cancer datasets: BreakHis (400×), INbreast, BUSI, and CBIS-DDSM, which capture variations in histopathological, mammographic, and ultrasound images. A stratified fivefold cross-validation strategy was adopted to ensure model generalizability. The proposed approach achieves outstanding classification performance, with accuracies of 98.75% on BreakHis, 99.12% on INbreast, 98.40% on BUSI, and 99.05% on CBIS-DDSM. These results consistently surpass those of traditional CNNs and recent baseline models, such as ResNet50, DenseNet121, EfficientNet-B0, and Vision Transformers, across all datasets. A detailed ablation study confirms the effectiveness of each component in the architecture. A computational cost analysis demonstrates that the proposed model achieves superior accuracy with reduced training epochs and competitive inference times.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70209","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145224072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-Dose Computed Tomography Image Denoising Vision Transformer Model Optimization Using Space State Method 基于空间状态法的低剂量ct图像去噪视觉变压器模型优化
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-01 DOI: 10.1002/ima.70220
Luella Marcos, Paul Babyn, Javad Alirezaie

Low-dose computed tomography (LDCT) is widely used to promote reduction of patient radiation exposure, but the associated increase in image noise poses challenges for diagnostic accuracy. In this study, we propose a Vision Transformer (ViT)-based denoising framework enhanced with a State Space Optimizing Block (SSOB) to improve both image quality and computational efficiency. The SSOB upgrades the multihead self-attention mechanism by reducing spatial redundancy and optimizing contextual feature fusion, thereby strengthening the transformer's ability to capture long-range dependencies and preserve fine anatomical structures under severe noise. Extensive evaluations on randomized and categorized datasets demonstrate that the proposed model consistently outperforms existing state-of-the-art denoising approaches. It achieved the highest average SSIM (up to 6.10% improvement), PSNR values (36.51 ± 0.37 dB on randomized and 36.30 ± 0.36 dB on categorized datasets), and the lowest RMSE, surpassing recent CNN-transformer-based denoising hybrid models by approximately 12%. Intensity profile analysis further confirmed its effectiveness, showing sharper edge transitions and more accurate gray-level distributions across anatomical boundaries, closely aligning with ground truth and retaining subtle diagnostic features often lost in competing models. In addition to improved reconstruction quality, the SSOB-empowered ViT achieved notable computational gains. It delivered the fastest inference (0.42 s per image), highest throughput (2.38 images/s), lowest GPU memory usage (750 MB), and smallest model size (7.6 MB), alongside one of the shortest training times (6.5 h). Compared to legacy architectures, which required up to 16 h of training and substantially more resources, the proposed model offers both accuracy and deployability. Collectively, these findings establish the SSOB as a key component for efficient transformer-based LDCT denoising, addressing memory and convergence challenges while preserving global contextual advantages.

低剂量计算机断层扫描(LDCT)被广泛用于减少患者的辐射暴露,但相关的图像噪声增加对诊断准确性提出了挑战。在这项研究中,我们提出了一种基于视觉变压器(ViT)的去噪框架,并通过状态空间优化块(SSOB)进行增强,以提高图像质量和计算效率。SSOB通过减少空间冗余和优化上下文特征融合来升级多头自注意机制,从而增强变压器捕获远程依赖关系的能力,并在严重噪声下保持良好的解剖结构。对随机和分类数据集的广泛评估表明,所提出的模型始终优于现有的最先进的去噪方法。它实现了最高的平均SSIM(提高了6.10%),PSNR值(随机数据集为36.51±0.37 dB,分类数据集为36.30±0.36 dB)和最低的RMSE,比最近基于cnn -变压器的去噪混合模型提高了约12%。强度剖面分析进一步证实了其有效性,显示出更清晰的边缘过渡和更准确的跨解剖边界的灰度分布,与基础事实紧密一致,并保留了竞争模型中经常丢失的微妙诊断特征。除了提高重建质量外,ssob支持的ViT还获得了显著的计算增益。它提供了最快的推理(每张图像0.42秒),最高的吞吐量(2.38张图像/秒),最低的GPU内存使用(750 MB)和最小的模型大小(7.6 MB),以及最短的训练时间(6.5小时)之一。与需要长达16小时的培训和更多资源的遗留体系结构相比,所提出的模型提供了准确性和可部署性。总的来说,这些发现表明SSOB是基于变压器的高效LDCT去噪的关键组件,在保持全局上下文优势的同时解决了记忆和收敛问题。
{"title":"Low-Dose Computed Tomography Image Denoising Vision Transformer Model Optimization Using Space State Method","authors":"Luella Marcos,&nbsp;Paul Babyn,&nbsp;Javad Alirezaie","doi":"10.1002/ima.70220","DOIUrl":"https://doi.org/10.1002/ima.70220","url":null,"abstract":"<p>Low-dose computed tomography (LDCT) is widely used to promote reduction of patient radiation exposure, but the associated increase in image noise poses challenges for diagnostic accuracy. In this study, we propose a Vision Transformer (ViT)-based denoising framework enhanced with a State Space Optimizing Block (SSOB) to improve both image quality and computational efficiency. The SSOB upgrades the multihead self-attention mechanism by reducing spatial redundancy and optimizing contextual feature fusion, thereby strengthening the transformer's ability to capture long-range dependencies and preserve fine anatomical structures under severe noise. Extensive evaluations on randomized and categorized datasets demonstrate that the proposed model consistently outperforms existing state-of-the-art denoising approaches. It achieved the highest average SSIM (up to 6.10% improvement), PSNR values (36.51 ± 0.37 dB on randomized and 36.30 ± 0.36 dB on categorized datasets), and the lowest RMSE, surpassing recent CNN-transformer-based denoising hybrid models by approximately 12%. Intensity profile analysis further confirmed its effectiveness, showing sharper edge transitions and more accurate gray-level distributions across anatomical boundaries, closely aligning with ground truth and retaining subtle diagnostic features often lost in competing models. In addition to improved reconstruction quality, the SSOB-empowered ViT achieved notable computational gains. It delivered the fastest inference (0.42 s per image), highest throughput (2.38 images/s), lowest GPU memory usage (750 MB), and smallest model size (7.6 MB), alongside one of the shortest training times (6.5 h). Compared to legacy architectures, which required up to 16 h of training and substantially more resources, the proposed model offers both accuracy and deployability. Collectively, these findings establish the SSOB as a key component for efficient transformer-based LDCT denoising, addressing memory and convergence challenges while preserving global contextual advantages.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70220","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145196274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Problem-Oriented Strategy for Diabetic Retinopathy Identification 糖尿病视网膜病变识别的问题导向策略
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-09-25 DOI: 10.1002/ima.70216
Mahdi Hadef, Said Yacine Boulahia, Abdenour Amamra

Diabetic retinopathy is a prevalent and sight-threatening complication of diabetes that affects individuals worldwide. Effectively addressing this condition requires adapting approaches to the specific characteristics of retinal images. Existing works often tackle the diagnostic challenge without focusing on a specific aspect. In contrast, our study introduces a new problem-oriented strategy that addresses key gaps in diabetic retinopathy using three novel, tailored approaches. First, to address the underexploitation of high-resolution retinal images, we propose a resolution-preserving, data-based approach that employs patch-based analysis without downscaling while also mitigating data scarcity and imbalance. Second, inspired by real-world clinical practice, we develop a symptoms-based approach that explicitly segments multiple key pathological indicators (blood vessels, exudates, and microaneurysms) and then uses them to guide the classification network. Third, we propose a hierarchical approach that decomposes the multi-stage classification task into multiple hierarchical binary classifications, enabling more specialized feature learning and informed decision-making across different severity levels. Evaluations on both EyePACS and APTOS benchmark datasets showcased superior performance, surpassing or matching contemporary state-of-the-art results. These outcomes demonstrate the effectiveness of our proposed approaches and underscore the strategy's potential to improve diabetic retinopathy diagnosis.

糖尿病视网膜病变是糖尿病的一种常见且威胁视力的并发症,影响着全世界的个体。有效地解决这种情况需要适应视网膜图像的具体特点的方法。现有的工作往往解决诊断的挑战,而不是专注于一个特定的方面。相比之下,我们的研究引入了一种新的问题导向策略,使用三种新颖的量身定制的方法来解决糖尿病视网膜病变的关键空白。首先,为了解决高分辨率视网膜图像开发不足的问题,我们提出了一种基于分辨率保持的基于数据的方法,该方法采用基于补丁的分析,而不缩小规模,同时也减轻了数据的稀缺性和不平衡性。其次,受现实世界临床实践的启发,我们开发了一种基于症状的方法,明确分割多个关键病理指标(血管、渗出物和微动脉瘤),然后使用它们来指导分类网络。第三,我们提出了一种分层方法,将多阶段分类任务分解为多个分层二元分类,从而实现更专业的特征学习和跨不同严重级别的明智决策。对EyePACS和APTOS基准数据集的评估显示出卓越的性能,超过或匹配当代最先进的结果。这些结果证明了我们提出的方法的有效性,并强调了该策略在改善糖尿病视网膜病变诊断方面的潜力。
{"title":"Problem-Oriented Strategy for Diabetic Retinopathy Identification","authors":"Mahdi Hadef,&nbsp;Said Yacine Boulahia,&nbsp;Abdenour Amamra","doi":"10.1002/ima.70216","DOIUrl":"https://doi.org/10.1002/ima.70216","url":null,"abstract":"<div>\u0000 \u0000 <p>Diabetic retinopathy is a prevalent and sight-threatening complication of diabetes that affects individuals worldwide. Effectively addressing this condition requires adapting approaches to the specific characteristics of retinal images. Existing works often tackle the diagnostic challenge without focusing on a specific aspect. In contrast, our study introduces a new problem-oriented strategy that addresses key gaps in diabetic retinopathy using three novel, tailored approaches. First, to address the underexploitation of high-resolution retinal images, we propose a resolution-preserving, data-based approach that employs patch-based analysis without downscaling while also mitigating data scarcity and imbalance. Second, inspired by real-world clinical practice, we develop a symptoms-based approach that explicitly segments multiple key pathological indicators (blood vessels, exudates, and microaneurysms) and then uses them to guide the classification network. Third, we propose a hierarchical approach that decomposes the multi-stage classification task into multiple hierarchical binary classifications, enabling more specialized feature learning and informed decision-making across different severity levels. Evaluations on both EyePACS and APTOS benchmark datasets showcased superior performance, surpassing or matching contemporary state-of-the-art results. These outcomes demonstrate the effectiveness of our proposed approaches and underscore the strategy's potential to improve diabetic retinopathy diagnosis.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 5","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145146524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FANet: Feature Aggregation Network With Dual Encoders for Fundus Retinal Vessel Segmentation 基于双编码器的特征聚合网络的眼底视网膜血管分割
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-09-23 DOI: 10.1002/ima.70213
Linfeng Kong, Yun Wu

Fundus retinal vessel segmentation is important for assisting in the diagnosis and monitoring of related ophthalmic diseases. Due to the fact that fundus retinal vessels have the characteristics of both local complex topology (e.g., branching structure) and global wide-area distribution, to be able to simultaneously take into account the local detail information and global context information and fully fuse the two kinds of information, this paper proposes a feature aggregation network (FANet) with dual encoders for fundus retinal vessel segmentation. Firstly, we employ the convolutional neural network (CNN) and Transformer to construct dual path encoders for extracting local detail information and global context information, respectively. Among them, to enhance the feature expression ability of the feed-forward network (FFN) in the Transformer block, we design the feature-optimized FFN (F3N). Next, we introduce the dual path feature aggregation (DPFA) module to fully fuse the feature information extracted from the CNN and Transformer paths. Finally, we introduce the multi-scale feature aggregation (MFA) module to obtain rich multi-scale information and adapt to the scale variation of vessels. Experimental results on CHASE-DB1, DRIVE, and STARE datasets demonstrate that FANet outperforms the existing mainstream segmentation methods in the comprehensive performance comparison of multiple evaluation metrics, verifying its effectiveness.

眼底视网膜血管分割对辅助相关眼科疾病的诊断和监测具有重要意义。鉴于眼底视网膜血管具有局部复杂拓扑结构(如分支结构)和全局广域分布的特点,为了能够同时兼顾局部细节信息和全局上下文信息,并将两者充分融合,本文提出了一种双编码器特征聚合网络(FANet)用于眼底视网膜血管分割。首先,我们利用卷积神经网络(CNN)和Transformer构造双路径编码器,分别提取局部细节信息和全局上下文信息。其中,为了增强Transformer块中前馈网络(FFN)的特征表达能力,我们设计了特征优化的FFN (F3N)。接下来,我们引入双路径特征聚合(DPFA)模块,以充分融合从CNN和Transformer路径中提取的特征信息。最后,我们引入了多尺度特征聚合(MFA)模块,以获得丰富的多尺度信息,适应船舶的尺度变化。在CHASE-DB1、DRIVE和STARE数据集上的实验结果表明,在多个评价指标的综合性能比较中,FANet优于现有主流分割方法,验证了其有效性。
{"title":"FANet: Feature Aggregation Network With Dual Encoders for Fundus Retinal Vessel Segmentation","authors":"Linfeng Kong,&nbsp;Yun Wu","doi":"10.1002/ima.70213","DOIUrl":"https://doi.org/10.1002/ima.70213","url":null,"abstract":"<div>\u0000 \u0000 <p>Fundus retinal vessel segmentation is important for assisting in the diagnosis and monitoring of related ophthalmic diseases. Due to the fact that fundus retinal vessels have the characteristics of both local complex topology (e.g., branching structure) and global wide-area distribution, to be able to simultaneously take into account the local detail information and global context information and fully fuse the two kinds of information, this paper proposes a feature aggregation network (FANet) with dual encoders for fundus retinal vessel segmentation. Firstly, we employ the convolutional neural network (CNN) and Transformer to construct dual path encoders for extracting local detail information and global context information, respectively. Among them, to enhance the feature expression ability of the feed-forward network (FFN) in the Transformer block, we design the feature-optimized FFN (F3N). Next, we introduce the dual path feature aggregation (DPFA) module to fully fuse the feature information extracted from the CNN and Transformer paths. Finally, we introduce the multi-scale feature aggregation (MFA) module to obtain rich multi-scale information and adapt to the scale variation of vessels. Experimental results on CHASE-DB1, DRIVE, and STARE datasets demonstrate that FANet outperforms the existing mainstream segmentation methods in the comprehensive performance comparison of multiple evaluation metrics, verifying its effectiveness.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 5","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145172028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Imaging Systems and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1