{"title":"MHWT: Wide-range attention modeling using window transformer for multi-modal MRI reconstruction","authors":"Qiuyi Han, Hongwei Du","doi":"10.1016/j.mri.2025.110362","DOIUrl":null,"url":null,"abstract":"<div><div>The Swin Transformer, with its window-based attention mechanism, demonstrates strong feature modeling capabilities. However, it struggles with high-resolution feature maps due to its fixed window size, particularly when capturing long-range dependencies in magnetic resonance image reconstruction tasks. To overcome this, we propose a novel multi-modal hybrid window attention Transformer (MHWT) that introduces a retractable attention mechanism combined with shape-alternating window design. This approach expands attention coverage while maintaining computational efficiency. Additionally, we employ a variable and shifted window attention strategy to model both local and global dependencies more flexibly. Improvements to the Transformer encoder, including adjustments to normalization and attention score computation, enhance training stability and reconstruction performance. Experimental results on multiple public datasets show that our method outperforms state-of-the-art approaches in both single-modal and multi-modal scenarios, demonstrating superior image reconstruction ability and adaptability. The code is publicly available at <span><span>https://github.com/EnieHan/MHWT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18165,"journal":{"name":"Magnetic resonance imaging","volume":"118 ","pages":"Article 110362"},"PeriodicalIF":2.1000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Magnetic resonance imaging","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0730725X25000463","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
The Swin Transformer, with its window-based attention mechanism, demonstrates strong feature modeling capabilities. However, it struggles with high-resolution feature maps due to its fixed window size, particularly when capturing long-range dependencies in magnetic resonance image reconstruction tasks. To overcome this, we propose a novel multi-modal hybrid window attention Transformer (MHWT) that introduces a retractable attention mechanism combined with shape-alternating window design. This approach expands attention coverage while maintaining computational efficiency. Additionally, we employ a variable and shifted window attention strategy to model both local and global dependencies more flexibly. Improvements to the Transformer encoder, including adjustments to normalization and attention score computation, enhance training stability and reconstruction performance. Experimental results on multiple public datasets show that our method outperforms state-of-the-art approaches in both single-modal and multi-modal scenarios, demonstrating superior image reconstruction ability and adaptability. The code is publicly available at https://github.com/EnieHan/MHWT.
期刊介绍:
Magnetic Resonance Imaging (MRI) is the first international multidisciplinary journal encompassing physical, life, and clinical science investigations as they relate to the development and use of magnetic resonance imaging. MRI is dedicated to both basic research, technological innovation and applications, providing a single forum for communication among radiologists, physicists, chemists, biochemists, biologists, engineers, internists, pathologists, physiologists, computer scientists, and mathematicians.