Pub Date : 2025-12-23DOI: 10.1016/j.infrared.2025.106352
Jie Liu , Linhan Li , Yuhan Li , Qingwu Duan , Juan Yue , Shijing Hao , Sili Gao
Semantic segmentation is vital for applications like autonomous driving and smart cities. Infrared imaging provides complementary information in complex environments, improving scene understanding. However, existing datasets rarely include multiple spectral bands simultaneously. This paper introduces a novel street-scene semantic segmentation dataset containing four spectral bands: visible, short-wave infrared, mid-wave infrared, and long-wave infrared. The data were collected using a custom multi-band camera equipped with high-resolution sensors. Mid-wave and long-wave images were captured using cooled infrared detectors, allowing high dynamic range imaging and enhanced sensitivity to fine details. Semantic annotations were created through lightweight preprocessing and manual labeling, covering three common object classes in street scenes. We evaluate multiple mainstream segmentation models using different band combinations. Results show that fusing all four bands significantly improves segmentation accuracy. This dataset provides a valuable resource for advancing multi-band image segmentation in challenging real-world scenarios.
{"title":"City-4Band: A four-band urban street scene dataset for semantic segmentation","authors":"Jie Liu , Linhan Li , Yuhan Li , Qingwu Duan , Juan Yue , Shijing Hao , Sili Gao","doi":"10.1016/j.infrared.2025.106352","DOIUrl":"10.1016/j.infrared.2025.106352","url":null,"abstract":"<div><div>Semantic segmentation is vital for applications like autonomous driving and smart cities. Infrared imaging provides complementary information in complex environments, improving scene understanding. However, existing datasets rarely include multiple spectral bands simultaneously. This paper introduces a novel street-scene semantic segmentation dataset containing four spectral bands: visible, short-wave infrared, mid-wave infrared, and long-wave infrared. The data were collected using a custom multi-band camera equipped with high-resolution sensors. Mid-wave and long-wave images were captured using cooled infrared detectors, allowing high dynamic range imaging and enhanced sensitivity to fine details. Semantic annotations were created through lightweight preprocessing and manual labeling, covering three common object classes in street scenes. We evaluate multiple mainstream segmentation models using different band combinations. Results show that fusing all four bands significantly improves segmentation accuracy. This dataset provides a valuable resource for advancing multi-band image segmentation in challenging real-world scenarios.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"153 ","pages":"Article 106352"},"PeriodicalIF":3.4,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-22DOI: 10.1016/j.infrared.2025.106330
Tingting Yao, Meiwen Zhu, Xinyu Gu, Wanting Luo, Qing Hu
Infrared small target detection technology has been widely used in both military and civil fields. Although numerous approaches have been proposed, the target detection accuracy is still affected by the poor resolution and insufficient detailed information of infrared images. Therefore, existing approaches often face the problems of false and missed detections. In this paper, a detailed information compensation and mask guided network has been proposed to solve the above problem. First, to compensate for more information of targets across different scales, a target detailed information compensation module is designed. Both local and non-local information of the target have been captured, hence the information loss caused by the upsampling operation during the multi-scale feature fusion stage could be restored. Furthermore, a dynamic boundary information extraction module is designed. More high-frequency texture information of the target is extracted in the shallow layer and incorporated in the deeper layer, thus the information loss caused by the continuous pooling operations could be compensated. Finally, to further improve the detection accuracy of the proposed network, the target masks are generated based on the ground truth, and a mask based loss constraint has been devised during the network parameter training process. Qualitative and quantitative comparison experiments conducted on SIRST and ISATD datasets demonstrate that the proposed network could achieve superior detection accuracy compared to state-of-the-art ones.
{"title":"DICMG-Net: Detailed information compensation and mask guided network for infrared small target detection","authors":"Tingting Yao, Meiwen Zhu, Xinyu Gu, Wanting Luo, Qing Hu","doi":"10.1016/j.infrared.2025.106330","DOIUrl":"10.1016/j.infrared.2025.106330","url":null,"abstract":"<div><div>Infrared small target detection technology has been widely used in both military and civil fields. Although numerous approaches have been proposed, the target detection accuracy is still affected by the poor resolution and insufficient detailed information of infrared images. Therefore, existing approaches often face the problems of false and missed detections. In this paper, a detailed information compensation and mask guided network has been proposed to solve the above problem. First, to compensate for more information of targets across different scales, a target detailed information compensation module is designed. Both local and non-local information of the target have been captured, hence the information loss caused by the upsampling operation during the multi-scale feature fusion stage could be restored. Furthermore, a dynamic boundary information extraction module is designed. More high-frequency texture information of the target is extracted in the shallow layer and incorporated in the deeper layer, thus the information loss caused by the continuous pooling operations could be compensated. Finally, to further improve the detection accuracy of the proposed network, the target masks are generated based on the ground truth, and a mask based loss constraint has been devised during the network parameter training process. Qualitative and quantitative comparison experiments conducted on SIRST and ISATD datasets demonstrate that the proposed network could achieve superior detection accuracy compared to state-of-the-art ones.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"153 ","pages":"Article 106330"},"PeriodicalIF":3.4,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-22DOI: 10.1016/j.infrared.2025.106338
Shuai Qiu, Shenao Qin, Ying Cao, Shulin Han, Qikai Wang, Yuzhi Song, Chuan-Kui Wang, Lei Cai
Smaller ionization energy and electronegativity enable the emission range of Eco-friendliness Sn-based perovskite light-emitting diodes (PeLEDs) researched to near-infrared range, which exhibits a wide range of applications in night vision, biomedicine, and communications. Nevertheless, the oxidizability of Sn2+ and the rapid crystallization rate of Sn-based perovskites lead to poor film quality, thus leading to diminished efficiency in tin-based PeLEDs. We developed effective Near-infrared (NIR) PeLEDs based on FA0.875Cs0.125SnI3 by using D-serine benzyl ester hydrochloride (D-SBEHC) as an additive, which has significant steric hindrance and multifunctional groups. The hydrogen bonding and coordination between D-SBEHC and FA0.875Cs0.125SnI3 effectively diminish the crystallization rate of the perovskite, inhibit the oxidation of Sn2+, and prevent the production of defects. The perovskite films modified by D-SBEHC exhibit nearly threefold increase inphotoluminescence quantum yield. Finally, we fabricated an efficient and stable PeLED with a peak at 903 nm, showing an external quantum efficiency (EQE) of 3.99% (eight times that of the control device) and a maximum radiance of 31 W/sr/m2.
{"title":"Multifunctional additive collaborative strategy for efficient near-infrared tin-based perovskite light-emitting diodes","authors":"Shuai Qiu, Shenao Qin, Ying Cao, Shulin Han, Qikai Wang, Yuzhi Song, Chuan-Kui Wang, Lei Cai","doi":"10.1016/j.infrared.2025.106338","DOIUrl":"10.1016/j.infrared.2025.106338","url":null,"abstract":"<div><div>Smaller ionization energy and electronegativity enable the emission range of Eco-friendliness Sn-based perovskite light-emitting diodes (PeLEDs) researched to near-infrared range, which exhibits a wide range of applications in night vision, biomedicine, and communications. Nevertheless, the oxidizability of Sn<sup>2+</sup> and the rapid crystallization rate of Sn-based perovskites lead to poor film quality, thus leading to diminished efficiency in tin-based PeLEDs. We developed effective Near-infrared (NIR) PeLEDs based on FA<sub>0.875</sub>Cs<sub>0.125</sub>SnI<sub>3</sub> by using D-serine benzyl ester hydrochloride (D-SBEHC) as an additive, which has significant steric hindrance and multifunctional groups. The hydrogen bonding and coordination between D-SBEHC and FA<sub>0.875</sub>Cs<sub>0.125</sub>SnI<sub>3</sub> effectively diminish the crystallization rate of the perovskite, inhibit the oxidation of Sn<sup>2+</sup>, and prevent the production of defects. The perovskite films modified by D-SBEHC exhibit nearly threefold increase inphotoluminescence quantum yield. Finally, we fabricated an efficient and stable PeLED with a peak at 903 nm, showing an external quantum efficiency (EQE) of 3.99% (eight times that of the control device) and a maximum radiance of 31 W/sr/m<sup>2</sup>.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"153 ","pages":"Article 106338"},"PeriodicalIF":3.4,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-22DOI: 10.1016/j.infrared.2025.106350
Yingjue Cao , Chengjin Wu , Xiangjun Li , Le Zhang , Jining Li , Dexian Yan
With the rapid advancement of artificial intelligence, deep learning offers an efficient solution for the design of complex metamaterials. This study proposes a design framework for two-dimensional terahertz metamaterial absorbers based on deep learning-surrogate optimization. A convolutional neural network is developed to encode metamaterial structures as 6 × 6 × 1 grayscale images and predict their absorption spectra at 251 uniformly spaced frequency points within the 12–15 THz range. The model achieves high prediction accuracy, with a loss value of 0.0312 and root mean square error of 0.249 on both training and test sets. To enable inverse design, a single-objective optimization model is constructed and integrated with a surrogate optimization algorithm. The optimization is performed by categorizing structures based on the number of metal blocks, systematically exploring all possible configurations, and iteratively identifying the optimal solution. The predicted absorption performance shows strong agreement with full-wave simulation results, confirming the model’s reliability. By integrating deep learning with surrogate optimization, this approach forms a closed-loop framework for both forward prediction and inverse design. It significantly reduces the computational cost of parameter tuning and enables a scalable, automated design process for terahertz metamaterials, offering a powerful strategy for advanced electromagnetic device development.
{"title":"A deep learning-surrogate optimization strategy for the design of two-dimensional terahertz metamaterial absorbers","authors":"Yingjue Cao , Chengjin Wu , Xiangjun Li , Le Zhang , Jining Li , Dexian Yan","doi":"10.1016/j.infrared.2025.106350","DOIUrl":"10.1016/j.infrared.2025.106350","url":null,"abstract":"<div><div>With the rapid advancement of artificial intelligence, deep learning offers an efficient solution for the design of complex metamaterials. This study proposes a design framework for two-dimensional terahertz metamaterial absorbers based on deep learning-surrogate optimization. A convolutional neural network is developed to encode metamaterial structures as 6 × 6 × 1 grayscale images and predict their absorption spectra at 251 uniformly spaced frequency points within the 12–15 THz range. The model achieves high prediction accuracy, with a loss value of 0.0312 and root mean square error of 0.249 on both training and test sets. To enable inverse design, a single-objective optimization model is constructed and integrated with a surrogate optimization algorithm. The optimization is performed by categorizing structures based on the number of metal blocks, systematically exploring all possible configurations, and iteratively identifying the optimal solution. The predicted absorption performance shows strong agreement with full-wave simulation results, confirming the model’s reliability. By integrating deep learning with surrogate optimization, this approach forms a closed-loop framework for both forward prediction and inverse design. It significantly reduces the computational cost of parameter tuning and enables a scalable, automated design process for terahertz metamaterials, offering a powerful strategy for advanced electromagnetic device development.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"153 ","pages":"Article 106350"},"PeriodicalIF":3.4,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-21DOI: 10.1016/j.infrared.2025.106335
Zahra Sofastaei, Akbar Eshaghi, Hossein Jamali, Hossien Zabolian
In this work, germanium-carbon (Ge1–xCx) coatings were deposited on zinc sulfide substrates using magnetron sputtering with Ar and CH4 gases as precursors. The chemical bonding and optical properties of these films were investigated as a function of substrate temperature (Ts) in the range of 150 °C to 300 °C. Fourier Transform Infrared (FTIR) spectroscopy, X-ray Diffraction (XRD), Raman Spectroscopy (RS), Field Emission Scanning Electron Microscopy (FESEM), and environmental tests were employed to evaluate and characterize the coatings. It was found that the coatings possessed both amorphous and crystalline structures, and as the temperature increased, the coating structure shifted towards a more crystalline form. Structural investigations revealed that the Ge1–xCx coatings are a composite material consisting of germanium and carbon. The intensity of the C–C and Ge-Ge bonds increased from approximately 1400 to 4000 (a.u.) with rising temperature. Furthermore, the coatings exhibited a smooth surface free from any surface cavities. Additionally, the transmittance percentage of the coatings decreased by 7 % with increasing substrate temperature.
{"title":"Effect of deposition temperatures on properties of germanium-carbon coatings prepared by a RF reactive magnetron sputtering method","authors":"Zahra Sofastaei, Akbar Eshaghi, Hossein Jamali, Hossien Zabolian","doi":"10.1016/j.infrared.2025.106335","DOIUrl":"10.1016/j.infrared.2025.106335","url":null,"abstract":"<div><div>In this work, germanium-carbon (Ge<sub>1–x</sub>C<sub>x</sub>) coatings were deposited on zinc sulfide substrates using magnetron sputtering with Ar and CH<sub>4</sub> gases as precursors. The chemical bonding and optical properties of these films were investigated as a function of substrate temperature (T<sub>s</sub>) in the range of 150 °C to 300 °C. Fourier Transform Infrared (FTIR) spectroscopy, X-ray Diffraction (XRD), Raman Spectroscopy (RS), Field Emission Scanning Electron Microscopy (FESEM), and environmental tests were employed to evaluate and characterize the coatings. It was found that the coatings possessed both amorphous and crystalline structures, and as the temperature increased, the coating structure shifted towards a more crystalline form. Structural investigations revealed that the Ge<sub>1–x</sub>C<sub>x</sub> coatings are a composite material consisting of germanium and carbon. The intensity of the C–C and Ge-Ge bonds increased from approximately 1400 to 4000 (a.u.) with rising temperature. Furthermore, the coatings exhibited a smooth surface free from any surface cavities. Additionally, the transmittance percentage of the coatings decreased by 7 % with increasing substrate temperature.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"153 ","pages":"Article 106335"},"PeriodicalIF":3.4,"publicationDate":"2025-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-21DOI: 10.1016/j.infrared.2025.106321
Xiong Li , Xinlin Xiong , Yawen Guo , Wenwei Wang , Bojin Yang , Xiangguo He , Yande Liu
Machine vision or spectral analysis alone cannot simultaneously detect surface defects and internal soluble solids content (SSC) in kumquats, limiting efficient postharvest quality assessment. This study integrated hyperspectral imaging (HSI), two-dimensional correlation spectroscopy (2D-COS), and optimized segmentation/chemometric models to bridge this gap. Using 287 kumquat samples (normal/rotten/bruised/green), HSI (400–1000 nm) was acquired with black-white correction and ROI extraction. 2D-COS qualitatively differentiated defect types via synchronous-asynchronous spectral contours, revealing distinct biochemical pathways (e.g., pectin degradation in rot, flavonoid synthesis in bruising). An improved Morphological-Canny Segmentation (IMS) algorithm—combining PCA preprocessing and marker correction—achieved 95 % defect detection accuracy, outperforming Otsu (90 %) and Watershed (91.7 %). For SSC prediction, PLS/LS-SVR models used SG/StandardScaler-preprocessed spectra, with 30 characteristic wavelengths selected via CARS/UVE/SPA intersection. The LS-SVR model (SG + StandardScaler) yielded optimal performance: R2 = 0.888, RMSEP = 1.529 (test set) and R2 = 0.960, RMSEP = 0.696 (validation with 108 normal samples). This work demonstrates HSI’s feasibility for simultaneous surface defect and internal SSC detection in kumquats, providing a reliable tool for small-fruited citrus postharvest grading and quality control.
{"title":"Simultaneous detection of surface defects and prediction of internal SSC of kumquats based on hyperspectral imaging technology","authors":"Xiong Li , Xinlin Xiong , Yawen Guo , Wenwei Wang , Bojin Yang , Xiangguo He , Yande Liu","doi":"10.1016/j.infrared.2025.106321","DOIUrl":"10.1016/j.infrared.2025.106321","url":null,"abstract":"<div><div>Machine vision or spectral analysis alone cannot simultaneously detect surface defects and internal soluble solids content (SSC) in kumquats, limiting efficient postharvest quality assessment. This study integrated hyperspectral imaging (HSI), two-dimensional correlation spectroscopy (2D-COS), and optimized segmentation/chemometric models to bridge this gap. Using 287 kumquat samples (normal/rotten/bruised/green), HSI (400–1000 nm) was acquired with black-white correction and ROI extraction. 2D-COS qualitatively differentiated defect types via synchronous-asynchronous spectral contours, revealing distinct biochemical pathways (e.g., pectin degradation in rot, flavonoid synthesis in bruising). An improved Morphological-Canny Segmentation (IMS) algorithm—combining PCA preprocessing and marker correction—achieved 95 % defect detection accuracy, outperforming Otsu (90 %) and Watershed (91.7 %). For SSC prediction, PLS/LS-SVR models used SG/StandardScaler-preprocessed spectra, with 30 characteristic wavelengths selected via CARS/UVE/SPA intersection. The LS-SVR model (SG + StandardScaler) yielded optimal performance: R<sup>2</sup> = 0.888, RMSEP = 1.529 (test set) and R<sup>2</sup> = 0.960, RMSEP = 0.696 (validation with 108 normal samples). This work demonstrates HSI’s feasibility for simultaneous surface defect and internal SSC detection in kumquats, providing a reliable tool for small-fruited citrus postharvest grading and quality control.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"153 ","pages":"Article 106321"},"PeriodicalIF":3.4,"publicationDate":"2025-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-21DOI: 10.1016/j.infrared.2025.106346
Runxuan An , Yanyin Guo , Ziyu Wang , Zhuoyi Zhao , Chuiyi Deng , Junwei Li
In multispectral object detection, the complementary nature of infrared and visible images remains underexploited due to perceptual differences and spatial misalignment between modalities, posing significant challenges to accurate detection. We propose MRT-DETR (Multispectral RT-DETR), an end-to-end multispectral real-time detection framework that addresses weak alignment and complex lighting conditions by combining brightness-aware weighting with deformable alignment. Built on the RT-DETR backbone, our method introduces an Early-stage Deformable Alignment (EDA) module that learns attention-guided offsets at shallow layers to explicitly align infrared features to visible features in the spatial domain. Additionally, a Dual-Branch Brightness Weighting (DBW) module derives patch-wise fusion preference maps from global and local illumination cues of the visible image, enabling spatially adaptive modality selection. Furthermore, we design a Hierarchical Cross-modal Attention Fusion (HCAF) module, which performs progressive, stage-wise cross-attention and self-attention to refine joint representations and enhance discriminative cues. Extensive experiments on LLVIP, FLIR, and M3FD demonstrate that MRT-DETR achieves substantial improvements in detection accuracy with a modest parameter count, maintaining robust performance under weakly aligned conditions and challenging illumination. Codes and data are available at https://github.com/arx48/MRT-DETR.git.
{"title":"MRT-DETR: A robust visible–infrared object detector with adaptive cross-modal feature fusion","authors":"Runxuan An , Yanyin Guo , Ziyu Wang , Zhuoyi Zhao , Chuiyi Deng , Junwei Li","doi":"10.1016/j.infrared.2025.106346","DOIUrl":"10.1016/j.infrared.2025.106346","url":null,"abstract":"<div><div>In multispectral object detection, the complementary nature of infrared and visible images remains underexploited due to perceptual differences and spatial misalignment between modalities, posing significant challenges to accurate detection. We propose MRT-DETR (Multispectral RT-DETR), an end-to-end multispectral real-time detection framework that addresses weak alignment and complex lighting conditions by combining brightness-aware weighting with deformable alignment. Built on the RT-DETR backbone, our method introduces an Early-stage Deformable Alignment (EDA) module that learns attention-guided offsets at shallow layers to explicitly align infrared features to visible features in the spatial domain. Additionally, a Dual-Branch Brightness Weighting (DBW) module derives patch-wise fusion preference maps from global and local illumination cues of the visible image, enabling spatially adaptive modality selection. Furthermore, we design a Hierarchical Cross-modal Attention Fusion (HCAF) module, which performs progressive, stage-wise cross-attention and self-attention to refine joint representations and enhance discriminative cues. Extensive experiments on LLVIP, FLIR, and M3FD demonstrate that MRT-DETR achieves substantial improvements in detection accuracy with a modest parameter count, maintaining robust performance under weakly aligned conditions and challenging illumination. Codes and data are available at <span><span>https://github.com/arx48/MRT-DETR.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"153 ","pages":"Article 106346"},"PeriodicalIF":3.4,"publicationDate":"2025-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-21DOI: 10.1016/j.infrared.2025.106342
Pengfei Xu , Gang Luo , Jinping Liu , Xinyu Zhou , Dianyi Song
Infrared and visible image fusion (IVIF) aims to integrate complementary information from both modalities to generate informative visual representations that fully exploit their respective advantages. While visible images provide rich texture and color information, they are susceptible to lighting variations and environmental conditions. Conversely, infrared images excel at thermal radiation detection but typically lack fine-grained texture details. Existing methods often fuse infrared and visible features indiscriminately, failing to effectively address the fundamental differences between these modalities, which limits fusion quality. To address this limitation, we propose DIFuse, a dual-domain interactive fusion network that decomposes features into global and local domains. The Cross-Modal Complementary Module (CCM) decomposes features into global and local domains to reduce the modality gap. In the global domain, The Cross-Modal Interactive Attention (CMIA) mechanism with adaptive weighting enhances semantic understanding and global interactions. In the local domain, The Spatial Context Adaptive Attention (SCAA) module integrates multi-scale features with directional information to improve local detail preservation. Furthermore, The Progressive Feature Perception Module (PFPM) enriches semantic representation, and The Information Compensation Module (ICM) ensures comprehensive multi-level information preservation. Extensive experiments on three public datasets (MSRS, RoadScene, TNO) demonstrate that DIFuse achieves superior performance in both quantitative metrics and visual quality. Moreover, the enhanced fusion quality shows strong potential for real-world applications, such as pedestrian detection, directly improving downstream task performance.
{"title":"DIFuse: Dual-Domain interactive fusion of Cross-Modality infrared and visible images with decomposition deep network","authors":"Pengfei Xu , Gang Luo , Jinping Liu , Xinyu Zhou , Dianyi Song","doi":"10.1016/j.infrared.2025.106342","DOIUrl":"10.1016/j.infrared.2025.106342","url":null,"abstract":"<div><div>Infrared and visible image fusion (IVIF) aims to integrate complementary information from both modalities to generate informative visual representations that fully exploit their respective advantages. While visible images provide rich texture and color information, they are susceptible to lighting variations and environmental conditions. Conversely, infrared images excel at thermal radiation detection but typically lack fine-grained texture details. Existing methods often fuse infrared and visible features indiscriminately, failing to effectively address the fundamental differences between these modalities, which limits fusion quality. To address this limitation, we propose DIFuse, a dual-domain interactive fusion network that decomposes features into global and local domains. The Cross-Modal Complementary Module (CCM) decomposes features into global and local domains to reduce the modality gap. In the global domain, The Cross-Modal Interactive Attention (CMIA) mechanism with adaptive weighting enhances semantic understanding and global interactions. In the local domain, The Spatial Context Adaptive Attention (SCAA) module integrates multi-scale features with directional information to improve local detail preservation. Furthermore, The Progressive Feature Perception Module (PFPM) enriches semantic representation, and The Information Compensation Module (ICM) ensures comprehensive multi-level information preservation. Extensive experiments on three public datasets (MSRS, RoadScene, TNO) demonstrate that DIFuse achieves superior performance in both quantitative metrics and visual quality. Moreover, the enhanced fusion quality shows strong potential for real-world applications, such as pedestrian detection, directly improving downstream task performance.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"153 ","pages":"Article 106342"},"PeriodicalIF":3.4,"publicationDate":"2025-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-21DOI: 10.1016/j.infrared.2025.106347
Chao Tan , Bin Cheng , Hui Chen , Zan Lin
Gastrodia elata is not only a valuable Chinese medicine but also an important food for health care. The price of Wild Gastrodia elata is often 3–5 times that of cultivated ones. Some unscrupulous dealers often play off cultivated gastrodia elata as wild ones wild for gaining illegal profits. It is necessary to quickly identify wild gastrodia elata. Given the scarcity of wild samples, an unbalanced data set is usually collected and it is therefore challenging to build a robust and accurate predictive model by data-driven methods. This work explores the feasibility of near-infrared (NIR) spectroscopy integrated with virtual sample-based ensemble modeling for realizing the identification of wild Gastrodia elata. Partial least square-discriminant analysis (PLS-DA) is used as the algorithm for constructing predictive models. To mitigate class imbalance, two algorithms including Synthetic Minority Oversampling Technique (SMOTE) and Adaptive Synthetic Sampling (ADASYN) are used for virtual sample generation. A fixed test set containing 22 wild samples and 48 cultivated samples was used for testing and comparison of various models. Sample sizes of the minority class for three cases (6, 14 and 22) were considered for training models. The experimental result indicates that the proposed scheme can produce improved prediction, and the ADASYN performs better than the SMOTE, with an accuracy of 85.7 %, 90 % and 97.1 % for three cases, respectively. Also, the lower the class imbalance of original training samples, the more obvious the improved effect is. The robustness of the proposed is always analyzed. The proposed scheme is a good reference for NIR-based applications with class imbalance.
{"title":"Application of virtual sample-based ensemble strategy to enhance the spectral recognition of wild Gastrodia elata","authors":"Chao Tan , Bin Cheng , Hui Chen , Zan Lin","doi":"10.1016/j.infrared.2025.106347","DOIUrl":"10.1016/j.infrared.2025.106347","url":null,"abstract":"<div><div>Gastrodia elata is not only a valuable Chinese medicine but also an important food for health care. The price of Wild Gastrodia elata is often 3–5 times that of cultivated ones. Some unscrupulous dealers often play off cultivated gastrodia elata as wild ones wild for gaining illegal profits. It is necessary to quickly identify wild gastrodia elata. Given the scarcity of wild samples, an unbalanced data set is usually collected and it is therefore challenging to build a robust and accurate predictive model by data-driven methods. This work explores the feasibility of near-infrared (NIR) spectroscopy integrated with virtual sample-based ensemble modeling for realizing the identification of wild Gastrodia elata. Partial least square-discriminant analysis (PLS-DA) is used as the algorithm for constructing predictive models. To mitigate class imbalance, two algorithms including Synthetic Minority Oversampling Technique (SMOTE) and Adaptive Synthetic Sampling (ADASYN) are used for virtual sample generation. A fixed test set containing 22 wild samples and 48 cultivated samples was used for testing and comparison of various models. Sample sizes of the minority class for three cases (6, 14 and 22) were considered for training models. The experimental result indicates that the proposed scheme can produce improved prediction, and the ADASYN performs better than the SMOTE, with an accuracy of 85.7 %, 90 % and 97.1 % for three cases, respectively. Also, the lower the class imbalance of original training samples, the more obvious the improved effect is. The robustness of the proposed is always analyzed. The proposed scheme is a good reference for NIR-based applications with class imbalance.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"153 ","pages":"Article 106347"},"PeriodicalIF":3.4,"publicationDate":"2025-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-19DOI: 10.1016/j.infrared.2025.106340
Xiangyu Jiang , Xinyu Chen , Jing Zhang , Jingliang Liu , Yongji Yu , Guangyong Jin
This study designed a compact eight-pass folded resonant cavity structure based on a rectangular prism. Through the synergistic cooperation of rectangular prism and reflection mirror, the beam passed through the Tm: YLF gain medium eight times within a cavity length of 670 mm, enhancing mode selectivity and pump energy utilization efficiency. A 6.05 W laser output was obtained at a pump power of 90 W, with a slope efficiency of 27.6 %. The fast axis beam quality was My2 = 1.59, and the slow axis beam quality was Mx2 = 1.68, providing an effective and feasible optimization technology solution for the development of high-power and high beam quality mid infrared lasers.
{"title":"Research on beam quality optimization of Tm: YLF laser with eight-pass folded cavity based on rectangular prism","authors":"Xiangyu Jiang , Xinyu Chen , Jing Zhang , Jingliang Liu , Yongji Yu , Guangyong Jin","doi":"10.1016/j.infrared.2025.106340","DOIUrl":"10.1016/j.infrared.2025.106340","url":null,"abstract":"<div><div>This study designed a compact eight-pass folded resonant cavity structure based on a rectangular prism. Through the synergistic cooperation of rectangular prism and reflection mirror, the beam passed through the Tm: YLF gain medium eight times within a cavity length of 670 mm, enhancing mode selectivity and pump energy utilization efficiency. A 6.05 W laser output was obtained at a pump power of 90 W, with a slope efficiency of 27.6 %. The fast axis beam quality was <em>My<sup>2</sup></em> = 1.59, and the slow axis beam quality was <em>Mx<sup>2</sup></em> = 1.68, providing an effective and feasible optimization technology solution for the development of high-power and high beam quality mid infrared lasers.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"153 ","pages":"Article 106340"},"PeriodicalIF":3.4,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}