Pub Date : 2026-01-10DOI: 10.1016/j.infrared.2026.106384
Sheng-hui Rong , Wang Zi-ming , Gao Xue-zhen , Zhao Wen-feng , Wu Xu-peng , Zhang Tao
Infrared small target detection is a crucial technique in the field of computer vision. With the advancement of deep learning, Convolutional Neural Network (CNN)-based methods have achieved promising results in target detection. However, due to the small size of the targets, relying solely on dense convolutional layers may lead to information loss. To address the issue of inaccurate background prediction in complex background, we propose an end-to-end infrared background prediction method based on a conditional diffusion model with an adaptive blocking strategy. On one hand, the adaptive blocking strategy effectively integrates both local and global information from the infrared image while significantly accelerating the inference speed of the diffusion model. On the other hand, the multi-scale attention segmentation module can effectively suppress background clutter and enhance the target. We also created an IRDF (infrared for diffusion) dataset, comprising of 23,378 images to evaluate the detection performance of the proposed method and the comparison methods. Extensive experiments demonstrate that our approach is capable of detecting targets precisely and performs effectively in various complex backgrounds.
{"title":"A novel diffusion-based background estimation for infrared dim small target detection","authors":"Sheng-hui Rong , Wang Zi-ming , Gao Xue-zhen , Zhao Wen-feng , Wu Xu-peng , Zhang Tao","doi":"10.1016/j.infrared.2026.106384","DOIUrl":"10.1016/j.infrared.2026.106384","url":null,"abstract":"<div><div>Infrared small target detection is a crucial technique in the field of computer vision. With the advancement of deep learning, Convolutional Neural Network (CNN)-based methods have achieved promising results in target detection. However, due to the small size of the targets, relying solely on dense convolutional layers may lead to information loss. To address the issue of inaccurate background prediction in complex background, we propose an end-to-end infrared background prediction method based on a conditional diffusion model with an adaptive blocking strategy. On one hand, the adaptive blocking strategy effectively integrates both local and global information from the infrared image while significantly accelerating the inference speed of the diffusion model. On the other hand, the multi-scale attention segmentation module can effectively suppress background clutter and enhance the target. We also created an IRDF (infrared for diffusion) dataset, comprising of 23,378 images to evaluate the detection performance of the proposed method and the comparison methods. Extensive experiments demonstrate that our approach is capable of detecting targets precisely and performs effectively in various complex backgrounds.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"154 ","pages":"Article 106384"},"PeriodicalIF":3.4,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146023730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.infrared.2026.106367
Huijie Zhu , Qingyuan Zhu , Kai Ding , Hao Tang , Yuchen Li
The combination of visible (RGB) and thermal infrared (TIR) data holds significant potential for all-day-all-night applications. However, research in this area is hampered by the limited size and demanding alignment requirements of multi-modal datasets. To address these challenges, we propose a generative adversarial network (GAN) to translate RGB data into TIR data, thereby significantly expanding the availability of RGBT data and alleviating the need for laborious alignment processes. Our method employs pixel-wise perceptual loss and a multi-scale architecture in the generator and discriminator, respectively, to ensure high-quality TIR data generation. Conditioned on the original RGB data, our model generates TIR data depicting the same scene, providing paired and aligned RGBT data that facilitates downstream tasks. Qualitative and quantitative analyses demonstrate the effectiveness of the generated RGBT data. In a questionnaire, participants found it difficult to distinguish between generated and real data. On the RGBT tracking task, methods trained with generated data performed comparably to those trained with real data, proving the utility and efficacy of our approach. Code is available at https://github.com/NJ587/RGB2TIR.
{"title":"Generative adversarial translation of RGB to thermal infrared images for enhanced multimodal data","authors":"Huijie Zhu , Qingyuan Zhu , Kai Ding , Hao Tang , Yuchen Li","doi":"10.1016/j.infrared.2026.106367","DOIUrl":"10.1016/j.infrared.2026.106367","url":null,"abstract":"<div><div>The combination of visible (RGB) and thermal infrared (TIR) data holds significant potential for all-day-all-night applications. However, research in this area is hampered by the limited size and demanding alignment requirements of multi-modal datasets. To address these challenges, we propose a generative adversarial network (GAN) to translate RGB data into TIR data, thereby significantly expanding the availability of RGBT data and alleviating the need for laborious alignment processes. Our method employs pixel-wise perceptual loss and a multi-scale architecture in the generator and discriminator, respectively, to ensure high-quality TIR data generation. Conditioned on the original RGB data, our model generates TIR data depicting the same scene, providing paired and aligned RGBT data that facilitates downstream tasks. Qualitative and quantitative analyses demonstrate the effectiveness of the generated RGBT data. In a questionnaire, participants found it difficult to distinguish between generated and real data. On the RGBT tracking task, methods trained with generated data performed comparably to those trained with real data, proving the utility and efficacy of our approach. Code is available at <span><span>https://github.com/NJ587/RGB2TIR</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"154 ","pages":"Article 106367"},"PeriodicalIF":3.4,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146023738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.infrared.2026.106368
Xianhui Yang , Jianfeng Sun , Xin Zhou , Le Ma , Wei Lu , Feng Liu , Jie Lu
The Gm-APD LiDAR can produce the three-dimensional structure of the target and possesses single-photon sensitivity, which is able to respond to extremely weak light, yet this also results that the image quality is highly susceptible to the background noise. Based on the spatio-temporal distribution characteristics of single-photon lidar data, a depth image estimation method using the region growing method is proposed. Multiple distance information is extracted from the histogram to construct a point cloud to ensure the detection rate of the echo signal. Based on the spatio-temporal distribution characteristics of point cloud data, the two-dimensional Otsu threshold method is used to denoise the point cloud, and then the region growing method is used to obtain the depth image. The sufficient simulations and experiments show that the proposed method using a small amount of data under very low signal-to-background ratio (SBR) conditions, has a better effect than the sparse Poisson intensity reconstruction algorithm (SPIRAL) when using more data. When the SBR is 0.004, the target recovery ratio of the proposed method reaches 79.3% with 0.05 s data, which is 66.4% higher than that of SPIRAL method. And when using 0.15 s data, the recovery ratio of the proposed method reaches 91.5%, which is 79.5% higher than that of SPIRAL method. The proposed method improves the suppression effect of the LiDAR system on noise, greatly improves the integrity of the target, and provides the basis for long-distance weak target detection and recognition.
{"title":"Depth image reconstruction algorithm of Gm-APD LiDAR using the region growing method","authors":"Xianhui Yang , Jianfeng Sun , Xin Zhou , Le Ma , Wei Lu , Feng Liu , Jie Lu","doi":"10.1016/j.infrared.2026.106368","DOIUrl":"10.1016/j.infrared.2026.106368","url":null,"abstract":"<div><div>The Gm-APD LiDAR can produce the three-dimensional structure of the target and possesses single-photon sensitivity, which is able to respond to extremely weak light, yet this also results that the image quality is highly susceptible to the background noise. Based on the spatio-temporal distribution characteristics of single-photon lidar data, a depth image estimation method using the region growing method is proposed. Multiple distance information is extracted from the histogram to construct a point cloud to ensure the detection rate of the echo signal. Based on the spatio-temporal distribution characteristics of point cloud data, the two-dimensional Otsu threshold method is used to denoise the point cloud, and then the region growing method is used to obtain the depth image. The sufficient simulations and experiments show that the proposed method using a small amount of data under very low signal-to-background ratio (SBR) conditions, has a better effect than the sparse Poisson intensity reconstruction algorithm (SPIRAL) when using more data. When the SBR is 0.004, the target recovery ratio of the proposed method reaches 79.3% with 0.05 s data, which is 66.4% higher than that of SPIRAL method. And when using 0.15 s data, the recovery ratio of the proposed method reaches 91.5%, which is 79.5% higher than that of SPIRAL method. The proposed method improves the suppression effect of the LiDAR system on noise, greatly improves the integrity of the target, and provides the basis for long-distance weak target detection and recognition.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"154 ","pages":"Article 106368"},"PeriodicalIF":3.4,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146074071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1016/j.infrared.2026.106391
Bo Zhao , Weiqin Li , Yiwen Li , Puyousen Zhang , Peifeng Tan , Yao Li , Binbin Pei
Infrared emissivity is a critical parameter in thermal radiation and aerospace remote sensing. Traditional contact-based techniques, including the integrating sphere and calorimetric methods, are limited by low accuracy and the lack of directional information. Non-contact measurements typically rely on Lambertian and isotropic assumptions, making them inadequate for characterizing the directional properties of real materials. Recent learning-based BRDF models, including CNN, Transformer, and factorization-based architectures, improve angular fitting flexibility but still lack explicit physical constraints. As a result, they struggle to maintain stability, non-negativity, and hemispherical energy consistency under sparse directional sampling, motivating the comparison conducted in this work. To this end, this study proposes LORENet, a neural-network-based directional emissivity inversion method incorporating a dual-peak asymmetric physical prior. First, asymmetric broadening and multi-peak structures are used to capture complex directional distributions. Second, the parameter network leverages physical priors to generate parameter fields while enforcing constraints of non-negativity and energy conservation. Finally, the reconstructed bidirectional reflectance distribution function (BRDF) is integrated to derive reflectance and enable high-precision emissivity inversion. Results show that the proposed approach provides superior accuracy and directional resolution in modeling non-Lambertian rough surfaces, and offers strong practicality and broad potential for application.
{"title":"A method for anisotropic BRDF modeling and infrared emissivity prediction of non-Lambertian coatings","authors":"Bo Zhao , Weiqin Li , Yiwen Li , Puyousen Zhang , Peifeng Tan , Yao Li , Binbin Pei","doi":"10.1016/j.infrared.2026.106391","DOIUrl":"10.1016/j.infrared.2026.106391","url":null,"abstract":"<div><div>Infrared emissivity is a critical parameter in thermal radiation and aerospace remote sensing. Traditional contact-based techniques, including the integrating sphere and calorimetric methods, are limited by low accuracy and the lack of directional information. Non-contact measurements typically rely on Lambertian and isotropic assumptions, making them inadequate for characterizing the directional properties of real materials. Recent learning-based BRDF models, including CNN, Transformer, and factorization-based architectures, improve angular fitting flexibility but still lack explicit physical constraints. As a result, they struggle to maintain stability, non-negativity, and hemispherical energy consistency under sparse directional sampling, motivating the comparison conducted in this work. To this end, this study proposes LORENet, a neural-network-based directional emissivity inversion method incorporating a dual-peak asymmetric physical prior. First, asymmetric broadening and multi-peak structures are used to capture complex directional distributions. Second, the parameter network leverages physical priors to generate parameter fields while enforcing constraints of non-negativity and energy conservation. Finally, the reconstructed bidirectional reflectance distribution function (BRDF) is integrated to derive reflectance and enable high-precision emissivity inversion. Results show that the proposed approach provides superior accuracy and directional resolution in modeling non-Lambertian rough surfaces, and offers strong practicality and broad potential for application.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"154 ","pages":"Article 106391"},"PeriodicalIF":3.4,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1016/j.infrared.2026.106392
Zexiao Zheng , Yaohong Zhao , Wei Xiang
Low-frequency non-uniformity noise, caused by scene-independent stray thermal radiation incident on the infrared detector, is a common form of additive noise in infrared images. Its presence significantly degrades image quality and adversely affects subsequent image processing and analysis. Due to the complex and diverse origins of such radiation, low-frequency non-uniformity exhibits varying characteristics, while existing correction algorithms generally have limited generalization capability and suboptimal performance. To address this issue, a correction method based on gradient-domain weighted B-spline is proposed. Specifically, non-uniform B-splines are employed in the gradient domain with an adaptive knot placement strategy, which allows the density of B-spline knots to be flexibly adjusted across different regions for accurate fitting. Furthermore, an adaptive gradient-domain filter is designed to robustly extract low-frequency information, with adaptive parameters estimating the noise distribution and better suppressing edges and texture details. To further suppress residual high-frequency components, a weighting scheme based on a second-order derivative prior is incorporated into the model. Experimental results demonstrate that the proposed method achieves superior adaptability and robustness, effectively removing diverse low-frequency non-uniform noise.
{"title":"Infrared low-frequency non-uniformity correction method based on gradient-domain weighted B-spline","authors":"Zexiao Zheng , Yaohong Zhao , Wei Xiang","doi":"10.1016/j.infrared.2026.106392","DOIUrl":"10.1016/j.infrared.2026.106392","url":null,"abstract":"<div><div>Low-frequency non-uniformity noise, caused by scene-independent stray thermal radiation incident on the infrared detector, is a common form of additive noise in infrared images. Its presence significantly degrades image quality and adversely affects subsequent image processing and analysis. Due to the complex and diverse origins of such radiation, low-frequency non-uniformity exhibits varying characteristics, while existing correction algorithms generally have limited generalization capability and suboptimal performance. To address this issue, a correction method based on gradient-domain weighted B-spline is proposed. Specifically, non-uniform B-splines are employed in the gradient domain with an adaptive knot placement strategy, which allows the density of B-spline knots to be flexibly adjusted across different regions for accurate fitting. Furthermore, an adaptive gradient-domain filter is designed to robustly extract low-frequency information, with adaptive parameters estimating the noise distribution and better suppressing edges and texture details. To further suppress residual high-frequency components, a weighting scheme based on a second-order derivative prior is incorporated into the model. Experimental results demonstrate that the proposed method achieves superior adaptability and robustness, effectively removing diverse low-frequency non-uniform noise.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"154 ","pages":"Article 106392"},"PeriodicalIF":3.4,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146023742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1016/j.infrared.2026.106376
An Wang , Yu-Cun Zhang , Qun Li
This paper proposes an emissivity correction model based on the surface phonon-photon coupling (SPCC) system of aluminum alloy, aimed at addressing the complexity of emissivity variation with temperature and wavelength under high-temperature conditions. The model combines the SPCC system with a wavelength-weight optimization algorithm, considering the interaction between phonons and photons on the aluminum alloy surface, and accurately describes the impact of multi-band infrared radiation intensity on temperature measurement. By introducing a swarm optimization algorithm to optimize the wavelength weight function, the model adjusts the contribution of different bands to temperature measurement, significantly improving the infrared temperature measurement accuracy in the range of 300–500 °C. Experimental results demonstrate that, compared to traditional fixed emissivity models, this model reduces temperature measurement errors by more than 20.6 %, providing a crucial theoretical foundation and technical support for precise temperature control of high-temperature aluminum alloy ring forgings.
{"title":"Aluminium alloy emissivity correction model based on photon-phonon coupling and wavelength weight optimization","authors":"An Wang , Yu-Cun Zhang , Qun Li","doi":"10.1016/j.infrared.2026.106376","DOIUrl":"10.1016/j.infrared.2026.106376","url":null,"abstract":"<div><div>This paper proposes an emissivity correction model based on the surface phonon-photon coupling (SPCC) system of aluminum alloy, aimed at addressing the complexity of emissivity variation with temperature and wavelength under high-temperature conditions. The model combines the SPCC system with a wavelength-weight optimization algorithm, considering the interaction between phonons and photons on the aluminum alloy surface, and accurately describes the impact of multi-band infrared radiation intensity on temperature measurement. By introducing a swarm optimization algorithm to optimize the wavelength weight function, the model adjusts the contribution of different bands to temperature measurement, significantly improving the infrared temperature measurement accuracy in the range of 300–500 °C. Experimental results demonstrate that, compared to traditional fixed emissivity models, this model reduces temperature measurement errors by more than 20.6 %, providing a crucial theoretical foundation and technical support for precise temperature control of high-temperature aluminum alloy ring forgings.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"154 ","pages":"Article 106376"},"PeriodicalIF":3.4,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-07DOI: 10.1016/j.infrared.2026.106389
Jinlin Jiang , Gang Hu , Guanglei Sheng , Guo Wei
Image fusion enhances complementary details and visual quality by integrating information from multiple modalities, thereby supporting more accurate decision-making in downstream tasks. While diffusion models show strong generative ability in fusion tasks, the absence of real-image supervision restricts their ability to capture local features. To address this, we propose a Dual-Task Directed Residual Denoising Diffusion Model (DTRDM) to better capture multi-scale diffusion features and enrich fused image content. First, we introduce two diffusion biases: “image residuals and pure noise” to guide forward diffusion in a goal-oriented manner. This strategy explicitly guides the inverse fusion process while reducing training complexity. Second, we design a noise prediction module based on a dual U-Net architecture, which generates residual or noise prediction features depending on the training objective. Multi-scale features are refined through cascading and iterative extraction, enabling the model to capture local details across modalities and enhance the fused representation. Finally, we introduce a color–structure-preserving composite loss for denoising, which strengthens feature complementarity across scales. Extensive experiments show that DTRDM achieves state-of-the-art results across key metrics with strong adaptability. It generalizes to diverse fusion tasks without retraining, and its results substantially improve decision-making in applications such as autonomous driving, traffic monitoring, and medical imaging.
{"title":"DTRDM: Dual-task directional residual denoising diffusion model for multimodal image fusion","authors":"Jinlin Jiang , Gang Hu , Guanglei Sheng , Guo Wei","doi":"10.1016/j.infrared.2026.106389","DOIUrl":"10.1016/j.infrared.2026.106389","url":null,"abstract":"<div><div>Image fusion enhances complementary details and visual quality by integrating information from multiple modalities, thereby supporting more accurate decision-making in downstream tasks. While diffusion models show strong generative ability in fusion tasks, the absence of real-image supervision restricts their ability to capture local features. To address this, we propose a Dual-Task Directed Residual Denoising Diffusion Model (DTRDM) to better capture multi-scale diffusion features and enrich fused image content. First, we introduce two diffusion biases: “image residuals and pure noise” to guide forward diffusion in a goal-oriented manner. This strategy explicitly guides the inverse fusion process while reducing training complexity. Second, we design a noise prediction module based on a dual U-Net architecture, which generates residual or noise prediction features depending on the training objective. Multi-scale features are refined through cascading and iterative extraction, enabling the model to capture local details across modalities and enhance the fused representation. Finally, we introduce a color–structure-preserving composite loss for denoising, which strengthens feature complementarity across scales. Extensive experiments show that DTRDM achieves state-of-the-art results across key metrics with strong adaptability. It generalizes to diverse fusion tasks without retraining, and its results substantially improve decision-making in applications such as autonomous driving, traffic monitoring, and medical imaging.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"154 ","pages":"Article 106389"},"PeriodicalIF":3.4,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-07DOI: 10.1016/j.infrared.2026.106388
Haiqing Yang , Xiaoyu Zhou , Xingyue Li , Lixin Peng , Yongqiang Yue
Deterioration pattern identification is the basis for studying the deterioration mechanism of stone cultural heritage and implementing protection measures. However, traditional photogrammetry-based methods for identifying heritage deterioration patterns exhibit excessive subjectivity, heavy reliance on surveyors’ experience, and low efficiency. To address this issue, this study proposes an intelligent recognition method for typical deterioration patterns based on hyperspectral imaging technology with Maijishan Grottoes as the research object. First, spectral data within the 400–1000 nm wavelength range were processed using Savitzky-Golay smoothing, normalization, and continuum removal to effectively enhance data quality. Next, feature wavelengths were selected through Competitive Adaptive Reweighted Sampling (CARS), Successive Projections Algorithm (SPA), and Random Frog Leaping Algorithm (RFLA) to reduce data redundancy. Subsequently, recognition models were constructed and trained based on these feature wavelengths, followed by a comparison of the performance of four different models in identifying typical deterioration patterns. Finally, the best-performing Random Forest (RF) model was applied to assess the overall deterioration distribution across the statues in the study area. The results demonstrate that the proposed recognition model can accurately identify deterioration patterns such as flaking, surface contamination, and salt crystallization. Additionally, the formation mechanisms of these deterioration patterns were analyzed, and corresponding conservation measures were proposed. This study provides an efficient and objective technical approach for the precise identification and quantitative analysis of deterioration patterns for stone cultural heritage.
{"title":"Non-contact deterioration patterns identification method of stone building heritage based on hyperspectral image technology","authors":"Haiqing Yang , Xiaoyu Zhou , Xingyue Li , Lixin Peng , Yongqiang Yue","doi":"10.1016/j.infrared.2026.106388","DOIUrl":"10.1016/j.infrared.2026.106388","url":null,"abstract":"<div><div>Deterioration pattern identification is the basis for studying the deterioration mechanism of stone cultural heritage and implementing protection measures. However, traditional photogrammetry-based methods for identifying heritage deterioration patterns exhibit excessive subjectivity, heavy reliance on surveyors’ experience, and low efficiency. To address this issue, this study proposes an intelligent recognition method for typical deterioration patterns based on hyperspectral imaging technology with Maijishan Grottoes as the research object. First, spectral data within the 400–1000 nm wavelength range were processed using Savitzky-Golay smoothing, normalization, and continuum removal to effectively enhance data quality. Next, feature wavelengths were selected through Competitive Adaptive Reweighted Sampling (CARS), Successive Projections Algorithm (SPA), and Random Frog Leaping Algorithm (RFLA) to reduce data redundancy. Subsequently, recognition models were constructed and trained based on these feature wavelengths, followed by a comparison of the performance of four different models in identifying typical deterioration patterns. Finally, the best-performing Random Forest (RF) model was applied to assess the overall deterioration distribution across the statues in the study area. The results demonstrate that the proposed recognition model can accurately identify deterioration patterns such as flaking, surface contamination, and salt crystallization. Additionally, the formation mechanisms of these deterioration patterns were analyzed, and corresponding conservation measures were proposed. This study provides an efficient and objective technical approach for the precise identification and quantitative analysis of deterioration patterns for stone cultural heritage.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"154 ","pages":"Article 106388"},"PeriodicalIF":3.4,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1016/j.infrared.2026.106375
Lei Zhang , Jinsong Du , Jiakang Li , Chengyuan Li , Jiandong Zhang , Lianlian Wu
Early detection of mildew in tobacco leaves is essential for maintaining product quality. While hyperspectral imaging (HSI) offers a non-destructive alternative with rich spectral–spatial information, but the high dimensionality of HSI and complex characteristics of early mildew pose significant challenges for conventional deep learning approach. In this article, we propose a novel multi-attention enhanced 3D Residual Convolutional Neural Network (3D-ResCNN) for early mildew detection of tobacco leaves using HSI data. First, the model employs 3D convolutions to simultaneously extract spatial and spectral features, while residual connections mitigate the vanishing gradient problem in deep networks. To improve mildew localization and spectral discrimination, a spatial–spectral attention module is integrated to selectively emphasize mildew-sensitive spatial regions and identify key spectral bands. Subsequently, a channel attention mechanism is introduced to adaptively reweight feature channels, thereby suppressing redundancy and emphasizing the most discriminative feature maps. Extensive experiments conducted on a real-world HSI tobacco dataset demonstrate that the proposed method achieves superior performance over traditional deep learning models in terms of accuracy and early-stage detection sensitivity, which validate the model’s effectiveness and superiority.
{"title":"Early detection of tobacco leaf mildew using multi-attention enhanced 3D residual convolutional Neural network with hyperspectral imaging","authors":"Lei Zhang , Jinsong Du , Jiakang Li , Chengyuan Li , Jiandong Zhang , Lianlian Wu","doi":"10.1016/j.infrared.2026.106375","DOIUrl":"10.1016/j.infrared.2026.106375","url":null,"abstract":"<div><div>Early detection of mildew in tobacco leaves is essential for maintaining product quality. While hyperspectral imaging (HSI) offers a non-destructive alternative with rich spectral–spatial information, but the high dimensionality of HSI and complex characteristics of early mildew pose significant challenges for conventional deep learning approach. In this article, we propose a novel multi-attention enhanced 3D Residual Convolutional Neural Network (3D-ResCNN) for early mildew detection of tobacco leaves using HSI data. First, the model employs 3D convolutions to simultaneously extract spatial and spectral features, while residual connections mitigate the vanishing gradient problem in deep networks. To improve mildew localization and spectral discrimination, a spatial–spectral attention module is integrated to selectively emphasize mildew-sensitive spatial regions and identify key spectral bands. Subsequently, a channel attention mechanism is introduced to adaptively reweight feature channels, thereby suppressing redundancy and emphasizing the most discriminative feature maps. Extensive experiments conducted on a real-world HSI tobacco dataset demonstrate that the proposed method achieves superior performance over traditional deep learning models in terms of accuracy and early-stage detection sensitivity, which validate the model’s effectiveness and superiority.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"154 ","pages":"Article 106375"},"PeriodicalIF":3.4,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145923453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-05DOI: 10.1016/j.infrared.2026.106374
Wenyi Sun , Yinji Chen , Zhiqi Gao , Tianyu Li , Xupu Chen , Yizhen Liu , Qiaoyun Wang
Dry matter content (DMC) is a key indicator in evaluating olive fruit quality, particularly in assessing its suitability for oil extraction. The Near-infrared (NIR) spectroscopy combined with machine learning methods is widely used to evaluate the DMC in olive fruit. In this paper, Extreme gradient boosting (XGBoost) algorithm with high efficiency, accuracy and flexibility was used as a preprocessing method to enhance the predictive performance of DMC estimation models. And the prediction result of XGBoost preprocessing with the partial least squares (PLS) and Multi-Layer Perceptron (MLP) models were compared with other widely used preprocessing methods (D1, D2, MA, MSC, SG, SNV, WAVE). Experimental results showed that the XGBoost preprocessing method outperformed other preprocessing methods in predictive accuracy, achieving lower values of root mean square error of cross-validation (RMSECV), root mean square error of prediction (RMSEP) and standard error of prediction (SEP), and higher ratio of performance to deviation (RPD) and coefficient of determination (R2). Moreover, the XGBoost-MLP model had better performance than that of XGBoost-PLS model. The experimental results demonstrate that the XGBoost preprocessing method achieved better fitting performance than other preprocessing methods.
{"title":"NIRS regression model for dry matter content estimation in olive fruit with XGBoost pre-treatment method","authors":"Wenyi Sun , Yinji Chen , Zhiqi Gao , Tianyu Li , Xupu Chen , Yizhen Liu , Qiaoyun Wang","doi":"10.1016/j.infrared.2026.106374","DOIUrl":"10.1016/j.infrared.2026.106374","url":null,"abstract":"<div><div>Dry matter content (DMC) is a key indicator in evaluating olive fruit quality, particularly in assessing its suitability for oil extraction. The Near-infrared (NIR) spectroscopy combined with machine learning methods is widely used to evaluate the DMC in olive fruit. In this paper, Extreme gradient boosting (XGBoost) algorithm with high efficiency, accuracy and flexibility was used as a preprocessing method to enhance the predictive performance of DMC estimation models. And the prediction result of XGBoost preprocessing with the partial least squares (PLS) and Multi-Layer Perceptron (MLP) models were compared with other widely used preprocessing methods (D<sup>1</sup>, D<sup>2</sup>, MA, MSC, SG, SNV, WAVE). Experimental results showed that the XGBoost preprocessing method outperformed other preprocessing methods in predictive accuracy, achieving lower values of root mean square error of cross-validation (RMSECV), root mean square error of prediction (RMSEP) and standard error of prediction (SEP), and higher ratio of performance to deviation (RPD) and coefficient of determination (R<sup>2</sup>). Moreover, the XGBoost-MLP model had better performance than that of XGBoost-PLS model. The experimental results demonstrate that the XGBoost preprocessing method achieved better fitting performance than other preprocessing methods.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"154 ","pages":"Article 106374"},"PeriodicalIF":3.4,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145923444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}