Pub Date : 2025-03-04DOI: 10.1016/j.measurement.2025.117197
Kai Ma , Zhanqiang Liu , Yukui Cai , Bing Wang
Cutting deformation behaviors at the microscale are important evidence for understanding fundamental cutting mechanisms. However, characterizing the microstructure of materials during the high-speed cutting process within narrow deformation zones presents significant challenges. In this work, a characterization methodology based on the digital image correlation (DIC) technique was developed to determine the microstructure evolution within the cutting deformation zones. High-speed in-situ visible and infrared image acquisition systems were utilized to capture gray and infrared image sequences during orthogonal cutting of Ti6Al4V. A deformation field reconstruction method based on the dislocation-density based (DDB) model was developed to derive the total dislocation density fields. The stored strain energy fields were then determined based on dislocation density fields. The generation process of serrated chips was investigated from macroscopic and microscopic perspectives to reveal the material removal mechanism during machining. This work provides a novel characterization methodology for investigating microstructure evolution under dynamic deformation conditions.
{"title":"In-situ DIC characterization of dislocation density and stored strain energy fields for deformation zones during cutting of Ti6Al4V alloy","authors":"Kai Ma , Zhanqiang Liu , Yukui Cai , Bing Wang","doi":"10.1016/j.measurement.2025.117197","DOIUrl":"10.1016/j.measurement.2025.117197","url":null,"abstract":"<div><div>Cutting deformation behaviors at the microscale are important evidence for understanding fundamental cutting mechanisms. However, characterizing the microstructure of materials during the high-speed cutting process within narrow deformation zones presents significant challenges. In this work, a characterization methodology based on the digital image correlation (DIC) technique was developed to determine the microstructure evolution within the cutting deformation zones. High-speed in-situ visible and infrared image acquisition systems were utilized to capture gray and infrared image sequences during orthogonal cutting of Ti6Al4V. A deformation field reconstruction method based on the dislocation-density based (DDB) model was developed to derive the total dislocation density fields. The stored strain energy fields were then determined based on dislocation density fields. The generation process of serrated chips was investigated from macroscopic and microscopic perspectives to reveal the material removal mechanism during machining. This work provides a novel characterization methodology for investigating microstructure evolution under dynamic deformation conditions.</div></div>","PeriodicalId":18349,"journal":{"name":"Measurement","volume":"250 ","pages":"Article 117197"},"PeriodicalIF":5.2,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-04DOI: 10.1016/j.measurement.2025.117179
Lei Zhao , Yunfeng Wang , Fanmin Bu , Pengfei Wang , Libin Tian , Caiwei Liu
This study proposes a corrosion grade prediction network for coated steel components under varying illumination conditions. Images of corroded steel components were captured under low illumination, ambient, and high illumination conditions by adjusting camera parameters in a field workshop. Three parallel enhanced Mobile-Vision-Transformer networks were developed to assess prediction performance for corrosion grades under different illumination conditions and two transfer learning approaches. Network weights were fused, incorporating a global average pooling layer and convolution layer to enable direct corrosion grade prediction across varied illumination conditions. The impact of learning rate, input image size, image augmentation technique, etc., on network performance was investigated. The interpretability of the network is enhanced using the gradient-weighted class activation mapping method. Furthermore, prediction accuracy was verified using images of corroded coated steel plates captured under diverse illumination conditions and corrosion grades from accelerated laboratory corrosion tests. Finally, a graphical user interface was designed for automated corrosion grade prediction in coated steel components under varying illumination conditions.
{"title":"Corrosion damage detection and evaluation of coated steel components under multiple illumination conditions","authors":"Lei Zhao , Yunfeng Wang , Fanmin Bu , Pengfei Wang , Libin Tian , Caiwei Liu","doi":"10.1016/j.measurement.2025.117179","DOIUrl":"10.1016/j.measurement.2025.117179","url":null,"abstract":"<div><div>This study proposes a corrosion grade prediction network for coated steel components under varying illumination conditions. Images of corroded steel components were captured under low illumination, ambient, and high illumination conditions by adjusting camera parameters in a field workshop. Three parallel enhanced Mobile-Vision-Transformer networks were developed to assess prediction performance for corrosion grades under different illumination conditions and two transfer learning approaches. Network weights were fused, incorporating a global average pooling layer and convolution layer to enable direct corrosion grade prediction across varied illumination conditions. The impact of learning rate, input image size, image augmentation technique, etc., on network performance was investigated. The interpretability of the network is enhanced using the gradient-weighted class activation mapping method. Furthermore, prediction accuracy was verified using images of corroded coated steel plates captured under diverse illumination conditions and corrosion grades from accelerated laboratory corrosion tests. Finally, a graphical user interface was designed for automated corrosion grade prediction in coated steel components under varying illumination conditions.</div></div>","PeriodicalId":18349,"journal":{"name":"Measurement","volume":"250 ","pages":"Article 117179"},"PeriodicalIF":5.2,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-04DOI: 10.1016/j.measurement.2025.117178
Shuo Shan, Yixin Ji, Jianhua Wang
To improve the three-dimensional (3D) measurement efficiency, an efficient and robust absolute phase extraction algorithm based on only four images employing the Hilbert transform (HT) is proposed. First, since the HT does not filter the background intensity well, the background intensity is captured and subtracted from a high-frequency fringe, which is used to generate a phase-shifted π/2 fringe, from which the high-frequency wrapped phase is calculated. Next, low-frequency sine and cosine fringe images are captured, and the low-frequency wrapped phase is extracted using a 2 + 1 phase shifting algorithm (PSA). Finally, the absolute phase is robustly extracted based on the dual-frequency hierarchical temporal phase unwrapping (HTPU), the combination of PSA and HTPU ensures the reliability of the proposed method. This paper analyzes the effect of fringe tilt on HT and gives the corresponding improvement methods. The article also explores the noise immunity of the proposed algorithm, and investigates the possibility of further reducing the frame number in the algorithm. Simulations and experiments verify the effectiveness of the method. Compared to the currently most efficient method combining PSA and HTPU, the frame numbers are reduced by 20 % with only a slight loss in accuracy.
{"title":"Four-image-based 3D measurement approach employing Hilbert transform","authors":"Shuo Shan, Yixin Ji, Jianhua Wang","doi":"10.1016/j.measurement.2025.117178","DOIUrl":"10.1016/j.measurement.2025.117178","url":null,"abstract":"<div><div>To improve the three-dimensional (3D) measurement efficiency, an efficient and robust absolute phase extraction algorithm based on only four images employing the Hilbert transform (HT) is proposed. First, since the HT does not filter the background intensity well, the background intensity is captured and subtracted from a high-frequency fringe, which is used to generate a phase-shifted <em>π</em>/2 fringe, from which the high-frequency wrapped phase is calculated. Next, low-frequency sine and cosine fringe images are captured, and the low-frequency wrapped phase is extracted using a 2 + 1 phase shifting algorithm (PSA). Finally, the absolute phase is robustly extracted based on the dual-frequency hierarchical temporal phase unwrapping (HTPU), the combination of PSA and HTPU ensures the reliability of the proposed method. This paper analyzes the effect of fringe tilt on HT and gives the corresponding improvement methods. The article also explores the noise immunity of the proposed algorithm, and investigates the possibility of further reducing the frame number in the algorithm. Simulations and experiments verify the effectiveness of the method. Compared to the currently most efficient method combining PSA and HTPU, the frame numbers are reduced by 20 % with only a slight loss in accuracy.</div></div>","PeriodicalId":18349,"journal":{"name":"Measurement","volume":"250 ","pages":"Article 117178"},"PeriodicalIF":5.2,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-04DOI: 10.1016/j.measurement.2025.117038
Liang Tao , Siyong Zhang , Dashan Zhang
Intelligent tire technology provides a novel method for improving the intelligence of wheeled tractors. To address the challenge of optimally placing uncertain sensors within the intelligent tires, a combined evaluation method of random weighting CRITIC-TOPSIS is developed. Firstly, a finite element model of a tractor radial tire was constructed, and static loading tests suggest that the maximum error between experimental and simulated data is smaller than 7%. Secondly, 24 sensor measurement points were designed to output of strain, displacement, and acceleration signals. The sample entropy and peak values of the signals were taken as evaluation indicators. Finally, the random weighted CRITIC-TOPSIS method was employed to calculate the scores. The findings indicate that the tire sidewall is the optimal location for placing both the strain and acceleration sensors, while the displacement sensor should be placed at the leading edge of the root of the pattern block near the tire’s central plane.
{"title":"Research on evaluation method of in-tire sensor placement position for wheeled tractor intelligent tires","authors":"Liang Tao , Siyong Zhang , Dashan Zhang","doi":"10.1016/j.measurement.2025.117038","DOIUrl":"10.1016/j.measurement.2025.117038","url":null,"abstract":"<div><div>Intelligent tire technology provides a novel method for improving the intelligence of wheeled tractors. To address the challenge of optimally placing uncertain sensors within the intelligent tires, a combined evaluation method of random weighting CRITIC-TOPSIS is developed. Firstly, a finite element model of a tractor radial tire was constructed, and static loading tests suggest that the maximum error between experimental and simulated data is smaller than 7%. Secondly, 24 sensor measurement points were designed to output of strain, displacement, and acceleration signals. The sample entropy and peak values of the signals were taken as evaluation indicators. Finally, the random weighted CRITIC-TOPSIS method was employed to calculate the scores. The findings indicate that the tire sidewall is the optimal location for placing both the strain and acceleration sensors, while the displacement sensor should be placed at the leading edge of the root of the pattern block near the tire’s central plane.</div></div>","PeriodicalId":18349,"journal":{"name":"Measurement","volume":"250 ","pages":"Article 117038"},"PeriodicalIF":5.2,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-04DOI: 10.1016/j.measurement.2025.117102
Dawid Kucharski, Michał Wieczorowski
This paper presents a newly developed phase extraction algorithm to determine surface displacements and accurately estimate surface roughness parameters. The algorithm utilises radial image processing to extract the phase from the central region of an interferogram with circular fringes obtained from a self-built Michelson-type interferometer with a single rough surface. A key advantage of the method lies in its ability to leverage the entire image for phase extraction, even when the fringes are fragmented, or the central region is unclear. The linear relationship between the extracted phase and the object’s displacement eliminates ambiguities associated with intensity measurements near maximum or minimum values while reliably indicating the direction of motion. Furthermore, the algorithm effectively addresses speckle interference challenges, common in industrial applications where lasers are the primary light source. The proposed setup, combined with the advanced image processing algorithm, achieves displacement measurements with sub-micrometre precision, offering a robust tool for analysing rough surfaces in industrial environments.
{"title":"Radial image processing for phase extraction in rough-surface interferometry","authors":"Dawid Kucharski, Michał Wieczorowski","doi":"10.1016/j.measurement.2025.117102","DOIUrl":"10.1016/j.measurement.2025.117102","url":null,"abstract":"<div><div>This paper presents a newly developed phase extraction algorithm to determine surface displacements and accurately estimate surface roughness parameters. The algorithm utilises radial image processing to extract the phase from the central region of an interferogram with circular fringes obtained from a self-built Michelson-type interferometer with a single rough surface. A key advantage of the method lies in its ability to leverage the entire image for phase extraction, even when the fringes are fragmented, or the central region is unclear. The linear relationship between the extracted phase and the object’s displacement eliminates ambiguities associated with intensity measurements near maximum or minimum values while reliably indicating the direction of motion. Furthermore, the algorithm effectively addresses speckle interference challenges, common in industrial applications where lasers are the primary light source. The proposed setup, combined with the advanced image processing algorithm, achieves displacement measurements with sub-micrometre precision, offering a robust tool for analysing rough surfaces in industrial environments.</div></div>","PeriodicalId":18349,"journal":{"name":"Measurement","volume":"250 ","pages":"Article 117102"},"PeriodicalIF":5.2,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-03DOI: 10.1016/j.measurement.2025.117100
Jiahao Wang, Xiaobo Li, Zhendi Ma
In the deep learning methods, acquiring the semantic information of retinal fundus vessels relies on the feature extraction methods because of the complexity of the geometric structure. However, segmenting complete vascular structure will face challenges of the feature extraction. First, dealing with both global features and local features at the same level also depends on different methods of feature extraction. Besides, the emphasis and methods for feature extraction must be changed across different stages and levels of feature maps. Therefore, we propose MSTP-Net, the Multi-Scale Three-Path Network. It consists of a backbone and three output paths, with a new architecture. The backbone takes different components to transform the original features into high-level features. Three paths, i.e., Local Path, Global Path, and Fusion Path, take different multi-scale tasks to extract features. Local Path uses fine-grained methods to select the local detailed vascular semantics. Global Path focuses on obtaining the global vascular structure. Fusion Path combines the local features from Local Path with the global features from Global Path to extract fusion features. At last we integrate global features, local features and fusion features to obtain the final output. We evaluated our method on four retinal datasets (DRIVE, STARE, CHASE_DB1, HRF). The experiment indicates that MSTP-Net has achieved competitive performance in retinal vessel segmentation. The source code of proposed MSTP-Net is available at https://github.com/KokoloNaga/MSTP-Net.git.
{"title":"Multi-Scale Three-Path Network (MSTP-Net): A new architecture for retinal vessel segmentation","authors":"Jiahao Wang, Xiaobo Li, Zhendi Ma","doi":"10.1016/j.measurement.2025.117100","DOIUrl":"10.1016/j.measurement.2025.117100","url":null,"abstract":"<div><div>In the deep learning methods, acquiring the semantic information of retinal fundus vessels relies on the feature extraction methods because of the complexity of the geometric structure. However, segmenting complete vascular structure will face challenges of the feature extraction. First, dealing with both global features and local features at the same level also depends on different methods of feature extraction. Besides, the emphasis and methods for feature extraction must be changed across different stages and levels of feature maps. Therefore, we propose MSTP-Net, the Multi-Scale Three-Path Network. It consists of a backbone and three output paths, with a new architecture. The backbone takes different components to transform the original features into high-level features. Three paths, i.e., Local Path, Global Path, and Fusion Path, take different multi-scale tasks to extract features. Local Path uses fine-grained methods to select the local detailed vascular semantics. Global Path focuses on obtaining the global vascular structure. Fusion Path combines the local features from Local Path with the global features from Global Path to extract fusion features. At last we integrate global features, local features and fusion features to obtain the final output. We evaluated our method on four retinal datasets (DRIVE, STARE, CHASE_DB1, HRF). The experiment indicates that MSTP-Net has achieved competitive performance in retinal vessel segmentation. The source code of proposed MSTP-Net is available at <span><span>https://github.com/KokoloNaga/MSTP-Net.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18349,"journal":{"name":"Measurement","volume":"250 ","pages":"Article 117100"},"PeriodicalIF":5.2,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-03DOI: 10.1016/j.measurement.2025.117097
Melanie Schaller , Mathis Kruse , Antonio Ortega , Marius Lindauer , Bodo Rosenhahn
Addressing sensor drift is essential in industrial measurement systems, where precise data output is necessary for maintaining accuracy and reliability in monitoring processes, as it progressively degrades the performance of machine learning models over time. Our findings indicate that the standard cross-validation method used in existing model training overestimates performance by inadequately accounting for drift. This is primarily because typical cross-validation techniques allow data instances to appear in both training and testing sets, thereby distorting the accuracy of the predictive evaluation. As a result, these models are unable to precisely predict future drift effects, compromising their ability to generalize and adapt to evolving data conditions. This paper presents two solutions: (1) a novel sensor drift compensation learning paradigm for validating models, and (2) automated machine learning (AutoML) techniques to enhance classification performance and compensate sensor drift. By employing strategies such as data balancing, meta-learning, automated ensemble learning, hyperparameter optimization, feature selection, and boosting, our AutoML-DC (Drift Compensation) model significantly improves classification performance against sensor drift. AutoML-DC further adapts effectively to varying drift severities.
{"title":"AutoML for multi-class anomaly compensation of sensor drift","authors":"Melanie Schaller , Mathis Kruse , Antonio Ortega , Marius Lindauer , Bodo Rosenhahn","doi":"10.1016/j.measurement.2025.117097","DOIUrl":"10.1016/j.measurement.2025.117097","url":null,"abstract":"<div><div>Addressing sensor drift is essential in industrial measurement systems, where precise data output is necessary for maintaining accuracy and reliability in monitoring processes, as it progressively degrades the performance of machine learning models over time. Our findings indicate that the standard cross-validation method used in existing model training overestimates performance by inadequately accounting for drift. This is primarily because typical cross-validation techniques allow data instances to appear in both training and testing sets, thereby distorting the accuracy of the predictive evaluation. As a result, these models are unable to precisely predict future drift effects, compromising their ability to generalize and adapt to evolving data conditions. This paper presents two solutions: (1) a novel sensor drift compensation learning paradigm for validating models, and (2) automated machine learning (AutoML) techniques to enhance classification performance and compensate sensor drift. By employing strategies such as data balancing, meta-learning, automated ensemble learning, hyperparameter optimization, feature selection, and boosting, our AutoML-DC (Drift Compensation) model significantly improves classification performance against sensor drift. AutoML-DC further adapts effectively to varying drift severities.</div></div>","PeriodicalId":18349,"journal":{"name":"Measurement","volume":"250 ","pages":"Article 117097"},"PeriodicalIF":5.2,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.measurement.2025.117117
Yan Li, Sheng Bao
This paper investigates the anisotropic magnetic memory signal of low-carbon steel produced by wire-arc directed energy deposition (wire-arc DED) using tensile tests. Two rectangular specimens with different printing directions were tested. The residual magnetic field on the specimen surfaces during two loading stages was measured using a TSC-PC-16 magnetometer. The study reveals that material stress history significantly affects magnetic memory signals, with clear anisotropy between transversal and longitudinal specimens. The normal magnetic memory signal is less influenced by surface roughness and better reflects applied stress. Magnetic characteristic parameters are defined to quantify anisotropy and exhibit a quadratic relationship with load, enabling load level evaluation. This research highlights the potential for identifying printing direction and evaluating surface roughness in wire-arc DED components through magnetic memory signals, contributing to non-destructive testing of additive manufacturing steel.
{"title":"Anisotropic magnetic memory signal of low-carbon steel fabricated by wire-arc directed energy deposition","authors":"Yan Li, Sheng Bao","doi":"10.1016/j.measurement.2025.117117","DOIUrl":"10.1016/j.measurement.2025.117117","url":null,"abstract":"<div><div>This paper investigates the anisotropic magnetic memory signal of low-carbon steel produced by wire-arc directed energy deposition (wire-arc DED) using tensile tests. Two rectangular specimens with different printing directions were tested. The residual magnetic field on the specimen surfaces during two loading stages was measured using a TSC-PC-16 magnetometer. The study reveals that material stress history significantly affects magnetic memory signals, with clear anisotropy between transversal and longitudinal specimens. The normal magnetic memory signal is less influenced by surface roughness and better reflects applied stress. Magnetic characteristic parameters are defined to quantify anisotropy and exhibit a quadratic relationship with load, enabling load level evaluation. This research highlights the potential for identifying printing direction and evaluating surface roughness in wire-arc DED components through magnetic memory signals, contributing to non-destructive testing of additive manufacturing steel.</div></div>","PeriodicalId":18349,"journal":{"name":"Measurement","volume":"250 ","pages":"Article 117117"},"PeriodicalIF":5.2,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143534724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.measurement.2025.116968
Griffani Megiyanto Rahmatullah , Shanq-Jang Ruan , Lieber Po-Hung Li
Lipreading is one of the techniques that can enhance speech perception. However, there are still limited studies of lipreading research focusing on low-resource languages, such as Indonesian. In this study, we introduce an instrument designed to generate lipreading datasets using CC BY video data available on YouTube called Lipreading Information Resource Assembler-Generator (LIRA-Gen). Using this instrument, we present the first Indonesian language lipreading dataset (IDLRW) containing over 48,000 videos with 100-word categories spoken by various persons in natural conditions. Also, we developed a deep learning architecture consisting of an Advanced Residual Network (ARN) using ResNet-34 incorporated with a Channel Spatial Attention (CSA) module, improved sequence modeling by fusing Bi-Gru with Mamba (BGM), an integrated word decision module, and fine-tuned hyperparameter. Our measurement shows that it reaches an accuracy of 60.51% on the IDLRW dataset and outperforms state-of-the-art lipreading models from another dataset even without implementing an additional learning strategy.
唇读是能够增强语音感知能力的技术之一。然而,针对印尼语等低资源语言的唇读研究仍然有限。在本研究中,我们介绍了一种利用 YouTube 上的 CC BY 视频数据生成唇读数据集的工具,名为 "唇读信息资源汇编生成器"(LIRA-Gen)。利用该工具,我们生成了第一个印尼语唇读数据集(IDLRW),其中包含超过 48,000 个视频,由不同的人在自然条件下说出 100 个单词类别。此外,我们还开发了一种深度学习架构,该架构由使用 ResNet-34 的高级残差网络(ARN)和通道空间注意(CSA)模块组成,通过融合 Bi-Gru 和 Mamba(BGM)改进了序列建模,集成了单词判定模块和微调超参数。我们的测量结果表明,它在 IDLRW 数据集上的准确率达到了 60.51%,即使不采用额外的学习策略,也超过了另一个数据集上最先进的读唇模型。
{"title":"Recognizing Indonesian words based on visual cues of lip movement using deep learning","authors":"Griffani Megiyanto Rahmatullah , Shanq-Jang Ruan , Lieber Po-Hung Li","doi":"10.1016/j.measurement.2025.116968","DOIUrl":"10.1016/j.measurement.2025.116968","url":null,"abstract":"<div><div>Lipreading is one of the techniques that can enhance speech perception. However, there are still limited studies of lipreading research focusing on low-resource languages, such as Indonesian. In this study, we introduce an instrument designed to generate lipreading datasets using CC BY video data available on YouTube called Lipreading Information Resource Assembler-Generator (LIRA-Gen). Using this instrument, we present the first Indonesian language lipreading dataset (IDLRW) containing over 48,000 videos with 100-word categories spoken by various persons in natural conditions. Also, we developed a deep learning architecture consisting of an Advanced Residual Network (ARN) using ResNet-34 incorporated with a Channel Spatial Attention (CSA) module, improved sequence modeling by fusing Bi-Gru with Mamba (BGM), an integrated word decision module, and fine-tuned hyperparameter. Our measurement shows that it reaches an accuracy of 60.51% on the IDLRW dataset and outperforms state-of-the-art lipreading models from another dataset even without implementing an additional learning strategy.</div></div>","PeriodicalId":18349,"journal":{"name":"Measurement","volume":"250 ","pages":"Article 116968"},"PeriodicalIF":5.2,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143534726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.measurement.2025.117075
Yu Zhao , Jie Meng , Peng Ye , Aijun Chen , Wuhuang Huang , Duyu Qiu , Qinchuan Zhang , Kuojun Yang
The prevalent architecture for wide-band data acquisition (DAQ) systems is the time-interleaving (TI) sampling architecture. However, addressing the frequency response mismatch (FRM) and frequency response distortion (FRD) errors caused by the analog front-end circuit is crucial for accurate sampling in this architecture. This paper models the wide-band DAQ system based on the TI architecture as a periodic time-varying (PTV) system. It proposes a joint-PTV compensation (J-PTVC) method for both FRM and FRD errors. The compensation of FRM and FRD errors involves designing a digital PTV filter in the digital back end and an improved FFT convolution architecture for the PTV filter system. Furthermore, this paper constructs a TI base DAQ system with a sampling rate of 40 GSPS using 4 ADCs with a sampling rate of 10 GSPS and implements the proposed improved FFT convolution architecture in field programmable gate array (FPGA). After the joint compensation, the spurious-free dynamic range (SFDR) of the system increases from 24.4 dB to 51.94 dB, effective number of bits (ENOB) increases from 3.16 bits to 6.34 bits, the magnitude-frequency response flatness after compensation reaches 0.25 dB, and the step response rise time also decreases from 65 ps to 52.5 ps with 25.6 ps fast-edge signal input.
{"title":"General digital background compensation strategy for wide-band time-interleaved data acquisition system based on periodically time-varying filters","authors":"Yu Zhao , Jie Meng , Peng Ye , Aijun Chen , Wuhuang Huang , Duyu Qiu , Qinchuan Zhang , Kuojun Yang","doi":"10.1016/j.measurement.2025.117075","DOIUrl":"10.1016/j.measurement.2025.117075","url":null,"abstract":"<div><div>The prevalent architecture for wide-band data acquisition (DAQ) systems is the time-interleaving (TI) sampling architecture. However, addressing the frequency response mismatch (FRM) and frequency response distortion (FRD) errors caused by the analog front-end circuit is crucial for accurate sampling in this architecture. This paper models the wide-band DAQ system based on the TI architecture as a periodic time-varying (PTV) system. It proposes a joint-PTV compensation (J-PTVC) method for both FRM and FRD errors. The compensation of FRM and FRD errors involves designing a digital PTV filter in the digital back end and an improved FFT convolution architecture for the PTV filter system. Furthermore, this paper constructs a TI base DAQ system with a sampling rate of 40 GSPS using 4 ADCs with a sampling rate of 10 GSPS and implements the proposed improved FFT convolution architecture in field programmable gate array (FPGA). After the joint compensation, the spurious-free dynamic range (SFDR) of the system increases from 24.4 dB to 51.94 dB, effective number of bits (ENOB) increases from 3.16 bits to 6.34 bits, the magnitude-frequency response flatness after compensation reaches 0.25 dB, and the step response rise time also decreases from 65 ps to 52.5 ps with 25.6 ps fast-edge signal input.</div></div>","PeriodicalId":18349,"journal":{"name":"Measurement","volume":"250 ","pages":"Article 117075"},"PeriodicalIF":5.2,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}