首页 > 最新文献

Displays最新文献

英文 中文
Private compression for intermediate feature in IoT-supported mobile cloud inference 在物联网支持的移动云推理中对中间特征进行私有压缩
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-19 DOI: 10.1016/j.displa.2024.102857
Yuan Zhang , Zixi Wang , Xiaodi Guan , Lijun He , Fan Li
In the emerging Internet of Things (IoT) paradigm, mobile cloud inference serves as an efficient application framework that relieves the computation and storage burden on resource-constrained mobile devices by offloading the workload to cloud servers. However, mobile cloud inference encounters computation, communication, and privacy challenges to ensure efficient system inference and protect the privacy of mobile users’ collected information. To address the deployment of deep neural networks (DNN) with large capacity, we propose splitting computing (SC) where the entire model is divided into two parts, to be executed on mobile and cloud ends respectively. However, the transmission of intermediate data poses a bottleneck to system performance. This paper initially demonstrates the privacy issue arising from the machine analysis-oriented intermediate feature. We conduct a preliminary experiment to intuitively reveal the latent potential for enhancing the privacy-preserving ability of the initial feature. Motivated by this, we propose a framework for privacy-preserving intermediate feature compression, which addresses the limitations in both compression and privacy that arise in the original extracted feature data. Specifically, we propose a method that jointly enhances privacy and encoding efficiency, achieved through the collaboration of the encoding feature privacy enhancement module and the privacy feature ordering enhancement module. Additionally, we develop a gradient-reversal optimization strategy based on information theory to ensure the utmost concealment of core privacy information throughout the entire codec process. We evaluate the proposed method on two DNN models using two datasets, demonstrating its ability to achieve superior analysis accuracy and higher privacy preservation than HEVC. Furthermore, we provide an application case of a wireless sensor network to validate the effectiveness of the proposed method in a real-world scenario.
在新兴的物联网(IoT)模式中,移动云推理作为一种高效的应用框架,通过将工作负载卸载到云服务器,减轻了资源受限的移动设备的计算和存储负担。然而,移动云推理面临着计算、通信和隐私方面的挑战,既要确保高效的系统推理,又要保护移动用户所收集信息的隐私。为了解决大容量深度神经网络(DNN)的部署问题,我们提出了拆分计算(SC),即将整个模型分为两部分,分别在移动端和云端执行。然而,中间数据的传输对系统性能构成了瓶颈。本文初步展示了面向机器分析的中间特征所带来的隐私问题。我们进行了初步实验,直观地揭示了增强初始特征隐私保护能力的潜在可能性。受此启发,我们提出了一个保护隐私的中间特征压缩框架,该框架解决了原始提取特征数据在压缩和隐私保护方面的局限性。具体来说,我们提出了一种方法,通过编码特征隐私增强模块和隐私特征排序增强模块的协作,共同提高隐私和编码效率。此外,我们还开发了一种基于信息论的梯度反转优化策略,以确保在整个编码过程中最大限度地隐藏核心隐私信息。我们使用两个数据集在两个 DNN 模型上对所提出的方法进行了评估,结果表明该方法能够实现比 HEVC 更高的分析精度和更高的隐私保护。此外,我们还提供了一个无线传感器网络的应用案例,以验证所提方法在真实世界场景中的有效性。
{"title":"Private compression for intermediate feature in IoT-supported mobile cloud inference","authors":"Yuan Zhang ,&nbsp;Zixi Wang ,&nbsp;Xiaodi Guan ,&nbsp;Lijun He ,&nbsp;Fan Li","doi":"10.1016/j.displa.2024.102857","DOIUrl":"10.1016/j.displa.2024.102857","url":null,"abstract":"<div><div>In the emerging Internet of Things (IoT) paradigm, mobile cloud inference serves as an efficient application framework that relieves the computation and storage burden on resource-constrained mobile devices by offloading the workload to cloud servers. However, mobile cloud inference encounters computation, communication, and privacy challenges to ensure efficient system inference and protect the privacy of mobile users’ collected information. To address the deployment of deep neural networks (DNN) with large capacity, we propose splitting computing (SC) where the entire model is divided into two parts, to be executed on mobile and cloud ends respectively. However, the transmission of intermediate data poses a bottleneck to system performance. This paper initially demonstrates the privacy issue arising from the machine analysis-oriented intermediate feature. We conduct a preliminary experiment to intuitively reveal the latent potential for enhancing the privacy-preserving ability of the initial feature. Motivated by this, we propose a framework for privacy-preserving intermediate feature compression, which addresses the limitations in both compression and privacy that arise in the original extracted feature data. Specifically, we propose a method that jointly enhances privacy and encoding efficiency, achieved through the collaboration of the encoding feature privacy enhancement module and the privacy feature ordering enhancement module. Additionally, we develop a gradient-reversal optimization strategy based on information theory to ensure the utmost concealment of core privacy information throughout the entire codec process. We evaluate the proposed method on two DNN models using two datasets, demonstrating its ability to achieve superior analysis accuracy and higher privacy preservation than HEVC. Furthermore, we provide an application case of a wireless sensor network to validate the effectiveness of the proposed method in a real-world scenario.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102857"},"PeriodicalIF":3.7,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142527919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Icon similarity model based on cognition and deep learning 基于认知和深度学习的图标相似性模型
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-19 DOI: 10.1016/j.displa.2024.102864
Linlin Wang, Yixuan Zou, Haiyan Wang, Chengqi Xue
Human-computer cooperation guided by natural interaction, intelligent interaction, and human–computer integration is gradually becoming a new trend in human–computer interfaces. An icon is an indispensable pictographic symbol in an interface that can convey pivotal semantics between humans and computers. Research on similar icons’ cognition in humans and the discrimination of computers can reduce misunderstandings and facilitate transparent cooperation. Therefore, this research focuses on images of icons, extracted contours, and four features, including the curvature, proportion, orientation, and line of the contour, step by step. By manipulating the feature value change to obtain 360 similar icons, a cognitive experiment was conducted with 25 participants to explore the boundary values of the feature dimensions that cause different levels of similarity. Its boundary values were applied to deep learning to train a discrimination algorithm model that included 1500 similar icons. This dataset was used to train a Siamese neural network using a 16-layer network branch of a visual geometry group. The training process used stochastic gradient descent. This method of combining human cognition and deep learning technology is meaningful for establishing a consensus on icon semantics, including content and emotions, by outputting similarity levels and values. Taking icon similarity discrimination as an example, this study explored the analysis and simulation methods of computer vision for human visual cognition. The accuracy evaluated is 90.82%. The precision was evaluated as 90% for high, 80.65% for medium, and 97.30% for low. Recall was evaluated as 100% for high, 89.29% for medium, and 83.72% for low. It has been verified that it can compensate for fuzzy cognition in humans and enable computers to cooperate efficiently.
以自然交互、智能交互、人机融合为指导的人机合作逐渐成为人机界面的新趋势。图标是界面中不可或缺的图形符号,可以传达人机之间的重要语义。研究人类对类似图标的认知和计算机的辨别,可以减少误解,促进透明的合作。因此,本研究将重点放在图标图像、提取的轮廓和轮廓的曲率、比例、方向和线条等四个特征上,逐步进行研究。通过操纵特征值变化获得 360 个相似图标,并对 25 名参与者进行了认知实验,以探索导致不同相似度的特征维度的边界值。其边界值被应用于深度学习,以训练一个包含 1500 个相似图标的判别算法模型。该数据集被用于使用视觉几何组的 16 层网络分支训练连体神经网络。训练过程采用随机梯度下降法。这种将人类认知与深度学习技术相结合的方法对于通过输出相似度等级和数值,就图标语义(包括内容和情感)达成共识很有意义。本研究以图标相似性判别为例,探索了计算机视觉对人类视觉认知的分析和模拟方法。评估的准确率为 90.82%。高精确度为 90%,中精确度为 80.65%,低精确度为 97.30%。高召回率为 100%,中召回率为 89.29%,低召回率为 83.72%。经验证,它可以弥补人类的模糊认知,并使计算机能够高效合作。
{"title":"Icon similarity model based on cognition and deep learning","authors":"Linlin Wang,&nbsp;Yixuan Zou,&nbsp;Haiyan Wang,&nbsp;Chengqi Xue","doi":"10.1016/j.displa.2024.102864","DOIUrl":"10.1016/j.displa.2024.102864","url":null,"abstract":"<div><div>Human-computer cooperation guided by natural interaction, intelligent interaction, and human–computer integration is gradually becoming a new trend in human–computer interfaces. An icon is an indispensable pictographic symbol in an interface that can convey pivotal semantics between humans and computers. Research on similar icons’ cognition in humans and the discrimination of computers can reduce misunderstandings and facilitate transparent cooperation. Therefore, this research focuses on images of icons, extracted contours, and four features, including the curvature, proportion, orientation, and line of the contour, step by step. By manipulating the feature value change to obtain 360 similar icons, a cognitive experiment was conducted with 25 participants to explore the boundary values of the feature dimensions that cause different levels of similarity. Its boundary values were applied to deep learning to train a discrimination algorithm model that included 1500 similar icons. This dataset was used to train a Siamese neural network using a 16-layer network branch of a visual geometry group. The training process used stochastic gradient descent. This method of combining human cognition and deep learning technology is meaningful for establishing a consensus on icon semantics, including content and emotions, by outputting similarity levels and values. Taking icon similarity discrimination as an example, this study explored the analysis and simulation methods of computer vision for human visual cognition. The accuracy evaluated is 90.82%. The precision was evaluated as 90% for high, 80.65% for medium, and 97.30% for low. Recall was evaluated as 100% for high, 89.29% for medium, and 83.72% for low. It has been verified that it can compensate for fuzzy cognition in humans and enable computers to cooperate efficiently.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102864"},"PeriodicalIF":3.7,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multicenter evaluation of CT deep radiomics model in predicting Leibovich score risk groups for non-metastatic clear cell renal cell carcinoma 多中心评估 CT 深度放射组学模型在预测非转移性透明细胞肾细胞癌的莱博维奇评分风险组别中的应用
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-19 DOI: 10.1016/j.displa.2024.102867
Wuchao Li , Tongyin Yang , Pinhao Li , Xinfeng Liu , Shasha Zhang , Jianguo Zhu , Yuanyuan Pei , Yan Zhang , Tijiang Zhang , Rongpin Wang

Background

Non-metastatic clear cell renal cell carcinoma (nccRCC) poses a significant risk of postoperative recurrence and metastasis, underscoring the importance of accurate preoperative risk assessment. While the Leibovich score is effective, it relies on postoperative histopathological data. This study aims to evaluate the efficacy of CT radiomics and deep learning models in predicting Leibovich score risk groups in nccRCC, and to explore the interrelationship between CT and pathological features.

Patients and Methods

This research analyzed 600 nccRCC patients from four datasets, dividing them into low (Leibovich scores of 0–2) and intermediate to high risk (Leibovich scores exceeding 3) groups. Radiological model was developed from CT subjective features, and radiomics and deep learning models were constructed from CT images. Additionally, a deep radiomics model using radiomics and deep learning features was developed, alongside a fusion model incorporating all feature types. Model performance was assessed by AUC values, while survival differences across predicted groups were analyzed using survival curves and the log-rank test. Moreover, the research investigated the interrelationship between CT and pathological features derived from whole-slide pathological images.

Results

Within the training dataset, four radiological, three radiomics, and thirteen deep learning features were selected to develop models predicting nccRCC Leibovich score risk groups. The deep radiomics model demonstrated superior predictive accuracy, evidenced by AUC values of 0.881, 0.829, and 0.819 in external validation datasets. Notably, significant differences in overall survival were observed among patients classified by this model (log-rank test p < 0.05 across all datasets). Furthermore, a correlation and complementarity were observed between CT deep radiomics features and pathological deep learning features.

Conclusions

The CT deep radiomics model precisely predicts nccRCC Leibovich score risk groups preoperatively and highlights the synergistic effect between CT and pathological data.
背景非转移性透明细胞肾细胞癌(nccRCC)术后复发和转移的风险很大,这凸显了准确术前风险评估的重要性。虽然莱博维奇评分很有效,但它依赖于术后组织病理学数据。本研究旨在评估CT放射组学和深度学习模型在预测nccRCC莱博维奇评分风险组别中的有效性,并探索CT和病理特征之间的相互关系。患者和方法本研究分析了四个数据集中的600例nccRCC患者,将其分为低风险组(莱博维奇评分为0-2分)和中高风险组(莱博维奇评分超过3分)。根据CT主观特征开发了放射学模型,根据CT图像构建了放射组学和深度学习模型。此外,还开发了一个使用放射组学和深度学习特征的深度放射组学模型,以及一个包含所有特征类型的融合模型。模型性能通过 AUC 值进行评估,而预测组间的生存差异则通过生存曲线和对数秩检验进行分析。结果在训练数据集中,选择了四个放射学特征、三个放射组学特征和十三个深度学习特征来开发预测 nccRCC 莱博维奇评分风险组别的模型。在外部验证数据集中,深度放射组学模型的AUC值分别为0.881、0.829和0.819,显示了其卓越的预测准确性。值得注意的是,通过该模型分类的患者在总生存率方面存在明显差异(所有数据集的对数秩检验 p < 0.05)。结论 CT 深度放射组学模型可精确预测 nccRCC 莱博维奇评分术前风险组别,并突出了 CT 和病理数据之间的协同效应。
{"title":"Multicenter evaluation of CT deep radiomics model in predicting Leibovich score risk groups for non-metastatic clear cell renal cell carcinoma","authors":"Wuchao Li ,&nbsp;Tongyin Yang ,&nbsp;Pinhao Li ,&nbsp;Xinfeng Liu ,&nbsp;Shasha Zhang ,&nbsp;Jianguo Zhu ,&nbsp;Yuanyuan Pei ,&nbsp;Yan Zhang ,&nbsp;Tijiang Zhang ,&nbsp;Rongpin Wang","doi":"10.1016/j.displa.2024.102867","DOIUrl":"10.1016/j.displa.2024.102867","url":null,"abstract":"<div><h3>Background</h3><div>Non-metastatic clear cell renal cell carcinoma (nccRCC) poses a significant risk of postoperative recurrence and metastasis, underscoring the importance of accurate preoperative risk assessment. While the Leibovich score is effective, it relies on postoperative histopathological data. This study aims to evaluate the efficacy of CT radiomics and deep learning models in predicting Leibovich score risk groups in nccRCC, and to explore the interrelationship between CT and pathological features.</div></div><div><h3>Patients and Methods</h3><div>This research analyzed 600 nccRCC patients from four datasets, dividing them into low (Leibovich scores of 0–2) and intermediate to high risk (Leibovich scores exceeding 3) groups. Radiological model was developed from CT subjective features, and radiomics and deep learning models were constructed from CT images. Additionally, a deep radiomics model using radiomics and deep learning features was developed, alongside a fusion model incorporating all feature types. Model performance was assessed by AUC values, while survival differences across predicted groups were analyzed using survival curves and the log-rank test. Moreover, the research investigated the interrelationship between CT and pathological features derived from whole-slide pathological images.</div></div><div><h3>Results</h3><div>Within the training dataset, four radiological, three radiomics, and thirteen deep learning features were selected to develop models predicting nccRCC Leibovich score risk groups. The deep radiomics model demonstrated superior predictive accuracy, evidenced by AUC values of 0.881, 0.829, and 0.819 in external validation datasets. Notably, significant differences in overall survival were observed among patients classified by this model (log-rank test p &lt; 0.05 across all datasets). Furthermore, a correlation and complementarity were observed between CT deep radiomics features and pathological deep learning features.</div></div><div><h3>Conclusions</h3><div>The CT deep radiomics model precisely predicts nccRCC Leibovich score risk groups preoperatively and highlights the synergistic effect between CT and pathological data.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102867"},"PeriodicalIF":3.7,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142571782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LLD-GAN: An end-to-end network for low-light image demosaicking LLD-GAN:用于低照度图像去马赛克的端到端网络
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-18 DOI: 10.1016/j.displa.2024.102856
Li Wang , Cong Shi , Shrinivas Pundlik , Xu Yang , Liyuan Liu , Gang Luo
Demosaicking of low and ultra-low light images has wide applications in the fields of consumer electronics, security, and industrial machine vision. Denoising is a challenge in the demosaicking process. This study introduces a comprehensive end-to-end low-light demosaicking framework called LLD-GAN (Low Light Demosaicking Generative Adversarial Network), which greatly reduces the computational complexity. Our architecture employs a Wasserstein GAN framework enhanced by a gradient penalty mechanism. We have redesigned the generator based on the UNet++ network as well as its corresponding discriminator, which makes the model learning more efficient. In addition, we propose a new loss metric grounded in the principles of perceptual loss to obtain images with better visual quality. The contribution of Wasserstein GAN with gradient penalty and perceptual loss function was proved to be beneficial by our ablation experiments. For RGB images, we tested the proposed model under a wide range of low light levels, from 1/30 to 1/150 of normal light level, for 16-bit images with added noise. For actual low-light raw sensor images, the model was evaluated under three distinct lighting conditions: 1/100, 1/250, and 1/300 of normal exposure. The qualitative and quantitative comparison against advanced techniques demonstrates the validity and superiority of the LLD-GAN as a unified denoising-demosaicking tool.
低照度和超低照度图像的去马赛克技术在消费电子、安防和工业机器视觉领域有着广泛的应用。去噪是去马赛克过程中的一项挑战。本研究介绍了一种名为 LLD-GAN(低照度去马赛克生成对抗网络)的端到端低照度去马赛克综合框架,它大大降低了计算复杂度。我们的架构采用了 Wasserstein GAN 框架,并通过梯度惩罚机制进行了增强。我们重新设计了基于 UNet++ 网络的生成器及其相应的判别器,从而提高了模型学习的效率。此外,我们还根据感知损失原理提出了一种新的损失度量,以获得视觉质量更好的图像。我们的消融实验证明了带有梯度惩罚和感知损失函数的 Wasserstein GAN 的贡献。对于 RGB 图像,我们在广泛的低光照度下(正常光照度的 1/30 到 1/150)对所提出的模型进行了测试,对 16 位图像添加了噪声。对于实际的弱光原始传感器图像,我们在三种不同的光照条件下对模型进行了评估:正常曝光的 1/100、1/250 和 1/300。通过与先进技术的定性和定量比较,证明了 LLD-GAN 作为统一去噪-去马赛克工具的有效性和优越性。
{"title":"LLD-GAN: An end-to-end network for low-light image demosaicking","authors":"Li Wang ,&nbsp;Cong Shi ,&nbsp;Shrinivas Pundlik ,&nbsp;Xu Yang ,&nbsp;Liyuan Liu ,&nbsp;Gang Luo","doi":"10.1016/j.displa.2024.102856","DOIUrl":"10.1016/j.displa.2024.102856","url":null,"abstract":"<div><div>Demosaicking of low and ultra-low light images has wide applications in the fields of consumer electronics, security, and industrial machine vision. Denoising is a challenge in the demosaicking process. This study introduces a comprehensive end-to-end low-light demosaicking framework called LLD-GAN (Low Light Demosaicking Generative Adversarial Network), which greatly reduces the computational complexity. Our architecture employs a Wasserstein GAN framework enhanced by a gradient penalty mechanism. We have redesigned the generator based on the UNet++ network as well as its corresponding discriminator, which makes the model learning more efficient. In addition, we propose a new loss metric grounded in the principles of perceptual loss to obtain images with better visual quality. The contribution of Wasserstein GAN with gradient penalty and perceptual loss function was proved to be beneficial by our ablation experiments. For RGB images, we tested the proposed model under a wide range of low light levels, from 1/30 to 1/150 of normal light level, for 16-bit images with added noise. For actual low-light raw sensor images, the model was evaluated under three distinct lighting conditions: 1/100, 1/250, and 1/300 of normal exposure. The qualitative and quantitative comparison against advanced techniques demonstrates the validity and superiority of the LLD-GAN as a unified denoising-demosaicking tool.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102856"},"PeriodicalIF":3.7,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142527920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Subjective and objective quality evaluation for industrial images 工业图像的主观和客观质量评估
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-18 DOI: 10.1016/j.displa.2024.102858
Chengxu Zhou , Yanlin Jiang , Hongyan Liu , Jingchao Cao , Ke Gu
Recently, the demand for ever-better image processing technique continues to grow in field of industrial scenario monitoring and industrial process inspection. The subjective and objective quality evaluation of industrial images are vital for advancing the development of industrial visual perception and enhancing the quality of industrial image/video processing applications. However, the scarcity of publicly available industrial image databases with reliable subjective scores restricts the development of industrial image quality evaluation (IIQE). In preparation for a vacancy, this article first establishes two industrial image databases (i.e., industrial scenario image dataset (ISID) and industrial process image dataset (IPID)) for assessing IIQE metrics. Furthermore, in order to avoid overwhelming industrial image nuances due to the wavelet subband summation, we then present a novel industrial application subband information fidelity standard (SIFS) evaluation method using the channel capacity of visual signals in wavelet domain. Specifically, we first build a visual signals channel model based on perception process from human eyes to brain. Second, we compute and compare the channel capacity for reference and distorted images to measure the information fidelity in each wavelet subband. Third, we sum over the subbands for information fidelity ratio to obtain the overall quality score. Finally, we fairly compare some up-to-date and our proposed image quality evaluation (IQE) methods in two novelty industrial datasets respectively. Our ISID and IPID datasets are capable of evaluating most IQE metrics comprehensively and paves the way for further research on IIQE. Our SIFS model show a remarkable performance comparing with other up-to-date IQE methods.
近来,在工业场景监控和工业过程检测领域,对更好的图像处理技术的需求不断增长。工业图像的主观和客观质量评价对于推动工业视觉感知的发展和提高工业图像/视频处理应用的质量至关重要。然而,具有可靠主观评分的公开工业图像数据库的匮乏限制了工业图像质量评估(IIQE)的发展。为了填补空缺,本文首先建立了两个工业图像数据库(即工业场景图像数据集(ISID)和工业过程图像数据集(IPID)),用于评估 IIQE 指标。此外,为了避免小波子带求和导致工业图像细微差别过大,我们随后提出了一种利用小波域视觉信号信道容量的新型工业应用子带信息保真度标准(SIFS)评估方法。具体来说,我们首先根据从人眼到大脑的感知过程建立视觉信号信道模型。其次,我们计算并比较参考图像和失真图像的信道容量,以衡量每个小波子带的信息保真度。第三,我们对各子带的信息保真度比率进行求和,得出总体质量得分。最后,我们分别在两个新颖的工业数据集上对一些最新的图像质量评估(IQE)方法和我们提出的方法进行了公平的比较。我们的 ISID 和 IPID 数据集能够全面评估大多数 IQE 指标,为 IIQE 的进一步研究铺平了道路。与其他最新的 IQE 方法相比,我们的 SIFS 模型表现出了卓越的性能。
{"title":"Subjective and objective quality evaluation for industrial images","authors":"Chengxu Zhou ,&nbsp;Yanlin Jiang ,&nbsp;Hongyan Liu ,&nbsp;Jingchao Cao ,&nbsp;Ke Gu","doi":"10.1016/j.displa.2024.102858","DOIUrl":"10.1016/j.displa.2024.102858","url":null,"abstract":"<div><div>Recently, the demand for ever-better image processing technique continues to grow in field of industrial scenario monitoring and industrial process inspection. The subjective and objective quality evaluation of industrial images are vital for advancing the development of industrial visual perception and enhancing the quality of industrial image/video processing applications. However, the scarcity of publicly available industrial image databases with reliable subjective scores restricts the development of industrial image quality evaluation (IIQE). In preparation for a vacancy, this article first establishes two industrial image databases (i.e., industrial scenario image dataset (ISID) and industrial process image dataset (IPID)) for assessing IIQE metrics. Furthermore, in order to avoid overwhelming industrial image nuances due to the wavelet subband summation, we then present a novel industrial application subband information fidelity standard (SIFS) evaluation method using the channel capacity of visual signals in wavelet domain. Specifically, we first build a visual signals channel model based on perception process from human eyes to brain. Second, we compute and compare the channel capacity for reference and distorted images to measure the information fidelity in each wavelet subband. Third, we sum over the subbands for information fidelity ratio to obtain the overall quality score. Finally, we fairly compare some up-to-date and our proposed image quality evaluation (IQE) methods in two novelty industrial datasets respectively. Our ISID and IPID datasets are capable of evaluating most IQE metrics comprehensively and paves the way for further research on IIQE. Our SIFS model show a remarkable performance comparing with other up-to-date IQE methods.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102858"},"PeriodicalIF":3.7,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142571780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Underwater image enhancement with zero-point symmetry prior and reciprocal mapping 利用零点对称先验和倒易映射进行水下图像增强
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-17 DOI: 10.1016/j.displa.2024.102845
Fei Li , Chang Liu , Xiaomao Li
Images captured underwater typically exhibit color distortion, low brightness, and pseudo-haze due to light absorption and scattering. These degradations limit underwater image display and analysis, and still challenge the performance of current methods. To overcome these drawbacks, we propose a targeted and systematic method. Specifically, based on a key observation and extensive statistical analysis, we develop a Zero-Point Symmetry Prior (ZPSP): the histograms of channels a and b in the Lab color space, for color-balanced images, exhibit a symmetry distribution around the zero-point. Guided by the ZPSP, a Color Histogram Symmetry (CHS) method is proposed to balance color differences between channels a and b by ensuring they adhere to ZPSP. For channel L, a Reciprocal Mapping (RM) method is proposed to remove pseudo-haze and improve brightness, by aligning its reflectance and illumination components with the Dark Channel Prior (DCP) and Bright Channel Prior (BCP), respectively. Relatedly, it employs a divide-and-conquer strategy, distinguishing underwater image degradations in decomposed sub-images and tackling them individually. Notably, the above-proposed methods are integrated into a systematic enhancement framework, while focusing on targeted optimization for each type of degradation. Benefiting from the proposed strategy and methods, various degradations are individually optimized and mutually promoted, consistently producing visually pleasing results. Comprehensive experiments demonstrate that the proposed method exhibits remarkable performance on various underwater image datasets and applications, also showing good generalization ability. The code is available at: https://github.com/CN-lifei/ZSRM.
由于光的吸收和散射,水下拍摄的图像通常会出现色彩失真、亮度低和伪雾现象。这些劣化现象限制了水下图像的显示和分析,并对现有方法的性能提出了挑战。为了克服这些弊端,我们提出了一种有针对性的系统方法。具体来说,基于一个关键的观察结果和广泛的统计分析,我们开发了一种零点对称先验(ZPSP):对于色彩平衡的图像,Lab 色彩空间中通道 a 和 b 的直方图围绕零点呈现对称分布。在 ZPSP 的指导下,我们提出了一种色彩直方图对称(CHS)方法,通过确保通道 a 和 b 遵循 ZPSP 来平衡它们之间的色彩差异。对于通道 L,提出了一种互易映射(RM)方法,通过将其反射分量和光照分量分别与暗通道优先值(DCP)和亮通道优先值(BCP)对齐,来消除伪雾霾并提高亮度。与此相关的是,它采用了分而治之的策略,在分解的子图像中区分水下图像劣化情况,并分别加以解决。值得注意的是,上述提出的方法被整合到一个系统增强框架中,同时针对每种退化类型进行有针对性的优化。得益于所提出的策略和方法,各种降解都得到了单独优化和相互促进,从而不断产生令人愉悦的视觉效果。综合实验证明,所提出的方法在各种水下图像数据集和应用中表现出了卓越的性能,同时也显示出了良好的泛化能力。代码见:https://github.com/CN-lifei/ZSRM。
{"title":"Underwater image enhancement with zero-point symmetry prior and reciprocal mapping","authors":"Fei Li ,&nbsp;Chang Liu ,&nbsp;Xiaomao Li","doi":"10.1016/j.displa.2024.102845","DOIUrl":"10.1016/j.displa.2024.102845","url":null,"abstract":"<div><div>Images captured underwater typically exhibit color distortion, low brightness, and pseudo-haze due to light absorption and scattering. These degradations limit underwater image display and analysis, and still challenge the performance of current methods. To overcome these drawbacks, we propose a targeted and systematic method. Specifically, based on a key observation and extensive statistical analysis, we develop a Zero-Point Symmetry Prior (ZPSP): the histograms of channels a and b in the Lab color space, for color-balanced images, exhibit a symmetry distribution around the zero-point. Guided by the ZPSP, a Color Histogram Symmetry (CHS) method is proposed to balance color differences between channels a and b by ensuring they adhere to ZPSP. For channel L, a Reciprocal Mapping (RM) method is proposed to remove pseudo-haze and improve brightness, by aligning its reflectance and illumination components with the Dark Channel Prior (DCP) and Bright Channel Prior (BCP), respectively. Relatedly, it employs a divide-and-conquer strategy, distinguishing underwater image degradations in decomposed sub-images and tackling them individually. Notably, the above-proposed methods are integrated into a systematic enhancement framework, while focusing on targeted optimization for each type of degradation. Benefiting from the proposed strategy and methods, various degradations are individually optimized and mutually promoted, consistently producing visually pleasing results. Comprehensive experiments demonstrate that the proposed method exhibits remarkable performance on various underwater image datasets and applications, also showing good generalization ability. The code is available at: <span><span>https://github.com/CN-lifei/ZSRM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102845"},"PeriodicalIF":3.7,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142527923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating ASD in children through automatic analysis of paintings 通过自动分析绘画评估儿童自闭症
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-16 DOI: 10.1016/j.displa.2024.102850
Ji-Feng Luo , Zhijuan Jin , Xinding Xia , Fangyu Shi , Zhihao Wang , Chi Zhang
Autism spectrum disorder (ASD) is a hereditary neurodevelopmental disorder affecting individuals, families, and societies worldwide. Screening for ASD relies on specialized medical resources, and current machine learning-based screening methods depend on expensive professional devices and algorithms. Therefore, there is a critical need to develop accessible and easily implementable methods for ASD assessment. In this study, we are committed to finding such an ASD screening and rehabilitation assessment solution based on children’s paintings. From an ASD painting database, 375 paintings from children with ASD and 160 paintings from typically developing children were selected, and a series of image signal processing algorithms based on typical characteristics of children with ASD were designed to extract features from images. The effectiveness of extracted features was evaluated through statistical methods, and they were then classified using a support vector machine (SVM) and XGBoost (eXtreme Gradient Boosting). In 5-fold cross-validation, the SVM achieved a recall of 94.93%, a precision of 86.40%, an accuracy of 85.98%, and an AUC of 90.90%, while the XGBoost achieved a recall of 96.27%, a precision of 93.78%, an accuracy of 92.90%, and an AUC of 98.00%. This efficacy persists at a high level even during additional validation on a set of newly collected paintings. Not only did the performance surpass that of participated human experts, but the high recall rate, as well as its affordability, manageability, and ease of implementation, indicates potentiality in wide screening and rehabilitation assessment. All analysis code is public at GitHub: dishangti/ASD-Painting-Pub.
自闭症谱系障碍(ASD)是一种遗传性神经发育障碍,影响着全世界的个人、家庭和社会。ASD 的筛查依赖于专业的医疗资源,目前基于机器学习的筛查方法依赖于昂贵的专业设备和算法。因此,我们亟需开发方便易用的 ASD 评估方法。在本研究中,我们致力于寻找这样一种基于儿童绘画的 ASD 筛查和康复评估解决方案。我们从 ASD 绘画数据库中选取了 375 幅 ASD 儿童绘画作品和 160 幅典型发育儿童绘画作品,设计了一系列基于 ASD 儿童典型特征的图像信号处理算法来提取图像特征。通过统计方法对提取特征的有效性进行评估,然后使用支持向量机(SVM)和 XGBoost(eXtreme Gradient Boosting)对其进行分类。在 5 倍交叉验证中,SVM 的召回率为 94.93%,精确率为 86.40%,准确率为 85.98%,AUC 为 90.90%,而 XGBoost 的召回率为 96.27%,精确率为 93.78%,准确率为 92.90%,AUC 为 98.00%。即使在对一组新收集的绘画作品进行额外验证时,这一功效仍保持在较高水平。不仅性能超过了参与验证的人类专家,而且高召回率、可负担性、可管理性和易实施性都显示了其在广泛筛查和康复评估方面的潜力。所有分析代码均在 GitHub 上公开:dishangti/ASD-Painting-Pub。
{"title":"Evaluating ASD in children through automatic analysis of paintings","authors":"Ji-Feng Luo ,&nbsp;Zhijuan Jin ,&nbsp;Xinding Xia ,&nbsp;Fangyu Shi ,&nbsp;Zhihao Wang ,&nbsp;Chi Zhang","doi":"10.1016/j.displa.2024.102850","DOIUrl":"10.1016/j.displa.2024.102850","url":null,"abstract":"<div><div>Autism spectrum disorder (ASD) is a hereditary neurodevelopmental disorder affecting individuals, families, and societies worldwide. Screening for ASD relies on specialized medical resources, and current machine learning-based screening methods depend on expensive professional devices and algorithms. Therefore, there is a critical need to develop accessible and easily implementable methods for ASD assessment. In this study, we are committed to finding such an ASD screening and rehabilitation assessment solution based on children’s paintings. From an ASD painting database, 375 paintings from children with ASD and 160 paintings from typically developing children were selected, and a series of image signal processing algorithms based on typical characteristics of children with ASD were designed to extract features from images. The effectiveness of extracted features was evaluated through statistical methods, and they were then classified using a support vector machine (SVM) and XGBoost (eXtreme Gradient Boosting). In 5-fold cross-validation, the SVM achieved a recall of 94.93%, a precision of 86.40%, an accuracy of 85.98%, and an AUC of 90.90%, while the XGBoost achieved a recall of 96.27%, a precision of 93.78%, an accuracy of 92.90%, and an AUC of 98.00%. This efficacy persists at a high level even during additional validation on a set of newly collected paintings. Not only did the performance surpass that of participated human experts, but the high recall rate, as well as its affordability, manageability, and ease of implementation, indicates potentiality in wide screening and rehabilitation assessment. All analysis code is public at GitHub: <span><span>dishangti/ASD-Painting-Pub</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102850"},"PeriodicalIF":3.7,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142527924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using query semantic and feature transfer fusion to enhance cardinality estimating of property graph queries 利用查询语义和特征转移融合来增强属性图查询的核心估计能力
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-16 DOI: 10.1016/j.displa.2024.102854
Zhenzhen He , Tiquan Gu , Jiong Yu
With the increasing complexity and diversity of query tasks, cardinality estimation has become one of the most challenging problems in query optimization. In this study, we propose an efficient and accurate cardinality estimation method to address the cardinality estimation problem in property graph queries, particularly in response to the current research gap regarding the neglect of contextual semantic features. We first propose formal representations of the property graph query and define its cardinality estimation problem. Then, through the query featurization, we transform the query into a vector representation that can be learned by the estimation model, and enrich the feature vector representation by the context semantic information of the query. We finally propose an estimation model for property graph queries, specifically introducing a feature information transfer module to dynamically control the information flow meanwhile achieving the model’s feature fusion and inference. Experimental results on three datasets show that the estimation model can accurately and efficiently estimate the cardinality of property graph queries, the mean Q_error and RMSE are reduced by about 30% and 25% than the state-of-art estimation models. The context semantics features of queries can improve the model’s estimation accuracy, the mean Q_error result is reduced by about 20% and the RMSE result is about 5%.
随着查询任务的复杂性和多样性不断增加,中心度估计已成为查询优化中最具挑战性的问题之一。在本研究中,我们提出了一种高效、准确的多因性估计方法,以解决属性图查询中的多因性估计问题,尤其是针对当前忽视上下文语义特征的研究空白。我们首先提出了属性图查询的形式化表征,并定义了属性图查询的中心性估计问题。然后,通过查询特征化,我们将查询转化为可被估计模型学习的向量表示,并通过查询的上下文语义信息丰富特征向量表示。最后,我们提出了一种针对属性图查询的估算模型,特别引入了一个特征信息传递模块来动态控制信息流,同时实现模型的特征融合和推理。在三个数据集上的实验结果表明,该估计模型能准确、高效地估计出属性图查询的卡片度,其平均 Q_error 和 RMSE 比现有估计模型分别降低了约 30% 和 25%。查询的上下文语义特征可以提高模型的估计精度,平均 Q_error 结果降低了约 20%,RMSE 结果降低了约 5%。
{"title":"Using query semantic and feature transfer fusion to enhance cardinality estimating of property graph queries","authors":"Zhenzhen He ,&nbsp;Tiquan Gu ,&nbsp;Jiong Yu","doi":"10.1016/j.displa.2024.102854","DOIUrl":"10.1016/j.displa.2024.102854","url":null,"abstract":"<div><div>With the increasing complexity and diversity of query tasks, cardinality estimation has become one of the most challenging problems in query optimization. In this study, we propose an efficient and accurate cardinality estimation method to address the cardinality estimation problem in property graph queries, particularly in response to the current research gap regarding the neglect of contextual semantic features. We first propose formal representations of the property graph query and define its cardinality estimation problem. Then, through the query featurization, we transform the query into a vector representation that can be learned by the estimation model, and enrich the feature vector representation by the context semantic information of the query. We finally propose an estimation model for property graph queries, specifically introducing a feature information transfer module to dynamically control the information flow meanwhile achieving the model’s feature fusion and inference. Experimental results on three datasets show that the estimation model can accurately and efficiently estimate the cardinality of property graph queries, the mean Q_error and RMSE are reduced by about 30% and 25% than the state-of-art estimation models. The context semantics features of queries can improve the model’s estimation accuracy, the mean Q_error result is reduced by about 20% and the RMSE result is about 5%.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102854"},"PeriodicalIF":3.7,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142527922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Profiles of cybersickness symptoms 网络病症状概况
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-11 DOI: 10.1016/j.displa.2024.102853
Jonathan W. Kelly , Nicole L. Hayes , Taylor A. Doty , Stephen B. Gilbert , Michael C. Dorneich
Cybersickness – discomfort caused by virtual reality (VR) – remains a significant problem that negatively affects the user experience. Research on individual differences in cybersickness has typically focused on overall sickness intensity, but a detailed understanding should include whether individuals differ in the relative intensity of cybersickness symptoms. This study used latent profile analysis (LPA) to explore whether there exist groups of individuals who experience common patterns of cybersickness symptoms. Participants played a VR game for up to 20 min. LPA indicated three groups with low, medium, and high overall cybersickness. Further, there were similarities and differences in relative patterns of nausea, disorientation, and oculomotor symptoms between groups. Disorientation was lower than nausea and oculomotor symptoms for all three groups. Nausea and oculomotor were experienced at similar levels within the high and low sickness groups, but the medium sickness group experienced more nausea than oculomotor. Characteristics of group members varied across groups, including gender, virtual reality experience, video game experience, and history of motion sickness. These findings identify distinct individual experiences in symptomology that go beyond overall sickness intensity, which could enable future interventions that target certain groups of individuals and specific symptoms.
网络晕眩--虚拟现实(VR)引起的不适--仍然是对用户体验产生负面影响的一个重要问题。有关网络晕眩个体差异的研究通常集中在整体晕眩强度上,但要详细了解个体在网络晕眩症状的相对强度上是否存在差异。本研究采用潜特征分析法(LPA)来探讨是否存在经历共同晕机症状模式的个人群体。参与者玩了长达 20 分钟的 VR 游戏。LPA 显示,总体晕网症状分为低、中、高三个组别。此外,各组之间恶心、迷失方向和眼球运动症状的相对模式也有异同。在所有三个组别中,迷失方向症状低于恶心和眼球运动症状。恶心和眼球运动症状在高晕组和低晕组的程度相似,但中晕组的恶心症状比眼球运动症状严重。各组成员的特征各不相同,包括性别、虚拟现实经验、视频游戏经验和晕动病史。这些研究结果确定了个人在症状学方面的不同体验,这些体验超出了总体晕眩强度,这有助于未来针对特定人群和特定症状采取干预措施。
{"title":"Profiles of cybersickness symptoms","authors":"Jonathan W. Kelly ,&nbsp;Nicole L. Hayes ,&nbsp;Taylor A. Doty ,&nbsp;Stephen B. Gilbert ,&nbsp;Michael C. Dorneich","doi":"10.1016/j.displa.2024.102853","DOIUrl":"10.1016/j.displa.2024.102853","url":null,"abstract":"<div><div>Cybersickness – discomfort caused by virtual reality (VR) – remains a significant problem that negatively affects the user experience. Research on individual differences in cybersickness has typically focused on overall sickness intensity, but a detailed understanding should include whether individuals differ in the relative intensity of cybersickness symptoms. This study used latent profile analysis (LPA) to explore whether there exist groups of individuals who experience common patterns of cybersickness symptoms. Participants played a VR game for up to 20 min. LPA indicated three groups with low, medium, and high overall cybersickness. Further, there were similarities and differences in relative patterns of nausea, disorientation, and oculomotor symptoms between groups. Disorientation was lower than nausea and oculomotor symptoms for all three groups. Nausea and oculomotor were experienced at similar levels within the high and low sickness groups, but the medium sickness group experienced more nausea than oculomotor. Characteristics of group members varied across groups, including gender, virtual reality experience, video game experience, and history of motion sickness. These findings identify distinct individual experiences in symptomology that go beyond overall sickness intensity, which could enable future interventions that target certain groups of individuals and specific symptoms.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102853"},"PeriodicalIF":3.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142444995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel heart rate estimation framework with self-correcting face detection for Neonatal Intensive Care Unit 用于新生儿重症监护室的新型心率估算框架与自校正人脸检测技术
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-11 DOI: 10.1016/j.displa.2024.102852
Kangyang Cao, Tao Tan, Zhengxuan Chen, Kaiwen Yang, Yue Sun
Remote photoplethysmography (rPPG) is a non-invasive method for monitoring heart rate (HR) and other vital signs by measuring subtle facial color changes caused by blood flow variations beneath the skin, typically captured through video-based imaging. Current rPPG technology, which is optimized for ideal conditions, faces significant challenges in real-world clinical settings such as Neonatal Intensive Care Units (NICUs). These challenges primarily arise from the limitations of automatic face detection algorithms embedded in HR estimation frameworks, which have difficulty accurately detecting the faces of newborns. Additionally, variations in lighting conditions can significantly affect the accuracy of HR estimation. The combination of these positional changes and fluctuations in lighting significantly impacts the accuracy of HR estimation. To address the challenges of inadequate face detection and HR estimation in newborns, we propose a novel HR estimation framework that incorporates a Self-Correcting face detection module. Our HR estimation framework introduces an innovative rPPG value reference module to mitigate the effects of lighting variations, significantly reducing HR estimation error. The Self-Correcting module improves face detection accuracy by enhancing robustness to occlusions and position changes while automating the process to minimize manual intervention. Our proposed framework demonstrates notable improvements in both face detection accuracy and HR estimation, outperforming existing methods for newborns in NICUs.
远程照相血压仪(rPPG)是一种非侵入性方法,通过测量皮下血流变化引起的细微面部颜色变化来监测心率(HR)和其他生命体征,通常通过视频成像来捕捉。目前的 rPPG 技术针对理想条件进行了优化,但在新生儿重症监护室 (NICU) 等实际临床环境中却面临着巨大挑战。这些挑战主要源于嵌入在 HR 估计框架中的自动人脸检测算法的局限性,该算法难以准确检测新生儿的脸部。此外,光照条件的变化也会严重影响心率估计的准确性。这些位置变化和光照波动的结合会严重影响 HR 估计的准确性。为了解决新生儿人脸检测和心率估算不足的难题,我们提出了一个新颖的心率估算框架,其中包含一个自校正人脸检测模块。我们的心率估计框架引入了创新的 rPPG 值参考模块,以减轻光照变化的影响,从而显著降低心率估计误差。自校正模块通过增强对遮挡和位置变化的鲁棒性来提高人脸检测的准确性,同时实现流程自动化,最大限度地减少人工干预。我们提出的框架在人脸检测准确性和心率估计方面都有显著改进,在新生儿重症监护室的新生儿方面优于现有方法。
{"title":"A novel heart rate estimation framework with self-correcting face detection for Neonatal Intensive Care Unit","authors":"Kangyang Cao,&nbsp;Tao Tan,&nbsp;Zhengxuan Chen,&nbsp;Kaiwen Yang,&nbsp;Yue Sun","doi":"10.1016/j.displa.2024.102852","DOIUrl":"10.1016/j.displa.2024.102852","url":null,"abstract":"<div><div>Remote photoplethysmography (rPPG) is a non-invasive method for monitoring heart rate (HR) and other vital signs by measuring subtle facial color changes caused by blood flow variations beneath the skin, typically captured through video-based imaging. Current rPPG technology, which is optimized for ideal conditions, faces significant challenges in real-world clinical settings such as Neonatal Intensive Care Units (NICUs). These challenges primarily arise from the limitations of automatic face detection algorithms embedded in HR estimation frameworks, which have difficulty accurately detecting the faces of newborns. Additionally, variations in lighting conditions can significantly affect the accuracy of HR estimation. The combination of these positional changes and fluctuations in lighting significantly impacts the accuracy of HR estimation. To address the challenges of inadequate face detection and HR estimation in newborns, we propose a novel HR estimation framework that incorporates a Self-Correcting face detection module. Our HR estimation framework introduces an innovative rPPG value reference module to mitigate the effects of lighting variations, significantly reducing HR estimation error. The Self-Correcting module improves face detection accuracy by enhancing robustness to occlusions and position changes while automating the process to minimize manual intervention. Our proposed framework demonstrates notable improvements in both face detection accuracy and HR estimation, outperforming existing methods for newborns in NICUs.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102852"},"PeriodicalIF":3.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142527917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Displays
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1