Pub Date : 2024-08-24DOI: 10.1016/j.displa.2024.102818
Yanning Ma , Yiran Li , Xulin Liu , Jie Gao , Axian Wang , Haiwen chen , Zhi Liu , Zuolin Jin
Orthodontic treatment is a subject of prevention, treatment and prognostic care for malocclusion. The complexity and multimodality of orthodontic treatment restrict the development of intelligent orthodontics, due to the different growth and development characteristics of patients with different treatment stages and the different prognosis of treatment at different stages. Digital twin technology can effectively solve the problems of orthodontics ,due to its ability to interpret the deep information of big data. Building upon medical knowledge, this paper succinctly summarizes the application of digital twin technology in key areas of orthodontics, including precision Medicine, personalized orthodontic treatment, prediction of soft and hard tissues before and after orthodontic treatment, and the orthodontic cloud platform. The study provides a feasibility analysis of an intelligent predictive model for orthodontic treatment under multimodal fusion, offering a robust solution for establishing a comprehensive digital twin-assisted diagnostic paradigm.
{"title":"Future perspectives of digital twin technology in orthodontics","authors":"Yanning Ma , Yiran Li , Xulin Liu , Jie Gao , Axian Wang , Haiwen chen , Zhi Liu , Zuolin Jin","doi":"10.1016/j.displa.2024.102818","DOIUrl":"10.1016/j.displa.2024.102818","url":null,"abstract":"<div><p>Orthodontic treatment is a subject of prevention, treatment and prognostic care for malocclusion. The complexity and multimodality of orthodontic treatment restrict the development of intelligent orthodontics, due to the different growth and development characteristics of patients with different treatment stages and the different prognosis of treatment at different stages. Digital twin technology can effectively solve the problems of orthodontics ,due to its ability to interpret the deep information of big data. Building upon medical knowledge, this paper succinctly summarizes the application of digital twin technology in key areas of orthodontics, including precision Medicine, personalized orthodontic treatment, prediction of soft and hard tissues before and after orthodontic treatment, and the orthodontic cloud platform. The study provides a feasibility analysis of an intelligent predictive model for orthodontic treatment under multimodal fusion, offering a robust solution for establishing a comprehensive digital twin-assisted diagnostic paradigm.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102818"},"PeriodicalIF":3.7,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142148484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-24DOI: 10.1016/j.displa.2024.102817
Linruo Liu , Yangtao Wang , Yanzhao Xie , Xin Tan , Lizhuang Ma , Maobin Tang , Meie Fang
Learning node representation on heterophilous graphs has been challenging due to nodes with diverse labels/attributes being connected. The main idea is to balance contributions between the center node and neighborhoods. However, existing methods failed to make full use of personalized contributions of different neighborhoods based on whether they own the same label as the center node, making it necessary to explore the distinctive contributions of similar/dissimilar neighborhoods. We reveal that both similar/dissimilar neighborhoods have positive impacts on feature aggregation under different homophily ratios. Especially, dissimilar neighborhoods play a significant role under low homophily ratios. Based on this, we propose LAAH, a label-aware aggregation approach for node representation learning on heterophilous graphs. LAAH separates each center node from its neighborhoods and generates their own node representations. Additionally, for each neighborhood, LAAH records its label information based on whether it belongs to the same class as the center node and then aggregates its effective feature in a weighted manner. Finally, a learnable parameter is used to balance the contributions of each center node and all its neighborhoods, leading to updated representations. Extensive experiments on 8 real-world heterophilous datasets and a synthetic dataset verify that LAAH can achieve competitive or superior accuracy in node classification with lower parameter scale and computational complexity compared with the SOTA methods. The code is released at GitHub: https://github.com/laah123graph/LAAH.
由于具有不同标签/属性的节点相互连接,在异嗜图中学习节点表示一直是一项挑战。其主要思路是平衡中心节点和邻域之间的贡献。然而,现有方法未能充分利用不同邻域根据是否与中心节点拥有相同标签而做出的个性化贡献,因此有必要探索相似/不相似邻域的独特贡献。我们发现,在不同的同亲比率下,相似/不相似邻域对特征聚合都有积极影响。特别是,在低同源性比率下,不相似邻域发挥着重要作用。在此基础上,我们提出了一种用于异亲图节点表示学习的标签感知聚合方法--LAAH。LAAH 将每个中心节点从其邻域中分离出来,并生成各自的节点表示。此外,对于每个邻域,LAAH 会根据其是否与中心节点属于同一类别记录其标签信息,然后以加权方式聚合其有效特征。最后,利用一个可学习的参数来平衡每个中心节点及其所有邻域的贡献,从而更新表征。在 8 个真实世界的嗜异性数据集和一个合成数据集上进行的广泛实验验证了 LAAH 能够在节点分类方面达到具有竞争力或更高的准确度,而且与 SOTA 方法相比,参数规模和计算复杂度更低。代码发布在 GitHub:https://github.com/laah123graph/LAAH。
{"title":"Label-aware aggregation on heterophilous graphs for node representation learning","authors":"Linruo Liu , Yangtao Wang , Yanzhao Xie , Xin Tan , Lizhuang Ma , Maobin Tang , Meie Fang","doi":"10.1016/j.displa.2024.102817","DOIUrl":"10.1016/j.displa.2024.102817","url":null,"abstract":"<div><p>Learning node representation on heterophilous graphs has been challenging due to nodes with diverse labels/attributes being connected. The main idea is to balance contributions between the center node and neighborhoods. However, existing methods failed to make full use of personalized contributions of different neighborhoods based on whether they own the same label as the center node, making it necessary to explore the distinctive contributions of similar/dissimilar neighborhoods. We reveal that both similar/dissimilar neighborhoods have positive impacts on feature aggregation under different homophily ratios. Especially, dissimilar neighborhoods play a significant role under low homophily ratios. Based on this, we propose LAAH, a label-aware aggregation approach for node representation learning on heterophilous graphs. LAAH separates each center node from its neighborhoods and generates their own node representations. Additionally, for each neighborhood, LAAH records its label information based on whether it belongs to the same class as the center node and then aggregates its effective feature in a weighted manner. Finally, a learnable parameter is used to balance the contributions of each center node and all its neighborhoods, leading to updated representations. Extensive experiments on 8 real-world heterophilous datasets and a synthetic dataset verify that LAAH can achieve competitive or superior accuracy in node classification with lower parameter scale and computational complexity compared with the SOTA methods. The code is released at GitHub: <span><span>https://github.com/laah123graph/LAAH</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102817"},"PeriodicalIF":3.7,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142084113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-22DOI: 10.1016/j.displa.2024.102812
Yarui Xi , Zhiwei Qiao , Ao Wang , Chenyun Fang , Fenglin Liu
The local scanning orthogonal translation computed laminography (OTCL) has great potential for tiny fault detection of laminated structure thin-plate parts. However, it generates limited-angle and truncated projection data, which result in aliasing and truncation artifacts in the reconstructed images. The directional total variation (DTV) algorithm has been demonstrated to achieve highly accurate reconstructed images in limited-angle computed tomography (CT). However, its application in local scanning OTCL has not been explored. Based on this algorithm, we introduce the norm to better suppress artifacts, and prior information to further constrain the reconstructed image. Thus, we propose a prior information guided directional total p-variation (DTpV) algorithm (Pig-DTpV). The Pig-DTpV model is a constrained non-convex optimization model. The constraint term are the six DTpV terms, whereas the objective term is the data fidelity term. Then, we use the iterative reweighting strategy and the Chambolle–Pock (CP) algorithm to solve the model. The Pig-DTpV reconstruction algorithm’s performance is compared with other algorithms such as simultaneous algebraic reconstruction technique (SART), TV, reweighted anisotropic-TV (RwATV), and DTV in simulation and real data experiments. The experiment results demonstrate that the Pig-DTpV algorithm can reduce truncation and aliasing artifacts and enhance the quality of reconstructed images.
{"title":"Pig-DTpV: A prior information guided directional TpV algorithm for orthogonal translation computed laminography","authors":"Yarui Xi , Zhiwei Qiao , Ao Wang , Chenyun Fang , Fenglin Liu","doi":"10.1016/j.displa.2024.102812","DOIUrl":"10.1016/j.displa.2024.102812","url":null,"abstract":"<div><p>The local scanning orthogonal translation computed laminography (OTCL) has great potential for tiny fault detection of laminated structure thin-plate parts. However, it generates limited-angle and truncated projection data, which result in aliasing and truncation artifacts in the reconstructed images. The directional total variation (DTV) algorithm has been demonstrated to achieve highly accurate reconstructed images in limited-angle computed tomography (CT). However, its application in local scanning OTCL has not been explored. Based on this algorithm, we introduce the <span><math><msub><mrow><mi>l</mi></mrow><mrow><mi>p</mi></mrow></msub></math></span> norm to better suppress artifacts, and prior information to further constrain the reconstructed image. Thus, we propose a prior information guided directional total p-variation (DTpV) algorithm (Pig-DTpV). The Pig-DTpV model is a constrained non-convex optimization model. The constraint term are the six DTpV terms, whereas the objective term is the data fidelity term. Then, we use the iterative reweighting strategy and the Chambolle–Pock (CP) algorithm to solve the model. The Pig-DTpV reconstruction algorithm’s performance is compared with other algorithms such as simultaneous algebraic reconstruction technique (SART), TV, reweighted anisotropic-TV (RwATV), and DTV in simulation and real data experiments. The experiment results demonstrate that the Pig-DTpV algorithm can reduce truncation and aliasing artifacts and enhance the quality of reconstructed images.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102812"},"PeriodicalIF":3.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142058186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-21DOI: 10.1016/j.displa.2024.102815
Ziwei Ming , Defeng Liu , Long Xiao , Siyu Tu , Peng Chen , Yingshan Ma , Jinsong Liu , Zhengang Yang , Kejia Wang
Ptychography is an imaging technique that uses the redundancy of information generated by the overlapping of adjacent light regions to calculate the relative phase of adjacent regions and reconstruct the image. In the terahertz domain, in order to make the ptychography technology better serve engineering applications, we propose a set of deep learning terahertz ptychography system that is easier to realize in engineering and plays an outstanding role. To address this issue, we propose to use a powerful deep blind degradation model which uses isotropic and anisotropic Gaussian kernels for random blurring, chooses the downsampling modes from nearest interpolation, bilinear interpolation, bicubic interpolation and down-up-sampling method, and introduces Gaussian noise, JPEG compression noise, and processed detector noise. Additionally, a random shuffle strategy is used to further expand the degradation space of the image. Using paired low/high resolution images generated by the deep blind degradation model, we trained a multi-layer residual network with residual scaling parameters and dense connection structure to achieve the neural network super-resolution of terahertz ptychography for the first time. We use two representative neural networks, SwinIR and RealESRGAN, to compare with our model. Experimental result shows that the proposed method achieved better accuracy and visual improvement than other terahertz ptychographic image super-resolution algorithms. Further quantitative calculation proved that our method has significant advantages in terahertz ptychographic image super-resolution, achieving a resolution of 33.09 dB on the peak signal-to-noise ratio (PSNR) index and 3.05 on the naturalness image quality estimator (NIQE) index. This efficient and engineered approach fills the gap in the improvement of terahertz ptychography by using neural networks.
{"title":"Generative adversarial networks with deep blind degradation powered terahertz ptychography","authors":"Ziwei Ming , Defeng Liu , Long Xiao , Siyu Tu , Peng Chen , Yingshan Ma , Jinsong Liu , Zhengang Yang , Kejia Wang","doi":"10.1016/j.displa.2024.102815","DOIUrl":"10.1016/j.displa.2024.102815","url":null,"abstract":"<div><p>Ptychography is an imaging technique that uses the redundancy of information generated by the overlapping of adjacent light regions to calculate the relative phase of adjacent regions and reconstruct the image. In the terahertz domain, in order to make the ptychography technology better serve engineering applications, we propose a set of deep learning terahertz ptychography system that is easier to realize in engineering and plays an outstanding role. To address this issue, we propose to use a powerful deep blind degradation model which uses isotropic and anisotropic Gaussian kernels for random blurring, chooses the downsampling modes from nearest interpolation, bilinear interpolation, bicubic interpolation and down-up-sampling method, and introduces Gaussian noise, JPEG compression noise, and processed detector noise. Additionally, a random shuffle strategy is used to further expand the degradation space of the image. Using paired low/high resolution images generated by the deep blind degradation model, we trained a multi-layer residual network with residual scaling parameters and dense connection structure to achieve the neural network super-resolution of terahertz ptychography for the first time. We use two representative neural networks, SwinIR and RealESRGAN, to compare with our model. Experimental result shows that the proposed method achieved better accuracy and visual improvement than other terahertz ptychographic image super-resolution algorithms. Further quantitative calculation proved that our method has significant advantages in terahertz ptychographic image super-resolution, achieving a resolution of 33.09 dB on the peak signal-to-noise ratio (PSNR) index and 3.05 on the naturalness image quality estimator (NIQE) index. This efficient and engineered approach fills the gap in the improvement of terahertz ptychography by using neural networks.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102815"},"PeriodicalIF":3.7,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142040041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-20DOI: 10.1016/j.displa.2024.102816
Sizhe Wang , Hao Sheng , Rongshan Chen , Da Yang , Zhenglong Cui , Ruixuan Cong , Zhang Xiong
Transformer-based light field (LF) super-resolution (SR) methods have recently achieved significant performance improvements due to global feature modeling by self-attention mechanisms. However, as a method designed for natural language processing, 4D LFs are reshaped into 1D sequences with an immense set of tokens, which results in a quadratic computational complexity cost. In this paper, a spatial–angular–epipolar swin transformer (SAEST) is proposed for spatial and angular SR (SASR), which sufficiently extracts SR information in the spatial, angular, and epipolar domains using local self-attention with shifted windows. Specifically, in SAEST, a spatial swin transformer and an angular standard transformer are firstly cascaded to extract spatial and angular SR features, separately. Then, the extracted SR feature is reshaped into the epipolar plane image pattern and fed into an epipolar swin transformer to extract the spatial–angular correlation information. Finally, several SAEST blocks are cascaded in a Unet framework to extract multi-scale SR features for SASR. Experiment results indicate that SAEST is a fast transformer-based SASR method with less running time and GPU consumption and has outstanding performance on simulated and real-world public datasets.
基于变压器的光场(LF)超分辨率(SR)方法通过自我注意机制进行全局特征建模,最近取得了显著的性能提升。然而,作为一种专为自然语言处理而设计的方法,4D 光场被重塑为具有大量标记集的 1D 序列,这导致了二次计算复杂度成本。本文提出了一种用于空间和角度 SR(SASR)的空间-角度-外极性斯温变换器(SAEST),该变换器利用带有移位窗口的局部自注意,充分提取了空间、角度和外极性域中的 SR 信息。具体来说,在 SAEST 中,首先级联空间斯温变换器和角度标准变换器,分别提取空间和角度 SR 特征。然后,将提取的 SR 特征重塑为外极平面图像模式,并输入外极swin 变换器以提取空间-角度相关信息。最后,在 Unet 框架中级联多个 SAEST 模块,为 SASR 提取多尺度 SR 特征。实验结果表明,SAEST 是一种基于变换器的快速 SASR 方法,运行时间和 GPU 消耗较少,在模拟和真实世界公共数据集上表现出色。
{"title":"Spatial–angular–epipolar transformer for light field spatial and angular super-resolution","authors":"Sizhe Wang , Hao Sheng , Rongshan Chen , Da Yang , Zhenglong Cui , Ruixuan Cong , Zhang Xiong","doi":"10.1016/j.displa.2024.102816","DOIUrl":"10.1016/j.displa.2024.102816","url":null,"abstract":"<div><p>Transformer-based light field (LF) super-resolution (SR) methods have recently achieved significant performance improvements due to global feature modeling by self-attention mechanisms. However, as a method designed for natural language processing, 4D LFs are reshaped into 1D sequences with an immense set of tokens, which results in a quadratic computational complexity cost. In this paper, a spatial–angular–epipolar swin transformer (SAEST) is proposed for spatial and angular SR (SASR), which sufficiently extracts SR information in the spatial, angular, and epipolar domains using local self-attention with shifted windows. Specifically, in SAEST, a spatial swin transformer and an angular standard transformer are firstly cascaded to extract spatial and angular SR features, separately. Then, the extracted SR feature is reshaped into the epipolar plane image pattern and fed into an epipolar swin transformer to extract the spatial–angular correlation information. Finally, several SAEST blocks are cascaded in a Unet framework to extract multi-scale SR features for SASR. Experiment results indicate that SAEST is a fast transformer-based SASR method with less running time and GPU consumption and has outstanding performance on simulated and real-world public datasets.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102816"},"PeriodicalIF":3.7,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142148483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-19DOI: 10.1016/j.displa.2024.102814
Xinlong Dong , Peicheng Shi , Heng Qi , Aixi Yang , Taonian Liang
In order to accurately identify occluding targets and infer the motion state of objects, we propose a Bird’s-Eye View Object Detection Network based on Temporal-Spatial feature fusion (TS-BEV), which replaces the previous multi-frame sampling method by using the cyclic propagation mode of historical frame instance information. We design a new Temporal-Spatial feature fusion attention module, which fully integrates temporal information and spatial features, and improves the inference and training speed. In response to realize multi-frame feature fusion across multiple scales and views, we propose an efficient Temporal-Spatial deformable aggregation module, which performs feature sampling and weighted summation from multiple feature maps of historical frames and current frames, and makes full use of the parallel computing capabilities of GPUs and AI chips to further improve efficiency. Furthermore, in order to solve the lack of global inference in the context of temporal-spatial fusion BEV features and the inability of instance features distributed in different locations to fully interact, we further design the BEV self-attention mechanism module to perform global operation of features, enhance global inference ability and fully interact with instance features. We have carried out extensive experimental experiments on the challenging BEV object detection nuScenes dataset, quantitative results show that our method achieves excellent performance of 61.5% mAP and 68.5% NDS in camera-only 3D object detection tasks, and qualitative results show that TS-BEV can effectively solve the problem of 3D object detection in complex traffic background with lack of light at night, with good robustness and scalability.
{"title":"TS-BEV: BEV object detection algorithm based on temporal-spatial feature fusion","authors":"Xinlong Dong , Peicheng Shi , Heng Qi , Aixi Yang , Taonian Liang","doi":"10.1016/j.displa.2024.102814","DOIUrl":"10.1016/j.displa.2024.102814","url":null,"abstract":"<div><p>In order to accurately identify occluding targets and infer the motion state of objects, we propose a Bird’s-Eye View Object Detection Network based on Temporal-Spatial feature fusion (TS-BEV), which replaces the previous multi-frame sampling method by using the cyclic propagation mode of historical frame instance information. We design a new Temporal-Spatial feature fusion attention module, which fully integrates temporal information and spatial features, and improves the inference and training speed. In response to realize multi-frame feature fusion across multiple scales and views, we propose an efficient Temporal-Spatial deformable aggregation module, which performs feature sampling and weighted summation from multiple feature maps of historical frames and current frames, and makes full use of the parallel computing capabilities of GPUs and AI chips to further improve efficiency. Furthermore, in order to solve the lack of global inference in the context of temporal-spatial fusion BEV features and the inability of instance features distributed in different locations to fully interact, we further design the BEV self-attention mechanism module to perform global operation of features, enhance global inference ability and fully interact with instance features. We have carried out extensive experimental experiments on the challenging BEV object detection nuScenes dataset, quantitative results show that our method achieves excellent performance of 61.5% mAP and 68.5% NDS in camera-only 3D object detection tasks, and qualitative results show that TS-BEV can effectively solve the problem of 3D object detection in complex traffic background with lack of light at night, with good robustness and scalability.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102814"},"PeriodicalIF":3.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142040451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-17DOI: 10.1016/j.displa.2024.102813
Zhangfan Shen, Tiantian Chen, Yi Wang, Moke Li, Jiaxiang Chen, Zhanpeng Hu
Although there have been many previous studies on icon visual search and recognition performance, only a few have considered the effects of both the internal and external characteristics of icons. In this behavioral study, we employed a visual search task and a semantic recognition task to explore the effects of icon style, semantic distance (SD), and task difficulty on users’ performance in perceiving and identifying icons. First, we created and filtered 64 new icons, which were divided into four different groups (flat design & close SD, flat design & far SD, skeuomorphic design & close SD, skeuomorphic design & far SD) through expert evaluation. A total of 40 participants (13 men and 27 women, ages ranging from 19 to 25 years, mean age = 21.9 years, SD=1.93) were asked to perform an icon visual search task and an icon recognition task after a round of learning. Participants’ accuracy and response time were measured as a function of the following independent variables: two icon styles (flat or skeuomorphic style), two levels of SD (close or far), and two levels of task difficulty (easy or difficult). The results showed that flat icons had better visual search performance than skeuomorphic icons; this beneficial effect increased as the task difficulty increased. However, in the icon recognition task, participants’ performance in recalling skeuomorphic icons was significantly better than that in recalling flat icons. Furthermore, a strong interaction effect between icon style and task difficulty was observed for response time. As the task difficulty decreased, the difference in recognition performance between these two different icon styles increased significantly. These findings provide valuable guidance for the design of icons in human–computer interaction interfaces.
尽管以前有很多关于图标视觉搜索和识别性能的研究,但只有少数研究考虑了图标内部和外部特征的影响。在这项行为研究中,我们采用了视觉搜索任务和语义识别任务来探讨图标风格、语义距离(SD)和任务难度对用户感知和识别图标表现的影响。首先,我们创建并筛选了64个新图标,并通过专家评估将其分为四组(扁平设计& close SD、扁平设计& far SD、skeuomorphic design & close SD、skeuomorphic design & far SD)。共有 40 名参与者(13 名男性和 27 名女性,年龄在 19 至 25 岁之间,平均年龄 = 21.9 岁,SD=1.93)被要求在一轮学习后完成图标视觉搜索任务和图标识别任务。参与者的准确率和反应时间是由以下自变量决定的:两种图标风格(扁平或偏斜风格)、两种标距水平(近或远)和两种任务难度(易或难)。结果表明,扁平图标的视觉搜索性能优于斜体图标;随着任务难度的增加,这种有利影响也在增加。然而,在图标识别任务中,参与者回忆斜体图标的表现明显优于回忆扁平图标的表现。此外,在反应时间方面,图标风格与任务难度之间存在强烈的交互效应。随着任务难度的降低,这两种不同图标风格之间的识别成绩差异明显增大。这些发现为人机交互界面中的图标设计提供了宝贵的指导。
{"title":"Skeuomorphic or flat? The effects of icon style on visual search and recognition performance","authors":"Zhangfan Shen, Tiantian Chen, Yi Wang, Moke Li, Jiaxiang Chen, Zhanpeng Hu","doi":"10.1016/j.displa.2024.102813","DOIUrl":"10.1016/j.displa.2024.102813","url":null,"abstract":"<div><p>Although there have been many previous studies on icon visual search and recognition performance, only a few have considered the effects of both the internal and external characteristics of icons. In this behavioral study, we employed a visual search task and a semantic recognition task to explore the effects of icon style, semantic distance (SD), and task difficulty on users’ performance in perceiving and identifying icons. First, we created and filtered 64 new icons, which were divided into four different groups (flat design & close SD, flat design & far SD, skeuomorphic design & close SD, skeuomorphic design & far SD) through expert evaluation. A total of 40 participants (13 men and 27 women, ages ranging from 19 to 25 years, mean age = 21.9 years, SD=1.93) were asked to perform an icon visual search task and an icon recognition task after a round of learning. Participants’ accuracy and response time were measured as a function of the following independent variables: two icon styles (flat or skeuomorphic style), two levels of SD (close or far), and two levels of task difficulty (easy or difficult). The results showed that flat icons had better visual search performance than skeuomorphic icons; this beneficial effect increased as the task difficulty increased. However, in the icon recognition task, participants’ performance in recalling skeuomorphic icons was significantly better than that in recalling flat icons. Furthermore, a strong interaction effect between icon style and task difficulty was observed for response time. As the task difficulty decreased, the difference in recognition performance between these two different icon styles increased significantly. These findings provide valuable guidance for the design of icons in human–computer interaction interfaces.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102813"},"PeriodicalIF":3.7,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142021071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-13DOI: 10.1016/j.displa.2024.102810
Shaoxu Li, Ye Pan
Neural Radiance Fields (NeRF) have recently emerged as a promising approach for synthesizing highly realistic images from 3D scenes. This technology has shown impressive results in capturing intricate details and producing photorealistic renderings. However, one of the limitations of traditional NeRF approaches is the difficulty in editing and manipulating the geometry of the scene once it has been captured. This restriction hinders creative freedom and practical applicability.
In this paper, we propose a method that enables interactive geometry editing for neural radiance fields manipulation. We use two proxy cages (inner cage and outer cage) to edit a scene. The inner cage defines the operation target, and the outer cage defines the adjustment space. Various operations apply to the two cages. After cage selection, operations on the inner cage lead to the desired transformation of the inner cage and adjustment of the outer cage. Users can edit the scene with translation, rotation, scaling, or combinations. The operations on the corners and edges of the cage are also supported. Our method does not need any explicit 3D geometry representations. The interactive geometry editing applies directly to the implicit neural radiance fields. Extensive experimental results demonstrate the effectiveness of our approach.
神经辐射场(NeRF)是最近出现的一种从三维场景合成高度逼真图像的有效方法。这项技术在捕捉复杂细节和制作逼真渲染效果方面取得了令人印象深刻的成果。然而,传统 NeRF 方法的局限之一是,一旦捕捉到场景的几何图形,就很难对其进行编辑和处理。在本文中,我们提出了一种可对神经辐射场进行交互式几何编辑的方法。我们使用两个代理笼(内笼和外笼)来编辑场景。内笼定义操作目标,外笼定义调整空间。各种操作都适用于这两个笼子。选择笼子后,对内笼子的操作将导致内笼子的预期变换和外笼子的调整。用户可以对场景进行平移、旋转、缩放或组合编辑。此外,还支持对笼子的角落和边缘进行操作。我们的方法不需要任何明确的 3D 几何图形表示。交互式几何编辑直接应用于隐式神经辐射场。大量实验结果证明了我们方法的有效性。
{"title":"Interactive geometry editing of Neural Radiance Fields","authors":"Shaoxu Li, Ye Pan","doi":"10.1016/j.displa.2024.102810","DOIUrl":"10.1016/j.displa.2024.102810","url":null,"abstract":"<div><p>Neural Radiance Fields (NeRF) have recently emerged as a promising approach for synthesizing highly realistic images from 3D scenes. This technology has shown impressive results in capturing intricate details and producing photorealistic renderings. However, one of the limitations of traditional NeRF approaches is the difficulty in editing and manipulating the geometry of the scene once it has been captured. This restriction hinders creative freedom and practical applicability.</p><p>In this paper, we propose a method that enables interactive geometry editing for neural radiance fields manipulation. We use two proxy cages (inner cage and outer cage) to edit a scene. The inner cage defines the operation target, and the outer cage defines the adjustment space. Various operations apply to the two cages. After cage selection, operations on the inner cage lead to the desired transformation of the inner cage and adjustment of the outer cage. Users can edit the scene with translation, rotation, scaling, or combinations. The operations on the corners and edges of the cage are also supported. Our method does not need any explicit 3D geometry representations. The interactive geometry editing applies directly to the implicit neural radiance fields. Extensive experimental results demonstrate the effectiveness of our approach.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102810"},"PeriodicalIF":3.7,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141978908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-13DOI: 10.1016/j.displa.2024.102807
Rose Rouhani , Narmada Umatheva , Jannik Brockerhoff , Behrang Keshavarz , Ernst Kruijff , Jan Gugenheimer , Bernhard E. Riecke
Virtual Reality (VR) sickness remains a significant challenge in the widespread adoption of VR technologies. The absence of a standardized benchmark system hinders progress in understanding and effectively countering VR sickness. This paper proposes an initial step towards a benchmark system, utilizing a novel methodological framework to serve as a common platform for evaluating contributing VR sickness factors and mitigation strategies. Our benchmark, grounded in established theories and leveraging existing research, features both small and large environments. In two research studies, we validated our system by demonstrating its capability to (1) quickly, reliably, and controllably induce VR sickness in both environments, followed by a rapid decline post-stimulus, facilitating cost and time-effective within-subject studies and increased statistical power, (2) integrate and evaluate established VR sickness mitigation methods — static and dynamic field of view reduction, blur, and virtual nose — demonstrating their effectiveness in reducing symptoms in the benchmark and their direct comparison within a standardized setting. Our proposed benchmark also enables broader, more comparative research into different technical, setup, and participant variables influencing VR sickness and overall user experience, ultimately paving the way for building a comprehensive database to identify the most effective strategies for specific VR applications.
{"title":"Towards benchmarking VR sickness: A novel methodological framework for assessing contributing factors and mitigation strategies through rapid VR sickness induction and recovery","authors":"Rose Rouhani , Narmada Umatheva , Jannik Brockerhoff , Behrang Keshavarz , Ernst Kruijff , Jan Gugenheimer , Bernhard E. Riecke","doi":"10.1016/j.displa.2024.102807","DOIUrl":"10.1016/j.displa.2024.102807","url":null,"abstract":"<div><p>Virtual Reality (VR) sickness remains a significant challenge in the widespread adoption of VR technologies. The absence of a standardized benchmark system hinders progress in understanding and effectively countering VR sickness. This paper proposes an initial step towards a benchmark system, utilizing a novel methodological framework to serve as a common platform for evaluating contributing VR sickness factors and mitigation strategies. Our benchmark, grounded in established theories and leveraging existing research, features both small and large environments. In two research studies, we validated our system by demonstrating its capability to (1) quickly, reliably, and controllably induce VR sickness in both environments, followed by a rapid decline post-stimulus, facilitating cost and time-effective within-subject studies and increased statistical power, (2) integrate and evaluate established VR sickness mitigation methods — static and dynamic field of view reduction, blur, and virtual nose — demonstrating their effectiveness in reducing symptoms in the benchmark and their direct comparison within a standardized setting. Our proposed benchmark also enables broader, more comparative research into different technical, setup, and participant variables influencing VR sickness and overall user experience, ultimately paving the way for building a comprehensive database to identify the most effective strategies for specific VR applications.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102807"},"PeriodicalIF":3.7,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0141938224001719/pdfft?md5=2e64eaeb33beb05d2ed088ab7163143d&pid=1-s2.0-S0141938224001719-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142048360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-10DOI: 10.1016/j.displa.2024.102811
Mingyue Yang , Xiaoxuan Dong , Wang Zhang , Peng Xie , Chuan Li , Shanxiong Chen
Automated segmentation algorithms are a crucial component of medical image analysis, playing an essential role in assisting professionals to achieve accurate diagnoses. Traditional convolutional neural networks (CNNs) face challenges when dealing with complex and variable lesions: limited by the receptive field of convolutional operators, CNNs often struggle to capture long-range dependencies of complex lesions. The transformer’s outstanding ability to capture long-range dependencies offers a new perspective on addressing these challenges. Inspired by this, our research aims to combine the precise spatial detail extraction capabilities of CNNs with the global semantic understanding abilities of transformers. Unlike traditional fusion methods, we propose a fine-grained feature fusion strategy based on complementary attention, deeply exploring and complementarily fusing the feature representations of the encoder. Moreover, considering that merely relying on feature fusion might overlook critical texture details and key edge features in the segmentation task, we designed a feature enhancement module based on information entropy. This module emphasizes shallow texture features and edge information, enabling the model to more accurately capture and enhance multi-level details of the image, further optimizing segmentation results. Our method was tested on multiple public segmentation datasets of polyps and skin lesions,and performed better than state-of-the-art methods. Extensive qualitative experimental results indicate that our method maintains robust performance even when faced with challenging cases of narrow or blurry-boundary lesions.
{"title":"A feature fusion module based on complementary attention for medical image segmentation","authors":"Mingyue Yang , Xiaoxuan Dong , Wang Zhang , Peng Xie , Chuan Li , Shanxiong Chen","doi":"10.1016/j.displa.2024.102811","DOIUrl":"10.1016/j.displa.2024.102811","url":null,"abstract":"<div><p>Automated segmentation algorithms are a crucial component of medical image analysis, playing an essential role in assisting professionals to achieve accurate diagnoses. Traditional convolutional neural networks (CNNs) face challenges when dealing with complex and variable lesions: limited by the receptive field of convolutional operators, CNNs often struggle to capture long-range dependencies of complex lesions. The transformer’s outstanding ability to capture long-range dependencies offers a new perspective on addressing these challenges. Inspired by this, our research aims to combine the precise spatial detail extraction capabilities of CNNs with the global semantic understanding abilities of transformers. Unlike traditional fusion methods, we propose a fine-grained feature fusion strategy based on complementary attention, deeply exploring and complementarily fusing the feature representations of the encoder. Moreover, considering that merely relying on feature fusion might overlook critical texture details and key edge features in the segmentation task, we designed a feature enhancement module based on information entropy. This module emphasizes shallow texture features and edge information, enabling the model to more accurately capture and enhance multi-level details of the image, further optimizing segmentation results. Our method was tested on multiple public segmentation datasets of polyps and skin lesions,and performed better than state-of-the-art methods. Extensive qualitative experimental results indicate that our method maintains robust performance even when faced with challenging cases of narrow or blurry-boundary lesions.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102811"},"PeriodicalIF":3.7,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141997547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}