Computers & Graphics-Uk最新文献_第10页

GIC-Flow: Appearance flow estimation via global information correlation for virtual try-on under large deformation GIC-Flow：通过大变形下虚拟试穿的全局信息相关性进行外观流估计

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-12 DOI: 10.1016/j.cag.2024.104071

Peng Zhang , Jiamei Zhan , Kexin Sun , Jie Zhang , Meng Wei , Kexin Wang

The primary aim of image-based virtual try-on is to seamlessly deform the target garment image to align with the human body. Owing to the inherent non-rigid nature of garments, current methods prioritise flexible deformation through appearance flow with high degrees of freedom. However, existing appearance flow estimation methods solely focus on the correlation of local feature information. While this strategy successfully avoids the extensive computational effort associated with the direct computation of the global information correlation of feature maps, it leads to challenges in garments adapting to large deformation scenarios. To overcome these limitations, we propose the GIC-Flow framework, which obtains appearance flow by calculating the global information correlation while reducing computational regression. Specifically, our proposed global streak information matching module is designed to decompose the appearance flow into horizontal and vertical vectors, effectively propagating global information in both directions. This innovative approach considerably diminishes computational requirements, contributing to an enhanced and efficient process. In addition, to ensure the accurate deformation of local texture in garments, we propose the local aggregate information matching module to aggregate information from the nearest neighbours before computing the global correlation and to enhance weak semantic information. Comprehensive experiments conducted using our method on the VITON and VITON-HD datasets show that GIC-Flow outperforms existing state-of-the-art algorithms, particularly in cases involving complex garment deformation.

基于图像的虚拟试穿的主要目的是对目标服装图像进行无缝变形，使其与人体保持一致。由于服装固有的非刚性特性，目前的方法优先考虑通过高自由度的外观流进行灵活变形。然而，现有的外观流估算方法仅关注局部特征信息的相关性。虽然这种策略成功地避免了直接计算特征图的全局信息相关性所带来的大量计算工作，但却给服装适应大变形场景带来了挑战。为了克服这些限制，我们提出了 GIC-Flow 框架，通过计算全局信息相关性获得外观流，同时减少计算回归。具体来说，我们提出的全局条纹信息匹配模块旨在将外观流分解为水平和垂直向量，从而有效地在两个方向传播全局信息。这种创新方法大大降低了计算要求，有助于提高流程的效率。此外，为了确保服装局部纹理的准确变形，我们提出了局部聚合信息匹配模块，在计算全局相关性之前聚合最近邻域的信息，并增强弱语义信息。使用我们的方法在 VITON 和 VITON-HD 数据集上进行的综合实验表明，GIC-Flow 优于现有的最先进算法，尤其是在涉及复杂服装变形的情况下。

{"title":"GIC-Flow: Appearance flow estimation via global information correlation for virtual try-on under large deformation","authors":"Peng Zhang , Jiamei Zhan , Kexin Sun , Jie Zhang , Meng Wei , Kexin Wang","doi":"10.1016/j.cag.2024.104071","DOIUrl":"10.1016/j.cag.2024.104071","url":null,"abstract":"<div><p>The primary aim of image-based virtual try-on is to seamlessly deform the target garment image to align with the human body. Owing to the inherent non-rigid nature of garments, current methods prioritise flexible deformation through appearance flow with high degrees of freedom. However, existing appearance flow estimation methods solely focus on the correlation of local feature information. While this strategy successfully avoids the extensive computational effort associated with the direct computation of the global information correlation of feature maps, it leads to challenges in garments adapting to large deformation scenarios. To overcome these limitations, we propose the GIC-Flow framework, which obtains appearance flow by calculating the global information correlation while reducing computational regression. Specifically, our proposed global streak information matching module is designed to decompose the appearance flow into horizontal and vertical vectors, effectively propagating global information in both directions. This innovative approach considerably diminishes computational requirements, contributing to an enhanced and efficient process. In addition, to ensure the accurate deformation of local texture in garments, we propose the local aggregate information matching module to aggregate information from the nearest neighbours before computing the global correlation and to enhance weak semantic information. Comprehensive experiments conducted using our method on the VITON and VITON-HD datasets show that GIC-Flow outperforms existing state-of-the-art algorithms, particularly in cases involving complex garment deformation.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104071"},"PeriodicalIF":2.5,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142229691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MuSic-UDF: Learning Multi-Scale dynamic grid representation for high-fidelity surface reconstruction from point clouds MuSic-UDF：学习多尺度动态网格表示法，实现点云高保真表面重建

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-10 DOI: 10.1016/j.cag.2024.104081

Chuan Jin , Tieru Wu , Yu-Shen Liu , Junsheng Zhou

Surface reconstruction for point clouds is a central task in 3D modeling. Recently, the attractive approaches solve this problem by learning neural implicit representations, e.g., unsigned distance functions (UDFs), from point clouds, which have achieved good performance. However, the existing UDF-based methods still struggle to recover the local geometrical details. One of the difficulties arises from the used inflexible representations, which is hard to capture the local high-fidelity geometry details. In this paper, we propose a novel neural implicit representation, named MuSic-UDF, which leverages Multi-Scale dynamic grids for high-fidelity and flexible surface reconstruction from raw point clouds with arbitrary typologies. Specifically, we initialize a hierarchical voxel grid where each grid point stores a learnable 3D coordinate. Then, we optimize these grids such that different levels of geometry structures can be captured adaptively. To further explore the geometry details, we introduce a frequency encoding strategy to hierarchically encode these coordinates. MuSic-UDF does not require any supervisions like ground truth distance values or point normals. We conduct comprehensive experiments under widely-used benchmarks, where the results demonstrate the superior performance of our proposed method compared to the state-of-the-art methods.

点云表面重建是三维建模的一项核心任务。最近，一些有吸引力的方法通过从点云中学习神经隐式表示（如无符号距离函数（UDF））来解决这一问题，并取得了良好的效果。然而，现有的基于 UDF 的方法仍难以恢复局部几何细节。其中一个困难来自于所使用的表征方式不够灵活，难以捕捉局部高保真几何细节。在本文中，我们提出了一种名为 MuSic-UDF 的新型神经隐式表示法，它利用多尺度动态网格从任意类型的原始点云中进行高保真、灵活的曲面重建。具体来说，我们初始化了一个分层体素网格，其中每个网格点都存储了一个可学习的三维坐标。然后，我们对这些网格进行优化，从而可以自适应地捕捉不同层次的几何结构。为了进一步探索几何细节，我们引入了频率编码策略，对这些坐标进行分层编码。MuSic-UDF 不需要任何监督，如地面真实距离值或点法线。我们在广泛使用的基准下进行了全面的实验，实验结果表明，与最先进的方法相比，我们提出的方法性能更优越。

{"title":"MuSic-UDF: Learning Multi-Scale dynamic grid representation for high-fidelity surface reconstruction from point clouds","authors":"Chuan Jin , Tieru Wu , Yu-Shen Liu , Junsheng Zhou","doi":"10.1016/j.cag.2024.104081","DOIUrl":"10.1016/j.cag.2024.104081","url":null,"abstract":"<div><p>Surface reconstruction for point clouds is a central task in 3D modeling. Recently, the attractive approaches solve this problem by learning neural implicit representations, e.g., unsigned distance functions (UDFs), from point clouds, which have achieved good performance. However, the existing UDF-based methods still struggle to recover the local geometrical details. One of the difficulties arises from the used inflexible representations, which is hard to capture the local high-fidelity geometry details. In this paper, we propose a novel neural implicit representation, named MuSic-UDF, which leverages <strong>Mu</strong>lti-<strong>S</strong>cale dynam<strong>ic</strong> grids for high-fidelity and flexible surface reconstruction from raw point clouds with arbitrary typologies. Specifically, we initialize a hierarchical voxel grid where each grid point stores a learnable 3D coordinate. Then, we optimize these grids such that different levels of geometry structures can be captured adaptively. To further explore the geometry details, we introduce a frequency encoding strategy to hierarchically encode these coordinates. MuSic-UDF does not require any supervisions like ground truth distance values or point normals. We conduct comprehensive experiments under widely-used benchmarks, where the results demonstrate the superior performance of our proposed method compared to the state-of-the-art methods.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104081"},"PeriodicalIF":2.5,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142241025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Voice user interfaces for effortless navigation in medical virtual reality environments 在医疗虚拟现实环境中轻松导航的语音用户界面

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-07 DOI: 10.1016/j.cag.2024.104069

Jan Hombeck, Henrik Voigt, Kai Lawonn

In various situations, such as clinical environments with sterile conditions or when hands are occupied with multiple devices, traditional methods of navigation and scene adjustment are impractical or even impossible. We explore a new solution by using voice control to facilitate interaction in virtual worlds to avoid the use of additional controllers. Therefore, we investigate three scenarios: Object Orientation, Visualization Customization, and Analytical Tasks and evaluate whether natural language interaction is possible and promising in each of these scenarios. In our quantitative user study participants were able to control virtual environments effortlessly using verbal instructions. This resulted in rapid orientation adjustments, adaptive visual aids, and accurate data analysis. In addition, user satisfaction and usability surveys showed consistently high levels of acceptance and ease of use. In conclusion, our study shows that the use of natural language can be a promising alternative for the improvement of user interaction in virtual environments. It enables intuitive interactions in virtual spaces, especially in situations where traditional controls have limitations.

在各种情况下，如无菌条件下的临床环境或双手被多个设备占用时，传统的导航和场景调整方法是不切实际的，甚至是不可能的。我们探索了一种新的解决方案，即使用语音控制来促进虚拟世界中的交互，从而避免使用额外的控制器。因此，我们研究了三种情况：对象定向、可视化定制和分析任务，并评估自然语言交互在这些场景中是否可行和有前景。在我们的定量用户研究中，参与者能够使用语言指令毫不费力地控制虚拟环境。这导致了快速的方向调整、自适应视觉辅助和准确的数据分析。此外，用户满意度和可用性调查显示，用户的接受度和易用性始终保持在较高水平。总之，我们的研究表明，使用自然语言是改进虚拟环境中用户交互的一种有前途的替代方法。它能在虚拟空间中实现直观的交互，尤其是在传统控制有局限性的情况下。

{"title":"Voice user interfaces for effortless navigation in medical virtual reality environments","authors":"Jan Hombeck, Henrik Voigt, Kai Lawonn","doi":"10.1016/j.cag.2024.104069","DOIUrl":"10.1016/j.cag.2024.104069","url":null,"abstract":"<div><p>In various situations, such as clinical environments with sterile conditions or when hands are occupied with multiple devices, traditional methods of navigation and scene adjustment are impractical or even impossible. We explore a new solution by using voice control to facilitate interaction in virtual worlds to avoid the use of additional controllers. Therefore, we investigate three scenarios: Object Orientation, Visualization Customization, and Analytical Tasks and evaluate whether natural language interaction is possible and promising in each of these scenarios. In our quantitative user study participants were able to control virtual environments effortlessly using verbal instructions. This resulted in rapid orientation adjustments, adaptive visual aids, and accurate data analysis. In addition, user satisfaction and usability surveys showed consistently high levels of acceptance and ease of use. In conclusion, our study shows that the use of natural language can be a promising alternative for the improvement of user interaction in virtual environments. It enables intuitive interactions in virtual spaces, especially in situations where traditional controls have limitations.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104069"},"PeriodicalIF":2.5,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324002048/pdfft?md5=5dba80971d593332ff92694bfbd894e8&pid=1-s2.0-S0097849324002048-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142167137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Synthetic surface mesh generation of aortic dissections using statistical shape modeling 利用统计形状建模生成主动脉夹层的合成表面网格

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-06 DOI: 10.1016/j.cag.2024.104070

Kai Ostendorf , Kathrin Bäumler , Domenico Mastrodicasa , Dominik Fleischmann , Bernhard Preim , Gabriel Mistelbauer

Aortic dissection is a rare disease affecting the aortic wall layers splitting the aortic lumen into two flow channels: the true and false lumen. The rarity of the disease leads to a sparsity of available datasets resulting in a low amount of available training data for in-silico studies or the training of machine learning algorithms. To mitigate this issue, we use statistical shape modeling to create a database of Stanford type B dissection surface meshes. We account for the complex disease anatomy by modeling two separate flow channels in the aorta, the true and false lumen. Former approaches mainly modeled the aortic arch including its branches but not two separate flow channels inside the aorta. To our knowledge, our approach is the first to attempt generating synthetic aortic dissection surface meshes. For the statistical shape model, the aorta is parameterized using the centerlines of the respective lumen and the according ellipses describing the cross-section of the lumen while being aligned along the centerline employing rotation-minimizing frames. To evaluate our approach we introduce disease-specific quality criteria by investigating the torsion and twist of the true lumen.

主动脉夹层是一种罕见的疾病，会影响主动脉壁层，将主动脉管腔分成两个流道：真管腔和假管腔。这种疾病的罕见性导致可用数据集的稀缺性，导致用于室内研究或机器学习算法训练的可用训练数据量较少。为了缓解这一问题，我们使用统计形状建模创建了斯坦福 B 型剖腹产表面网格数据库。我们通过对主动脉的两个独立流道--真腔和假腔--进行建模，来解释复杂的疾病解剖结构。以前的方法主要对主动脉弓（包括其分支）进行建模，但没有对主动脉内的两个独立流道进行建模。据我们所知，我们的方法是首次尝试生成合成主动脉夹层表面网格。在统计形状模型中，主动脉的参数使用了各自管腔的中心线和描述管腔横截面的相应椭圆，同时使用旋转最小化框架沿中心线对齐。为了评估我们的方法，我们通过研究真实管腔的扭转和扭曲情况，引入了针对特定疾病的质量标准。

{"title":"Synthetic surface mesh generation of aortic dissections using statistical shape modeling","authors":"Kai Ostendorf , Kathrin Bäumler , Domenico Mastrodicasa , Dominik Fleischmann , Bernhard Preim , Gabriel Mistelbauer","doi":"10.1016/j.cag.2024.104070","DOIUrl":"10.1016/j.cag.2024.104070","url":null,"abstract":"<div><p>Aortic dissection is a rare disease affecting the aortic wall layers splitting the aortic lumen into two flow channels: the true and false lumen. The rarity of the disease leads to a sparsity of available datasets resulting in a low amount of available training data for in-silico studies or the training of machine learning algorithms. To mitigate this issue, we use statistical shape modeling to create a database of Stanford type B dissection surface meshes. We account for the complex disease anatomy by modeling two separate flow channels in the aorta, the true and false lumen. Former approaches mainly modeled the aortic arch including its branches but not two separate flow channels inside the aorta. To our knowledge, our approach is the first to attempt generating synthetic aortic dissection surface meshes. For the statistical shape model, the aorta is parameterized using the centerlines of the respective lumen and the according ellipses describing the cross-section of the lumen while being aligned along the centerline employing rotation-minimizing frames. To evaluate our approach we introduce disease-specific quality criteria by investigating the torsion and twist of the true lumen.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104070"},"PeriodicalIF":2.5,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S009784932400205X/pdfft?md5=f0b8f98a6ffb57b157863af63c74d980&pid=1-s2.0-S009784932400205X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142166764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A semantic edge-aware parameter efficient image filtering technique 语义边缘感知参数高效图像过滤技术

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-06 DOI: 10.1016/j.cag.2024.104068

Kunal Pradhan , Swarnajyoti Patra

The success of a structure preserving filtering technique has relied on its capability to recognize structures and textures present in the input image. In this paper a novel structure preserving filtering technique is presented that first, generates an edge-map of the input image by exploiting semantic information. Then, an edge-aware adaptive recursive median filter is utilized to produce the filter image. The technique provides satisfactory results for a wide variety of images with minimal fine-tuning of its parameters. Moreover, along with the various computer graphics applications the proposed technique also shows its robustness to incorporate spatial information for spectral-spatial classification of hyperspectral images. A MATLAB implementation of the proposed technique is available at-https://www.github.com/K-Pradhan/A-semantic-edge-aware-parameter-efficient-image-filtering-technique

结构保留过滤技术的成功取决于其识别输入图像中结构和纹理的能力。本文提出了一种新颖的结构保护滤波技术，首先，利用语义信息生成输入图像的边缘图。然后，利用边缘感知自适应递归中值滤波器生成滤波图像。该技术只需对参数进行最小限度的微调，就能为各种图像提供令人满意的结果。此外，除了各种计算机图形应用之外，所提出的技术还显示了其在结合空间信息对高光谱图像进行光谱空间分类方面的鲁棒性。拟议技术的 MATLAB 实现可在以下网址获取：https://www.github.com/K-Pradhan/A-semantic-edge-aware-parameter-efficient-image-filtering-technique

引用次数: 0

TPVis: A visual analytics system for exploring test case prioritization methods TPVis：用于探索测试用例优先级排序方法的可视化分析系统

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-05 DOI: 10.1016/j.cag.2024.104064

José Arthur Silveira , Leandro Vieira , Nivan Ferreira

Software testing is a vital tool to ensure the quality and trustworthiness of the pieces of software produced. Test suites are often large, which makes the process of testing software a costly and time-consuming process. In this context, test case prioritization (TCP) methods play an important role by ranking test cases in order to enable early fault detection and, hence, enable quicker problem fixes. The evaluation of such methods is a difficult problem, due to the variety of the methods and objectives. To address this issue, we present TPVis, a visual analytics framework that enables the evaluation and comparison of TCP methods designed in collaboration with experts in software testing. Our solution is an open-source web application that provides a variety of analytical tools to assist in the exploration of test suites and prioritization algorithms. Furthermore, TPVis also provides dashboard presets, that were validated with our domain collaborators, that support common analysis goals. We illustrate the usefulness of TPVis through a series of use cases that illustrate our system’s flexibility in addressing different problems in analyzing TCP methods. Finally, we also report on feedback received from the domain experts that indicate the effectiveness of TPVis. TPVis is available at https://github.com/vixe-cin-ufpe/TPVis.

软件测试是确保软件质量和可信度的重要工具。测试套件通常很大，这使得软件测试过程既费钱又费时。在这种情况下，测试用例优先级排序（TCP）方法发挥了重要作用，通过对测试用例进行排序，可以及早发现故障，从而更快地解决问题。由于方法和目标的多样性，对此类方法进行评估是一个难题。为解决这一问题，我们提出了 TPVis，这是一个可视化分析框架，可对与软件测试专家合作设计的 TCP 方法进行评估和比较。我们的解决方案是一个开源网络应用程序，提供各种分析工具，协助探索测试套件和优先级算法。此外，TPVis 还提供了仪表板预设，这些预设已与我们的领域合作者进行了验证，支持共同的分析目标。我们通过一系列用例说明了 TPVis 的实用性，这些用例说明了我们的系统在解决 TCP 方法分析中的不同问题时的灵活性。最后，我们还报告了来自领域专家的反馈，这些反馈表明 TPVis 非常有效。TPVis 可在 https://github.com/vixe-cin-ufpe/TPVis 上查阅。

{"title":"TPVis: A visual analytics system for exploring test case prioritization methods","authors":"José Arthur Silveira , Leandro Vieira , Nivan Ferreira","doi":"10.1016/j.cag.2024.104064","DOIUrl":"10.1016/j.cag.2024.104064","url":null,"abstract":"<div><p>Software testing is a vital tool to ensure the quality and trustworthiness of the pieces of software produced. Test suites are often large, which makes the process of testing software a costly and time-consuming process. In this context, test case prioritization (TCP) methods play an important role by ranking test cases in order to enable early fault detection and, hence, enable quicker problem fixes. The evaluation of such methods is a difficult problem, due to the variety of the methods and objectives. To address this issue, we present TPVis, a visual analytics framework that enables the evaluation and comparison of TCP methods designed in collaboration with experts in software testing. Our solution is an open-source web application that provides a variety of analytical tools to assist in the exploration of test suites and prioritization algorithms. Furthermore, TPVis also provides dashboard presets, that were validated with our domain collaborators, that support common analysis goals. We illustrate the usefulness of TPVis through a series of use cases that illustrate our system’s flexibility in addressing different problems in analyzing TCP methods. Finally, we also report on feedback received from the domain experts that indicate the effectiveness of TPVis. TPVis is available at <span><span>https://github.com/vixe-cin-ufpe/TPVis</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104064"},"PeriodicalIF":2.5,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142166763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Transferring transfer functions (TTF): A guided approach to transfer function optimization in volume visualization 传递函数（TTF）：体积可视化中转移函数优化的指导方法

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-04 DOI: 10.1016/j.cag.2024.104067

Amin Nasim Saravi, Joshua Horacsek, Usman Alim, Julio Daniel Silva

In volume visualization, a transfer function tailored for one volume usually does not work for other similar volumes without careful tuning. This process can be tedious and time-consuming for a large set of volumes. In this work, we present a novel approach to transfer function optimization based on the differentiable volume rendering of a reference volume and its corresponding transfer function. Using two fully connected neural networks, our approach learns a continuous 2D separable transfer function that visualizes the features of interest with consistent visual properties between the volumes. Because many volume visualization software packages support separable transfer functions, users can export the optimized transfer function into a domain-specific application for further interactions. In tandem with domain experts’ input and assessments, we present two use cases to demonstrate the effectiveness of our approach. The first use case tracks the effect of an asteroid blast near the ocean surface. In this application, a volume and its corresponding transfer function seed our method, cascading transfer function optimization for the proceeding time steps. The second use case focuses on the visualization of white matter, gray matter, and cerebrospinal fluid in magnetic resonance imaging (MRI) volumes. We optimize an intensity-gradient transfer function for one volume from its segmentation. Then we use these results to visualize other brain volumes with different intensity ranges acquired on different MRI machines.

在体积可视化中，如果不仔细调整，为一个体积量身定制的传递函数通常不适用于其他类似的体积。对于大量的体积集来说，这一过程可能既繁琐又耗时。在这项工作中，我们提出了一种基于参考体的可变体渲染及其相应传递函数的传递函数优化新方法。通过使用两个全连接的神经网络，我们的方法可以学习到连续的二维可分离传递函数，该传递函数可以将感兴趣的特征可视化，并且各体之间具有一致的视觉特性。由于许多体积可视化软件包都支持可分离传递函数，因此用户可以将优化后的传递函数导出到特定领域的应用程序中，以便进一步交互。结合领域专家的意见和评估，我们介绍了两个使用案例，以展示我们方法的有效性。第一个用例是追踪海洋表面附近小行星爆炸的影响。在该应用中，一个体积及其相应的传递函数为我们的方法播下了种子，并在接下来的时间步骤中对传递函数进行了级联优化。第二个用例侧重于磁共振成像（MRI）体积中白质、灰质和脑脊液的可视化。我们通过对一个容积的分割来优化其强度梯度传递函数。然后，我们利用这些结果来可视化在不同核磁共振成像仪上获取的具有不同强度范围的其他脑体积。

{"title":"Transferring transfer functions (TTF): A guided approach to transfer function optimization in volume visualization","authors":"Amin Nasim Saravi, Joshua Horacsek, Usman Alim, Julio Daniel Silva","doi":"10.1016/j.cag.2024.104067","DOIUrl":"10.1016/j.cag.2024.104067","url":null,"abstract":"<div><p>In volume visualization, a transfer function tailored for one volume usually does not work for other similar volumes without careful tuning. This process can be tedious and time-consuming for a large set of volumes. In this work, we present a novel approach to transfer function optimization based on the differentiable volume rendering of a reference volume and its corresponding transfer function. Using two fully connected neural networks, our approach learns a continuous 2D separable transfer function that visualizes the features of interest with consistent visual properties between the volumes. Because many volume visualization software packages support separable transfer functions, users can export the optimized transfer function into a domain-specific application for further interactions. In tandem with domain experts’ input and assessments, we present two use cases to demonstrate the effectiveness of our approach. The first use case tracks the effect of an asteroid blast near the ocean surface. In this application, a volume and its corresponding transfer function seed our method, cascading transfer function optimization for the proceeding time steps. The second use case focuses on the visualization of white matter, gray matter, and cerebrospinal fluid in magnetic resonance imaging (MRI) volumes. We optimize an intensity-gradient transfer function for one volume from its segmentation. Then we use these results to visualize other brain volumes with different intensity ranges acquired on different MRI machines.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104067"},"PeriodicalIF":2.5,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324002024/pdfft?md5=a9712309d212a1aa1bbb843ba60a6d2c&pid=1-s2.0-S0097849324002024-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142229684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Empowering sign language communication: Integrating sentiment and semantics for facial expression synthesis 增强手语交流能力：整合情感和语义进行面部表情合成

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-09-03 DOI: 10.1016/j.cag.2024.104065

Rafael V. Azevedo , Thiago M. Coutinho , João P. Ferreira , Thiago L. Gomes , Erickson R. Nascimento

Translating written sentences from oral languages to a sequence of manual and non-manual gestures plays a crucial role in building a more inclusive society for deaf and hard-of-hearing people. Facial expressions (non-manual), in particular, are responsible for encoding the grammar of the sentence to be spoken, applying punctuation, pronouns, or emphasizing signs. These non-manual gestures are closely related to the semantics of the sentence being spoken and also to the utterance of the speaker’s emotions. However, most Sign Language Production (SLP) approaches are centered on synthesizing manual gestures and do not focus on modeling the speaker’s expression. This paper introduces a new method focused in synthesizing facial expressions for sign language. Our goal is to improve sign language production by integrating sentiment information in facial expression generation. The approach leverages a sentence’s sentiment and semantic features to sample from a meaningful representation space, integrating the bias of the non-manual components into the sign language production process. To evaluate our method, we extend the Fréchet gesture distance (FGD) and propose a new metric called Fréchet Expression Distance (FED) and apply an extensive set of metrics to assess the quality of specific regions of the face. The experimental results showed that our method achieved state of the art, being superior to the competitors on How2Sign and PHOENIX14T datasets. Moreover, our architecture is based on a carefully designed graph pyramid that makes it simpler, easier to train, and capable of leveraging emotions to produce facial expressions. Our code and pretrained models will be available at: https://github.com/verlab/empowering-sign-language.

将书面句子从口语翻译成一系列手动和非手动手势，对于为聋人和重听者建立一个更具包容性的社会起着至关重要的作用。特别是面部表情（非手动），它负责对口语句子的语法进行编码，应用标点符号、代词或强调符号。这些非手动手势与所说句子的语义以及说话者的情感表达密切相关。然而，大多数手语制作（SLP）方法都以合成手动手势为中心，并不关注说话者的表情建模。本文介绍了一种专注于手语面部表情合成的新方法。我们的目标是通过在面部表情生成中整合情感信息来改进手语的制作。该方法利用句子的情感和语义特征从有意义的表示空间中采样，将非人工成分的偏差整合到手语制作过程中。为了评估我们的方法，我们扩展了弗雷谢特手势距离（FGD），提出了一种名为弗雷谢特表情距离（FED）的新度量，并应用一系列广泛的度量来评估面部特定区域的质量。实验结果表明，我们的方法达到了先进水平，在 How2Sign 和 PHOENIX14T 数据集上优于竞争对手。此外，我们的架构基于精心设计的图金字塔，使其更简单、更易于训练，并能利用情绪生成面部表情。我们的代码和预训练模型可在以下网址获取：https://github.com/verlab/empowering-sign-language。

{"title":"Empowering sign language communication: Integrating sentiment and semantics for facial expression synthesis","authors":"Rafael V. Azevedo , Thiago M. Coutinho , João P. Ferreira , Thiago L. Gomes , Erickson R. Nascimento","doi":"10.1016/j.cag.2024.104065","DOIUrl":"10.1016/j.cag.2024.104065","url":null,"abstract":"<div><p>Translating written sentences from oral languages to a sequence of manual and non-manual gestures plays a crucial role in building a more inclusive society for deaf and hard-of-hearing people. Facial expressions (non-manual), in particular, are responsible for encoding the grammar of the sentence to be spoken, applying punctuation, pronouns, or emphasizing signs. These non-manual gestures are closely related to the semantics of the sentence being spoken and also to the utterance of the speaker’s emotions. However, most Sign Language Production (SLP) approaches are centered on synthesizing manual gestures and do not focus on modeling the speaker’s expression. This paper introduces a new method focused in synthesizing facial expressions for sign language. Our goal is to improve sign language production by integrating sentiment information in facial expression generation. The approach leverages a sentence’s sentiment and semantic features to sample from a meaningful representation space, integrating the bias of the non-manual components into the sign language production process. To evaluate our method, we extend the Fréchet gesture distance (FGD) and propose a new metric called Fréchet Expression Distance (FED) and apply an extensive set of metrics to assess the quality of specific regions of the face. The experimental results showed that our method achieved state of the art, being superior to the competitors on How2Sign and PHOENIX14T datasets. Moreover, our architecture is based on a carefully designed graph pyramid that makes it simpler, easier to train, and capable of leveraging emotions to produce facial expressions. Our code and pretrained models will be available at: <span><span>https://github.com/verlab/empowering-sign-language</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104065"},"PeriodicalIF":2.5,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142173877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Human-in-the-loop: Using classifier decision boundary maps to improve pseudo labels 人在回路中：使用分类决策边界图改进伪标签

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-08-30 DOI: 10.1016/j.cag.2024.104062

Bárbara C. Benato , Cristian Grosu , Alexandre X. Falcão , Alexandru C. Telea

For classification tasks, several strategies aim to tackle the problem of not having sufficient labeled data, usually by automatic labeling or by fully passing this task to a user. Automatic labeling is simple to apply but can fail handling complex situations where human insights may be required to decide the correct labels. Conversely, manual labeling leverages the expertise of specialists but may waste precious effort which could be handled by automatic methods. More specifically, automatic solutions could be improved by combining an active learning loop with manual labeling assisted by visual depictions of a classifier’s behavior. We propose to include the human in the labeling loop by using manual labeling in feature spaces produced by a deep feature annotation (DeepFA) technique. To assist manual labeling, we provide users with visual insights on the classifier’s decision boundaries. Finally, we use the manual and automatically computed labels jointly to retrain the classifier in an active learning (AL) loop scheme. Experiments using a toy and a real-world application dataset show that our proposed combination of manual labeling supported by visualization of decision boundaries and automatic labeling can yield a significant increase in classifier performance with a quite limited user effort.

对于分类任务，有几种策略旨在解决标注数据不足的问题，通常是通过自动标注或将此任务完全交给用户来完成。自动标注简单易用，但在处理复杂情况时可能会失败，因为在这种情况下可能需要人的洞察力来决定正确的标注。相反，人工标注利用了专家的专业知识，但可能会浪费宝贵的精力，而这些精力本可以通过自动方法来处理。更具体地说，可以通过将主动学习环路与人工标注相结合，并辅以分类器行为的可视化描述，来改进自动解决方案。我们建议在深度特征标注（DeepFA）技术生成的特征空间中使用人工标注，从而将人类纳入标注循环。为了帮助人工标注，我们为用户提供了分类器决策边界的可视化见解。最后，我们利用手动和自动计算的标签，在主动学习（AL）循环方案中重新训练分类器。使用玩具数据集和实际应用数据集进行的实验表明，我们提出的将手动标注与决策边界可视化和自动标注相结合的方法可以显著提高分类器的性能，而用户只需付出相当有限的努力。

{"title":"Human-in-the-loop: Using classifier decision boundary maps to improve pseudo labels","authors":"Bárbara C. Benato , Cristian Grosu , Alexandre X. Falcão , Alexandru C. Telea","doi":"10.1016/j.cag.2024.104062","DOIUrl":"10.1016/j.cag.2024.104062","url":null,"abstract":"<div><p>For classification tasks, several strategies aim to tackle the problem of not having sufficient labeled data, usually by automatic labeling or by fully passing this task to a user. Automatic labeling is simple to apply but can fail handling complex situations where human insights may be required to decide the correct labels. Conversely, manual labeling leverages the expertise of specialists but may waste precious effort which could be handled by automatic methods. More specifically, automatic solutions could be improved by combining an active learning loop with manual labeling assisted by visual depictions of a classifier’s behavior. We propose to include the human in the labeling loop by using manual labeling in feature spaces produced by a deep feature annotation (DeepFA) technique. To assist manual labeling, we provide users with visual insights on the classifier’s decision boundaries. Finally, we use the manual and automatically computed labels jointly to retrain the classifier in an active learning (AL) loop scheme. Experiments using a toy and a real-world application dataset show that our proposed combination of manual labeling supported by visualization of decision boundaries and automatic labeling can yield a significant increase in classifier performance with a quite limited user effort.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104062"},"PeriodicalIF":2.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142150523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SingVisio: Visual analytics of diffusion model for singing voice conversion SingVisio：歌声转换扩散模型的可视化分析

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk

Pub Date : 2024-08-30 DOI: 10.1016/j.cag.2024.104058

Liumeng Xue , Chaoren Wang , Mingxuan Wang , Xueyao Zhang , Jun Han , Zhizheng Wu

In this study, we present SingVisio, an interactive visual analysis system that aims to explain the diffusion model used in singing voice conversion. SingVisio provides a visual display of the generation process in diffusion models, showcasing the step-by-step denoising of the noisy spectrum and its transformation into a clean spectrum that captures the desired singer’s timbre. The system also facilitates side-by-side comparisons of different conditions, such as source content, melody, and target timbre, highlighting the impact of these conditions on the diffusion generation process and resulting conversions. Through comparative and comprehensive evaluations, SingVisio demonstrates its effectiveness in terms of system design, functionality, explainability, and user-friendliness. It offers users of various backgrounds valuable learning experiences and insights into the diffusion model for singing voice conversion.

在本研究中，我们介绍了一个交互式可视分析系统 SingVisio，旨在解释歌声转换中使用的扩散模型。SingVisio 提供了扩散模型生成过程的可视化显示，展示了噪声频谱的逐步去噪和转化为干净频谱的过程，从而捕捉到所需的歌手音色。该系统还便于对不同条件（如源内容、旋律和目标音色）进行并排比较，突出显示这些条件对扩散生成过程和转换结果的影响。通过比较和综合评估，SingVisio 展示了其在系统设计、功能、可解释性和用户友好性方面的有效性。它为不同背景的用户提供了宝贵的学习经验和对歌唱语音转换扩散模型的深入了解。

引用次数: 0