首页 > 最新文献

Computers & Graphics-Uk最新文献

英文 中文
MuSic-UDF: Learning Multi-Scale dynamic grid representation for high-fidelity surface reconstruction from point clouds MuSic-UDF:学习多尺度动态网格表示法,实现点云高保真表面重建
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-10 DOI: 10.1016/j.cag.2024.104081

Surface reconstruction for point clouds is a central task in 3D modeling. Recently, the attractive approaches solve this problem by learning neural implicit representations, e.g., unsigned distance functions (UDFs), from point clouds, which have achieved good performance. However, the existing UDF-based methods still struggle to recover the local geometrical details. One of the difficulties arises from the used inflexible representations, which is hard to capture the local high-fidelity geometry details. In this paper, we propose a novel neural implicit representation, named MuSic-UDF, which leverages Multi-Scale dynamic grids for high-fidelity and flexible surface reconstruction from raw point clouds with arbitrary typologies. Specifically, we initialize a hierarchical voxel grid where each grid point stores a learnable 3D coordinate. Then, we optimize these grids such that different levels of geometry structures can be captured adaptively. To further explore the geometry details, we introduce a frequency encoding strategy to hierarchically encode these coordinates. MuSic-UDF does not require any supervisions like ground truth distance values or point normals. We conduct comprehensive experiments under widely-used benchmarks, where the results demonstrate the superior performance of our proposed method compared to the state-of-the-art methods.

点云表面重建是三维建模的一项核心任务。最近,一些有吸引力的方法通过从点云中学习神经隐式表示(如无符号距离函数(UDF))来解决这一问题,并取得了良好的效果。然而,现有的基于 UDF 的方法仍难以恢复局部几何细节。其中一个困难来自于所使用的表征方式不够灵活,难以捕捉局部高保真几何细节。在本文中,我们提出了一种名为 MuSic-UDF 的新型神经隐式表示法,它利用多尺度动态网格从任意类型的原始点云中进行高保真、灵活的曲面重建。具体来说,我们初始化了一个分层体素网格,其中每个网格点都存储了一个可学习的三维坐标。然后,我们对这些网格进行优化,从而可以自适应地捕捉不同层次的几何结构。为了进一步探索几何细节,我们引入了频率编码策略,对这些坐标进行分层编码。MuSic-UDF 不需要任何监督,如地面真实距离值或点法线。我们在广泛使用的基准下进行了全面的实验,实验结果表明,与最先进的方法相比,我们提出的方法性能更优越。
{"title":"MuSic-UDF: Learning Multi-Scale dynamic grid representation for high-fidelity surface reconstruction from point clouds","authors":"","doi":"10.1016/j.cag.2024.104081","DOIUrl":"10.1016/j.cag.2024.104081","url":null,"abstract":"<div><p>Surface reconstruction for point clouds is a central task in 3D modeling. Recently, the attractive approaches solve this problem by learning neural implicit representations, e.g., unsigned distance functions (UDFs), from point clouds, which have achieved good performance. However, the existing UDF-based methods still struggle to recover the local geometrical details. One of the difficulties arises from the used inflexible representations, which is hard to capture the local high-fidelity geometry details. In this paper, we propose a novel neural implicit representation, named MuSic-UDF, which leverages <strong>Mu</strong>lti-<strong>S</strong>cale dynam<strong>ic</strong> grids for high-fidelity and flexible surface reconstruction from raw point clouds with arbitrary typologies. Specifically, we initialize a hierarchical voxel grid where each grid point stores a learnable 3D coordinate. Then, we optimize these grids such that different levels of geometry structures can be captured adaptively. To further explore the geometry details, we introduce a frequency encoding strategy to hierarchically encode these coordinates. MuSic-UDF does not require any supervisions like ground truth distance values or point normals. We conduct comprehensive experiments under widely-used benchmarks, where the results demonstrate the superior performance of our proposed method compared to the state-of-the-art methods.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142241025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Voice user interfaces for effortless navigation in medical virtual reality environments 在医疗虚拟现实环境中轻松导航的语音用户界面
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-07 DOI: 10.1016/j.cag.2024.104069

In various situations, such as clinical environments with sterile conditions or when hands are occupied with multiple devices, traditional methods of navigation and scene adjustment are impractical or even impossible. We explore a new solution by using voice control to facilitate interaction in virtual worlds to avoid the use of additional controllers. Therefore, we investigate three scenarios: Object Orientation, Visualization Customization, and Analytical Tasks and evaluate whether natural language interaction is possible and promising in each of these scenarios. In our quantitative user study participants were able to control virtual environments effortlessly using verbal instructions. This resulted in rapid orientation adjustments, adaptive visual aids, and accurate data analysis. In addition, user satisfaction and usability surveys showed consistently high levels of acceptance and ease of use. In conclusion, our study shows that the use of natural language can be a promising alternative for the improvement of user interaction in virtual environments. It enables intuitive interactions in virtual spaces, especially in situations where traditional controls have limitations.

在各种情况下,如无菌条件下的临床环境或双手被多个设备占用时,传统的导航和场景调整方法是不切实际的,甚至是不可能的。我们探索了一种新的解决方案,即使用语音控制来促进虚拟世界中的交互,从而避免使用额外的控制器。因此,我们研究了三种情况:对象定向、可视化定制和分析任务,并评估自然语言交互在这些场景中是否可行和有前景。在我们的定量用户研究中,参与者能够使用语言指令毫不费力地控制虚拟环境。这导致了快速的方向调整、自适应视觉辅助和准确的数据分析。此外,用户满意度和可用性调查显示,用户的接受度和易用性始终保持在较高水平。总之,我们的研究表明,使用自然语言是改进虚拟环境中用户交互的一种有前途的替代方法。它能在虚拟空间中实现直观的交互,尤其是在传统控制有局限性的情况下。
{"title":"Voice user interfaces for effortless navigation in medical virtual reality environments","authors":"","doi":"10.1016/j.cag.2024.104069","DOIUrl":"10.1016/j.cag.2024.104069","url":null,"abstract":"<div><p>In various situations, such as clinical environments with sterile conditions or when hands are occupied with multiple devices, traditional methods of navigation and scene adjustment are impractical or even impossible. We explore a new solution by using voice control to facilitate interaction in virtual worlds to avoid the use of additional controllers. Therefore, we investigate three scenarios: Object Orientation, Visualization Customization, and Analytical Tasks and evaluate whether natural language interaction is possible and promising in each of these scenarios. In our quantitative user study participants were able to control virtual environments effortlessly using verbal instructions. This resulted in rapid orientation adjustments, adaptive visual aids, and accurate data analysis. In addition, user satisfaction and usability surveys showed consistently high levels of acceptance and ease of use. In conclusion, our study shows that the use of natural language can be a promising alternative for the improvement of user interaction in virtual environments. It enables intuitive interactions in virtual spaces, especially in situations where traditional controls have limitations.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324002048/pdfft?md5=5dba80971d593332ff92694bfbd894e8&pid=1-s2.0-S0097849324002048-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142167137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Synthetic surface mesh generation of aortic dissections using statistical shape modeling 利用统计形状建模生成主动脉夹层的合成表面网格
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-06 DOI: 10.1016/j.cag.2024.104070

Aortic dissection is a rare disease affecting the aortic wall layers splitting the aortic lumen into two flow channels: the true and false lumen. The rarity of the disease leads to a sparsity of available datasets resulting in a low amount of available training data for in-silico studies or the training of machine learning algorithms. To mitigate this issue, we use statistical shape modeling to create a database of Stanford type B dissection surface meshes. We account for the complex disease anatomy by modeling two separate flow channels in the aorta, the true and false lumen. Former approaches mainly modeled the aortic arch including its branches but not two separate flow channels inside the aorta. To our knowledge, our approach is the first to attempt generating synthetic aortic dissection surface meshes. For the statistical shape model, the aorta is parameterized using the centerlines of the respective lumen and the according ellipses describing the cross-section of the lumen while being aligned along the centerline employing rotation-minimizing frames. To evaluate our approach we introduce disease-specific quality criteria by investigating the torsion and twist of the true lumen.

主动脉夹层是一种罕见的疾病,会影响主动脉壁层,将主动脉管腔分成两个流道:真管腔和假管腔。这种疾病的罕见性导致可用数据集的稀缺性,导致用于室内研究或机器学习算法训练的可用训练数据量较少。为了缓解这一问题,我们使用统计形状建模创建了斯坦福 B 型剖腹产表面网格数据库。我们通过对主动脉的两个独立流道--真腔和假腔--进行建模,来解释复杂的疾病解剖结构。以前的方法主要对主动脉弓(包括其分支)进行建模,但没有对主动脉内的两个独立流道进行建模。据我们所知,我们的方法是首次尝试生成合成主动脉夹层表面网格。在统计形状模型中,主动脉的参数使用了各自管腔的中心线和描述管腔横截面的相应椭圆,同时使用旋转最小化框架沿中心线对齐。为了评估我们的方法,我们通过研究真实管腔的扭转和扭曲情况,引入了针对特定疾病的质量标准。
{"title":"Synthetic surface mesh generation of aortic dissections using statistical shape modeling","authors":"","doi":"10.1016/j.cag.2024.104070","DOIUrl":"10.1016/j.cag.2024.104070","url":null,"abstract":"<div><p>Aortic dissection is a rare disease affecting the aortic wall layers splitting the aortic lumen into two flow channels: the true and false lumen. The rarity of the disease leads to a sparsity of available datasets resulting in a low amount of available training data for in-silico studies or the training of machine learning algorithms. To mitigate this issue, we use statistical shape modeling to create a database of Stanford type B dissection surface meshes. We account for the complex disease anatomy by modeling two separate flow channels in the aorta, the true and false lumen. Former approaches mainly modeled the aortic arch including its branches but not two separate flow channels inside the aorta. To our knowledge, our approach is the first to attempt generating synthetic aortic dissection surface meshes. For the statistical shape model, the aorta is parameterized using the centerlines of the respective lumen and the according ellipses describing the cross-section of the lumen while being aligned along the centerline employing rotation-minimizing frames. To evaluate our approach we introduce disease-specific quality criteria by investigating the torsion and twist of the true lumen.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S009784932400205X/pdfft?md5=f0b8f98a6ffb57b157863af63c74d980&pid=1-s2.0-S009784932400205X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142166764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A semantic edge-aware parameter efficient image filtering technique 语义边缘感知参数高效图像过滤技术
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-06 DOI: 10.1016/j.cag.2024.104068

The success of a structure preserving filtering technique has relied on its capability to recognize structures and textures present in the input image. In this paper a novel structure preserving filtering technique is presented that first, generates an edge-map of the input image by exploiting semantic information. Then, an edge-aware adaptive recursive median filter is utilized to produce the filter image. The technique provides satisfactory results for a wide variety of images with minimal fine-tuning of its parameters. Moreover, along with the various computer graphics applications the proposed technique also shows its robustness to incorporate spatial information for spectral-spatial classification of hyperspectral images. A MATLAB implementation of the proposed technique is available at-https://www.github.com/K-Pradhan/A-semantic-edge-aware-parameter-efficient-image-filtering-technique

结构保留过滤技术的成功取决于其识别输入图像中结构和纹理的能力。本文提出了一种新颖的结构保护滤波技术,首先,利用语义信息生成输入图像的边缘图。然后,利用边缘感知自适应递归中值滤波器生成滤波图像。该技术只需对参数进行最小限度的微调,就能为各种图像提供令人满意的结果。此外,除了各种计算机图形应用之外,所提出的技术还显示了其在结合空间信息对高光谱图像进行光谱空间分类方面的鲁棒性。拟议技术的 MATLAB 实现可在以下网址获取:https://www.github.com/K-Pradhan/A-semantic-edge-aware-parameter-efficient-image-filtering-technique
{"title":"A semantic edge-aware parameter efficient image filtering technique","authors":"","doi":"10.1016/j.cag.2024.104068","DOIUrl":"10.1016/j.cag.2024.104068","url":null,"abstract":"<div><p>The success of a structure preserving filtering technique has relied on its capability to recognize structures and textures present in the input image. In this paper a novel structure preserving filtering technique is presented that first, generates an edge-map of the input image by exploiting semantic information. Then, an edge-aware adaptive recursive median filter is utilized to produce the filter image. The technique provides satisfactory results for a wide variety of images with minimal fine-tuning of its parameters. Moreover, along with the various computer graphics applications the proposed technique also shows its robustness to incorporate spatial information for spectral-spatial classification of hyperspectral images. A MATLAB implementation of the proposed technique is available at-<span><span>https://www.github.com/K-Pradhan/A-semantic-edge-aware-parameter-efficient-image-filtering-technique</span><svg><path></path></svg></span></p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142173878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TPVis: A visual analytics system for exploring test case prioritization methods TPVis:用于探索测试用例优先级排序方法的可视化分析系统
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-05 DOI: 10.1016/j.cag.2024.104064

Software testing is a vital tool to ensure the quality and trustworthiness of the pieces of software produced. Test suites are often large, which makes the process of testing software a costly and time-consuming process. In this context, test case prioritization (TCP) methods play an important role by ranking test cases in order to enable early fault detection and, hence, enable quicker problem fixes. The evaluation of such methods is a difficult problem, due to the variety of the methods and objectives. To address this issue, we present TPVis, a visual analytics framework that enables the evaluation and comparison of TCP methods designed in collaboration with experts in software testing. Our solution is an open-source web application that provides a variety of analytical tools to assist in the exploration of test suites and prioritization algorithms. Furthermore, TPVis also provides dashboard presets, that were validated with our domain collaborators, that support common analysis goals. We illustrate the usefulness of TPVis through a series of use cases that illustrate our system’s flexibility in addressing different problems in analyzing TCP methods. Finally, we also report on feedback received from the domain experts that indicate the effectiveness of TPVis. TPVis is available at https://github.com/vixe-cin-ufpe/TPVis.

软件测试是确保软件质量和可信度的重要工具。测试套件通常很大,这使得软件测试过程既费钱又费时。在这种情况下,测试用例优先级排序(TCP)方法发挥了重要作用,通过对测试用例进行排序,可以及早发现故障,从而更快地解决问题。由于方法和目标的多样性,对此类方法进行评估是一个难题。为解决这一问题,我们提出了 TPVis,这是一个可视化分析框架,可对与软件测试专家合作设计的 TCP 方法进行评估和比较。我们的解决方案是一个开源网络应用程序,提供各种分析工具,协助探索测试套件和优先级算法。此外,TPVis 还提供了仪表板预设,这些预设已与我们的领域合作者进行了验证,支持共同的分析目标。我们通过一系列用例说明了 TPVis 的实用性,这些用例说明了我们的系统在解决 TCP 方法分析中的不同问题时的灵活性。最后,我们还报告了来自领域专家的反馈,这些反馈表明 TPVis 非常有效。TPVis 可在 https://github.com/vixe-cin-ufpe/TPVis 上查阅。
{"title":"TPVis: A visual analytics system for exploring test case prioritization methods","authors":"","doi":"10.1016/j.cag.2024.104064","DOIUrl":"10.1016/j.cag.2024.104064","url":null,"abstract":"<div><p>Software testing is a vital tool to ensure the quality and trustworthiness of the pieces of software produced. Test suites are often large, which makes the process of testing software a costly and time-consuming process. In this context, test case prioritization (TCP) methods play an important role by ranking test cases in order to enable early fault detection and, hence, enable quicker problem fixes. The evaluation of such methods is a difficult problem, due to the variety of the methods and objectives. To address this issue, we present TPVis, a visual analytics framework that enables the evaluation and comparison of TCP methods designed in collaboration with experts in software testing. Our solution is an open-source web application that provides a variety of analytical tools to assist in the exploration of test suites and prioritization algorithms. Furthermore, TPVis also provides dashboard presets, that were validated with our domain collaborators, that support common analysis goals. We illustrate the usefulness of TPVis through a series of use cases that illustrate our system’s flexibility in addressing different problems in analyzing TCP methods. Finally, we also report on feedback received from the domain experts that indicate the effectiveness of TPVis. TPVis is available at <span><span>https://github.com/vixe-cin-ufpe/TPVis</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142166763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transferring transfer functions (TTF): A guided approach to transfer function optimization in volume visualization 传递函数(TTF):体积可视化中转移函数优化的指导方法
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-04 DOI: 10.1016/j.cag.2024.104067

In volume visualization, a transfer function tailored for one volume usually does not work for other similar volumes without careful tuning. This process can be tedious and time-consuming for a large set of volumes. In this work, we present a novel approach to transfer function optimization based on the differentiable volume rendering of a reference volume and its corresponding transfer function. Using two fully connected neural networks, our approach learns a continuous 2D separable transfer function that visualizes the features of interest with consistent visual properties between the volumes. Because many volume visualization software packages support separable transfer functions, users can export the optimized transfer function into a domain-specific application for further interactions. In tandem with domain experts’ input and assessments, we present two use cases to demonstrate the effectiveness of our approach. The first use case tracks the effect of an asteroid blast near the ocean surface. In this application, a volume and its corresponding transfer function seed our method, cascading transfer function optimization for the proceeding time steps. The second use case focuses on the visualization of white matter, gray matter, and cerebrospinal fluid in magnetic resonance imaging (MRI) volumes. We optimize an intensity-gradient transfer function for one volume from its segmentation. Then we use these results to visualize other brain volumes with different intensity ranges acquired on different MRI machines.

在体积可视化中,如果不仔细调整,为一个体积量身定制的传递函数通常不适用于其他类似的体积。对于大量的体积集来说,这一过程可能既繁琐又耗时。在这项工作中,我们提出了一种基于参考体的可变体渲染及其相应传递函数的传递函数优化新方法。通过使用两个全连接的神经网络,我们的方法可以学习到连续的二维可分离传递函数,该传递函数可以将感兴趣的特征可视化,并且各体之间具有一致的视觉特性。由于许多体积可视化软件包都支持可分离传递函数,因此用户可以将优化后的传递函数导出到特定领域的应用程序中,以便进一步交互。结合领域专家的意见和评估,我们介绍了两个使用案例,以展示我们方法的有效性。第一个用例是追踪海洋表面附近小行星爆炸的影响。在该应用中,一个体积及其相应的传递函数为我们的方法播下了种子,并在接下来的时间步骤中对传递函数进行了级联优化。第二个用例侧重于磁共振成像(MRI)体积中白质、灰质和脑脊液的可视化。我们通过对一个容积的分割来优化其强度梯度传递函数。然后,我们利用这些结果来可视化在不同核磁共振成像仪上获取的具有不同强度范围的其他脑体积。
{"title":"Transferring transfer functions (TTF): A guided approach to transfer function optimization in volume visualization","authors":"","doi":"10.1016/j.cag.2024.104067","DOIUrl":"10.1016/j.cag.2024.104067","url":null,"abstract":"<div><p>In volume visualization, a transfer function tailored for one volume usually does not work for other similar volumes without careful tuning. This process can be tedious and time-consuming for a large set of volumes. In this work, we present a novel approach to transfer function optimization based on the differentiable volume rendering of a reference volume and its corresponding transfer function. Using two fully connected neural networks, our approach learns a continuous 2D separable transfer function that visualizes the features of interest with consistent visual properties between the volumes. Because many volume visualization software packages support separable transfer functions, users can export the optimized transfer function into a domain-specific application for further interactions. In tandem with domain experts’ input and assessments, we present two use cases to demonstrate the effectiveness of our approach. The first use case tracks the effect of an asteroid blast near the ocean surface. In this application, a volume and its corresponding transfer function seed our method, cascading transfer function optimization for the proceeding time steps. The second use case focuses on the visualization of white matter, gray matter, and cerebrospinal fluid in magnetic resonance imaging (MRI) volumes. We optimize an intensity-gradient transfer function for one volume from its segmentation. Then we use these results to visualize other brain volumes with different intensity ranges acquired on different MRI machines.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324002024/pdfft?md5=a9712309d212a1aa1bbb843ba60a6d2c&pid=1-s2.0-S0097849324002024-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142229684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Empowering sign language communication: Integrating sentiment and semantics for facial expression synthesis 增强手语交流能力:整合情感和语义进行面部表情合成
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-03 DOI: 10.1016/j.cag.2024.104065

Translating written sentences from oral languages to a sequence of manual and non-manual gestures plays a crucial role in building a more inclusive society for deaf and hard-of-hearing people. Facial expressions (non-manual), in particular, are responsible for encoding the grammar of the sentence to be spoken, applying punctuation, pronouns, or emphasizing signs. These non-manual gestures are closely related to the semantics of the sentence being spoken and also to the utterance of the speaker’s emotions. However, most Sign Language Production (SLP) approaches are centered on synthesizing manual gestures and do not focus on modeling the speaker’s expression. This paper introduces a new method focused in synthesizing facial expressions for sign language. Our goal is to improve sign language production by integrating sentiment information in facial expression generation. The approach leverages a sentence’s sentiment and semantic features to sample from a meaningful representation space, integrating the bias of the non-manual components into the sign language production process. To evaluate our method, we extend the Fréchet gesture distance (FGD) and propose a new metric called Fréchet Expression Distance (FED) and apply an extensive set of metrics to assess the quality of specific regions of the face. The experimental results showed that our method achieved state of the art, being superior to the competitors on How2Sign and PHOENIX14T datasets. Moreover, our architecture is based on a carefully designed graph pyramid that makes it simpler, easier to train, and capable of leveraging emotions to produce facial expressions. Our code and pretrained models will be available at: https://github.com/verlab/empowering-sign-language.

将书面句子从口语翻译成一系列手动和非手动手势,对于为聋人和重听者建立一个更具包容性的社会起着至关重要的作用。特别是面部表情(非手动),它负责对口语句子的语法进行编码,应用标点符号、代词或强调符号。这些非手动手势与所说句子的语义以及说话者的情感表达密切相关。然而,大多数手语制作(SLP)方法都以合成手动手势为中心,并不关注说话者的表情建模。本文介绍了一种专注于手语面部表情合成的新方法。我们的目标是通过在面部表情生成中整合情感信息来改进手语的制作。该方法利用句子的情感和语义特征从有意义的表示空间中采样,将非人工成分的偏差整合到手语制作过程中。为了评估我们的方法,我们扩展了弗雷谢特手势距离(FGD),提出了一种名为弗雷谢特表情距离(FED)的新度量,并应用一系列广泛的度量来评估面部特定区域的质量。实验结果表明,我们的方法达到了先进水平,在 How2Sign 和 PHOENIX14T 数据集上优于竞争对手。此外,我们的架构基于精心设计的图金字塔,使其更简单、更易于训练,并能利用情绪生成面部表情。我们的代码和预训练模型可在以下网址获取:https://github.com/verlab/empowering-sign-language。
{"title":"Empowering sign language communication: Integrating sentiment and semantics for facial expression synthesis","authors":"","doi":"10.1016/j.cag.2024.104065","DOIUrl":"10.1016/j.cag.2024.104065","url":null,"abstract":"<div><p>Translating written sentences from oral languages to a sequence of manual and non-manual gestures plays a crucial role in building a more inclusive society for deaf and hard-of-hearing people. Facial expressions (non-manual), in particular, are responsible for encoding the grammar of the sentence to be spoken, applying punctuation, pronouns, or emphasizing signs. These non-manual gestures are closely related to the semantics of the sentence being spoken and also to the utterance of the speaker’s emotions. However, most Sign Language Production (SLP) approaches are centered on synthesizing manual gestures and do not focus on modeling the speaker’s expression. This paper introduces a new method focused in synthesizing facial expressions for sign language. Our goal is to improve sign language production by integrating sentiment information in facial expression generation. The approach leverages a sentence’s sentiment and semantic features to sample from a meaningful representation space, integrating the bias of the non-manual components into the sign language production process. To evaluate our method, we extend the Fréchet gesture distance (FGD) and propose a new metric called Fréchet Expression Distance (FED) and apply an extensive set of metrics to assess the quality of specific regions of the face. The experimental results showed that our method achieved state of the art, being superior to the competitors on How2Sign and PHOENIX14T datasets. Moreover, our architecture is based on a carefully designed graph pyramid that makes it simpler, easier to train, and capable of leveraging emotions to produce facial expressions. Our code and pretrained models will be available at: <span><span>https://github.com/verlab/empowering-sign-language</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142173877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human-in-the-loop: Using classifier decision boundary maps to improve pseudo labels 人在回路中:使用分类决策边界图改进伪标签
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-30 DOI: 10.1016/j.cag.2024.104062

For classification tasks, several strategies aim to tackle the problem of not having sufficient labeled data, usually by automatic labeling or by fully passing this task to a user. Automatic labeling is simple to apply but can fail handling complex situations where human insights may be required to decide the correct labels. Conversely, manual labeling leverages the expertise of specialists but may waste precious effort which could be handled by automatic methods. More specifically, automatic solutions could be improved by combining an active learning loop with manual labeling assisted by visual depictions of a classifier’s behavior. We propose to include the human in the labeling loop by using manual labeling in feature spaces produced by a deep feature annotation (DeepFA) technique. To assist manual labeling, we provide users with visual insights on the classifier’s decision boundaries. Finally, we use the manual and automatically computed labels jointly to retrain the classifier in an active learning (AL) loop scheme. Experiments using a toy and a real-world application dataset show that our proposed combination of manual labeling supported by visualization of decision boundaries and automatic labeling can yield a significant increase in classifier performance with a quite limited user effort.

对于分类任务,有几种策略旨在解决标注数据不足的问题,通常是通过自动标注或将此任务完全交给用户来完成。自动标注简单易用,但在处理复杂情况时可能会失败,因为在这种情况下可能需要人的洞察力来决定正确的标注。相反,人工标注利用了专家的专业知识,但可能会浪费宝贵的精力,而这些精力本可以通过自动方法来处理。更具体地说,可以通过将主动学习环路与人工标注相结合,并辅以分类器行为的可视化描述,来改进自动解决方案。我们建议在深度特征标注(DeepFA)技术生成的特征空间中使用人工标注,从而将人类纳入标注循环。为了帮助人工标注,我们为用户提供了分类器决策边界的可视化见解。最后,我们利用手动和自动计算的标签,在主动学习(AL)循环方案中重新训练分类器。使用玩具数据集和实际应用数据集进行的实验表明,我们提出的将手动标注与决策边界可视化和自动标注相结合的方法可以显著提高分类器的性能,而用户只需付出相当有限的努力。
{"title":"Human-in-the-loop: Using classifier decision boundary maps to improve pseudo labels","authors":"","doi":"10.1016/j.cag.2024.104062","DOIUrl":"10.1016/j.cag.2024.104062","url":null,"abstract":"<div><p>For classification tasks, several strategies aim to tackle the problem of not having sufficient labeled data, usually by automatic labeling or by fully passing this task to a user. Automatic labeling is simple to apply but can fail handling complex situations where human insights may be required to decide the correct labels. Conversely, manual labeling leverages the expertise of specialists but may waste precious effort which could be handled by automatic methods. More specifically, automatic solutions could be improved by combining an active learning loop with manual labeling assisted by visual depictions of a classifier’s behavior. We propose to include the human in the labeling loop by using manual labeling in feature spaces produced by a deep feature annotation (DeepFA) technique. To assist manual labeling, we provide users with visual insights on the classifier’s decision boundaries. Finally, we use the manual and automatically computed labels jointly to retrain the classifier in an active learning (AL) loop scheme. Experiments using a toy and a real-world application dataset show that our proposed combination of manual labeling supported by visualization of decision boundaries and automatic labeling can yield a significant increase in classifier performance with a quite limited user effort.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142150523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SingVisio: Visual analytics of diffusion model for singing voice conversion SingVisio:歌声转换扩散模型的可视化分析
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-30 DOI: 10.1016/j.cag.2024.104058

In this study, we present SingVisio, an interactive visual analysis system that aims to explain the diffusion model used in singing voice conversion. SingVisio provides a visual display of the generation process in diffusion models, showcasing the step-by-step denoising of the noisy spectrum and its transformation into a clean spectrum that captures the desired singer’s timbre. The system also facilitates side-by-side comparisons of different conditions, such as source content, melody, and target timbre, highlighting the impact of these conditions on the diffusion generation process and resulting conversions. Through comparative and comprehensive evaluations, SingVisio demonstrates its effectiveness in terms of system design, functionality, explainability, and user-friendliness. It offers users of various backgrounds valuable learning experiences and insights into the diffusion model for singing voice conversion.

在本研究中,我们介绍了一个交互式可视分析系统 SingVisio,旨在解释歌声转换中使用的扩散模型。SingVisio 提供了扩散模型生成过程的可视化显示,展示了噪声频谱的逐步去噪和转化为干净频谱的过程,从而捕捉到所需的歌手音色。该系统还便于对不同条件(如源内容、旋律和目标音色)进行并排比较,突出显示这些条件对扩散生成过程和转换结果的影响。通过比较和综合评估,SingVisio 展示了其在系统设计、功能、可解释性和用户友好性方面的有效性。它为不同背景的用户提供了宝贵的学习经验和对歌唱语音转换扩散模型的深入了解。
{"title":"SingVisio: Visual analytics of diffusion model for singing voice conversion","authors":"","doi":"10.1016/j.cag.2024.104058","DOIUrl":"10.1016/j.cag.2024.104058","url":null,"abstract":"<div><p>In this study, we present SingVisio, an interactive visual analysis system that aims to explain the diffusion model used in singing voice conversion. SingVisio provides a visual display of the generation process in diffusion models, showcasing the step-by-step denoising of the noisy spectrum and its transformation into a clean spectrum that captures the desired singer’s timbre. The system also facilitates side-by-side comparisons of different conditions, such as source content, melody, and target timbre, highlighting the impact of these conditions on the diffusion generation process and resulting conversions. Through comparative and comprehensive evaluations, SingVisio demonstrates its effectiveness in terms of system design, functionality, explainability, and user-friendliness. It offers users of various backgrounds valuable learning experiences and insights into the diffusion model for singing voice conversion.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142241024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Virtual reality inspection of chromatin 3D and 2D data 染色质三维和二维数据的虚拟现实检测
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-30 DOI: 10.1016/j.cag.2024.104059

Understanding the packing of long DNA strands into chromatin is one of the ultimate challenges in genomic research. An intrinsic part of this complex problem is studying the chromatin’s spatial structure. Biologists reconstruct 3D models of chromatin from experimental data, yet the exploration and analysis of such 3D structures is limited in existing genomic data visualization tools. To improve this situation, we investigated the current options of immersive methods and designed a prototypical VR visualization tool for 3D chromatin models that leverages virtual reality to deal with the spatial data. We showcase the tool in three primary use cases. First, we provide an overall 3D shape overview of the chromatin to facilitate the identification of regions of interest and the selection for further investigation. Second, we include the option to export the selected regions and elements in the BED format, which can be loaded into common analytical tools. Third, we integrate epigenetic modification data along the sequence that influence gene expression, either as in-world 2D charts or overlaid on the 3D structure itself. We developed our application in collaboration with two domain experts and gathered insights from two informal studies with five other experts.

了解长 DNA 链在染色质中的堆积是基因组研究的终极挑战之一。研究染色质的空间结构是这一复杂问题的内在组成部分。生物学家根据实验数据重建染色质的三维模型,但现有的基因组数据可视化工具对这种三维结构的探索和分析非常有限。为了改善这种状况,我们研究了当前的沉浸式方法选项,并设计了一种利用虚拟现实技术处理空间数据的三维染色质模型原型 VR 可视化工具。我们在三个主要用例中展示了该工具。首先,我们提供了染色质的整体三维形状概览,以便于识别感兴趣的区域和选择进一步的研究。其次,我们提供了以 BED 格式导出所选区域和元素的选项,可将其加载到常用分析工具中。第三,我们沿序列整合了影响基因表达的表观遗传修饰数据,这些数据可以是世界范围内的二维图表,也可以叠加在三维结构上。我们与两位领域专家合作开发了我们的应用程序,并从与其他五位专家的两次非正式研究中收集了见解。
{"title":"Virtual reality inspection of chromatin 3D and 2D data","authors":"","doi":"10.1016/j.cag.2024.104059","DOIUrl":"10.1016/j.cag.2024.104059","url":null,"abstract":"<div><p>Understanding the packing of long DNA strands into chromatin is one of the ultimate challenges in genomic research. An intrinsic part of this complex problem is studying the chromatin’s spatial structure. Biologists reconstruct 3D models of chromatin from experimental data, yet the exploration and analysis of such 3D structures is limited in existing genomic data visualization tools. To improve this situation, we investigated the current options of immersive methods and designed a prototypical VR visualization tool for 3D chromatin models that leverages virtual reality to deal with the spatial data. We showcase the tool in three primary use cases. First, we provide an overall 3D shape overview of the chromatin to facilitate the identification of regions of interest and the selection for further investigation. Second, we include the option to export the selected regions and elements in the BED format, which can be loaded into common analytical tools. Third, we integrate epigenetic modification data along the sequence that influence gene expression, either as in-world 2D charts or overlaid on the 3D structure itself. We developed our application in collaboration with two domain experts and gathered insights from two informal studies with five other experts.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001948/pdfft?md5=f80ba96ee4f32f07bbbc948215d8362d&pid=1-s2.0-S0097849324001948-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computers & Graphics-Uk
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1