首页 > 最新文献

Computers & Graphics-Uk最新文献

英文 中文
SingVisio: Visual analytics of diffusion model for singing voice conversion SingVisio:歌声转换扩散模型的可视化分析
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-30 DOI: 10.1016/j.cag.2024.104058
Liumeng Xue , Chaoren Wang , Mingxuan Wang , Xueyao Zhang , Jun Han , Zhizheng Wu

In this study, we present SingVisio, an interactive visual analysis system that aims to explain the diffusion model used in singing voice conversion. SingVisio provides a visual display of the generation process in diffusion models, showcasing the step-by-step denoising of the noisy spectrum and its transformation into a clean spectrum that captures the desired singer’s timbre. The system also facilitates side-by-side comparisons of different conditions, such as source content, melody, and target timbre, highlighting the impact of these conditions on the diffusion generation process and resulting conversions. Through comparative and comprehensive evaluations, SingVisio demonstrates its effectiveness in terms of system design, functionality, explainability, and user-friendliness. It offers users of various backgrounds valuable learning experiences and insights into the diffusion model for singing voice conversion.

在本研究中,我们介绍了一个交互式可视分析系统 SingVisio,旨在解释歌声转换中使用的扩散模型。SingVisio 提供了扩散模型生成过程的可视化显示,展示了噪声频谱的逐步去噪和转化为干净频谱的过程,从而捕捉到所需的歌手音色。该系统还便于对不同条件(如源内容、旋律和目标音色)进行并排比较,突出显示这些条件对扩散生成过程和转换结果的影响。通过比较和综合评估,SingVisio 展示了其在系统设计、功能、可解释性和用户友好性方面的有效性。它为不同背景的用户提供了宝贵的学习经验和对歌唱语音转换扩散模型的深入了解。
{"title":"SingVisio: Visual analytics of diffusion model for singing voice conversion","authors":"Liumeng Xue ,&nbsp;Chaoren Wang ,&nbsp;Mingxuan Wang ,&nbsp;Xueyao Zhang ,&nbsp;Jun Han ,&nbsp;Zhizheng Wu","doi":"10.1016/j.cag.2024.104058","DOIUrl":"10.1016/j.cag.2024.104058","url":null,"abstract":"<div><p>In this study, we present SingVisio, an interactive visual analysis system that aims to explain the diffusion model used in singing voice conversion. SingVisio provides a visual display of the generation process in diffusion models, showcasing the step-by-step denoising of the noisy spectrum and its transformation into a clean spectrum that captures the desired singer’s timbre. The system also facilitates side-by-side comparisons of different conditions, such as source content, melody, and target timbre, highlighting the impact of these conditions on the diffusion generation process and resulting conversions. Through comparative and comprehensive evaluations, SingVisio demonstrates its effectiveness in terms of system design, functionality, explainability, and user-friendliness. It offers users of various backgrounds valuable learning experiences and insights into the diffusion model for singing voice conversion.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104058"},"PeriodicalIF":2.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142241024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Virtual reality inspection of chromatin 3D and 2D data 染色质三维和二维数据的虚拟现实检测
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-30 DOI: 10.1016/j.cag.2024.104059
Elena Molina , David Kouřil , Tobias Isenberg , Barbora Kozlíková , Pere-Pau Vázquez

Understanding the packing of long DNA strands into chromatin is one of the ultimate challenges in genomic research. An intrinsic part of this complex problem is studying the chromatin’s spatial structure. Biologists reconstruct 3D models of chromatin from experimental data, yet the exploration and analysis of such 3D structures is limited in existing genomic data visualization tools. To improve this situation, we investigated the current options of immersive methods and designed a prototypical VR visualization tool for 3D chromatin models that leverages virtual reality to deal with the spatial data. We showcase the tool in three primary use cases. First, we provide an overall 3D shape overview of the chromatin to facilitate the identification of regions of interest and the selection for further investigation. Second, we include the option to export the selected regions and elements in the BED format, which can be loaded into common analytical tools. Third, we integrate epigenetic modification data along the sequence that influence gene expression, either as in-world 2D charts or overlaid on the 3D structure itself. We developed our application in collaboration with two domain experts and gathered insights from two informal studies with five other experts.

了解长 DNA 链在染色质中的堆积是基因组研究的终极挑战之一。研究染色质的空间结构是这一复杂问题的内在组成部分。生物学家根据实验数据重建染色质的三维模型,但现有的基因组数据可视化工具对这种三维结构的探索和分析非常有限。为了改善这种状况,我们研究了当前的沉浸式方法选项,并设计了一种利用虚拟现实技术处理空间数据的三维染色质模型原型 VR 可视化工具。我们在三个主要用例中展示了该工具。首先,我们提供了染色质的整体三维形状概览,以便于识别感兴趣的区域和选择进一步的研究。其次,我们提供了以 BED 格式导出所选区域和元素的选项,可将其加载到常用分析工具中。第三,我们沿序列整合了影响基因表达的表观遗传修饰数据,这些数据可以是世界范围内的二维图表,也可以叠加在三维结构上。我们与两位领域专家合作开发了我们的应用程序,并从与其他五位专家的两次非正式研究中收集了见解。
{"title":"Virtual reality inspection of chromatin 3D and 2D data","authors":"Elena Molina ,&nbsp;David Kouřil ,&nbsp;Tobias Isenberg ,&nbsp;Barbora Kozlíková ,&nbsp;Pere-Pau Vázquez","doi":"10.1016/j.cag.2024.104059","DOIUrl":"10.1016/j.cag.2024.104059","url":null,"abstract":"<div><p>Understanding the packing of long DNA strands into chromatin is one of the ultimate challenges in genomic research. An intrinsic part of this complex problem is studying the chromatin’s spatial structure. Biologists reconstruct 3D models of chromatin from experimental data, yet the exploration and analysis of such 3D structures is limited in existing genomic data visualization tools. To improve this situation, we investigated the current options of immersive methods and designed a prototypical VR visualization tool for 3D chromatin models that leverages virtual reality to deal with the spatial data. We showcase the tool in three primary use cases. First, we provide an overall 3D shape overview of the chromatin to facilitate the identification of regions of interest and the selection for further investigation. Second, we include the option to export the selected regions and elements in the BED format, which can be loaded into common analytical tools. Third, we integrate epigenetic modification data along the sequence that influence gene expression, either as in-world 2D charts or overlaid on the 3D structure itself. We developed our application in collaboration with two domain experts and gathered insights from two informal studies with five other experts.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104059"},"PeriodicalIF":2.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001948/pdfft?md5=f80ba96ee4f32f07bbbc948215d8362d&pid=1-s2.0-S0097849324001948-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advanced visualization of aortic dissection anatomy and hemodynamics 主动脉夹层解剖和血流动力学高级可视化
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-30 DOI: 10.1016/j.cag.2024.104060
Aaron Schroeder , Kai Ostendorf , Kathrin Bäumler , Domenico Mastrodicasa , Veit Sandfort , Dominik Fleischmann , Bernhard Preim , Gabriel Mistelbauer

Aortic dissection is a life-threatening cardiovascular disease constituted by the delamination of the aortic wall. Due to the weakened structure of the false lumen, the aorta often dilates over time, which can – after certain diameter thresholds are reached – increase the risk of fatal aortic rupture. The identification of patients with a high risk of late adverse events is an ongoing clinical challenge, further complicated by the complex dissection anatomy and the wide variety among patients. Moreover, patient-specific risk stratification depends not only on morphological, but also on hemodynamic factors, which can be derived from computer simulations or 4D flow magnetic resonance imaging (MRI). However, comprehensible visualizations that depict the complex anatomical and functional information in a single view are yet to be developed. These visualization tools will assist clinical research and decision-making by facilitating a comprehensive understanding of the aortic state. For that purpose, we identified several visualization tasks and requirements in close collaboration with cardiovascular imaging scientists and radiologists. We displayed true and false lumen hemodynamics using pathlines as well as surface hemodynamics on the dissection flap and the inner vessel wall. Pathlines indicate antegrade and retrograde flow, blood flow through fenestrations, and branch vessel supply. Dissection-specific hemodynamic measures, such as interluminal pressure difference and flap compliance, provide further insight of the blood flow throughout the cardiac cycle. Finally, we evaluated our visualization techniques with cardiothoracic and vascular surgeons in two separate virtual sessions.

主动脉夹层是一种由主动脉壁分层构成的危及生命的心血管疾病。由于假腔结构减弱,主动脉往往会随着时间的推移而扩张,在达到一定直径阈值后,会增加致命性主动脉破裂的风险。如何识别晚期不良事件风险高的患者是一项持续的临床挑战,而复杂的夹层解剖结构和患者的多样性使这一挑战变得更加复杂。此外,患者特异性风险分层不仅取决于形态学因素,还取决于血液动力学因素,这些因素可以通过计算机模拟或四维血流磁共振成像(MRI)得出。然而,在单一视图中描绘复杂解剖和功能信息的可理解可视化技术尚待开发。这些可视化工具将有助于全面了解主动脉状态,从而为临床研究和决策提供帮助。为此,我们与心血管成像科学家和放射科医生密切合作,确定了几项可视化任务和要求。我们使用路径线显示真实和虚假的管腔血流动力学,以及夹层瓣和血管内壁的表面血流动力学。路径线显示了前向和逆行血流、通过瘘管的血流以及分支血管的供应。解剖瓣特有的血流动力学测量,如腔间压差和瓣顺应性,可进一步了解整个心动周期的血流情况。最后,我们与心胸外科医生和血管外科医生在两个独立的虚拟会议中评估了我们的可视化技术。
{"title":"Advanced visualization of aortic dissection anatomy and hemodynamics","authors":"Aaron Schroeder ,&nbsp;Kai Ostendorf ,&nbsp;Kathrin Bäumler ,&nbsp;Domenico Mastrodicasa ,&nbsp;Veit Sandfort ,&nbsp;Dominik Fleischmann ,&nbsp;Bernhard Preim ,&nbsp;Gabriel Mistelbauer","doi":"10.1016/j.cag.2024.104060","DOIUrl":"10.1016/j.cag.2024.104060","url":null,"abstract":"<div><p>Aortic dissection is a life-threatening cardiovascular disease constituted by the delamination of the aortic wall. Due to the weakened structure of the false lumen, the aorta often dilates over time, which can – after certain diameter thresholds are reached – increase the risk of fatal aortic rupture. The identification of patients with a high risk of late adverse events is an ongoing clinical challenge, further complicated by the complex dissection anatomy and the wide variety among patients. Moreover, patient-specific risk stratification depends not only on morphological, but also on hemodynamic factors, which can be derived from computer simulations or 4D flow magnetic resonance imaging (MRI). However, comprehensible visualizations that depict the complex anatomical and functional information in a single view are yet to be developed. These visualization tools will assist clinical research and decision-making by facilitating a comprehensive understanding of the aortic state. For that purpose, we identified several visualization tasks and requirements in close collaboration with cardiovascular imaging scientists and radiologists. We displayed true and false lumen hemodynamics using pathlines as well as surface hemodynamics on the dissection flap and the inner vessel wall. Pathlines indicate antegrade and retrograde flow, blood flow through fenestrations, and branch vessel supply. Dissection-specific hemodynamic measures, such as interluminal pressure difference and flap compliance, provide further insight of the blood flow throughout the cardiac cycle. Finally, we evaluated our visualization techniques with cardiothoracic and vascular surgeons in two separate virtual sessions.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104060"},"PeriodicalIF":2.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S009784932400195X/pdfft?md5=0ad3789d79a9874f94f737d74b5f7695&pid=1-s2.0-S009784932400195X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interactive data comics for communicating medical data to the general public: A study of engagement and ease of understanding 向大众传播医学数据的交互式数据漫画:关于参与度和易理解性的研究
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-29 DOI: 10.1016/j.cag.2024.104055
Melissa Fogwill, Areti Manataki

We are experiencing a health literacy crisis worldwide, which has alarming effects on individuals’ medical outcomes. This poses the challenge of communicating key information about health conditions and their management in a way that is easily understood by a general audience. In this paper, we propose the use of data-driven storytelling to address this challenge, in particular through interactive data comics. We developed an interactive data comic that communicates cancer data. A between-group study with 98 participants was carried out to evaluate the data comic’s ease of understanding and engagement, compared to a text medium that captures the same information. The study reveals that the data comic is perceived to be more engaging, and participants have greater recall and understanding of the data within the story, compared with the text medium.

我们正在经历一场全球性的健康扫盲危机,它对个人的医疗结果产生了令人担忧的影响。如何以普通受众易于理解的方式传达有关健康状况及其管理的关键信息,这对我们提出了挑战。在本文中,我们建议使用数据驱动的故事来应对这一挑战,特别是通过交互式数据漫画。我们开发了一种传播癌症数据的交互式数据漫画。我们对 98 名参与者进行了组间研究,以评估数据漫画与获取相同信息的文字媒体相比,是否更易于理解和参与。研究显示,与文字媒体相比,数据漫画被认为更吸引人,参与者对故事中的数据有更多的回忆和理解。
{"title":"Interactive data comics for communicating medical data to the general public: A study of engagement and ease of understanding","authors":"Melissa Fogwill,&nbsp;Areti Manataki","doi":"10.1016/j.cag.2024.104055","DOIUrl":"10.1016/j.cag.2024.104055","url":null,"abstract":"<div><p>We are experiencing a health literacy crisis worldwide, which has alarming effects on individuals’ medical outcomes. This poses the challenge of communicating key information about health conditions and their management in a way that is easily understood by a general audience. In this paper, we propose the use of data-driven storytelling to address this challenge, in particular through interactive data comics. We developed an interactive data comic that communicates cancer data. A between-group study with 98 participants was carried out to evaluate the data comic’s ease of understanding and engagement, compared to a text medium that captures the same information. The study reveals that the data comic is perceived to be more engaging, and participants have greater recall and understanding of the data within the story, compared with the text medium.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104055"},"PeriodicalIF":2.5,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001900/pdfft?md5=40198602446338906e6765a4f491fd30&pid=1-s2.0-S0097849324001900-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142099003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EasySkinning: Target-oriented skinning by mesh contraction and curve editing EasySkinning:通过网格收缩和曲线编辑实现面向目标的皮肤绘制
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-29 DOI: 10.1016/j.cag.2024.104049
Jing Ma , Jituo Li , Dongliang Zhang

Skinning, a critical process in animation that defines how bones influence the vertices of a 3D character model, significantly impacts the visual effect in animation production. Traditional methods are time-intensive and skill-dependent, whereas automatic techniques lack in flexibility and quality. Our research introduces EasySkinning, a user-friendly system applicable to complex meshes. This method comprises three key components: rigid weight initialization through Voronoi contraction, precise weight editing via curve tools, and smooth weight solving for reconstructing target deformations. EasySkinning begins by contracting the input mesh inwards to the skeletal bones, which improves vertex-to-bone mappings, particularly in intricate mesh areas. We also design intuitive curve-editing tools, allowing users to define more precise bone influential regions. The final stage employs advanced deformation algorithms for smooth weight solving, crucial for achieving desired animations. Through extensive experiments, we demonstrate that EasySkinning not only simplifies the creation of high-quality skinning weights but also consistently outperforms existing automatic and interactive skinning methods.

蒙皮是动画制作中的一个关键步骤,它定义了骨骼如何影响三维角色模型的顶点,对动画制作的视觉效果有重大影响。传统方法耗时长且依赖技能,而自动技术则缺乏灵活性和质量。我们的研究引入了 EasySkinning,这是一种适用于复杂网格的用户友好型系统。该方法由三个关键部分组成:通过 Voronoi 收缩进行刚性权重初始化、通过曲线工具进行精确权重编辑,以及用于重建目标变形的平滑权重求解。EasySkinning 首先将输入网格向内收缩至骨骼,从而改善顶点到骨骼的映射,尤其是在复杂的网格区域。我们还设计了直观的曲线编辑工具,允许用户定义更精确的骨骼影响区域。最后阶段采用了先进的变形算法,以实现平滑的权重求解,这对实现所需的动画效果至关重要。通过大量实验,我们证明 EasySkinning 不仅简化了高质量蒙皮权重的创建,而且性能始终优于现有的自动和交互式蒙皮方法。
{"title":"EasySkinning: Target-oriented skinning by mesh contraction and curve editing","authors":"Jing Ma ,&nbsp;Jituo Li ,&nbsp;Dongliang Zhang","doi":"10.1016/j.cag.2024.104049","DOIUrl":"10.1016/j.cag.2024.104049","url":null,"abstract":"<div><p>Skinning, a critical process in animation that defines how bones influence the vertices of a 3D character model, significantly impacts the visual effect in animation production. Traditional methods are time-intensive and skill-dependent, whereas automatic techniques lack in flexibility and quality. Our research introduces EasySkinning, a user-friendly system applicable to complex meshes. This method comprises three key components: rigid weight initialization through Voronoi contraction, precise weight editing via curve tools, and smooth weight solving for reconstructing target deformations. EasySkinning begins by contracting the input mesh inwards to the skeletal bones, which improves vertex-to-bone mappings, particularly in intricate mesh areas. We also design intuitive curve-editing tools, allowing users to define more precise bone influential regions. The final stage employs advanced deformation algorithms for smooth weight solving, crucial for achieving desired animations. Through extensive experiments, we demonstrate that EasySkinning not only simplifies the creation of high-quality skinning weights but also consistently outperforms existing automatic and interactive skinning methods.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104049"},"PeriodicalIF":2.5,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142150524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HiSEG: Human assisted instance segmentation HiSEG:人工辅助实例分割
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-26 DOI: 10.1016/j.cag.2024.104061
Muhammed Korkmaz, T. Metin Sezgin

Instance segmentation is a form of image detection which has a range of applications, such as object refinement, medical image analysis, and image/video editing, all of which demand a high degree of accuracy. However, this precision is often beyond the reach of what even state-of-the-art, fully automated instance segmentation algorithms can deliver. The performance gap becomes particularly prohibitive for small and complex objects. Practitioners typically resort to fully manual annotation, which can be a laborious process. In order to overcome this problem, we propose a novel approach to enable more precise predictions and generate higher-quality segmentation masks for high-curvature, complex and small-scale objects. Our human-assisted segmentation method, HiSEG, augments the existing Strong Mask R-CNN network to incorporate human-specified partial boundaries. We also present a dataset of hand-drawn partial object boundaries, which we refer to as “human attention maps”. In addition, the Partial Sketch Object Boundaries (PSOB) dataset contains hand-drawn partial object boundaries which represent curvatures of an object’s ground truth mask with several pixels. Through extensive evaluation using the PSOB dataset, we show that HiSEG outperforms state-of-the art methods such as Mask R-CNN, Strong Mask R-CNN, Mask2Former, and Segment Anything, achieving respective increases of +42.0, +34.9, +29.9, and +13.4 points in APMask metrics for these four models. We hope that our novel approach will set a baseline for future human-aided deep learning models by combining fully automated and interactive instance segmentation architectures.

实例分割是图像检测的一种形式,在物体细化、医学图像分析和图像/视频编辑等方面有广泛的应用,所有这些应用都要求很高的精确度。然而,即使是最先进的全自动实例分割算法也往往无法达到这种精度。对于小而复杂的对象来说,性能差距变得尤其令人望而却步。实践者通常会采用全手动标注的方法,这可能是一个费力的过程。为了克服这一问题,我们提出了一种新方法,以实现更精确的预测,并为高曲率、复杂和小型物体生成更高质量的分割掩码。我们的人工辅助分割方法 HiSEG 增强了现有的强掩码 R-CNN 网络,纳入了人类指定的部分边界。我们还提出了一个手绘部分物体边界的数据集,我们将其称为 "人类注意力地图"。此外,部分草图对象边界(PSOB)数据集包含手绘的部分对象边界,它代表了对象的地面实况掩模的几个像素的曲率。通过使用 PSOB 数据集进行广泛评估,我们发现 HiSEG 优于 Mask R-CNN、Strong Mask R-CNN、Mask2Former 和 Segment Anything 等最先进的方法,这四种模型的 APMask 指标分别提高了 +42.0、+34.9、+29.9 和 +13.4。我们希望,通过结合全自动和交互式实例分割架构,我们的新方法将为未来的人类辅助深度学习模型设定基准。
{"title":"HiSEG: Human assisted instance segmentation","authors":"Muhammed Korkmaz,&nbsp;T. Metin Sezgin","doi":"10.1016/j.cag.2024.104061","DOIUrl":"10.1016/j.cag.2024.104061","url":null,"abstract":"<div><p>Instance segmentation is a form of image detection which has a range of applications, such as object refinement, medical image analysis, and image/video editing, all of which demand a high degree of accuracy. However, this precision is often beyond the reach of what even state-of-the-art, fully automated instance segmentation algorithms can deliver. The performance gap becomes particularly prohibitive for small and complex objects. Practitioners typically resort to fully manual annotation, which can be a laborious process. In order to overcome this problem, we propose a novel approach to enable more precise predictions and generate higher-quality segmentation masks for high-curvature, complex and small-scale objects. Our human-assisted segmentation method, HiSEG, augments the existing Strong Mask R-CNN network to incorporate human-specified partial boundaries. We also present a dataset of hand-drawn partial object boundaries, which we refer to as “human attention maps”. In addition, the Partial Sketch Object Boundaries (PSOB) dataset contains hand-drawn partial object boundaries which represent curvatures of an object’s ground truth mask with several pixels. Through extensive evaluation using the PSOB dataset, we show that HiSEG outperforms state-of-the art methods such as Mask R-CNN, Strong Mask R-CNN, Mask2Former, and Segment Anything, achieving respective increases of +42.0, +34.9, +29.9, and +13.4 points in AP<span><math><msub><mrow></mrow><mrow><mtext>Mask</mtext></mrow></msub></math></span> metrics for these four models. We hope that our novel approach will set a baseline for future human-aided deep learning models by combining fully automated and interactive instance segmentation architectures.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104061"},"PeriodicalIF":2.5,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142099001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast direct multi-person radiance fields from sparse input with dense pose priors 利用密集姿态先验从稀疏输入快速直接获得多人辐射场
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-26 DOI: 10.1016/j.cag.2024.104063
João Paulo Lima , Hideaki Uchiyama , Diego Thomas , Veronica Teichrieb

Volumetric radiance fields have been popular in reconstructing small-scale 3D scenes from multi-view images. With additional constraints such as person correspondences, reconstructing a large 3D scene with multiple persons becomes possible. However, existing methods fail for sparse input views or when person correspondences are unavailable. In such cases, the conventional depth image supervision may be insufficient because it only captures the relative position of each person with respect to the camera center. In this paper, we investigate an alternative approach by supervising the optimization framework with a dense pose prior that represents correspondences between the SMPL model and the input images. The core ideas of our approach consist in exploiting dense pose priors estimated from the input images to perform person segmentation and incorporating such priors into the learning of the radiance field. Our proposed dense pose supervision is view-independent, significantly speeding up computational time and improving 3D reconstruction accuracy, with less floaters and noise. We confirm the advantages of our proposed method with extensive evaluation in a subset of the publicly available CMU Panoptic dataset. When training with only five input views, our proposed method achieves an average improvement of 6.1% in PSNR, 3.5% in SSIM, 17.2% in LPIPSvgg, 19.3% in LPIPSalex, and 39.4% in training time.

体积辐射场在从多视角图像重建小规模三维场景方面很受欢迎。有了额外的约束条件(如人物对应关系),重建有多个人物的大型三维场景就成为可能。然而,现有的方法在输入视图稀疏或没有人物对应关系时会失效。在这种情况下,传统的深度图像监督可能是不够的,因为它只能捕捉每个人相对于摄像机中心的相对位置。在本文中,我们研究了一种替代方法,即用表示 SMPL 模型和输入图像之间对应关系的密集姿态先验来监督优化框架。我们方法的核心思想是利用从输入图像中估算出的密集姿态先验来执行人物分割,并将这些先验纳入辐射场的学习中。我们提出的密集姿态监督与视图无关,大大加快了计算时间,提高了三维重建精度,减少了浮点和噪声。我们在公开的 CMU Panoptic 数据集子集中进行了广泛的评估,证实了我们提出的方法的优势。当仅使用五个输入视图进行训练时,我们提出的方法平均提高了 6.1%的 PSNR、3.5% 的 SSIM、17.2% 的 LPIPSvgg、19.3% 的 LPIPSalex 以及 39.4% 的训练时间。
{"title":"Fast direct multi-person radiance fields from sparse input with dense pose priors","authors":"João Paulo Lima ,&nbsp;Hideaki Uchiyama ,&nbsp;Diego Thomas ,&nbsp;Veronica Teichrieb","doi":"10.1016/j.cag.2024.104063","DOIUrl":"10.1016/j.cag.2024.104063","url":null,"abstract":"<div><p>Volumetric radiance fields have been popular in reconstructing small-scale 3D scenes from multi-view images. With additional constraints such as person correspondences, reconstructing a large 3D scene with multiple persons becomes possible. However, existing methods fail for sparse input views or when person correspondences are unavailable. In such cases, the conventional depth image supervision may be insufficient because it only captures the relative position of each person with respect to the camera center. In this paper, we investigate an alternative approach by supervising the optimization framework with a dense pose prior that represents correspondences between the SMPL model and the input images. The core ideas of our approach consist in exploiting dense pose priors estimated from the input images to perform person segmentation and incorporating such priors into the learning of the radiance field. Our proposed dense pose supervision is view-independent, significantly speeding up computational time and improving 3D reconstruction accuracy, with less floaters and noise. We confirm the advantages of our proposed method with extensive evaluation in a subset of the publicly available CMU Panoptic dataset. When training with only five input views, our proposed method achieves an average improvement of 6.1% in PSNR, 3.5% in SSIM, 17.2% in LPIPS<sup>vgg</sup>, 19.3% in LPIPS<sup>alex</sup>, and 39.4% in training time.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104063"},"PeriodicalIF":2.5,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142087400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Choreographing multi-degree of freedom behaviors in large-scale crowd simulations 在大规模人群模拟中编排多自由度行为
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-23 DOI: 10.1016/j.cag.2024.104051
Kexiang Huang , Gangyi Ding , Dapeng Yan , Ruida Tang , Tianyu Huang , Nuria Pelechano

This study introduces a novel framework for choreographing multi-degree of freedom (MDoF) behaviors in large-scale crowd simulations. The framework integrates multi-objective optimization with spatio-temporal ordering to effectively generate and control diverse MDoF crowd behavior states. We propose a set of evaluation criteria for assessing the aesthetic quality of crowd states and employ multi-objective optimization to produce crowd states that meet these criteria. Additionally, we introduce time offset functions and interpolation progress functions to perform complex and diversified behavior state interpolations. Furthermore, we designed a user-centric interaction module that allows for intuitive and flexible adjustments of crowd behavior states through sketching, spline curves, and other interactive means. Qualitative tests and quantitative experiments on the evaluation criteria demonstrate the effectiveness of this method in generating and controlling MDoF behaviors in crowds. Finally, case studies, including real-world applications in the Opening Ceremony of the 2022 Beijing Winter Olympics, validate the practicality and adaptability of this approach.

本研究介绍了一种在大规模人群模拟中编排多自由度(MDoF)行为的新型框架。该框架将多目标优化与时空排序相结合,可有效生成和控制多种 MDoF 人群行为状态。我们提出了一套评估人群状态美学质量的评价标准,并采用多目标优化来生成符合这些标准的人群状态。此外,我们还引入了时间偏移函数和插值进度函数,以执行复杂多样的行为状态插值。此外,我们还设计了一个以用户为中心的交互模块,通过草图、样条曲线和其他交互方式,直观灵活地调整人群行为状态。对评估标准的定性测试和定量实验证明了这种方法在生成和控制人群 MDoF 行为方面的有效性。最后,包括 2022 年北京冬奥会开幕式实际应用在内的案例研究验证了这种方法的实用性和适应性。
{"title":"Choreographing multi-degree of freedom behaviors in large-scale crowd simulations","authors":"Kexiang Huang ,&nbsp;Gangyi Ding ,&nbsp;Dapeng Yan ,&nbsp;Ruida Tang ,&nbsp;Tianyu Huang ,&nbsp;Nuria Pelechano","doi":"10.1016/j.cag.2024.104051","DOIUrl":"10.1016/j.cag.2024.104051","url":null,"abstract":"<div><p>This study introduces a novel framework for choreographing multi-degree of freedom (MDoF) behaviors in large-scale crowd simulations. The framework integrates multi-objective optimization with spatio-temporal ordering to effectively generate and control diverse MDoF crowd behavior states. We propose a set of evaluation criteria for assessing the aesthetic quality of crowd states and employ multi-objective optimization to produce crowd states that meet these criteria. Additionally, we introduce time offset functions and interpolation progress functions to perform complex and diversified behavior state interpolations. Furthermore, we designed a user-centric interaction module that allows for intuitive and flexible adjustments of crowd behavior states through sketching, spline curves, and other interactive means. Qualitative tests and quantitative experiments on the evaluation criteria demonstrate the effectiveness of this method in generating and controlling MDoF behaviors in crowds. Finally, case studies, including real-world applications in the Opening Ceremony of the 2022 Beijing Winter Olympics, validate the practicality and adaptability of this approach.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104051"},"PeriodicalIF":2.5,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142048641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
US-Net: U-shaped network with Convolutional Attention Mechanism for ultrasound medical images US-Net:采用卷积注意机制的 U 型网络,用于超声医学图像
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-23 DOI: 10.1016/j.cag.2024.104054
Xiaoyu Xie , Pingping Liu , Yijun Lang , Zhenjie Guo , Zhongxi Yang , Yuhao Zhao

Ultrasound imaging, characterized by low contrast, high noise, and interference from surrounding tissues, poses significant challenges in lesion segmentation. To tackle these issues, we introduce an enhanced U-shaped network that incorporates several novel features for precise, automated segmentation. Firstly, our model utilizes a convolution-based self-attention mechanism to establish long-range dependencies in feature maps, crucial for small dataset applications, accompanied by a soft thresholding method for noise reduction. Secondly, we employ multi-sized convolutional kernels to enrich feature processing, coupled with curvature calculations to accentuate edge details via a soft-attention approach. Thirdly, an advanced skip connection strategy is implemented in the UNet architecture, integrating information entropy to assess and utilize texture-rich channels, thereby improving semantic detail in the encoder before merging with decoder outputs. We validated our approach using a newly curated dataset, VPUSI (Vascular Plaques Ultrasound Images), alongside the established datasets, BUSI, TN3K and DDTI. Comparative experiments on these datasets show that our model outperforms existing state-of-the-art techniques in segmentation accuracy.

超声成像的特点是对比度低、噪声大、周围组织干扰多,这给病变分割带来了巨大挑战。为了解决这些问题,我们引入了一种增强型 U 形网络,该网络结合了多种新功能,可实现精确的自动分割。首先,我们的模型利用基于卷积的自注意机制在特征图中建立长程依赖关系,这对小数据集应用至关重要,同时还采用了软阈值方法来降低噪声。其次,我们采用多大小卷积核来丰富特征处理,并结合曲率计算,通过软关注方法突出边缘细节。第三,在 UNet 架构中实施了先进的跳转连接策略,整合信息熵来评估和利用纹理丰富的通道,从而在与解码器输出合并之前改善编码器中的语义细节。我们使用了一个新开发的数据集 VPUSI(血管斑块超声图像),以及已有的数据集 BUSI、TN3K 和 DDTI,对我们的方法进行了验证。在这些数据集上进行的对比实验表明,我们的模型在分割准确性上优于现有的最先进技术。
{"title":"US-Net: U-shaped network with Convolutional Attention Mechanism for ultrasound medical images","authors":"Xiaoyu Xie ,&nbsp;Pingping Liu ,&nbsp;Yijun Lang ,&nbsp;Zhenjie Guo ,&nbsp;Zhongxi Yang ,&nbsp;Yuhao Zhao","doi":"10.1016/j.cag.2024.104054","DOIUrl":"10.1016/j.cag.2024.104054","url":null,"abstract":"<div><p>Ultrasound imaging, characterized by low contrast, high noise, and interference from surrounding tissues, poses significant challenges in lesion segmentation. To tackle these issues, we introduce an enhanced U-shaped network that incorporates several novel features for precise, automated segmentation. Firstly, our model utilizes a convolution-based self-attention mechanism to establish long-range dependencies in feature maps, crucial for small dataset applications, accompanied by a soft thresholding method for noise reduction. Secondly, we employ multi-sized convolutional kernels to enrich feature processing, coupled with curvature calculations to accentuate edge details via a soft-attention approach. Thirdly, an advanced skip connection strategy is implemented in the UNet architecture, integrating information entropy to assess and utilize texture-rich channels, thereby improving semantic detail in the encoder before merging with decoder outputs. We validated our approach using a newly curated dataset, VPUSI (Vascular Plaques Ultrasound Images), alongside the established datasets, BUSI, TN3K and DDTI. Comparative experiments on these datasets show that our model outperforms existing state-of-the-art techniques in segmentation accuracy.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104054"},"PeriodicalIF":2.5,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142122100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ShapeBench: A new approach to benchmarking local 3D shape descriptors ShapeBench:为本地三维形状描述符设定基准的新方法
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-22 DOI: 10.1016/j.cag.2024.104052
Bart Iver van Blokland

The ShapeBench evaluation methodology is proposed as an extension to the popular Area Under Precision-Recall Curve (PRC/AUC) for measuring the matching performance of local 3D shape descriptors. It is observed that the PRC inadequately accounts for other similar surfaces in the same or different objects when determining whether a candidate match is a true positive. The novel Descriptor Distance Index (DDI) metric is introduced to address this limitation. In contrast to previous evaluation methodologies, which identify entire objects in a given scene, the DDI metric measures descriptor performance by analysing point-to-point distances. The ShapeBench methodology is also more scalable than previous approaches, by using procedural generation. The benchmark is used to evaluate both old and new descriptors. The results produced by the implementation of the benchmark are fully replicable, and are made publicly available.

ShapeBench 评估方法是对常用的精确度-召回曲线下面积(PRC/AUC)的扩展,用于测量局部三维形状描述符的匹配性能。据观察,PRC 在确定候选匹配是否为真匹配时,没有充分考虑相同或不同对象中的其他类似表面。为解决这一局限性,我们引入了新颖的描述符距离指数(DDI)指标。与以往识别给定场景中整个物体的评估方法不同,DDI 指标通过分析点到点的距离来衡量描述符的性能。此外,ShapeBench 方法通过使用程序生成,比以前的方法更具可扩展性。该基准可用于评估新旧描述符。该基准实施所产生的结果完全可以复制,并已公开发布。
{"title":"ShapeBench: A new approach to benchmarking local 3D shape descriptors","authors":"Bart Iver van Blokland","doi":"10.1016/j.cag.2024.104052","DOIUrl":"10.1016/j.cag.2024.104052","url":null,"abstract":"<div><p>The ShapeBench evaluation methodology is proposed as an extension to the popular Area Under Precision-Recall Curve (PRC/AUC) for measuring the matching performance of local 3D shape descriptors. It is observed that the PRC inadequately accounts for other similar surfaces in the same or different objects when determining whether a candidate match is a true positive. The novel Descriptor Distance Index (DDI) metric is introduced to address this limitation. In contrast to previous evaluation methodologies, which identify entire objects in a given scene, the DDI metric measures descriptor performance by analysing point-to-point distances. The ShapeBench methodology is also more scalable than previous approaches, by using procedural generation. The benchmark is used to evaluate both old and new descriptors. The results produced by the implementation of the benchmark are fully replicable, and are made publicly available.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104052"},"PeriodicalIF":2.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001870/pdfft?md5=5829ea110e365c2d20b6d416c88f685a&pid=1-s2.0-S0097849324001870-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142099000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computers & Graphics-Uk
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1