Pub Date : 2024-08-30DOI: 10.1016/j.cag.2024.104058
Liumeng Xue , Chaoren Wang , Mingxuan Wang , Xueyao Zhang , Jun Han , Zhizheng Wu
In this study, we present SingVisio, an interactive visual analysis system that aims to explain the diffusion model used in singing voice conversion. SingVisio provides a visual display of the generation process in diffusion models, showcasing the step-by-step denoising of the noisy spectrum and its transformation into a clean spectrum that captures the desired singer’s timbre. The system also facilitates side-by-side comparisons of different conditions, such as source content, melody, and target timbre, highlighting the impact of these conditions on the diffusion generation process and resulting conversions. Through comparative and comprehensive evaluations, SingVisio demonstrates its effectiveness in terms of system design, functionality, explainability, and user-friendliness. It offers users of various backgrounds valuable learning experiences and insights into the diffusion model for singing voice conversion.
{"title":"SingVisio: Visual analytics of diffusion model for singing voice conversion","authors":"Liumeng Xue , Chaoren Wang , Mingxuan Wang , Xueyao Zhang , Jun Han , Zhizheng Wu","doi":"10.1016/j.cag.2024.104058","DOIUrl":"10.1016/j.cag.2024.104058","url":null,"abstract":"<div><p>In this study, we present SingVisio, an interactive visual analysis system that aims to explain the diffusion model used in singing voice conversion. SingVisio provides a visual display of the generation process in diffusion models, showcasing the step-by-step denoising of the noisy spectrum and its transformation into a clean spectrum that captures the desired singer’s timbre. The system also facilitates side-by-side comparisons of different conditions, such as source content, melody, and target timbre, highlighting the impact of these conditions on the diffusion generation process and resulting conversions. Through comparative and comprehensive evaluations, SingVisio demonstrates its effectiveness in terms of system design, functionality, explainability, and user-friendliness. It offers users of various backgrounds valuable learning experiences and insights into the diffusion model for singing voice conversion.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104058"},"PeriodicalIF":2.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142241024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-30DOI: 10.1016/j.cag.2024.104059
Elena Molina , David Kouřil , Tobias Isenberg , Barbora Kozlíková , Pere-Pau Vázquez
Understanding the packing of long DNA strands into chromatin is one of the ultimate challenges in genomic research. An intrinsic part of this complex problem is studying the chromatin’s spatial structure. Biologists reconstruct 3D models of chromatin from experimental data, yet the exploration and analysis of such 3D structures is limited in existing genomic data visualization tools. To improve this situation, we investigated the current options of immersive methods and designed a prototypical VR visualization tool for 3D chromatin models that leverages virtual reality to deal with the spatial data. We showcase the tool in three primary use cases. First, we provide an overall 3D shape overview of the chromatin to facilitate the identification of regions of interest and the selection for further investigation. Second, we include the option to export the selected regions and elements in the BED format, which can be loaded into common analytical tools. Third, we integrate epigenetic modification data along the sequence that influence gene expression, either as in-world 2D charts or overlaid on the 3D structure itself. We developed our application in collaboration with two domain experts and gathered insights from two informal studies with five other experts.
了解长 DNA 链在染色质中的堆积是基因组研究的终极挑战之一。研究染色质的空间结构是这一复杂问题的内在组成部分。生物学家根据实验数据重建染色质的三维模型,但现有的基因组数据可视化工具对这种三维结构的探索和分析非常有限。为了改善这种状况,我们研究了当前的沉浸式方法选项,并设计了一种利用虚拟现实技术处理空间数据的三维染色质模型原型 VR 可视化工具。我们在三个主要用例中展示了该工具。首先,我们提供了染色质的整体三维形状概览,以便于识别感兴趣的区域和选择进一步的研究。其次,我们提供了以 BED 格式导出所选区域和元素的选项,可将其加载到常用分析工具中。第三,我们沿序列整合了影响基因表达的表观遗传修饰数据,这些数据可以是世界范围内的二维图表,也可以叠加在三维结构上。我们与两位领域专家合作开发了我们的应用程序,并从与其他五位专家的两次非正式研究中收集了见解。
{"title":"Virtual reality inspection of chromatin 3D and 2D data","authors":"Elena Molina , David Kouřil , Tobias Isenberg , Barbora Kozlíková , Pere-Pau Vázquez","doi":"10.1016/j.cag.2024.104059","DOIUrl":"10.1016/j.cag.2024.104059","url":null,"abstract":"<div><p>Understanding the packing of long DNA strands into chromatin is one of the ultimate challenges in genomic research. An intrinsic part of this complex problem is studying the chromatin’s spatial structure. Biologists reconstruct 3D models of chromatin from experimental data, yet the exploration and analysis of such 3D structures is limited in existing genomic data visualization tools. To improve this situation, we investigated the current options of immersive methods and designed a prototypical VR visualization tool for 3D chromatin models that leverages virtual reality to deal with the spatial data. We showcase the tool in three primary use cases. First, we provide an overall 3D shape overview of the chromatin to facilitate the identification of regions of interest and the selection for further investigation. Second, we include the option to export the selected regions and elements in the BED format, which can be loaded into common analytical tools. Third, we integrate epigenetic modification data along the sequence that influence gene expression, either as in-world 2D charts or overlaid on the 3D structure itself. We developed our application in collaboration with two domain experts and gathered insights from two informal studies with five other experts.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104059"},"PeriodicalIF":2.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001948/pdfft?md5=f80ba96ee4f32f07bbbc948215d8362d&pid=1-s2.0-S0097849324001948-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-30DOI: 10.1016/j.cag.2024.104060
Aaron Schroeder , Kai Ostendorf , Kathrin Bäumler , Domenico Mastrodicasa , Veit Sandfort , Dominik Fleischmann , Bernhard Preim , Gabriel Mistelbauer
Aortic dissection is a life-threatening cardiovascular disease constituted by the delamination of the aortic wall. Due to the weakened structure of the false lumen, the aorta often dilates over time, which can – after certain diameter thresholds are reached – increase the risk of fatal aortic rupture. The identification of patients with a high risk of late adverse events is an ongoing clinical challenge, further complicated by the complex dissection anatomy and the wide variety among patients. Moreover, patient-specific risk stratification depends not only on morphological, but also on hemodynamic factors, which can be derived from computer simulations or 4D flow magnetic resonance imaging (MRI). However, comprehensible visualizations that depict the complex anatomical and functional information in a single view are yet to be developed. These visualization tools will assist clinical research and decision-making by facilitating a comprehensive understanding of the aortic state. For that purpose, we identified several visualization tasks and requirements in close collaboration with cardiovascular imaging scientists and radiologists. We displayed true and false lumen hemodynamics using pathlines as well as surface hemodynamics on the dissection flap and the inner vessel wall. Pathlines indicate antegrade and retrograde flow, blood flow through fenestrations, and branch vessel supply. Dissection-specific hemodynamic measures, such as interluminal pressure difference and flap compliance, provide further insight of the blood flow throughout the cardiac cycle. Finally, we evaluated our visualization techniques with cardiothoracic and vascular surgeons in two separate virtual sessions.
{"title":"Advanced visualization of aortic dissection anatomy and hemodynamics","authors":"Aaron Schroeder , Kai Ostendorf , Kathrin Bäumler , Domenico Mastrodicasa , Veit Sandfort , Dominik Fleischmann , Bernhard Preim , Gabriel Mistelbauer","doi":"10.1016/j.cag.2024.104060","DOIUrl":"10.1016/j.cag.2024.104060","url":null,"abstract":"<div><p>Aortic dissection is a life-threatening cardiovascular disease constituted by the delamination of the aortic wall. Due to the weakened structure of the false lumen, the aorta often dilates over time, which can – after certain diameter thresholds are reached – increase the risk of fatal aortic rupture. The identification of patients with a high risk of late adverse events is an ongoing clinical challenge, further complicated by the complex dissection anatomy and the wide variety among patients. Moreover, patient-specific risk stratification depends not only on morphological, but also on hemodynamic factors, which can be derived from computer simulations or 4D flow magnetic resonance imaging (MRI). However, comprehensible visualizations that depict the complex anatomical and functional information in a single view are yet to be developed. These visualization tools will assist clinical research and decision-making by facilitating a comprehensive understanding of the aortic state. For that purpose, we identified several visualization tasks and requirements in close collaboration with cardiovascular imaging scientists and radiologists. We displayed true and false lumen hemodynamics using pathlines as well as surface hemodynamics on the dissection flap and the inner vessel wall. Pathlines indicate antegrade and retrograde flow, blood flow through fenestrations, and branch vessel supply. Dissection-specific hemodynamic measures, such as interluminal pressure difference and flap compliance, provide further insight of the blood flow throughout the cardiac cycle. Finally, we evaluated our visualization techniques with cardiothoracic and vascular surgeons in two separate virtual sessions.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104060"},"PeriodicalIF":2.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S009784932400195X/pdfft?md5=0ad3789d79a9874f94f737d74b5f7695&pid=1-s2.0-S009784932400195X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-29DOI: 10.1016/j.cag.2024.104055
Melissa Fogwill, Areti Manataki
We are experiencing a health literacy crisis worldwide, which has alarming effects on individuals’ medical outcomes. This poses the challenge of communicating key information about health conditions and their management in a way that is easily understood by a general audience. In this paper, we propose the use of data-driven storytelling to address this challenge, in particular through interactive data comics. We developed an interactive data comic that communicates cancer data. A between-group study with 98 participants was carried out to evaluate the data comic’s ease of understanding and engagement, compared to a text medium that captures the same information. The study reveals that the data comic is perceived to be more engaging, and participants have greater recall and understanding of the data within the story, compared with the text medium.
{"title":"Interactive data comics for communicating medical data to the general public: A study of engagement and ease of understanding","authors":"Melissa Fogwill, Areti Manataki","doi":"10.1016/j.cag.2024.104055","DOIUrl":"10.1016/j.cag.2024.104055","url":null,"abstract":"<div><p>We are experiencing a health literacy crisis worldwide, which has alarming effects on individuals’ medical outcomes. This poses the challenge of communicating key information about health conditions and their management in a way that is easily understood by a general audience. In this paper, we propose the use of data-driven storytelling to address this challenge, in particular through interactive data comics. We developed an interactive data comic that communicates cancer data. A between-group study with 98 participants was carried out to evaluate the data comic’s ease of understanding and engagement, compared to a text medium that captures the same information. The study reveals that the data comic is perceived to be more engaging, and participants have greater recall and understanding of the data within the story, compared with the text medium.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104055"},"PeriodicalIF":2.5,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001900/pdfft?md5=40198602446338906e6765a4f491fd30&pid=1-s2.0-S0097849324001900-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142099003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-29DOI: 10.1016/j.cag.2024.104049
Jing Ma , Jituo Li , Dongliang Zhang
Skinning, a critical process in animation that defines how bones influence the vertices of a 3D character model, significantly impacts the visual effect in animation production. Traditional methods are time-intensive and skill-dependent, whereas automatic techniques lack in flexibility and quality. Our research introduces EasySkinning, a user-friendly system applicable to complex meshes. This method comprises three key components: rigid weight initialization through Voronoi contraction, precise weight editing via curve tools, and smooth weight solving for reconstructing target deformations. EasySkinning begins by contracting the input mesh inwards to the skeletal bones, which improves vertex-to-bone mappings, particularly in intricate mesh areas. We also design intuitive curve-editing tools, allowing users to define more precise bone influential regions. The final stage employs advanced deformation algorithms for smooth weight solving, crucial for achieving desired animations. Through extensive experiments, we demonstrate that EasySkinning not only simplifies the creation of high-quality skinning weights but also consistently outperforms existing automatic and interactive skinning methods.
{"title":"EasySkinning: Target-oriented skinning by mesh contraction and curve editing","authors":"Jing Ma , Jituo Li , Dongliang Zhang","doi":"10.1016/j.cag.2024.104049","DOIUrl":"10.1016/j.cag.2024.104049","url":null,"abstract":"<div><p>Skinning, a critical process in animation that defines how bones influence the vertices of a 3D character model, significantly impacts the visual effect in animation production. Traditional methods are time-intensive and skill-dependent, whereas automatic techniques lack in flexibility and quality. Our research introduces EasySkinning, a user-friendly system applicable to complex meshes. This method comprises three key components: rigid weight initialization through Voronoi contraction, precise weight editing via curve tools, and smooth weight solving for reconstructing target deformations. EasySkinning begins by contracting the input mesh inwards to the skeletal bones, which improves vertex-to-bone mappings, particularly in intricate mesh areas. We also design intuitive curve-editing tools, allowing users to define more precise bone influential regions. The final stage employs advanced deformation algorithms for smooth weight solving, crucial for achieving desired animations. Through extensive experiments, we demonstrate that EasySkinning not only simplifies the creation of high-quality skinning weights but also consistently outperforms existing automatic and interactive skinning methods.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104049"},"PeriodicalIF":2.5,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142150524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-26DOI: 10.1016/j.cag.2024.104061
Muhammed Korkmaz, T. Metin Sezgin
Instance segmentation is a form of image detection which has a range of applications, such as object refinement, medical image analysis, and image/video editing, all of which demand a high degree of accuracy. However, this precision is often beyond the reach of what even state-of-the-art, fully automated instance segmentation algorithms can deliver. The performance gap becomes particularly prohibitive for small and complex objects. Practitioners typically resort to fully manual annotation, which can be a laborious process. In order to overcome this problem, we propose a novel approach to enable more precise predictions and generate higher-quality segmentation masks for high-curvature, complex and small-scale objects. Our human-assisted segmentation method, HiSEG, augments the existing Strong Mask R-CNN network to incorporate human-specified partial boundaries. We also present a dataset of hand-drawn partial object boundaries, which we refer to as “human attention maps”. In addition, the Partial Sketch Object Boundaries (PSOB) dataset contains hand-drawn partial object boundaries which represent curvatures of an object’s ground truth mask with several pixels. Through extensive evaluation using the PSOB dataset, we show that HiSEG outperforms state-of-the art methods such as Mask R-CNN, Strong Mask R-CNN, Mask2Former, and Segment Anything, achieving respective increases of +42.0, +34.9, +29.9, and +13.4 points in AP metrics for these four models. We hope that our novel approach will set a baseline for future human-aided deep learning models by combining fully automated and interactive instance segmentation architectures.
{"title":"HiSEG: Human assisted instance segmentation","authors":"Muhammed Korkmaz, T. Metin Sezgin","doi":"10.1016/j.cag.2024.104061","DOIUrl":"10.1016/j.cag.2024.104061","url":null,"abstract":"<div><p>Instance segmentation is a form of image detection which has a range of applications, such as object refinement, medical image analysis, and image/video editing, all of which demand a high degree of accuracy. However, this precision is often beyond the reach of what even state-of-the-art, fully automated instance segmentation algorithms can deliver. The performance gap becomes particularly prohibitive for small and complex objects. Practitioners typically resort to fully manual annotation, which can be a laborious process. In order to overcome this problem, we propose a novel approach to enable more precise predictions and generate higher-quality segmentation masks for high-curvature, complex and small-scale objects. Our human-assisted segmentation method, HiSEG, augments the existing Strong Mask R-CNN network to incorporate human-specified partial boundaries. We also present a dataset of hand-drawn partial object boundaries, which we refer to as “human attention maps”. In addition, the Partial Sketch Object Boundaries (PSOB) dataset contains hand-drawn partial object boundaries which represent curvatures of an object’s ground truth mask with several pixels. Through extensive evaluation using the PSOB dataset, we show that HiSEG outperforms state-of-the art methods such as Mask R-CNN, Strong Mask R-CNN, Mask2Former, and Segment Anything, achieving respective increases of +42.0, +34.9, +29.9, and +13.4 points in AP<span><math><msub><mrow></mrow><mrow><mtext>Mask</mtext></mrow></msub></math></span> metrics for these four models. We hope that our novel approach will set a baseline for future human-aided deep learning models by combining fully automated and interactive instance segmentation architectures.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104061"},"PeriodicalIF":2.5,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142099001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-26DOI: 10.1016/j.cag.2024.104063
João Paulo Lima , Hideaki Uchiyama , Diego Thomas , Veronica Teichrieb
Volumetric radiance fields have been popular in reconstructing small-scale 3D scenes from multi-view images. With additional constraints such as person correspondences, reconstructing a large 3D scene with multiple persons becomes possible. However, existing methods fail for sparse input views or when person correspondences are unavailable. In such cases, the conventional depth image supervision may be insufficient because it only captures the relative position of each person with respect to the camera center. In this paper, we investigate an alternative approach by supervising the optimization framework with a dense pose prior that represents correspondences between the SMPL model and the input images. The core ideas of our approach consist in exploiting dense pose priors estimated from the input images to perform person segmentation and incorporating such priors into the learning of the radiance field. Our proposed dense pose supervision is view-independent, significantly speeding up computational time and improving 3D reconstruction accuracy, with less floaters and noise. We confirm the advantages of our proposed method with extensive evaluation in a subset of the publicly available CMU Panoptic dataset. When training with only five input views, our proposed method achieves an average improvement of 6.1% in PSNR, 3.5% in SSIM, 17.2% in LPIPSvgg, 19.3% in LPIPSalex, and 39.4% in training time.
{"title":"Fast direct multi-person radiance fields from sparse input with dense pose priors","authors":"João Paulo Lima , Hideaki Uchiyama , Diego Thomas , Veronica Teichrieb","doi":"10.1016/j.cag.2024.104063","DOIUrl":"10.1016/j.cag.2024.104063","url":null,"abstract":"<div><p>Volumetric radiance fields have been popular in reconstructing small-scale 3D scenes from multi-view images. With additional constraints such as person correspondences, reconstructing a large 3D scene with multiple persons becomes possible. However, existing methods fail for sparse input views or when person correspondences are unavailable. In such cases, the conventional depth image supervision may be insufficient because it only captures the relative position of each person with respect to the camera center. In this paper, we investigate an alternative approach by supervising the optimization framework with a dense pose prior that represents correspondences between the SMPL model and the input images. The core ideas of our approach consist in exploiting dense pose priors estimated from the input images to perform person segmentation and incorporating such priors into the learning of the radiance field. Our proposed dense pose supervision is view-independent, significantly speeding up computational time and improving 3D reconstruction accuracy, with less floaters and noise. We confirm the advantages of our proposed method with extensive evaluation in a subset of the publicly available CMU Panoptic dataset. When training with only five input views, our proposed method achieves an average improvement of 6.1% in PSNR, 3.5% in SSIM, 17.2% in LPIPS<sup>vgg</sup>, 19.3% in LPIPS<sup>alex</sup>, and 39.4% in training time.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104063"},"PeriodicalIF":2.5,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142087400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study introduces a novel framework for choreographing multi-degree of freedom (MDoF) behaviors in large-scale crowd simulations. The framework integrates multi-objective optimization with spatio-temporal ordering to effectively generate and control diverse MDoF crowd behavior states. We propose a set of evaluation criteria for assessing the aesthetic quality of crowd states and employ multi-objective optimization to produce crowd states that meet these criteria. Additionally, we introduce time offset functions and interpolation progress functions to perform complex and diversified behavior state interpolations. Furthermore, we designed a user-centric interaction module that allows for intuitive and flexible adjustments of crowd behavior states through sketching, spline curves, and other interactive means. Qualitative tests and quantitative experiments on the evaluation criteria demonstrate the effectiveness of this method in generating and controlling MDoF behaviors in crowds. Finally, case studies, including real-world applications in the Opening Ceremony of the 2022 Beijing Winter Olympics, validate the practicality and adaptability of this approach.
{"title":"Choreographing multi-degree of freedom behaviors in large-scale crowd simulations","authors":"Kexiang Huang , Gangyi Ding , Dapeng Yan , Ruida Tang , Tianyu Huang , Nuria Pelechano","doi":"10.1016/j.cag.2024.104051","DOIUrl":"10.1016/j.cag.2024.104051","url":null,"abstract":"<div><p>This study introduces a novel framework for choreographing multi-degree of freedom (MDoF) behaviors in large-scale crowd simulations. The framework integrates multi-objective optimization with spatio-temporal ordering to effectively generate and control diverse MDoF crowd behavior states. We propose a set of evaluation criteria for assessing the aesthetic quality of crowd states and employ multi-objective optimization to produce crowd states that meet these criteria. Additionally, we introduce time offset functions and interpolation progress functions to perform complex and diversified behavior state interpolations. Furthermore, we designed a user-centric interaction module that allows for intuitive and flexible adjustments of crowd behavior states through sketching, spline curves, and other interactive means. Qualitative tests and quantitative experiments on the evaluation criteria demonstrate the effectiveness of this method in generating and controlling MDoF behaviors in crowds. Finally, case studies, including real-world applications in the Opening Ceremony of the 2022 Beijing Winter Olympics, validate the practicality and adaptability of this approach.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104051"},"PeriodicalIF":2.5,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142048641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-23DOI: 10.1016/j.cag.2024.104054
Xiaoyu Xie , Pingping Liu , Yijun Lang , Zhenjie Guo , Zhongxi Yang , Yuhao Zhao
Ultrasound imaging, characterized by low contrast, high noise, and interference from surrounding tissues, poses significant challenges in lesion segmentation. To tackle these issues, we introduce an enhanced U-shaped network that incorporates several novel features for precise, automated segmentation. Firstly, our model utilizes a convolution-based self-attention mechanism to establish long-range dependencies in feature maps, crucial for small dataset applications, accompanied by a soft thresholding method for noise reduction. Secondly, we employ multi-sized convolutional kernels to enrich feature processing, coupled with curvature calculations to accentuate edge details via a soft-attention approach. Thirdly, an advanced skip connection strategy is implemented in the UNet architecture, integrating information entropy to assess and utilize texture-rich channels, thereby improving semantic detail in the encoder before merging with decoder outputs. We validated our approach using a newly curated dataset, VPUSI (Vascular Plaques Ultrasound Images), alongside the established datasets, BUSI, TN3K and DDTI. Comparative experiments on these datasets show that our model outperforms existing state-of-the-art techniques in segmentation accuracy.
超声成像的特点是对比度低、噪声大、周围组织干扰多,这给病变分割带来了巨大挑战。为了解决这些问题,我们引入了一种增强型 U 形网络,该网络结合了多种新功能,可实现精确的自动分割。首先,我们的模型利用基于卷积的自注意机制在特征图中建立长程依赖关系,这对小数据集应用至关重要,同时还采用了软阈值方法来降低噪声。其次,我们采用多大小卷积核来丰富特征处理,并结合曲率计算,通过软关注方法突出边缘细节。第三,在 UNet 架构中实施了先进的跳转连接策略,整合信息熵来评估和利用纹理丰富的通道,从而在与解码器输出合并之前改善编码器中的语义细节。我们使用了一个新开发的数据集 VPUSI(血管斑块超声图像),以及已有的数据集 BUSI、TN3K 和 DDTI,对我们的方法进行了验证。在这些数据集上进行的对比实验表明,我们的模型在分割准确性上优于现有的最先进技术。
{"title":"US-Net: U-shaped network with Convolutional Attention Mechanism for ultrasound medical images","authors":"Xiaoyu Xie , Pingping Liu , Yijun Lang , Zhenjie Guo , Zhongxi Yang , Yuhao Zhao","doi":"10.1016/j.cag.2024.104054","DOIUrl":"10.1016/j.cag.2024.104054","url":null,"abstract":"<div><p>Ultrasound imaging, characterized by low contrast, high noise, and interference from surrounding tissues, poses significant challenges in lesion segmentation. To tackle these issues, we introduce an enhanced U-shaped network that incorporates several novel features for precise, automated segmentation. Firstly, our model utilizes a convolution-based self-attention mechanism to establish long-range dependencies in feature maps, crucial for small dataset applications, accompanied by a soft thresholding method for noise reduction. Secondly, we employ multi-sized convolutional kernels to enrich feature processing, coupled with curvature calculations to accentuate edge details via a soft-attention approach. Thirdly, an advanced skip connection strategy is implemented in the UNet architecture, integrating information entropy to assess and utilize texture-rich channels, thereby improving semantic detail in the encoder before merging with decoder outputs. We validated our approach using a newly curated dataset, VPUSI (Vascular Plaques Ultrasound Images), alongside the established datasets, BUSI, TN3K and DDTI. Comparative experiments on these datasets show that our model outperforms existing state-of-the-art techniques in segmentation accuracy.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104054"},"PeriodicalIF":2.5,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142122100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-22DOI: 10.1016/j.cag.2024.104052
Bart Iver van Blokland
The ShapeBench evaluation methodology is proposed as an extension to the popular Area Under Precision-Recall Curve (PRC/AUC) for measuring the matching performance of local 3D shape descriptors. It is observed that the PRC inadequately accounts for other similar surfaces in the same or different objects when determining whether a candidate match is a true positive. The novel Descriptor Distance Index (DDI) metric is introduced to address this limitation. In contrast to previous evaluation methodologies, which identify entire objects in a given scene, the DDI metric measures descriptor performance by analysing point-to-point distances. The ShapeBench methodology is also more scalable than previous approaches, by using procedural generation. The benchmark is used to evaluate both old and new descriptors. The results produced by the implementation of the benchmark are fully replicable, and are made publicly available.
{"title":"ShapeBench: A new approach to benchmarking local 3D shape descriptors","authors":"Bart Iver van Blokland","doi":"10.1016/j.cag.2024.104052","DOIUrl":"10.1016/j.cag.2024.104052","url":null,"abstract":"<div><p>The ShapeBench evaluation methodology is proposed as an extension to the popular Area Under Precision-Recall Curve (PRC/AUC) for measuring the matching performance of local 3D shape descriptors. It is observed that the PRC inadequately accounts for other similar surfaces in the same or different objects when determining whether a candidate match is a true positive. The novel Descriptor Distance Index (DDI) metric is introduced to address this limitation. In contrast to previous evaluation methodologies, which identify entire objects in a given scene, the DDI metric measures descriptor performance by analysing point-to-point distances. The ShapeBench methodology is also more scalable than previous approaches, by using procedural generation. The benchmark is used to evaluate both old and new descriptors. The results produced by the implementation of the benchmark are fully replicable, and are made publicly available.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104052"},"PeriodicalIF":2.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001870/pdfft?md5=5829ea110e365c2d20b6d416c88f685a&pid=1-s2.0-S0097849324001870-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142099000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}